2008-01-10

Pointless Polymorphism

The echo chamber started resonating with Albert Y. C. Lai's post. The point of it is that function composition is a more pervasive paradigm (or abstraction device, rather) than just higher-order functional programming. In Haskell, point-freeness is a usual stylistic device (in the literary sense); in the ML family of languages it is seldom seen.

Andreas Farre pitched in with the composition combinators defined in F#. F# shares with OCaml the syntactic quirk of arbitrary infix operators built from fixed operator characters, whose precedence and associativity are fixed in advance and depend on the first character in the operator name. For instance, in OCaml, operators starting with | are left-associative, but those starting with @ are right-associative and at a higher precedence level. I pointed out on the discussion over at reddit that this, coupled with the value restriction makes point-free style in ML-family languages (including OCaml) inconvenient.

Aside: If you don't see the point of the puns, you're not alone: I don't either, but it seems that any discussion of point-free programming induces everybody to start shooting bad puns point-blank. I'll make an effort to try and stop it.

What this means from a practical standpoint is that η-conversion doesn't quite work symmetrically: whereas f is always equivalent to fun x -> f x, the converse is not true. For example, let id x = x be the identity function, with type α → α. Then, fun x -> List.map id x has type α list → α list, but the η-converted, point-free List.map id does not: the result type in OCaml is '_a list → '_a list, where the type variables are monomorphic and unbound. The first application of this function to an argument of concrete type T will fix the point-free function's type to be TT, for ever.

I like to define the composition operator in OCaml as:

let ( % ) f g x = f (g x)

since % is relatively lightweight and it is not defined elsewhere in the standard library. The purportedly more obvious alternative would be using @, as it is reminiscent of the ring operator ∘ that usually denotes composition in Mathematics. It has two inconvenients, however: it is already in use to denote list concatenation, and it has the wrong precedence for composition, namely right-to-left.

With this, we can see that the value restriction won't let us define a point-free polymorphic reverse sort:

# let revsort = List.rev % List.sort Pervasives.compare ;;
val revsort : '_a list -> '_a list = <fun>

Even though compare is polymorphic on its arguments, the defined function is not:

# revsort [1;2;3;4] ;;
- : int list = [4; 3; 2; 1]

# revsort ;;
- : int list -> int list = <fun>

The first application of revsort to a value fixes its type to that of the value. The fix is to η-expand the function and lose point-freeness:

# let revsort x = (List.rev % List.sort Pervasives.compare) x ;;
val revsort : 'a list -> 'a list = <fun>

Here's where F# operators |> and >> are handy, as they build the composition applied to the point, in the right order (left to right). There is, however, a case where point-free programming remains applicable and expressive under a Hindley-Milner type discipline: when composing monomorphic functions.

Browsing type specimens, I came across some pangrams I didn't know. They looked like pangrams, but I didn't want to check every letter to see if they effectively were. Programming to the rescue! First, I needed to define the isomorphism between strings and lists of characters, which is built-in in Haskell but not in OCaml:

let unpack s =
  let r = ref [] in
  String.iter (fun c -> r := c :: !r) s;
  List.rev !r

let pack l =
  let b = Buffer.create 16 in
  List.iter (Buffer.add_char b) l;
  Buffer.contents b

These are imperative for efficiency, but they need not be. A predicate to test if a character is alphabetic is quite easy:

let is_alpha = function 'A' .. 'Z' | 'a' .. 'z' -> true | _ -> false

Range and or matchings are compiled very efficiently, so this is almost as good as a table lookup. Many letters will be repeated many times; the easiest way to filter duplicates is to assume that the list is sorted, and the repetitions occur in contiguous blocks:

let rec unique = function
  [] | [_] as l -> l
| x :: (y :: _ as l) -> if x = y then unique l else x :: (unique l)

The alternative would be the equivalent to Haskell's nub that removes duplicates anywhere on the list, retaining the first and respecting the order of the elements in the original list. This constitutes the generic scaffolding. For convenience, I'll alias a couple of things:

open List
open String

let sort l = List.sort Pervasives.compare l

Note that the sort is polymorphic, as it is a syntactic value (which happens to be functional). And now for the punchline, we get to the point of closing:

let letters = pack % unique % sort % filter is_alpha % unpack % lowercase

The use of monomorphic lowercase, unpack and is_alpha forces the type inferencer to derive monomorphic types for filter, sort and unique, and thus making the value restriction irrelevant here. The processing pipeline is as concisely and elegantly expressed as any in Haskell (or in Unix shell, for that matter). With that, we can test for panalphabets like this:

let is_panalphabet = (=) 26 % length % letters

For instance (à ta santé, OCaml!):

# is_panalphabet "Portez ce vieux whisky au juge blond qui fume" ;;
- : bool = true

(no diacritics were harmed in this production). Or:

# List.map is_panalphabet [
"jaded zombies acted quaintly but kept driving their oxen forward" ;
"the quick brown fox jumps over the lazy dog" ;
"pack my box with five dozen liquor jugs" ;
"sphinx of black quartz judge my vow" ;
];;
- : bool list = [true; true; true; true]

So, there is a use for point-free programming in OCaml, especially when dealing with type-based DSLs, as the functions operating on values are usually monomorphic. And this is the point I wanted to make.

No comments: