2007-11-06

Counting Cattle

Well, not exactly cattle, but a different kind of animals, the polyominos (or "ominos", for short). These are the kind of tiles you can form by gluing squares by their edges. The article "Generating Polyominoes" in The Monad Reader 5 presents a direct implementation of the naïve algorithm for generating the set of polyominos comprised of n squares. I decided to try my hand at doing the same in OCaml, while trying to write the simplest code I could come up with that solved the problem.

I try to use, whenever possible (and the monomorphic restriction allows!) composition of higher order functions. To make it more pleasant, I use the following operator:

let (%) f g x = f (g x)

Whenever introducing an operator, I find it sound practice to add to it enough structure to make it as generally applicable as possible. The identity of the composition is, unsurprisingly, the identity function:

let id x = x

I will be using sets extensively for computing ominos and their (ha!) sets. Unfortunately, the standard Set module in OCaml lacks a couple of useful functions. Fortunately, OCaml's module system is extremely flexible, and allows a kind of "extension by subclassing" that models pretty well the object-oriented mechanism of extension by inheritance:

module FSet(O : Set.OrderedType) = struct
  include Set.Make(O)
  let map f s = fold (add % f) s empty
  let of_list l = List.fold_right add l empty
end

I simply include the original module, and add the functions I need. The first addition to this module FSet (short for "functorial set") is a monomorphic map that applies a function f to each element of the set s, returning the set of results. The second function is the fix for a plain oversight of the module authors, that is, the lack of a function that returns the list of elements of a set, which is the inverse of Set.elements.

A cell is simply a pair of integer coordinates:

type cell = int * int

Cells will be treated individually, or as cell sets:

module C = FSet(struct
  type t = cell
  let compare (i, j) (i', j') =
    let c = compare j j' in
    if c <> 0 then c else
    compare i i'
end)

Cells are sorted so that those in the first row come before those in the rows above; in other words, the one with the least y-coordinate is lesser. An omino is a cell set together with its integer extents:

type omino = { wid : int; hgt : int; cel : C.t; }

Knowing how many cells high and wide the region occupied by the cells is will make some of the operations on ominos more efficient. Now, an omino set is, plainly, an FSet of ominos:

module O = FSet(struct
  type t = omino
  let compare o o' =
    let c = compare o.hgt o'.hgt in
    if c <> 0 then c else
    let c = compare o.wid o'.wid in
    if c <> 0 then c else
    C.compare o.cel o'.cel
end)

Again, the lesser omino is the shorter one, or the skinnier one, or the one whose cell set comes first. But, how do we know what the region occupied by a set of cells is? By traversing the cell set and recording the minimum and maximum coordinates seen:

let extents s =
  C.fold (fun (i, j) (mi, mj, xi, xj) ->
    (min mi i, min mj j, max xi i, max xj j))
    s
    (max_int, max_int, min_int, min_int)

Since I am not storing the origin of the region but just its extents, I need a way to normalize the cell set by translating it so that it lies in the first quadrant:

let normalize s =
  let (mi, mj, xi, xj) = extents s in
  { wid = xi - mi + 1; hgt = xj - mj + 1;
    cel = C.map (fun (i, j) -> (i - mi, j - mj)) s; }

(Here's where the extended map in FSet comes handy. Note that the extents are one more than the maximum x- and y-coordinates in the set.) Now, given a list of otherwise unconstrained cells, I build an omino from it simply by normalizing the resulting cell set:

let of_list = normalize % C.of_list

I want to enumerate ominos disregarding plane symmetries. That is, if two ominos can be transformed one into the other by a sequence of rotations and reflections, they are treated as one and the same omino. The first symmetry we have already seen, it is the identity transformation, as implemented by id. A rotation by a quarter turn transforms coordinates (x, y) into (-y, x). By taking into account the omino's extents, I don't need to normalize the result:

let rot o =
  { wid = o.hgt; hgt = o.wid;
    cel = C.map (fun (i, j) -> (o.hgt - 1 - j, i)) o.cel; }

Of course, the old width is the transformed height and vice versa. The third basic transformation is a flip that swaps cells around the horizontal axis by changing coordinates (x, y) into (x, -y):

let flip o =
  { o with
    cel = C.map (fun (i, j) -> (i, o.hgt - 1 - j)) o.cel; }

Again, I take advantage of the extents to perform the transformation in such a way that avoids the need for normalization. These basic transformations are sufficient to carry all the plane symmetries of a rectangular region. In technical terms they finitely generate, or give rise to the so-called dihedral group of order 4, or D4. The remaining transformations are built up from these by composition:

let transforms = [
  id;
  rot;
  rot % rot;
  rot % rot % rot;
  flip;
  rot % flip;
  rot % rot % flip;
  rot % rot % rot % flip
]

The symmetries of a polyomino is the set resulting from applying to it each of these transforms in turn:

let symmetries o = O.of_list (List.map (fun x -> x o) transforms)

Aside: I could have fused the map and the fold implied in O.of_list, but doing this is mechanical and would have obscured this rather simple expression.

Given that I want to treat ominos related by a symmetry as identical, I need a way to pick a canonical representation of an omino. This is an arbitrary but fixed element in the set of transforms of the omino. Since ominos are totally ordered by O.compare, this can mean either the least or the greatest element of said set. I opted for the former:

let canon = O.min_elt % symmetries

I am now ready to start growing legs on our animals. I mean, start adding cells to our ominos. Since cells must be connected by an edge, the flower of a cell is the set of all four such neighbors:

let flower (i, j) =
  (C.add (i + 1, j) %
   C.add (i - 1, j) %
   C.add (i, j + 1) %
   C.add (i, j - 1)) C.empty

To extend an omino, we first need to find the candidate set of neighbors of any of its cells that actually fall outside the omino. Then, given a valid neighbor, I extend the omino by normalizing its cell set extended with the new cell. Then, focusing on a particular cell in the set, the set of extensions at that cell is generated by the flower of candidate neighbors. Note that the result of extend is a function from omino sets to omino sets; we will use it to accumulate into the final result the ominos generated by considering in turn all the extensions possible at each of the original omino's cells (I annotate the range types of the functions for documentation):

let extensions o =
  let candidates c : C.t = C.diff (flower c) o.cel
  and extend c : omino = canon (normalize (C.add c o.cel)) in
  let extend_at c : O.t -> O.t = C.fold (O.add % extend) (candidates c) in
  C.fold extend_at o.cel O.empty

The lifting of the extension of an omino to the extension of an omino set is straightforward:

let extend_set s = O.fold (O.union % extensions) s O.empty

(This is what is usually called a flat map, or a monadic join.) We are now ready to enumerate!

I start with the simplest possible omino set, the set of 1-celled ominos. Its only member is the (unique) one-celled omino:

let o1 = O.singleton (of_list [0,0])

To make things easy, instead of manually constructing the extension, the extension of the extension and so on, I'll use the iterated composition function:

let rec nest n f x =
 if n == 0 then [] else
 if n == 1 then [x] else
 x :: nest (pred n) f (f x)

This repeatedly applies the function f to its argument, up to n times. Instead of just returning the n-th iterate, fn(x), it returns the list of partial iterates. This makes simple to compute the enumeration:

let enumerate_to n = List.map O.cardinal (nest n extend_set o1)

For instance, let's enumerate the ominos up to the decominos (the critters with exactly ten tiles in their bodies):

# enumerate_to 10 ;;
- : int list = [1; 1; 2; 5; 12; 35; 108; 369; 1285; 4655]

This takes a while, but the result does indeed coincide with the beginning of the On-Line Encyclopedia of Integer Sequences' A000105!

I'm usually surprised at the conciseness and elegance most Haskell programs achieve, through a combination of laziness, a great standard library and higher-order combinators. OCaml programs, in my experience, are much more low-level, and they tend to have a "procedural" flavor to them caused by too much explicit scaffolding being visible. Here, a judicious combination of the proper data structure (Sets) and some general-purpose recursive higher order functions make this implementation short, simple and manifestly correct.

No comments: