2008-07-30

Curried Intuitions

This is not a culinary but a logical exploration. There is continuous interest on the Curry-Howard correspondence not only from logicians but also from practicioners of typeful programming. I link just to the last two to appear on Reddit; as Baader-Meinhof would have it, I was reading the entry about the development of intuitionistic logic on the Stanford Encyclopedia of Philosophy when it occurred to me that the different axiomatizations presented must have immediate corresponding λ-terms.

The Curry-Howard correspondence states that if intuitionistic implication is interpreted as a typed λ-abstraction, then types represent constants, polymorphic type variables represent logical variables, and each step in a logical deduction corresponds to a β-reduction step. The consequence of this is Wadler's Theorems for Free!, that basically states that given a polymorphic-typed term, its shape is so constrained by its types that it is, essentially, unique, and thus it is not necessary to actually carry out the proof, or the reduction, but just checking that the types match suffices.

Not quite. Negation in IL is a though concept to systematize, because the intuition (!) behind a proof is that it is a process of deriving the consequent from the given antecedents, and negation would seem to require showing that there is no way to "build" the thing being negated. This is codified in the usual way as ¬AA → ⊥, where the latter denotes "false", or "undefined", which is interpreted under the isomorphism as the empty type empty. This type is not usually present in Hindley-Milner languages like OCaml because it introduces more problems than it solves, so there is not really a direct way to "program" negation in OCaml. The indirect route is to use first-class continuations; these are present in Standard ML of New Jersey, but again not in OCaml. I had to fake it somehow.

Obviously, writing in continuation-passing style makes using call-with-current-continuation (or callcc, as it is affectionately called), which I could use to implement negation, trivial; the problem with that is that I had to encode all my axioms using CPS. Leroy shows how a monad can make this painless, by structuring the CPS semantics inside one:

module Cont :
 sig
  type empty
  type 'a cont = 'a -> empty
  type 'a t = 'a cont cont
  val return : 'a -> 'a t
  val bind : 'a t -> ('a -> 'b t) -> 'b t
  val callcc : ('a cont -> 'a t) -> 'a t
  val throw : 'a cont -> 'a -> 'b t
  val run : 'a t -> 'a
 end = (* …

The empty type I can fake it using an abstract type; this is actually an existential type that makes the type α cont of continuations mean "a continuation is a function from things of arbitrary type to some type that I know but I won't tell you what is". The type of the monad α t is a continuation of continuations; this is forced by the fact that, under the CH correspondence, the intuitionistic axiom ¬¬AA is precisely translation to CPS. With that, return and bind have the expected types. These allow me to write callcc, that invokes a function with the current continuation, and throw, that aborts the current continuation with the one supplied to it. In order to "escape" the monad, run unpacks the computation and gives its final value, if at all possible.

The implementation follows closely Leroy's, except for run that uses a reference to achieve fully polymorphic continuations. This uses Oleg Kiselyov's insight about delimited continuations, namely that mutation is implicitly a delimited continuation, which has a recursive type:

… *)
 struct
  type empty = unit
  type 'a cont = 'a -> empty
  type 'a t = 'a cont cont
  let return x = fun k -> k x
  let bind m f = fun k -> m (fun x -> f x k)
  let callcc f = fun k -> f k k
  let throw k x = fun _ -> k x
  let run m =
    let r = ref None in
    let () = m (fun v -> r := Some v) in
    match !r with Some v -> v | None -> failwith "throw"
 end

Note that my empty is as simple as it could be short of being really the empty type. Then return and bind exactly implement the CPS-transform; I use anonymous functions in the bodies to emphasize the continuation argument. This makes callcc's working transparent: it feeds the (now explicit) continuation to its argument; in the same way, it is obvious that throw works by discarding the continuation and invoking immediately its argument.

The case of run is subtle but not especially complicated: it feeds its CPS-transformed argument a continuation that stores the result in a reference; this reference must be initialized to something not yet existing and so it has type α option. When the continuation returns, a pattern-match is used to recover the resulting value. Except that if there isn't a resulting value, because the CPS function aborted to a top-level continuation instead of running through to the end, the effect is equivalent to an unhandled exception.

The first axiomatization given in the SEP is Kolmogorov's; I won't show it here because it is the minimal set of axioms required to show an embedding of classical logic in intuitionistic logic. The second axiomatization is Glivenko's:

  1. pp
  2. (pq) → ((qr) → (pr))
  3. (pq) → p
  4. (pq) → q
  5. (rp) → ((rq) → (r → (pq)))
  6. p → (pq)
  7. q → (pq)
  8. (pr) → ((qr) → ((pq) → r))
  9. (pq) → ((p → ¬q) → ¬p)

According to the CH correspondence, the product type (that is, the type of pairs) encodes conjunction and the disjunction maps to a sum type, otherwise standard in Haskell:

type ('a, 'b) either = Left of 'a | Right of 'b

The first five axioms are straightforward; they almost write themselves, even with names that are all but forced by their shapes:

module G = struct
  let id x = x
  let seq f g x = g (f x)
  let fst (x, _) = x
  let snd (_, y) = y
  let pair f g x = (f x, g x)

The following three are no more difficult:

let left x = Left x
  let right x = Right x
  let cond f g = function Left x -> f x | Right y -> g y

The last term completes the axiomatization with a sort of rule of contradiction: if from the same antecedent it can be proved both a consequent and its negation, the antecedent is false. The problem while encoding this term is that there isn't an obvious way to magic in the particular proposition involved. The idea is to invoke the continuation of the term, hoping that it somehow provides the value required, and apply it to the first function. If it however raises an exeption, the value is fed to the second function, which functions as a kind of handler:

let catch handle raise =
    Cont.run 
      (Cont.callcc (fun k ->
        let v = Cont.run k in
        Cont.throw (raise v) (handle v)))
end

The module signature encodes the axioms:

module G :
 sig
  val id : 'a -> 'a
  val seq : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'c
  val fst : 'a * 'b -> 'a
  val snd : 'a * 'b -> 'b
  val pair : ('a -> 'b) -> ('a -> 'c) -> 'a -> 'b * 'c
  val left : 'a -> ('a, 'b) either
  val right : 'a -> ('b, 'a) either
  val cond : ('a -> 'b) -> ('c -> 'b) -> ('a, 'c) either -> 'b
  val catch : ('a -> 'b) -> ('a -> 'b Cont.cont) -> 'a Cont.cont
 end

where pairs must be read as conjunctions, and occurrences of the either type denote disjunctions. The third axiomatization given is Heyting's:

  1. p → (pp)
  2. (pq) → (qp)
  3. (pq) → ((pr)→ (qr))
  4. ((pq) ∧ (qr)) → (pr)
  5. q → (pq)
  6. (p ∧ (pq)) → q
  7. p → (pq)
  8. (pq) → (qp)
  9. ((pr) ∧ (qr)) → ((pq) → r)
  10. ¬p → (pq)
  11. ((pq) ∧ (p → ¬q)) → ¬p

These axioms are both more basic and harder to work with; for instance, there is no axiom that directly allows proving a proposition from a conjunction:

module H = struct
  let diag x = (x, x)
  let swap (x, y) = (y, x)
  let first f (x, y) = (f x, y)
  let seq (f, g) x = g (f x)
  let const x _ = x
  let apply (x, f) = f x
  let left x = Left x
  let exch = function Left x -> Right x | Right y -> Left y
  let cond (f, g) = function Left x -> f x | Right y -> g y

Now, the axioms dealing with negation can be resolved directly with the continuation monad, as I did before. Axiom number 10 has exactly the type of throw, except for the monadic type. It corresponds to the ex falso quod libet principle (in Buenos Aires we have a saying that basically goes "if my granny had wheels, I would be a cart" which is as silly as it is true). Axiom number 11 is exactly as before:

let throw k x = Cont.run (Cont.throw k x)
  let catch (raise, handle) =
    Cont.run
      (Cont.callcc (fun k ->
        let v = Cont.run k in
        Cont.throw (handle v) (raise v)))
end

Again, the module's signature matches the axioms:

module H :
 sig
  val diag : 'a -> 'a * 'a
  val swap : 'a * 'b -> 'b * 'a
  val first : ('a -> 'b) -> 'a * 'c -> 'b * 'c
  val seq : ('a -> 'b) * ('b -> 'c) -> 'a -> 'c
  val const : 'a -> 'b -> 'a
  val apply : 'a * ('a -> 'b) -> 'b
  val left : 'a -> ('a, 'b) either
  val exch : ('a, 'b) either -> ('b, 'a) either
  val cond : ('a -> 'b) * ('c -> 'b) -> ('a, 'c) either -> 'b
  val throw : 'a Cont.cont -> 'a -> 'b
  val catch : ('a -> 'b) * ('a -> 'b Cont.cont) -> 'a Cont.cont
 end

Now strictly speaking both axiomatizations ought to be equivalent; in other words, one's axioms are theorems in the other; and so I could very well expect to find expressions for H's functions solely in terms of G's. However, things are not that easy. By the CH-correspondence, application of λ-terms corresponds to modus ponens; however, a proof is free, in order to complete a given step, to recall any axiom or prior logical derivation whatsoever; in effect, it requires binding as a metalogical operation. Should I insist with equating proofs with combinators, I must introduce one or more structural axioms that allow me to simulate abstraction and application; in effect, I'd be using SK calculus. An intermediate possibility would be to allow naming the arguments but restricting the use of composition and application of the base system's functions to construct the corresponding functions of the derived system:

module H' = struct
  let diag x = G.pair G.id G.id x
  let swap x = G.pair G.snd G.fst x
  let first f = G.pair (G.seq G.fst f) G.snd
  let seq p = G.seq (G.fst p) (G.snd p)
  let const x _ = x
  let apply p = (G.snd p) (G.fst p)
  let left = G.left
  let exch d = G.cond G.right G.left d
  let cond p = G.cond (G.fst p) (G.snd p)
  let throw k x = Cont.run (Cont.throw k x)
  let catch p = G.catch (G.fst p) (G.snd p)
end

Note that diag, swap, left and exch are combinators, and the argument is forced by OCaml's value restriction. Note also that const in Heyting's formulation is purely structural and identical to the K combinator.

With respect to seq, apply, cond and catch, they name their argument only to extract both components of the pair, in effect uncurrying the corresponding axioms in Glivenko's formulation. With an explicit metalogic axiom (a → (bc)) → ((ab) → c):

let uncurry f (x, y) = f x y

more theorems are actually combinators:

module H'' = struct
  let diag x = G.pair G.id G.id x
  let swap x = G.pair G.snd G.fst x
  let second f = G.seq (G.seq G.snd) (G.pair G.fst) f
  let first' f = G.seq (G.seq G.fst) (G.pair G.snd) f
  let first  f = G.seq (G.seq G.fst) (fun h -> G.pair h G.snd) f
  let seq p = uncurry G.seq p
  let const x _ = x
  let apply p = G.seq swap (uncurry G.id) p
  let left = G.left
  let exch d = G.cond G.right G.left d
  let cond p = uncurry G.cond p
  let throw k x = Cont.run (Cont.throw k x)
  let catch p = uncurry G.catch p
end

The importance of theorems being implemented as combinators is that the proofs are structurally very simple, with the outermost redex the only choice point where to apply modus ponens. As it is evident, by avoiding naming the arguments the only option left for pair is to fix its first, making the point-free expression of second easy but tht of first impossible under this restriction; the best I could manage was first', where the resulting tuple is in reverse order. Also, the theorem for apply shows that it is indeed uncurry under another guise; maybe this is a heuristic argument that points to the fact that uncurry could be expressed under the axioms G.

Going the other way would need its counterpart, the axiom ((ab) → c) → (a → (bc)), or currying:

let curry f x y = f (x, y)

In effect, this pair of functions witness the isomorphism α × βγαβγ, where I switch from the logical to the typed view to emphasize the fact that this is not just an equivalence (a fact easily proved in classical logic with the identity ab ≡ ¬ab and de Morgan's Law).

module G' = struct
  let id x = H.seq (H.diag, uncurry H.const) x
  let seq f = curry H.seq f
  let fst p = uncurry H.const p
  let snd p = H.seq (H.swap, uncurry H.const) p
  let pair f g = H.seq (H.first f, H''.second g)
  let left = H.left
  let right x = H.seq (H.left, H.exch) x
  let cond f = curry H.cond f
  let catch f = curry H.catch f
end

I crucially needed both named capture and the auxiliary lemma H''.second in the definition of pair; I won't even try to define it in combinator form.

So as you can see, both axiomatizations are, indeed, interexpressible; a purely logical proof would simply need to simulate reduction of the corresponding terms, although this is not necessary, since as a consequence of freeness it only suffices to check that the types come out right.

As a final theorem for free, let me show you the term that proves the law of noncontradiction:

let fail x = G.catch G.fst G.snd x

(and indeed it does fail, as entering G.catch G.fst G.snd in the console shows) whose type (α × α cont) cont corresponds to the theorem ¬(a ∧ ¬a).

2008-07-12

Picturing Networks

This is what a sorting network looks like:

It is a sorting network of width 8 with a minimal number of 19 exchange boxes and 6 parallel stages. Each item to be sorted is represented by a horizontal line, and there are eight of them. Each exchange box is represented by a thicker vertical line between inputs. If the exchange boxes can be scheduled to run in parallel without interference, they are brought closer together, to emphasize grouping. It is not difficult to generate these graphics. I'll emit EPS as it is a simple textual format which Preview in Mac OS X can rasterize readily enough.

As before, a network is an array of pairs of integer indices ij representing an exchange operation between inputs i and j:

type network = (int * int) array

The program will take networks represented in textual format: each exchange box is a pair of integers separated by a colon:

let re_box = Str.regexp "\\([0-9]+\\):\\([0-9]+\\)"

sequences of such boxes are separated by whitespace, commas or colons:

let re_sep = Str.regexp "[\t ]*[,;][\t ]*"

A box is parsed with the former, by matching and converting. If the matching succeeds, it is sufficient validation to let me convert both indices to decimal without fear:

let box_of_string s =
  if Str.string_match re_box s 0 then
    let i = int_of_string (Str.matched_group 1 s)
    and j = int_of_string (Str.matched_group 2 s) in
    if i > j then j, i else i, j
  else failwith "malformed exchange box"

Converting an entire network is as simple as splitting the input line according to the latter regular expression, and applying the preceding function to each pair. Afterwards, I count the greatest referenced input line in the network to obtain its width:

let network_of_string s : int * network =
  let l = List.map box_of_string (Str.split re_sep s) in
  succ (List.fold_left (fun width (_, j) -> max width j) 0 l), Array.of_list l

For instance, the above network can be parsed like this:

# network_of_string "4:5,0:6,3:7,1:2;2:5,0:4,1:3,6:7;2:6,3:4,0:1;1:3,5:7,4:6;4:5,2:3;5:6,1:2,3:4" ;;
- : int * network =
(8,
 [|(4, 5); (0, 6); (3, 7); (1, 2); (2, 5); (0, 4); (1, 3); (6, 7); (2, 6);
   (3, 4); (0, 1); (1, 3); (5, 7); (4, 6); (4, 5); (2, 3); (5, 6); (1, 2);
   (3, 4)|])

In order to determine which group of comparators can be scheduled in parallel, I simulate placing them onto wires and "pushing" them as far to the left as previously placed comparators would allow them. The following function abstract this logic into a generic fold, which lets me reuse it in a couple of places without bothering with this logic:

let fold_network f e width (net : network) =
  let bounds = Array.make width 0 in
  fst (Array.fold_left (fun (e, depth) (i, j) ->
    if i == j then (e, depth) else
    let d = max depth (1 + max bounds.(i) bounds.(j)) in
    bounds.(i) <- d; bounds.(j) <- d;
    (f (d > depth) e (i, j), d)) (e, 0) net)

The array bounds keeps tab of how many comparators are already placed on line i. The variable depth in the fold accumulator tracks the overall depth of the network. An exchange box i = j with both terminals on the same input is considered a "no-operation" and not counted. Otherwise, the endpoints would accumulate on the respective entries in bounds; if the resulting number is larger than the recorded depth, the accumulating function is notified and the accumulator is updated accordingly.

With this, is not difficult to pretty print a network in a format that round-trips correctly via parsing:

let string_of_network width (net : network) =
  let buffer = Buffer.create 16 in
  ignore (fold_network (fun is_sequential any (i, j) ->
    if any then
      Buffer.add_char buffer (if is_sequential then ';' else ',');
    Buffer.add_string buffer (string_of_int i);
    Buffer.add_char   buffer ':';
    Buffer.add_string buffer (string_of_int j);
    true) false width net);
  Buffer.contents buffer

Exchange boxes that can be grouped in parallel are separated by commas, and sequential groups are separated by semicolons.

An EPS document is characterized by a reduced number of PostScript commands; the file itself must follow the DSC set forth by Adobe in the respective manual. The most important comment is the bounding box that must accurately enclose the drawing. This requires calculating the total width and height the network will occupy, and to avoid clipping I use a little margin:

let open_document out ?(margin=0.) width height =
  Printf.fprintf out "%%!PS-Adobe-3.0 EPSF-3.0\n";
  Printf.fprintf out "%%%%BoundingBox: 0 0 %d %d\n"
    (truncate (ceil (width  +. 2. *. margin)))
    (truncate (ceil (height +. 2. *. margin)));
  Printf.fprintf out "%%%%DocumentData: Clean7Bit\n";
  Printf.fprintf out "%%%%EndComments\n"

Similarly, the document must be finalized with a standard prelude:

let close_document out =
  Printf.fprintf out "showpage\n";
  Printf.fprintf out "%%%%EOF\n"

To ease the measurements in PostScript, a conversion from millimeters to points:

let mm = ( *. ) (72. /. 25.4)

This lets me express sizes in metric units (even if neither Preview's nor Photoshop's conversions accurately reflect these sizes). The distance between input lines is:

let line_distance = mm 5.

and the distances between exchange boxes are:

let parallel_distance   = mm 2.
and sequential_distance = mm 5.

The lines have their different thicknesses, too:

let line_thickness     = 0.5
and exchange_thickness = 2.

To compute the bounding box I need to calculate the overall width of the laid out network. As you can see in the graphic above, the first and last boxes are inset by a sequential_distance from the input lines' ends. Then it's just a matter of determining if the box must be laid in a group or after one, and add the corresponding distance:

let network_width width (net : network) =
  fold_network (fun is_sequential width _ ->
    width +. if is_sequential then sequential_distance else parallel_distance)
  sequential_distance width net

Note that I have no use for the indices in the fold function, and that the initial width is a sequential_distance as explained above. Note also that the parameter width is actually the number of input lines (the network's width), not its graphical horizontal dimension.

Generating the actual PostScript is a little messy, so I'll intersperse these comments with the code instead of presenting it as a complete function:

let output_network out width (net : network) =

First I compute the dimensions of the graphic. I will use an exchange_thickness of margin around the picture, so that no line or terminal gets clipped. The height is the line_distance times the number of inter-line gaps, that is, one less the network width:

  let m = exchange_thickness
  and h = line_distance *. float (pred width)
  and w = network_width width net in

With that I can output the document header:

  open_document out ~margin:m w h;

I lay first the input line chord:

  Printf.fprintf out "%g setlinewidth\n" line_thickness;
  for i = 0 to pred width do
    let y = m +. line_distance *. float i in
    Printf.fprintf out "%g %g moveto %g %g lineto stroke\n"
      m y (m +. w) y
  done;

I use %g format specifiers to minimize the number of decimal digits printed. Also, I take into account the margins both in the horizontal and vertical coordinates. Now the actual placing of the exchange boxes is a matter of folding over the network:

  ignore (fold_network (fun is_sequential width (i, j) ->

The horizontal coordinate is accumulated over, and updated for each new exchange box being placed:

    let x  = width +. if is_sequential then sequential_distance else parallel_distance

The vertical coordinates present no mystery:

    and y  = m +. line_distance *. float i
    and y' = m +. line_distance *. float j in

With exchange_thickness I lay the vertical lines:

    Printf.fprintf out "%g setlinewidth\n" exchange_thickness;
    Printf.fprintf out "%g %g moveto %g %g lineto stroke\n" x y x y';

And with a zero line to avoid overstepping the bounding box I place connector dots on both endpoints:

    Printf.fprintf out "0 setlinewidth\n";
    Printf.fprintf out "%g %g %g 0 360 arc fill\n" x y  exchange_thickness;
    Printf.fprintf out "%g %g %g 0 360 arc fill\n" x y' exchange_thickness;

The fold accumulates the horizontal coordinate starting with a margin:

    x) m width net);

Finally, I close the document:

  close_document out
;;

The terminating double-colon is just to emphasize that this function must be reconstructed from its component pieces. A minimally simple command-line driver for this pretty-printer takes its argument from the line, and writes its output to standard output:

let () =
  let width, network = network_of_string Sys.argv.(1) in
  output_network stdout width network

A robust driver with options, on-screen usage and an exception handler is left to the reader.

2008-07-09

Sorting out Sorters

Previously I proved how the zero-one principle can test if a comparator network is actually a sorting network or not. This can be efficiently implemented as a program. I'll represent a comparator network of width w as an array of exchange boxes:

type network = (int * int) array

Each entry of the array is a pair (i, j) with 0 ≤ ij < w, representing an exchange box between "wires" or indices i and j. If i = j, the "exchange" box is the null operation, as xx = xx = x. Now the quickest way to generate binary sequences in a binary computer is to count. For every 0 ≤ x < 2w,

x = ⟨∑i : 0 ≤ i < x : 2ixi

with xi ∈ {0, 1} a binary digit. With that, a binary sequence is sorted if, as a binary word, it is represented as a block of zeros followed by a block of ones. It is well known that it is very easy to test if the block of ones is contiguous in the least significant positions, that is, if there is a kw such that for all 0 ≤ i < k, xi = 1 and for k < i < x, xi = 0:

let is_sorted word = word land (word + 1) == 0

Why does that work? The condition on x is equivalent to requiring that x = 2k - 1, hence the bitwise land will not find a 1 digit in common between this and 2k. Now, viewed as an operation on bits, the exchange box computes the following binary function:

ABA↓BA↑B
0000
0101
1001
1111

That is, (AB) = (AB, AB). An exchange box, hence, needs to select bits and apply the corresponding functions according to a mask:

let exchange (i, j) word =
  (* i <= j *)
  word
    lor  (word lsr (j - i) land     (1 lsl i))
    land (word lsl (j - i) lor lnot (1 lsl j))

Note that the shifts required to align the i-th bit with the j-th one require crucially that ij. Note also that, strictly speaking, the function sorts bits in a word in descending order, that is, the greater elements (the ones) are moved to the lowest positions (the least significant bits). To apply an exchange network to a binary word, it suffices to fold each exchange box over it:

let sort (n : network) word =
  Array.fold_left (fun word (i, j) -> exchange (i, j) word) word n

With that, a function testing a comparator network is simply:

exception Unsorted of int

let is_sorting_network width (n : network) =
  let tests = 1 lsl width in
  try
    for i = 0 to pred tests do
      if not (is_sorted (sort n i)) then raise (Unsorted i)
    done;
    true
  with Unsorted _ -> false

For instance, the canonical 4-sorting network passes:

# is_sorting_network 4 [|0,1;2,3;0,2;1,3;1,2|] ;;
- : bool = true

But a slight variation thereof doesn't:

# is_sorting_network 4 [|0,1;2,3;0,2;1,3;1,3|] ;;
- : bool = false

I use an exception not only to bail out early from the testing loop, but also because elsewhere I make use of the word that failed to sort in choosing the best individual among a collection of random networks. But about that, more later.

The Zero-One Principle

Continuing my treatment of comparator networks, I've experimenting with the discovery of minimal sorting networks using evolutionary programming, about which I plan to write in the future. One of the things needed for that is a means of quickly evaluating if a given network actually sorts all sequences of a given length or not. There is a very powerful theoretical result, called the zero-one principle that aids in identifying sorting networks from general comparator networks. It says:

(The zero-one principle): A comparator network N of width n sorts all sequences of length m if it sorts all 2n binary sequences of length n.

The proof is not difficult, and is a good showcase for the equational reasoning I'm so fond of. First, some definitions.

Given an ordered set (A, ≤), the minimum of elements a, a′A is the unique element aa′ such that:

aa′aaa′a′

Dually, the maximum of a, a′A is the unique element aa′ such that:

aaa′a′aa′

A monotonic function from ordered set (A, ≤) to ordered set (B, ⊑) is a function that preserves orderings, that is:

⟨∀a, a′A : aa′ : f.a ⊑ f.a′

With that, I can prove the first

Lemma: monotonic functions preserve minima (dually, maxima)

Let f be monotonic on A. Then ∀a, a′A, by definition of minimum:

   aa′aaa′a′
⇒ { monotonicity of f }
   f.(aa′) ≤ f.a ∧ f.(aa′) ≤ f.a′
≡ {definition of ↓ }
   f.(aa′) = f.a ↓ f.a′

And dually for maxima.

Recall that an comparator box is a function (xy) = (xy, xy). Applied to a sequences, a comparator [ij] with 0 ≤ ij < n sorts elements i and j of [l0, l1,… ln-1] into [l0,… li-1, lilj, li+1,… lj-1, lilj, lj+1,… ln-1]. A comparator network N is a sequence of such boxes, and as a sequence, it has a monoidal structure under composition. For a function f : AB, the pointwise extension map f : AnBn maps a sequence [l0, l1,… ln-1]∈An to the sequence [f.l0, f.l1,… f.ln-1]∈Bn. Hence, the following

Lemma: comparator networks commute with the pointwise extension of monotonic functions

Let f be monotonic on A. By induction on the structure of N, we have for the base case a single comparator box [ij]:

   ([ij] ∘ map f).l
= { defn. pointwise extension }
   [ij].[f.l0, f.l1,… f.ln-1]
= { defn. comparator box }
   [f.l0,… f.li-1, f.li ↓ f.lj, f.li+1,… f.lj-1, f.li ↑ f.lj, f.lj+1,… f.ln-1]
= { lemma }
   [f.l0,… f.li-1, f.(lilj), f.li+1,… f.lj-1, f.(lilj), f.lj+1,… f.ln-1]
= { defn. pointwise extension }
   map f.[l0,… li-1, lilj, li+1,… lj-1, lilj, lj+1,… ln-1]
= { defn. comparator box }
  (map f ∘ [ij]).l

For the inductive case, a comparator network N can be decomposed as the composition of comparator networks NL and NR, such that:

   N ∘ map f
= { compositionality of network }
   (NL ∘ NR) ∘ map f
= { associtativity of composition, base case }
   NL ∘ (map f ∘ NR)
= { associtativity of composition, base case }
   map f ∘ (NL ∘ NR)
= { compositionality of network }
   map f ∘ N

Now I can prove the

Theorem: The zero-one principle as stated above is valid.

Suppose there is a sequence lAn that is not sorted by N, that is, for m = N.l there is an 0 ≤ k < n-1 such that mkmk+1. For all cA, define f : A → {0, 1} as f.c = 1 if mkc, and f.c = 0 otherwise. Now f is monotonic:

   f.c ≤ f.c′
⇐ { range of f }
   f.c = 0 ∨ f.c′ = 1
≡ { defn. f }
   mkcmkc′mkcmkc′
⇐ { transitivity ≤ }
    cc′

Furthermore, f.mk = 1 and f.mk+1 = 0 which means that (map f ∘ N).l is unsorted. But since f is monotonic, by the second Lemma, (N ∘ map f).l is unsorted too. In other words, if there's a sequence over An that N doesn't sort, then there is a sequence over 2n that N doesn't sort either; by contrapositive the theorem follows.

In the next post I'll show how this translates in an efficient test for comparator networks.

2008-07-02

Sorting out Sorting

I believe that refactoring in the functional milieu is as important as it is in the object-oriented world. There's usually room for further improvement of one's definitions. Often, it is no more than the trivial application of selective inlining of definitions that clarify code that turned out more cluttered and opaque that it could have been, possibly from working at a remove from the problem domain by dint of excessive generalization.

More rarely this refactoring occurs at a higher conceptual level than is known in object-oriented programming: instead of indentifying and rooting out repeated code it abstracts common patterns of recursion and applications into higher-order, closed combinators. This is somewhat in tension with the drive to write code as simple as possible, and in a sense it goes in opposition to the previous refactoring.

With practice, category theory guides one's decisions, so that the code one writes comes out "pre-refactored"; in general, having black-box higher-order combinators means that one effectively programs (or in principle could program) in an incomplete (in the technical sense) but total fragment of the language.

Looking back to the code I wrote the last time, I had opportunity for performing both kinds of refactoring. One good chance to apply the first one I took advantage of was the lifting of swap to lists. Inlining all the definitions, it came out like:

let swap = function
| [x; y] when x > y -> [y; x]
| l                 -> l

made especially simple by the application of a pattern guard. This function is the natural (again, in a technical sense) lifting of the min-max binary operator to lists, as is apparent from the code. Now exchange reduces to:

let exchange p = join % List.map swap % pairup $ p

without undue clutter generated by the domain injections and projections. I carried out the second kind of refactoring when I noticed that the recursion schemata for both bitonic and sort were identical. The gist of it is that after an even-odd shuffle (carried out by unzip), the result of two parallel invocations of the procedure on each half are merged back into a complete network. This parallel invocation is, in a sequential setting, a function application to both members of a pair:

let par f (x, y) = f x, f y

(incidentally, par is Spanish for "pair"). Now call this recursion schema the shuffle:

let rec shuffle merge = function
| []  -> []
| [x] -> [x]
| l   -> merge % par (shuffle merge) % unzip $ l

The trivial networks pass unchanged; otherwise, my description is word for word (backwards!) the transformation performed on the network. Let me recall Knuth's recipe:

[W]e can construct a bitonic sorter of order p […] by first sorting the bitonic subsequences ⟨z1, z3, z5, …⟩ and ⟨z2, z4, z6, …⟩ independently, then comparing and interchanging z1:z2, z3:z4, … (Knuth, Sorting and Searching, p 232)

Now, taking advantage of the abstracted recursion scheme, I applied again the first kind of refactoring. "Comparing and interchanging" the sorted pair of sequences ended up like this:

let bitonic p = shuffle exchange % uncurry List.rev_append $ p

And the complete sort like this:

let sort l = shuffle bitonic l

This schema, sort ≡ shuffle merge for a suitable merge phase is due to Batcher. A marvelous exposition is section 4.6 of Jayadev Misra's paper Powerlist: A Structure for Parallel Recursion, which not only inspired this program, but about which I hope to write more in the future.