2007-10-02

Algebraic Data Types and Java Generics (I)

See also Part II.

Andrew Kennedy reminds us that, yes, C# is a functional programming language. So is Java; or at least, Java can encode some of the most advanced idioms that are native to expressive functional languages like Haskell or OCaml. These examples should be straightforwardly translatable to C#.

A algebraic data type or ADT is the type of a choice of values, each of which is a collection, or tuple of other ADT values. In other words, an ADT is a (possibly recursive) "sum of products of types" type. Each option is introduced by a constructor that permits determining to which alternative a value corresponds, and to match it as a pattern to gain access to the constructor's values. Important special cases of ADTs are the list:

type 'a list = Nil | Cons of 'a * 'a list

where Nil —the empty list— is usually called [] and Cons (h, t) —the list with head h and tail t— is denoted by h :: t; and the sum type:

type ('a,'b) sum = Left of 'a | Right of 'b

A special type of sum is the one parameterized on the left by unit, the one-point (uni-valued) type. This special case is called the option type:

type 'a option = None | Some of 'a

This is a good representation for values that "might not be there". Options are used against a destructor, in this case maybe: a higher-order function that discriminates and either applies a function to Some x, or returns a default value for None:

let maybe f e v = match v with
| Some x -> f x
| None   -> e

There is an object-oriented encoding of ADTs that is especially suited for Java: the type is an abstract class declaring visitor methods, and each (algebraic, not object-oriented) constructor or variant is an inner class extending the outer abstract base. With some care with the protection level for the class constructors, this can be tightly sealed so that no extra cases can be added in an external program. I'll illustrate with an encoding for 'a option.

First, a preliminary definition. Java (unlike C#) doesn't have a notion of first-class functions, that is, methods cannot be passed around as they are. The only constructible values are objects; hence we must "simulate" first-class functions with functional classes:

interface Function<X, Y> {
  Y apply(X x);
}

We begin by declaring an Option class, parameterized by the type X of the value it encapsulates:

abstract class Option<X> {

The functionality that this class declares, and that its subclasses must implement, is the maybe destructor:

  public abstract <Y> Y maybe(Function<X, Y> f, Y e);

The type of the argument is fixed by the class parameter, but the type of the result depends on the particular Function f being used. To enforce encapsulation, this class will not be open to subclassing:

  private Option() { }

Now each of Option's subclasses will encode a different variant. First, the inner class corresponding to Some:

  static final class Some<X> extends Option<X> {
    public final X value;

    private Some(X value) {
      this.value = value;
    }

Only the enclosing class will be able to construct objects of this type. The destructor for a Some object applies the function to the value it represents, and returns the result:

    public <Y> Y maybe(Function<X, Y> f, Y e) { return f.apply(this.value); }
  }

Now the class corresponding to the None variant. Again, it will be closed to external instantiation:

  static final class None<X> extends Option<X> {
    private None() { }

Destructing a None value entails simply returning the default value:

    public <Y> Y maybe(Function<X, Y> f, Y e) { return e; }
  }

Again, only the abstract Option class will be able to construct instances of its subclasses:

  public static <X> Option<X> some(X x) { return new Some<X>(x); }
  public static <X> Option<X> none()    { return new None<X>( ); }
}

This is not as general as a parametric visitor over the composite type, but is much simpler. With sufficiently rich functional objects, both approaches are interexpressible. We can test that everything works:

public class OptionTest {
  public static void main(String[] args) {
    Function<Integer, String> string_of_int = new Function<Integer, String>() {
      public String apply(Integer value) { return value.toString(); }
    };
    Option<Integer> p = Option.some(10);
    Option<Integer> q = Option.none();
    System.out.println("p = " + p.maybe(string_of_int, "no value"));
    System.out.println("q = " + q.maybe(string_of_int, "no value"));
  }
}

We'll see next how Option naturally gives rise to a monad, and how to implement a monad in Java.

No comments: