Programming with Streams

Let's write some functions that manipulate streams. It will help to have a notation for streams to use as part of documentation. Let's use <a; b; c; ...> to denote the stream that has elements a, b, and c at its head, followed by infinitely many other elements.

Here are functions to square a stream, and to sum two streams:

(* [square <a;b;c;...>] is [<a*a;b*b;c*c;...]. *)
let rec square (Cons (h, tf)) =
  Cons (h*h, fun () -> square (tf ()))

(* [sum <a1; a2; a3; ...> <b1; b2; b3; ...>] is
 * [<a1+b1; a2+b2; a3+b3; ...>] *)
let rec sum (Cons (h1, tf1)) (Cons (h2, tf2)) =
  Cons (h1+h2, fun () -> sum (tf1 ()) (tf2 ()))

Their types are:

val square : int stream -> int stream
val sum : int stream -> int stream -> int stream

Note how the basic template for defining both functions is the same:

  • Pattern match against the input stream(s), which must be Cons of a head and a tail function (a thunk).

  • Construct a stream as the output, which must be Cons of a new head and a new tail function (a thunk).

  • In constructing the new tail function, delay the evaluation of the tail by immediately starting with fun () -> ....

  • Inside the body of that thunk, recursively apply the function being defined (square or sum) to the result of forcing a thunk (or thunks) to evaluate.

Of course, squaring and summing are just two possible ways of mapping a function across a stream or streams. That suggests we could write a higher-order map function, much like for lists:

(* [map f <a;b;c;...>] is [<f a; f b; f c; ...>] *)
let rec map f (Cons (h, tf)) =
  Cons (f h, fun () -> map f (tf ()))

(* [map2 f <a1;b1;c1;...> <a2;b2;c2;...>] is
 * [<f a1 b1; f a2 b2; f a3 b3; ...>] *)
let rec map2 f (Cons (h1, tf1)) (Cons (h2, tf2)) =
  Cons (f h1 h2, fun () -> map2 f (tf1 ()) (tf2 ()))

let square' = map (fun n -> n*n)
let sum' = map2 (+)

And their types are as we would expect:

val map : ('a -> 'b) -> 'a stream -> 'b stream
val map2 : ('a -> 'b -> 'c) -> 'a stream -> 'b stream -> 'c stream
val square' : int stream -> int stream
val sum' : int stream -> int stream -> int stream

Now that we have a map function for streams, we can successfully define nats in one of the clever ways we originally attempted:

# let rec nats = Cons(0, fun () -> map (fun x -> x+1) nats);;
val nats : int stream = Cons (0, <fun>)

# take 10 nats;;
- : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9]

Why does this work? Intuitively, nats is <0; 1; 2; 3; ...>, so mapping the increment function over nats is <1; 2; 3; 4; ...>. If we cons 0 onto the beginning of <1; 2; 3; 4; ...>, we get <0; 1; 2; 3; ...>, as desired. The recursive value definition is permitted, because we never attempt to use nats until after its definition is finished. In particular, the thunk delays nats from being evaluated on the right-hand side of the definition.

Here's another clever definition. Consider the Fibonacci sequence <1; 1; 2; 3; 5; 8; ...>. If we take the tail of it, we get <1; 2; 3; 5; 8; 13; ...>. If we sum those two streams, we get <2; 3; 5; 8; 13; 21; ...>. That's nothing other than the tail of the tail of the Fibonacci sequence. So if we were to prepend [1; 1] to it, we'd have the actual Fibonacci sequence. That's the intuition behind this definition:

let rec fibs =
  Cons(1, fun () ->
    Cons(1, fun () ->
      sum fibs (tl fibs)))

And it works!

# take 10 fibs;;
- : int list = [1; 1; 2; 3; 5; 8; 13; 21; 34; 55]

Unfortunately, it's highly inefficient. Every time we force the computation of the next element, it required recomputing all the previous elements, twice: once for fibs and once for tl fibs in the last line of the definition. By the time we get up to the 30th number, the computation is noticeably slow; by the time of the 100th, it seems to last forever.

Could we do better? Yes, with a little help from a new language feature: laziness. We discuss it, next.

results matching ""

    No results matching ""