A Theorem about Folding

When we studied List.fold_left and List.fold_right, we discussed how they sometimes compute the same function, but in general do not. For example,

  List.fold_left (+) 0 [1; 2; 3]  
= (((0 + 1) + 2) + 3
= 6
= 1 + (2 + (3 + 0))
= List.fold_right (+) lst [1;2;3]

but

  List.fold_left (-) 0 [1;2;3]
= (((0 - 1) - 2) - 3
= -6
<> 2
= 1 - (2 - (3 - 0))
= List.fold_right (-) lst [1;2;3]

Based on the equations above, it looks like the fact that + is commutative and associative, whereas - is not, explains this difference between when the two fold functions get the same answer. Let's prove it!

First, recall the definitions of the fold functions:

let rec fold_left f acc lst =
  match lst with
  | [] -> acc
  | h :: t -> fold_left f (f acc h) t

let rec fold_right f lst acc =
  match lst with
  | [] -> acc
  | h :: t -> f h (fold_right f t acc)

Second, recall what it means for a function f : 'a -> 'a to be commutative and associative:

Commutative:  forall x y, f x y = f y x  
Associative:  forall x y z, f x (f y z) = f (f x y) z

Those might look a little different than the normal formulations of those properties, because we are using f as a prefix operator. If we were to write f instead as an infix operator op, they would look more familiar:

Commutative:  forall x y, x op y = y op x  
Associative:  forall x y z, x op (y op z) = (x op y) op z

When f is both commutative and associative we have this little interchange lemma that lets us swap two arguments around:

Lemma (interchange): f x (f y z) = f y (f x z)

Proof:

  f x (f y z)
=   { associativity }
  f (f x y) z
=   { commutativity }
  f (f y x) z
=   { associativity }
  f y (f z x) 

QED

Now we're ready to state and prove the theorem.

Theorem: If f is commutative and associative, then
  forall lst acc, 
    fold_left f acc lst = fold_right f lst acc.

Proof: by induction on lst.
P(lst) = forall acc, 
  fold_left f acc lst = fold_right f lst acc

Base case: lst = []
Show: forall acc, 
  fold_left f acc [] = fold_right f [] acc

  fold_left f acc []
=   { evaluation }
  acc
=   { evaluation }
  fold_right f [] acc

Inductive case: lst = h :: t
IH: forall acc, 
  fold_left f acc t = fold_right f t acc
Show: forall acc, 
  fold_left f acc (h :: t) = fold_right f (h :: t) acc

  fold_left f acc (h :: t)
=   { evaluation }
  fold_left f (f acc h) t
=   { IH with acc := f acc h }
  fold_right f t (f acc h)

  fold_right f (h :: t) acc
=   { evaluation }
  f h (fold_right f t acc)

Now, it might seem as though we are stuck: the left and right sides of the equality we want to show have failed to "meet in the middle." But we're actually in a similar situation to when we proved the correctness of facti earlier: there's something (applying f to h and another argument) that we want to push into the accumulator of that last line (so that we have f acc h).

Let's try proving that with its own lemma:

Lemma: forall lst acc x, 
  f x (fold_right f lst acc) = fold_right f lst (f acc x)

Proof: by induction on lst.
P(lst) = forall acc x, 
  f x (fold_right f lst acc) = fold_right f lst (f acc x)

Base case: lst = []
Show: forall acc x, 
  f x (fold_right f [] acc) = fold_right f [] (f acc x)

  f x (fold_right f [] acc)
=   { evaluation }
  f x acc

  fold_right f [] (f acc x)
=   { evaluation }
  f acc x
=   { commutativity of f }
  f x acc

Inductive case: lst = h :: t
IH: forall acc x, 
  f x (fold_right f t acc) = fold_right f t (f acc x)
Show: forall acc x, 
  f x (fold_right f (h :: t) acc) = fold_right f (h :: t) (f acc x)

  f x (fold_right f (h :: t) acc)
=  { evaluation }
  f x (f h (fold_right f t acc))
=  { interchange lemma }
  f h (f x (fold_right f t acc))
=  { IH }
  f h (fold_right f t (f acc x))

  fold_right f (h :: t) (f acc x)
=   { evaluation }
  f h (fold_right f t (f acc x))

QED

Now that the lemma is completed, we can resume the proof of the theorem. We'll restart at the beginning of the inductive case:

Inductive case: lst = h :: t
IH: forall acc, 
  fold_left f acc t = fold_right f t acc
Show: forall acc, 
  fold_left f acc (h :: t) = fold_right f (h :: t) acc

  fold_left f acc (h :: t)
=   { evaluation }
  fold_left f (f acc h) t
=   { IH with acc := f acc h }
  fold_right f t (f acc h)

  fold_right f (h :: t) acc
=   { evaluation }
  f h (fold_right f t acc)
=   { lemma with x := h and lst := t }
  fold_right f t (f acc h)

QED

It took two inductions to prove the theorem, but we succeeded! Now we know that the behavior we observed with + wasn't a fluke: any commutative and associative operator causes fold_left and fold_right to get the same answer.

results matching ""

    No results matching ""