# Scheme: An Interpreter for Extended Lambda Calculus/Section 3

## Section 3: Substitution Semantics and Programming Styles

In the previous section we showed several different SCHEME programs for computing the factorial function. How are they different? We intuitively distinguish recursive from iterative programs, for example, by noting that recursive programs "call themselves" but in the last section we claimed to do iteration with a seemingly recursive program. Experienced programmers "know" that recursion uses up "stack" so a program implemented recursively will run out of stack on a sufficiently large problem. Can we make these ideas more precise? One traditional approach is to model the computation with lambda calculus.

### Reviewing the Lambda Calculus

Traditionally language constructs are broken up into two distinct classes: imperative constructs and those with side-effects—such as assignment and go-to; and applicative constructs—those executed for value—such as arithmetic expressions. In addition, compiled languages often require a third class, declarative constructs, but these are provided primarily to guide the compilation process and do not directly affect the semantics of execution, and so will not concern us here.

Lambda calculus is a model for the applicative component of programming languages. It models all non-imperative constructs as applications of functions and specifies the semantics of such expressions by a set of axioms or rewrite rules. One axiom states that a combination, i.e. an expression formed by a function applied to some arguments, is equivalent to the body of that function with the appropriate arguments substituted for the free occurrences of the formal parameters of the function in its body:

```((LAMBDA <vars> <body>) <args>) = Subst[<args> <vars> <body>]
```

Another axiom requires that the meaning of an expression be independent of the names of the formal parameters bound in the expression:

```(LAMBDA <vars> <body>)
= (LAMBDA <newvars> Subst[<newvars> <vars> <body>])
provided that none of <newvars> appears free in <body>.
```

These constraints force `Subst` to be defined in such a way that an important kind of referential transparency is obtained. Besides these "structural" axioms, others are provided which specify the result of certain primitive functions applied to specific arguments. We shall not be concerned with these problems here—we will assume a small reasonable set of primitive functions.

### Recursive programs

Now, let's see how lambda calculus may be used (informally) to model a computation. Consider the standard definition of the factorial function:

```(DEFINE FACT
(LAMBDA (N) (IF (= N 0) 1
(* N (FACT (- M 1))))))
```

We are being very informal—lambda calculus as presented by [Church] does not include such constructs as `DEFINE`, `IF`, or `=`, `*`, or even `1`! The "usual" lambda calculus construct for defining recursive functions is a rather obscure object called the "fixed-point" operator. We have been lax to avoid the hassle of "rigor mortis" in this tutorial paper. Similarly, `IF` is the SCHEME conditional construct we will use for convenience, it reduces to its second or third argument depending on whether the first reduces to `TRUE` or `FALSE`. The objects `*`, `=`, `0`, `1`, etc. may be thought of as abbreviations for complex lambda expressions (such as Church numerals) whose details we are not interested in. On the other hand, we may think of them as primitive expressions, defined by additional axioms; this viewpoint leads to practical interpreter implementations.

Now let's reduce the expression `(FACT 3)`. We will perform the expression reductions, except for the `IF` primitive, in Applicative Order (call by value), though this is not necessary, as we will discuss later. We display a "trace" of the substitutions:

```=>    (FACT 3)
=>    (IF (- 3 0) 1 (* 3 (FACT (- 3 1))))
=>    (* 3 (FACT (- 3 1)))
=>    (* 3 (FACT 2))
=>    (* 3 (IF (= 2 0) 1 (* 2 (FACT (- 2 1)))))
=>    (* 3 (* 2 (FACT (- 2 1))))
=>    (* 3 (* 2 (FACT 1)))
=>    (* 3 (* 2 (IF (= 1 0) 1 (* 1 (FACT (- 1 1))))))
=>    (* 3 (* 2 (* 1 (FACT (- 1 1)))))
=>    (* 3 (* 2 (* 1 (FACT 0))))
=>    (* 3 (* 2 (* 1 (IF (= 0 0) 1 (* 0 (FACT (- 0 1)))))))
=>    (* 3 (* 2 (* 1 1)))
=>    (* 3 (* 2 1))
=>    (* 3 2)
=>    6
```

You will note that we have calculated `(fact 3)` by a process wherein each expression is replaced by an expression which is provably equivalent to it via an axiom or which is produced by application of a primitive function.

Consider the "iterative" definition of `FACT`. Although it appears to be recursive, since it "calls itself", we will see that it captures the essence of our intuitive notion of iteration.

```(DEFINE FACT
(LAMBDA (N)
(LABELS ((FACT1
(LAMBDA (M ANS)
(IF (= M 0) ANS
(FACT1 (- M 1) (* M ANS))))))
(FACT1 N 1))))
```

Let us now compute `(fact 3)`.

```=>    (FACT 3)
=>    (FACT1 3 1)
=>    (IF (= 3 0) 1
(FACT1 (- 3 1) (* 3 1)))
=>    (FACT1 (- 3 1) (* 3 1))
=>    (FACT1 2 (* 3 1))
=>    (FACT1 2 3)
=>    (IF (= 2 0) 3
(FACT1 (- 2 1) (* 2 3)))
=>    (FACT1 (- 2 1) (* 2 3))
=>    (FACT1 1 (* 2 3))
=>    (FACT1 1 6)
=>    (IF (= 1 0) 6
(FACT1 (- 1 1) (* 1 6)))
=>    (FACT1 (- 1 1) (* 1 6))
=>    (FACT1 0 (* 1 6))
=>    (FACT1 0 6)
=>    (IF (= 0 0) 6
(FACT1 (- 0 1) (* 0 6)))
=>    6
```

Notice that the expressions involved have a fixed maximum size independent of the argument to `FACT`! In fact, as Marvin Minsky pointed out, successive reductions produce a cycle of expressions which are identical except for the numerical quantities involved. Looking back, we may note by way of comparison that the recursive version caused creation of expressions proportional in size to the argument. This is why we think that this version of `FACT` is iterative rather than recursive. At each stage of the iterative version the "state" of the computation is summarized in two variables, the counter and the answer accumulator, while at each stage of the recursive version the "state" contains a chain of pieces each of which contains a component of the state. In the recursive version of `FACT`, for example, the state contains the sequence of multiplications to be performed upon return from the bottom. It is true that the iterative factorial also can produce expressions of arbitrary size, since the number of bits needed to express factorial of n grows with n; but this is a property of the numbers calculated by the function which is implemented in iterative style, and not of the iterative control structure itself. A recursive control structure inherently creates expressions of unbounded size as a function of the recursion depth, while an iterative control structure produces a cycle of equivalent expressions, and so the expressions are of approximately the same size no matter how many iteration steps are taken. This is the essence of the difference between the notions of iteration and recursion. Hewitt [MAC, p. 234] made a similar observation in passing, expressing the difference in terms of storage used in program execution rather than in terms of intermediate expressions produced by substitution semantics.

### Continuation Passing Recursion

Remember the other way to compute factorials?

```(DEFINE FACT
(LAMBDA (N C)
(IF (= M 0) (C 1)
(FACT (- N 1)
(LAMBDA (A) (C (* N A)))))))
```

This looks iterative on the surface! but in fact it is recursive. Let's compute `(FACT 3 ANSWER)`, where `ANSWER` is a continuation which is to receive the result of `FACT` applied to `3`; that is, the last thing `FACT` should do is apply the continuation `ANSWER` to its result.

```=>    (FACT 3 ANSWER)
=>    (IF (= 3 0) (ANSWER 1)
(FACT (- 3 1) (LAMBDA (A) (ANSWER (* 3 A)))))
=>    (FACT (- 3 1) (LAMBDA (A) (ANSWER (* 3 A))))
=>    (FACT 2 (LAMBDA (A) (ANSWER (* 3 A))))
=>    (IF (= 2 0) ((LAMBDA (A) (ANSWER (* 3 A))) 1)
(FACT (- 2 1)
(LAMBDA (A)
((LAMBDA (A) (ANSWER (* 3 A)))
(* 2 8)))))
=>    (FACT (- 2 1)
(LAMBDA (A)
((LAMBDA (A) (ANSWER (* 3 A)))
(* 2 A))))
=>    (FACT 1
(LAMBDA (A)
((LAMBDA (A) (ANSWER (* 3 A)))
(* 2 A))))
```
```=>    (IF (= 1 0)
((LAMBDA (A)
((LAMBDA (A) (ANSWER (* 3 A)))
(* 2 A)))
1)
(FACT (- 1 1)
(LAMBDA (A)
((LAMBDA (A)
((LAMBDA (A)
(* 2 A)))
(* 1 A)))))
=>    (FACT (- 1 1)
(LAMBDA (A)
((LAMBDA (A)
((LAMBDA (A)
(* 2 A)))
(* 1 A))))
=>    (FACT 0
(LAMBDA (A)
((LAMBDA (A)
((LAMBDA (A)
(* 2 A)))
(* 1 A))))
=>    (IF (= 0 0)
((LAMBDA (A)
((LAMBDA (A)
((LAMBDA (A)
(* 2 A)))
(* 1 A)))
1)
(FACT (- 0 1)
(LAMBDA (A)
((LAMBDA (A)
((LAMBDA (A)
((LAMBDA (A)
(* 2 A)))
(* 1 A)))
(* 0 A)))))
=>    ((LAMBDA (A)
((LAMBDA (A)
((LAMBDA (A)
(* 2 A)))
(* 1 A)))
1)
```
```=> ((LAMBDA (A)
((LAMBDA (A)
(* 2 A)))
(* 1 1))
=> ((LAMBDA (A)
((LAMBDA (A)
(* 2 A)))
1)
=> ((LAMBDA (A)
(* 2 1))
=> ((LAMBDA (A)
2)
```

Note that we have computed factorial of 3 (and are about to give this result to the continuation), but in the process no combination with `FACT` in the first position has ever been reduced except as the outermost expression. If we think of the computation in terms of evaluation rather than substitution, this means that we never returned a value from any application of the function `FACT`! It is always possible, if we are willing to specify explicitly what to do with the answer, to perform any calculation in this way: rather than reducing to its value, it reduces to an application of a continuation to its value (cf. [Fischer]). That is, in this continuation-passing programming style, a function always "returns" its result by "sending" it to another function. This is the key idea.

We also note that by our previous observation, this program is essentially recursive in that the expressions produced as intermediate results of the substitution semantics grow to a size proportional to the depth. In fact, the same information is being stored in the nested continuations produced by this program as in the nested products produced by the traditional recursion—what to do with the result.

One might object that this `FACT` is not the same kind of object as the previous definition, since we can't use it as a function in the same manner. Note, however, that if we supply the continuation `(LAMBDA (X) X)`, the resulting combination `(FACT 3 (LAMBDA (X) X))` will reduce to `6`, just as with traditional recursion.

One might also object that we are using function values—the primitives `=`, `-`, and `*` are functions which return values, for example. But this is just a property of the primitives; consider a new set of primitives `==`, `--`, and `**` which accept continuations (indeed, let `==` take two continuations: if the predicate is `TRUE` call the first, otherwise call the second). We can then define `fact` as follows:

```(DEFINE FACT
(LAMBDA (N C)
(== N 0
(LAMBDA () (C 1))
(LAMBDA ()
(-- N 1
(LAMBDA (M)
(FACT M (LAMBDA (A) (** A N C)))))))))
```

We can see here that no functional application returns a value in a computation of factorial in this situation. We believe that functional usage, where applicable (pun intended), is more perspicuous than continuation-passing. We shall therefore use functional primitives such as `*` rather than `**` wherever possible, keeping in mind that we could use `**` instead if we wished.