Lambda: The Ultimate Declarative/A Different View of LAMBDA
| ←Abstract | Lambda: The Ultimate Declarative by A Different View of LAMBDA |
Lexical and Dynamic Binding→ |
A Different View of LAMBDA [edit]
Historically, LAMBDA expressions in LISP have been viewed as functions: objects which, when applied ordered sets of arguments, yield single values. These single values typically then become arguments for yet other functions. The consistent use of functions in LISP leads to what is called the applicative programming style. Here we discuss a more general view, of which the functional view will turn out to be a special case. We will consider a new interpretation of LAMBDA as an environment operator which performs the primitive declarative operation of renaming a quantity, and we will consider a function call to be a primitive unconditional imperative operator which includes GOTO as a special case. (In an earlier paper [Steele 76] we described LAMBDA as "the ultimate imperative". Here we assert that this was unfortunately misleading, for it is function invocation which is imperative.)
Primitive Operations in Programming Languages [edit]
What are the primitive operations common to all high-level programming languages? It is the data manipulation primitives which most clearly differentiate high-level languages: FORTRAN has numbers, characters, and arrays; PL/I has strings and structures as well; LISP has list cells and atomic symbols. All have, however, similar notions of control structures and of variables.
If we ignore the various data types and data manipulation primitives, we find that only a few primitive ideas are left. Some of these are:
- Transfer of control
- Environment operations
- Side effects
- Process synchronization
Transfer of control may be subdivided into conditional and unconditional transfers. Environment operations include binding of variables on function entry, declaration of local variables, and so on. Side effects include not only modifications to data structures, but altering of global variables and input/output. Process synchronization includes such issues as resource allocation and passing of information between processes in a consistent manner.
Large numbers of primitive constructs are provided in contemporary programming languages for these purposes. The following short catalog is by no means complete, but only representative:
- Transfer of control
- Sequential blocks
GOTOIF-THEN-ELSEWHILE-DO,REPEAT-UNTIL, and other loopsCASESELECTEXIT(also known asESCAPEorCATCH/THROW)- Decision tables
- Environment operations
- Formal procedure parameters
- Declarations within blocks
- Assignments to local variables
- Pattern matching
- Side effects
- Assignments to global (or
COMMON) variables - Input/output
- Assignments to array elements
- Assignments to other data structures
- Assignments to global (or
- Process synchronization
- Semaphores
- Critical regions
- Monitors
- Path expressions
IF-THEN-ELSE, and WHILE-DO form a complete set of control operations. One can even do without IF-THEN-ELSE, though the technique for eliminating it seems to produce more rather than less complexity. {Note No IF-THEN-ELSE} A minimal set should contain primitives which are not only universal but also easy to describe, simple to implement, and capable of describing more complex constructs in a straightforward manner. This is why the semaphore is still commonly used; its simplicity makes it easy to describe more complex synchronization operators. The expositors of monitors and path expressions, for example, go to great lengths to describe them in terms of semaphores [Hoare 74] [Campbell 74]; but it would be difficult to describe either of these "high-level" synchronization constructs in terms of the other.
With the criteria of simplicity, universality, and expressive power in mind, let us consider some choices for sets of control and environment operators. Side effects and process synchronization will not be treated further in this paper.
Function Invocation: The Ultimate Imperative [edit]
The essential characteristic of a control operator is that it transfers control. It may do this in a more or less disciplined way, but this discipline is generally more conceptual than actual; to put it another way, "down underneath, DO, CASE, and SELECT all compile into IFs and GOTOs". This is why many people resist the elimination of GOTO from high-level languages; just as the semaphore seems to be a fundamental synchronization primitive, so the GOTO seems to be a fundamental control primitive from which, together with IF, any more complex one can be constructed if necessary. (There has been a recent controversy over the nested IF-THEN-ELSE as well. Alternatives such as repetitions of tests or decision tables have been examined. However, there is no denying that IF-THEN-ELSE seems to be the simplest conditional control operator easily capable of expressing all others.)
GOTO, however, is that to communicate information from the code gone from to the code gone to it is necessary to use global variables. This was a fundamental difficulty with the CONNIVER language [McDermott 74], for example; while CONNIVER allowed great flexibility in its control structures, the passing around of data was so undisciplined as to be completely unmanageable. It would be nice if we had
Who Pops the Return Address? [edit]
Earlier we showed a translation of BAR into "machine language", and noted that there was no code which explicitly popped a return address; the buck was always passed to another function (F, G, or H). This may seem surprising at first, but it is in fact a necessary consequence of our view of function calls as "GOTOs with a message". We will show by induction that only primitive functions not expressible in our language (SCHEME) perform POPJ; indeed, only this nature of the primitives determines the fact that our language is functionally oriented!
What is the last thing performed by a function? Consider the definition of one:
(DEFINE FUN (LAMBDA (X1 X2 ... XN) <body>))
Now <body> must be a form in our language. There are several cases:
- Constant, variable, or closure. In this case we actually compiled a
POPJin the case ofFACTabove, but we could view constants, variables, and closures (in general, things which "evaluate trivially"in the sense described in [Steele 76]) as functions of zero arguments if we wished, and soGOTOa place which would get the value of the constant, variable, or closure intoRESULT. This place would inherit the return address, and so our function need not pop it. Alternatively, we may view constants, etc. as primitives, the same way we regard integer addition as a primitive (note thatCTA2above required aPOPJ, since we had "open-coded" the addition primitive). (IF <pred> <exp1> <exp2>). In this case the last thing our function does is the last thing<exp1>or<exp2>does, and so we appeal to this analysis inductively.(LABELS <defns> <exp>). In this case the last thing our function does is the last thing<exp>does. This may involve invoking a function defined in theLABELS, but we can consider them to be separate functions for our purposes here.- A function call. In this case the function called will inherit the return address.
Since these are all the cases, we must conclude that our function never pops its return address! But it must get popped at some point so that the final value may be returned.
Or must it? If we examine the four cases again and analyze the recursive argument, it becomes clear that the last thing a function that we define in SCHEME eventually does is invoke another function. The functions we define therefore cannot cause a return address to be popped. It is, rather, the primitive, built-in operators of the language which pop return addresses. These primitives cannot be directly expressed in the language itself (or, more accurately, there is some bases set of them which cannot be expressed). It is the constants (which we may temporarily regard as zero-argument functions), the arithmetic operators, and so forth which pop the return address. (One might note that in the compilation of CURRIED-TRIPLE-ADD above, a POPJ appeared only at the point the primitive "+" function was open-coded as ADD instructions.)