--- Copyright 1994 Paul R. Wilson Until 1998, you may use these notes freely for noncommercial purposes, as long as this notice is preserved in all copies. After January 1, 1998, you must obtain permission from the author to copy or redistribute. --- ENVIRONMENTS AND CONTINUATIONS In most conventional languages, binding environment frames are allocated in an ACTIVATION STACK; calling a procedure allocates an ACTIVATION RECORD (or "stack frame") that holds the return address for the call and the variable bindings for the procedure being called. In most language implementations, the stack is a contiguous area of memory, and pushing an activation record onto the stack is done by incrementing a pointer into this contiguous area by the size of the stack frame. Removing a stack frame is done by decrementing the pointer by the size of the stack frame. Scheme implementations are quite different. As we've explained previously, variable bindings are NOT allocated in a stack, but instead in environment frames on the garbage-collected heap. This is necessary so that closures can have indefinite extent, and can count on the environments they use living as long as is necessary. The garbage collector will eventually reclaim the space for variable bindings in frames that aren't captured by closures. Scheme implementations also differ from conventional language implementations in how they represent the saved state of callers. (In a conventional language implementation, the callers' state is in two places: the variable bindings are in the callers' own stack frames, and the return address (where to resume the caller after a procedure returns) is stored in the callee's stack frame.) In a Scheme implementation, the caller's state is saved in a record on the garbage-collected heap, called a PARTIAL CONTINUATION. It's called a continuation because it says how to resume the caller when we return into it. It's called a PARTIAL continuation because that record, by itself, it only tells us how to resume the caller, not the caller's caller or the caller's caller's caller. On the other hand, each partial continuation holds a pointer to the partial continuation for ITS caller, so a chain of continuations represents how to continue the whole computation: how to resume the caller, and when it returns, how to resume ITS caller, and so on until the computation is complete. This chain is therefore called a FULL CONTINUATION. In most Scheme implementations, a special REGISTER called the CONTINUATION REGISTER is used to hold the pointer to the partial continuation for the caller of the currently-executing procedure. When we call a procedure, we can package up the state of the caller as a record on the heap (a partial continuation), and push that partial continuation onto the chain of continuations hanging off the continuation register. part. cont. (saved state of caller's /\ caller's caller) | | part. cont. (saved state of caller's caller) /\ | +-------+ | CONT | +---+-----> part. cont. (saved state of caller) +-------+ (It is often convenient to draw stacks and continuations as growing downward, which is our convention here---the newer elements are on the bottom.) Note that the continuation register may be a register in the CPU, or it may just be a particular memory location that our implementation uses for this purpose. The point is just that when we're executing a procedure, we always know where to find a pointer to the partial continuation that lets us resume its caller. We will sometimes abbreviate this register's name as CONT. A typical implementation of Scheme using a compiler has several important registers that encode the state of the currently-executing procedure: * the ENVIRONMENT REGISTER (ENVT) holds the pointer to the chain of environment frames that make up the environment that the procedure is executing in. * the PROGRAM COUNTER register (PC) holds the pointer to the next instruction to execute. In a normal system that compiles to normal machine code, this is the actual program counter of the underlying hardware. * the CONTINUATION register (CONT), as we've said, holds the pointer to the chain of partial continuations that lets us resume callers. This is very roughly the equivalent of an activation stack pointer. Before we call a procedure, we must save a continuation if we want to resume the current procedure after the callee returns. (We don't have to do this for tail-calls, as we'll explain later.) Since the important state of the currently-executing procedure is in the registers listed above, we will create a record that has fields to hold them, and push that on the continuation chain. We will save the value of the CONT, ENVT, and PC registers in the partial continuation, then put a pointer to this new partial continuation in the continuation registers. We also need to save any other state that the caller will need when it resumes, as suggested by the ellipsis below. (We'll discuss what else goes in a partial continuation when we talk about compilers in detail.) old cont. /\ | +-------+ | +-------+ |p.cont.| | CONT | +---+------->+=======+ | +-------+ cont | +---+-------+ +-------+ envt | +---+-------->old envt +-------+ pc | +---+-------->return address +-------+ | + ... | | +-------+ Notice that since we saved the old value of the continuation register in the partial continuation, that serves as the ``next'' pointer in the linked list that makes up the full continuation. This is exactly as it should be. The value of the continuation register is part of the caller's state, and saving it naturally constructs a linked list, because each procedure's state is fundamentally linked to the state of its caller. Saving the return address is a little bit special---rather than just copying the program counter and saving it, we must save the address we want to resume at when we resume this procedure. Once a procedure has pushed a continuation, it has saved its state and can call another procedure. The other procedure can use the ENVT, CONT, and PC registers without destroying the values of those registers needed by the caller. This is called a CALLER SAVES register usage convention; the assumption is that the callee is allowed to freely clobber the values in the registers, so it's the caller's responsibility to save any values it will need when it resumes. To do a PROCEDURE RETURN, it is only necessary to copy the register values out of the continuation that's pointed to by the CONT register. This will restore the caller's environment and its pointer to its caller's continuation, and setting the PC will branch to the code in the caller where execution should resume. We often call this ``popping'' a continuation, because it's a stacklike operation---saving a (partial) continuation pushes the values in registers onto the front of the ``stack'', and restoring one pops the values back into the registers. (As we will explain later, however, Scheme continuation chains don't always observe this simple stack discipline, which is why they can't be implemented in contiguous arrays.) If we save state and do a procedure call, and before returning our caller save's ITS state and does a procedure call, the chain of continuations gets longer. For the most part, this is like pushing activation information on a stack. /\ | +-------+ | |p.cont.| | +=======+ | cont | +---+-------+ +-------+ envt | +---+-------->old envt +-------+ pc | +---+-------->return address +-------+ | + ... | | +-------+ _ |\ \ \ \ /\ \ . +-------+ | +-------+ |p.cont.| | CONT | +---+------->+=======+ / +-------+ cont | +---+---' +-------+ envt | +---+-------->old envt +-------+ pc | +---+-------->return address +-------+ | + ... | | +-------+ Notice that when we say we save the "state" of the caller, we mean the values in our important registers, but we don't directly save particular variable values---when we save the environment pointer, we don't save the values in the bindings in the environment. If other code then executes in that same environment and changes those values, the new values will be seen by this procedure when it returns and restores the environment pointer. This policy has two important consequences: * we can save an environment pointer into a continuation very quickly, and restore it quickly, because we're just saving and restoring one pointer, and * it ensures that environments have the right semantics: closures that live in the same environment SHOULD see each others' changes to variables. This is one of the ways that procedures are supposed to be able to communicate. (Consider implementing a module using LETREC---the local procedures may want to communicate by shared access to a private variable.) Executing a return ("popping a continuation") does not modify the partial continuation being popped---it just involves getting values out of the continuation and putting them into registers. Continuations are thus created and used NONDESTRUCTIVELY, and the continuations on the heap form a graph that reflects the pattern of non-tail procedure calls. Usually, that graph is just a tree, because of the tree-like pattern of call graphs, and the current ``stack'' of partial continuations is just the rightmost path through that graph, i.e., the path from the newest record all the way back to the root. Consider the following procedures, where A calls B twice, and each B is called, it calls C twice: (define (a) (b) (b) #t) (define (b) (c) (c) #t) (define (c) #f) All of these calls are non-tail calls, because none of the procedures ever ends in a (tail) call. Suppose we call A after pushing a continuation for A's caller, then A calls B the first time. A will push a continuation to save its state, then call B. While executing B, B's state will be in the registers, including a pointer to the continuation for A in the CONT register. cont for A's caller / / cont. for A /\ +---+ | CONT | +-+----+ +---+ B will then push a continuation and call C. cont for A's caller / / cont. for A / / cont. for B /\ | +-|-+ CONT | + | +---+ When C returns, it will restore B's state by popping the partial continuation's values into registers. At this point, the CONT register will point past the continuation for B to the continuation for A. cont for A's caller / / cont. for A / /\ / | cont. for B | | +---+ | CONT | +-+------+ +---+ Then B will push another continuation and call C again. cont for A's caller / / cont. for A / \ / \ cont. for B cont for B /\ | +---+ | CONT | +-+----------+ +---+ Again, C will return, restoring B's state, and the CONT register will point past the continuation for B at the continuation for A. cont for A's caller / / cont. for A <-------+ / \ | / \ | cont. for B cont for B | | +---+ | CONT | +-+-------------------+ +---+ After returning to A, the CONT register will point past the continuation for A to the continuation for A's caller. Then before A calls B again, it will push another continuation to save its state. cont for A's caller / \ / \ cont. for A cont for A / \ /\ / \ | cont. for B cont for B | | +---+ | CONT | +-+----------------------+ +---+ Then A will return and the CONT register will point past the continuation for A to the continuation for A's caller. cont for A's caller <--+ / | / | cont. for A | / \ | / \ | cont. for B cont for B | | +---+ | CONT | +-+----------------------------+ +---+ This continues in the obvious way, so that at the time of the fourth and last call to C, the continuations on the heap look like this: cont for A's caller / \ / \ cont. for A cont. for A / \ / \ / \ / \ cont for B cont for B cont for B cont for B /\ | +---+ | CONT | +-+-------------------------------+ +---+ Most of the time, the rest of this graph becomes garbage quickly---each continuation holds pointers back up the tree, but nothing holds pointers DOWN the tree. Partial continuations therefore usually become garbage the first time they're returned through. The fact that this graph is created on the heap will allow us to implement call-with-current-continuation, a.k.a. call/cc, a very powerful control construct. (We will discuss call/cc in detail later.)