Go to the first, previous, next, last section, table of contents.

cdr-linked lists

In Lisp and Scheme, you don't typically string objects together into a list by giving each one a "next" field that points right to the next object. Instead, you create a list of pairs whose car fields hold the pointers to the objects, and whose cdr fields link the pairs together into a "spine."

There isn't really a special list data type in Scheme. A list is really just a sequence of pairs, ending with a null pointer. A null pointer is a list, too--it's a sequence of zero pairs ending in a null pointer. We sometimes talk about "the car of a list" or "the cdr of a list," but what that really means is "the car of the first pair in the list" and "the cdr of the first pair in the list."

Suppose we have a variable foo holding a pointer to a list containing the integers 22, 15, and 6. Here's one way of drawing this situation.

                    +---------+          +---------+        +---------+
     +---------+    | <PAIR>  |          | <PAIR>  |        | <PAIR>  |
 foo |    *----+--->+=========+      +-->+=========+    +-->+=========+
     +---------+    |       22|     /    |       15|   /    |        6|
                    +---------+    /     +---------+  /     +---------+
                    |    *----+---+      |    *----+-+      |    *    |
                    +---------+          +---------+        +---------+

This shows something pretty close to the way things are likely to actually represented in memory. But there's usually a better way of drawing the list, which emphasizes the fact that number values are conceptually pointers to numbers, and which corresponds to the way we usually think about lists:

     +---+    +---+---+      +---+---+      +---+---+
 bar | *-+--->| * | *-+----->| * | *-+----->| * | * |
     +---+    +-+-+---+      +-+-+---+      +-+-+---+
                |              |              |
               \|/            \|/            \|/
               22             15              6       

I've left off the header fields of objects, which are not visible to a Scheme programmer.

I've also drawn pairs in a special way, with the car and cdr fields side-by-side. Putting the fields side-by-side lets us draw the list left-to-right, with the cdr field in a convenient place for its normal use as a "next" pointer. I've drawn the integers outside the pairs, with pointers to them from the car fields, because that's the way things look at the language level.

This emphasizes the fact that lists are generally separate things from the items "in" the list.

We leave off the headers because they're a low-level detail anyway, because they're a hidden implementation detail that may vary from system to system, and because Scheme programmers immediately recognize this kind of two-box drawing of a pair.

A major advantage of Scheme's list structure is that you don't have to modify an object to put it on a list--an object can easily be in many lists at once, because a list is really just a spine of pairs that holds pointers to the items in the list. This is much cleaner than the way people are typically taught to create simple lists in most beginning programming classes. (It's also very natural in a language where all values are pointers---of course lists of objects are really just lists of pointers to objects.)

For example, you can have two lists with the same elements, or some of the same elements, but perhaps in a different order.

     +---+    +---+---+      +---+---+      +---+---+
 bar | *-+--->| * | *-+----->| * | *-+----->| * | * |
     +---+    +-+-+---+      +-+-+---+      +-+-+---+
                |              |              |
               \|/            \|/            \|/
               22             15              6       
               /|\                           /|\
                |                             |
     +---+    +-|-+---+                     +-+-+---+
 baz | *-+--->| * | *-+-------------------->| * | * |
     +---+    +---+---+                     +---+---+ 

Here I've drawn two lists, bar and baz---that is, lists that are the values of the variables bar and baz. bar holds the elements 22, 15, and 6, while baz just holds the elements 22 and 6.

Since these two lists are really just made up of pairs, and they're different pairs, we can modify one list without modifying the other, and without modifying the objects "in" the lists. For example, we can reverse the order of one of the lists without affecting the other.

(We also don't have to create a special kind of list node that has two next fields, so that something can be in two lists at a time. We can just have two separate lists of pairs, or three or four.)

Scheme has a standard way of writing a textual representation of a list. Given the pictured situation, evaluating the expression (display bar) will print (22 15 6). Evaluating the expression (display baz) will print (22 6). Notice that Scheme just writes out a pair of parentheses around the items in the list--it doesn't represent the individual pairs, but just their car values.

Dynamic typing also helps make lists useful. A list of pairs can hold any type of object, or even a mixed bag of different types of objects. So, for example, a pair list can be a list of integers, a list of lists, a list of text characters, or a list of any of the kinds of objects we haven't gotten to yet. It can also be a mixed list of integers, other lists, and whatnot. A few list routines can therefore be useful in a variety of situations--a single list search routine can search any kind of list for a particular target object, for example.

This picture shows two variable bindings, for the variables bar and foo. bar's binding holds a list (10 15 6), while foo's holds a list (22 15 6). We say that these lists share structure, i.e., part of one list is also part of the other.

                    +-------- + 
     +---------+    | <PAIR>  |
 bar |    *----+--->+=========+
     +---------+    |       10| 
                    +---------+ 
                    |    *----+-+ 
                    +---------+  \
                                  \
                    +---------+    \     +---------+        +---------+
     +---------+    | <PAIR>  |     \    | <PAIR>  |        | <PAIR>  |
 foo |    *----+--->+=========+      +-->+=========+    +-->+=========+
     +---------+    |       22|     /    |       15|   /    |        6|
                    +---------+    /     +---------+  /     +---------+
                    |    *----+---+      |    *----+-+      |    *    |
                    +---------+          +---------+        +---------+

This picture may correspond well to how things are represented in memory, but it's a little confusing.

The more common way of drawing this data structure is

     +---+    +---+---+ 
 bar | *-+--->| * | *-+------+
     +---+    +-+-+---+      |
                |            |
               \|/           |
               10            |
                            \|/
     +---+    +---+---+      +---+---+      +---+---+
 foo | *-+--->| * | *-+----->| * | *-+----->| * | * |
     +---+    +-+-+---+      +-+-+---+      +-+-+---+
                |              |              |
               \|/            \|/            \|/
               22             15              6       

Again, this emphasizes the idea that everything's a pointer--conceptually, the pairs hold pointers to the integers.

In the above picture, we can talk about "the car of foo", which really means the value in the car field of the pair pointed at by the value stored in (the binding of) foo. It's (a pointer to) 10. We would often call this "the car of the list foo."

Notice that the cdr of foo is also a list, and it's the same list as the cdr of bar---the cdrs of the first pairs in each list point to the same list.

We can say that the cdr of foo and the cdr of bar "are eq?," because the expression (eq? (cdr foo) (cdr bar)) returns true. That is, (car foo) and (cdr foo) return (pointers to) exactly the same object.


Go to the first, previous, next, last section, table of contents.