info scheme
,
but most of the (enormous amount of) information
in this info-page is intended for people who basically know the language
and want to look up some detail they've forgotten. For those who
don't basically know the language, this will have to do.
The students for whom I originally wrote these notes had spent a year learning Pascal, and were now required to learn a very different language called Scheme. The natural tendency of such students was to write in Scheme with a strong "Pascal accent": they wrote the same way they would have written in Pascal, but changed the semicolons, commas, and parentheses to make it legal Scheme. As a result, even the students who got these Pascal-programs-in-Scheme to work correctly missed the opportunity to learn the different perspective Scheme brings to programming. My main concern, therefore, was to get students to think in a manner natural to the Scheme language.
The Web page, however, has taken on a life of its own, and is apparently being used as a reference by programming teachers at a number of other schools. (If you are such a teacher, feel free to link to this page, but I would prefer that you link, not copy it so that I can easily make corrections.) Accordingly, I've revised it somewhat to reflect its new, broader audience.
Using and programming in Scheme requires a very different way of thinking than does using and programming in Pascal, C, C++, or many other "procedural" languages. The most obvious difference is that Scheme is (traditionally) interactive. When working in most procedural languages, you write a program in a text editor, get out of the editor, compile the program, run the program, find some bugs, and go back to the editor to fix them. In Scheme, you type commands one at a time and you see the responses to them immediately; the programming process consists of typing particular commands that have the effect of defining new commands. You can put such definitions into a file with a text editor and then load it as a unit, and you can mix this way of doing things freely with the more usual command-response way. Indeed, one way to load a file is to type a command which has the effect of reading in a specified file just as though you were typing it.
I've used the word "command" in the above paragraph. In fact, a better
word would be "expression": Scheme expressions are primarily thought of
as having a value (for example, the expression (+ 3 5)
has the value 8
), and only secondarily as doing
something. This is another big difference between Scheme and the
procedural languages you've seen before: in Pascal, each statement
does something -- it assigns a variable, or reads a number, or
calls a procedure -- whereas in Scheme, each expression
(like (+ 3 5)
has a value, but only a few special ones (like (define x 5)
)
do anything as a side effect. On the other hand, it means that Scheme
has a strong resemblance to the language of algebra.
Many high school and university computer science programs now teach Scheme as a first language, and I've seriously considered this for our program. For a discussion of why this might be a good idea, see the article Mastering the AP CS Curriculum Without Using C++ from the TeachScheme! Project.
for
-loop in C or Pascal.
if
if
...else
while
for
do
...while
switch
...case
...default:
goto
continue
break
return
cond
" and "do
" statements. In rare instances
a Scheme programmer may use a more specialized construct such as
"case
" or a powerful one like
"call-with-current-continuation
"
(which has no analogue in C or Pascal),
but "cond
" alone probably constitutes the vast majority of
Scheme control constructs in practice.
Furthermore, where a C programmer is stuck with the twelve constructs
above, a Scheme programmer can easily define new control constructs (see
below).
3
, 355/113
, or 3.14
;
y
, or +
, or add-course
;
"Bloch"
or "Software II"
; and
(+ 3 (* 4 x))
,
("Trina" "Ian" "Sherri" "Keith" "Peg")
,
or (blue red green (light grey))
.
Numbers, symbols, and strings are collectively referred to as atoms, so called because they cannot be broken down into smaller meaningful pieces. Lists, on the other hand, are the most common kind of complex expressions, i.e. expressions made up of smaller pieces which would have meaning by themselves. One of Scheme's strong points is the ease with which it allows the programmer to manipulate lists. Lists are typed in, and printed out, in a standard form: a pair of parentheses around a sequence of list elements, each separated by at least one space from the previous and next list elements.
Practically, a symbol looks like a variable name: it
consists of a sequence of one or more letters, numbers, and punctuation
marks (most often hyphens and underscores) without intervening space.
Philosophically, a symbol is a data object which has no inherent meaning
(as opposed to a number, which has various built-in behaviours with
respect to arithmetic). Symbols can be used as
the names of variables, or they can be used as data in their own right.
For example, if you were keeping track of birds you had seen, you might
type
(define birdlist '(crow raven pigeon starling sparrow hawk))
This defines a new symbol named "birdlist
" as a variable,
whose value is a list containing the symbols named "crow
",
"raven
", etc.
These symbols are all distinct from one another, but they have no
built-in properties; indeed, the only meaning ascribed to any of them is
that birdlist
is a variable name whose current value is
such-and-such a list. (I'll explain the apostrophe before the list of
bird names in a moment.)
These typeless variables present both advantages and disadvantages for the programmer. The obvious advantage is the saving of time figuring out and typing declaration statements. A less obvious advantage is the ability to write a function that works equally well on a variety of different types of data. (For example, in Pascal it is impossible to write a general "matrix multiplication" function that works for all sizes of matrices; if you want to work with 2x2, 3x3, 4x4, and 10x10 matrices, you need to write four versions of the same function. In Scheme, you could easily write a single matrix-multiplication function that looked at its arguments to see what size they were.) On the other hand, Scheme will silently allow you to assign a floating-point number to a variable that you thought contained a list of character-strings, thus creating a potentially hard-to-find bug, while C or Pascal would give you a compiler error message before you had a chance to even run the buggy program.
Scheme numbers, on the other hand, more closely approximate a mathematician's notion of "number". A Scheme integer is essentially unlimited in range (until you use up all the memory in your computer representing it). A Scheme fraction like 1/3 is stored exactly as a ratio of two (essentially unlimited) integers, so 1/3+1/3+1/3 is 1, as an ordinary equality test will tell you. The language also handles IEEE-standard floating-point numbers, of course, labelling them as "inexact" to distinguish them from exact integers and fractions. Scheme also supports complexes and interval arithmetic, either of which would require defining a new compound data structure in a language like C or Pascal.
Each kind of expression is evaluated in a different way:
3.14
,
the printed result is 3.14
.
x
has the value
(blue red)
, and you type x
, the printed result is
(blue red)
.
"Bloch"
,
the printed result is "Bloch"
.
define
had a chance to do its job.
For example, let's trace what happens when you type in
(+ y 4)
where y
is a variable currently holding the value 6.
(6 4)
, producing as result the number 10.
For another example, let's trace what happens when you type in
(define z (+ y 4))
where y
is still a variable holding the value 6.
z
and (+ y 4)
, but merely passes
them on to the built-in function.
(+ y 4)
), which is
evaluated as in the previous example to yield the number 10.
'bluejay
bluejay
, and simply returns
the symbol bluejay
, which is
then printed: bluejay
'bluejay
Similarly, if you type
'(+ 3 4)
the evaluator does not add 3 and 4 and return 7, but rather
returns the three-element list it was given:
(+ 3 4)
quote
, which simply takes an argument,
doesn't evaluate it, and returns it unaltered. Thus when you type
'bluejay
(quote bluejay)
.
The evaluator then treats quote
as a special form,
doesn't evaluate the argument bluejay
, and returns
the symbol bluejay
.
>
Similarly, if you type '(crow raven bluejay)
(quote (crow raven bluejay))
(crow raven bluejay)
list
function:
(list 4 5 6 7) ;Value: (4 5 6 7)Note that the
list
function takes as many arguments as you
choose to give it, and constructs a list with that many elements.
(list 5 y birdlist '(a b c)) ;Value: (5 6 (crow raven pigeon starling sparrow hawk) (a b c))In this example, some of the elements in the list were variables, and they were replaced by their values.
list
is not a
special form: it evaluates its arguments in a perfectly ordinary way
before putting them together into a list. Note also that a list can be
an element of another list.
To get individual elements from a list, you can use the built-in functions
first
, second
,...:
(first birdlist) ;Value: crow (second birdlist) ;Value: raven (third birdlist) ;Value: pigeonOnly the first few such functions are defined, and in practice most Scheme programmers only use
first
, second
, and perhaps
third
. The reason is that a list is not an array, and
should not be used for the same purposes as an array: if you really want
to get at the i-th element of a list, you're better off with a
real array. However, lists make it easy to work with "the rest of" a
list: Scheme has a built-in function
rest
that gives you "all but the first element".
(A few old-fashioned Scheme programmers use the older names "car" and "cdr" instead of "first" and "rest". The names "car" and "cdr" are something of an embarrassment: they refer to two machine instructions on a particular model of IBM computer popular in the late 1950's, standing for "contents of the address register" and "contents of the data register". Don't worry about them, but if you hear somebody talking about them, you know it's a Lisp/Scheme old-timer.)
(rest birdlist) ;Value: (raven pigeon starling sparrow hawk) (first birdlist) ;Value: crow (first (rest birdlist)) ;Value: raven (second birdlist) ;Value: raven (first (rest (rest birdlist))) ;Value: pigeon (third birdlist) ;Value: pigeon (first (rest (rest (rest birdlist)))) ;Value: starlingNote that the names "second", "third", etc. are defined only for convenience: as long as you have "first" and "rest", you can get at any list element you wish.
So "first", "rest", and their relatives
allow you to take a list apart. Scheme also provides functions to put a
list together. The simplest one is cons
, which takes two
arguments (an object and a list) and constructs a new list with the
object inserted in front of the list:
(cons 'swan birdlist) ;Value: (swan crow raven pigeon starling sparrow hawk) (cons 4 '(5 6 7)) ;Value: (4 5 6 7Note that
cons
does not change any of its arguments
(indeed, most Scheme functions don't):
birdlist ;Value: (crow raven pigeon starling sparrow hawk) (cons 'swan birdlist) ;Value: (swan crow raven pigeon starling sparrow hawk) birdlist ;Value: (crow raven pigeon starling sparrow hawk)The simplest possible list is called "the empty list" or "the null list".
() ;Value: () '() ;Value: () (cons 'lonely ()) ;Value: (lonely) (cons 'a (cons 'b ())) ;Value: (a b)Note that
cons
-ing something with the empty list
()
produces, as you would expect, a list starting with the
"something", but not containing any other elements. A list with one
element is very different from that element itself!
'lonely ;Value: lonely (list 'lonely) ;Value: (lonely) (cons 'lonely '()) ;Value: (lonely) (cons birdlist birdlist) ;Value: ((crow raven pigeon starling sparrow hawk) crow raven pigeon starling sparrow hawk)The last example illustrates that
cons
does not care
whether its first argument is itself a list; cons
treats it as
a single object to be inserted into another list. If you want to
combine two or more lists, you can use the append
function:
(append birdlist birdlist) ;Value: (crow raven pigeon starling sparrow hawk crow raven pigeon starling sparrow hawk) birdlist ;Value: (crow raven pigeon starling sparrow hawk)Note that
append
, too, does not change its arguments.
printf
and
scanf
, and you can also define your own C functions, which are
then treated the same way as C library functions. In the same way,
although Scheme provides many built-in functions like list
,
+
, first
, cons
, display
,
etc., you can equally well define your own functions.
Here's a simple example:
(define (add1 x) (+ x 1))) ;Value: add1This defines a new function named "add1" which takes one parameter, "x", and returns the value of the expression "(+ x 1)". After typing this into a Scheme interpreter, we can use "add1" freely, e.g.
(add1 4) ;Value: 5 (define y 6) ;Value: y (add1 y) ;Value: 7and even use it in other functions that you define:
(define (add3 x) (add1 (add1 (add1 x)))) ;Value: add3 (add3 y) ;Value: 9 y ;Value: 6For another example, suppose we want to build a list with three copies of something:
(define (threecopies x) (list x x x)) ;Value: threecopies (threecopies 4) ;Value: (4 4 4) (threecopies birdlist) ;Value: ((crow raven pigeon starling sparrow hawk) (crow raven pigeon starling sparrow hawk) (crow raven pigeon starling sparrow hawk))
((lambda (x) (+ x 1)) 6) ;Value: 7This is of course an unrealistic example, since any reasonable Scheme programmer would say
(+ 1 6)instead. But the function "threecopies" above provides a more realistic example. If you didn't plan to use the operation "make a list with three copies of x" in the future, you might skip defining "threecopies" and just say
((lambda (x) (list x x x)) birdlist) ;Value: ((crow raven pigeon starling sparrow hawk) (crow raven pigeon starling sparrow hawk) (crow raven pigeon starling sparrow hawk))or
((lambda (x) (list x x x)) '(a b c d e f g)) ;Value: ((a b c d e f g) (a b c d e f g) (a b c d e f g))In this case, using an unnamed function would be shorter and less error-prone than writing
(list '(a b c d e f g) '(a b c d e f g) '(a b c d e f g))
So what is this "lambda" thing, anyway? "lambda" is another special
form used for building functions. The first argument to "lambda" is an
example of how the function is to be called:
(func-name param1 param2 ...)
in parentheses, the function name and a list of its parameters.
The second argument is an expression which is the
body of the function. In fact, although we defined the "add1" function
above with the syntax
(define (add1 x) (+ x 1))some Scheme programmers prefer an alternate syntax:
(define add1 (lambda (x) (+ x 1)))This way of stating things is longer and less convenient, but makes clearer what's happening: we define a new symbol named "add1" and give it, as its value, a function which takes an argument which it calls "x" and returns x+1. Thus typing "
(add1 5)
" is
exactly equivalent to typing "((lambda (x) (+ x 1)) 5)
", since
"add1
" has the function "(lambda (x) (+ x 1))
" as its
value. Similarly, the following two function calls are exactly
equivalent:
(threecopies '(a b c d e f g)) ;Value: ((a b c d e f g) (a b c d e f g) (a b c d e f g)) ((lambda (x) (list x x x)) '(a b c d e f g)) ;Value: ((a b c d e f g) (a b c d e f g) (a b c d e f g))
if
" is a special form with three arguments.
It evaluates the first; if the first
evaluates to "#t
"
(Scheme's notation for "true"),
it evaluates and returns the second argument, and if not, it evaluates
and returns the third argument. For example,
x ;Value: 3 y ;Value: 6 (> y x) ;Value: #t (< y x) ;Value: #f (if (> y x) 'raven 'bluebird) ;Value: raven (if (< y x) 'raven 'bluebird) ;Value: bluebird
In C and Pascal, it's common to write a sequence of statements like "if... then... else if... then... else if... then... else...". The same can be done in Scheme, but the parentheses start nesting very deeply:
(if (< x y) 'x-less-than-y (if (= x y) 'x-equals-y (if (= x (+ y 1)) 'x-one-more-than-y 'x-at-least-two-more-than-y)))A more convenient, and incidentally more general, way to do this is with the "
cond
" special form. "cond
" takes one or more
arguments, each of which is a list of expressions. It evaluates the
first expression in the first list: if it's non-null, it then evaluates
the rest of the expressions in the first list and returns the value of
the last one; if it's null, on the other hand, "cond
" goes on
to the second list and does the same thing: evaluate the first
expression, see whether it's null, and so on.
In other words,
(cond (#t expr1b) (expr2a expr2b) ... (exprna exprnb))is equivalent to
expr1b
, while
(cond (#f expr1b) (expr2a expr2b) ... (exprna exprnb))is equivalent to
(cond (expr2a expr2b) ... (exprna exprnb))
So the following "cond
"
form has the same effect as the above three nested "if
"s:
(cond ((< x y) 'x-less-than-y) ((= x y) 'x-equals-y) ((= x (+ y 1)) 'x-one-more-than-y) (#t 'x-at-least-two-more-than-y))By the way, you can use the word "else" in place of exprna, producing the following slightly more readable equivalent of the above:
(cond ((< x y) 'x-less-than-y) ((= x y) 'x-equals-y) ((= x (+ y 1)) 'x-one-more-than-y) (else 'x-at-least-two-more-than-y))
Recursion, in a programming language, simply means a function calling
itself. Of course, if the function f(x)
is computed by calling
f(x)
again, the computer will go into an infinite loop, asking
the same question over and over again, so it's important that each
recursive call be on a simpler version of the same question.
Perhaps the most well-known example is the factorial function,
(define (factorial n) (if (<= n 0) 1 (* n (factorial (- n 1)))))That is, the factorial of
n
is 1 if n <= 0
, and
otherwise it can be computed by first computing the factorial of
n-1
and multiplying the result by n
. Thus
n
,
could be found easily if only we knew the answer to a simpler question,
the factorial of n-1
. This is the most common form of
recursive functions, both in Scheme and in procedural languages like C and
Pascal.
Recursive functions are generally used when the form of data the function uses is itself self-referential, and can be thought of as adding successive layers of complexity to a simple "base case". For example, the most common base case for working with natural numbers is 0; the most common base case for working with lists is the empty list.
I usually use the following step-by-step approach to writing recursive functions.
cond
with exactly as many clauses as the
total number of base cases and recurrences.
I recommend writing the following functions as exercises in recursive programming. These are not homework assignments, to be turned in for a grade, but I might put a recursion problem on your next exam.
#t
if x is an element of the list,
#f
otherwise
n
-th element of the list,
assuming n >= 1
list
that aren't eql?
to the specified element
x
list
;
note that this is not necessarily the same as (length list),
because a list might contain another list of atoms as one of its
elements. For example, (length '(a (b c) d)) = 3, but
(count-atoms '(a (b c) d)) = 4.
list1
or list2
, without
repeats (you may assume that list1
and list2
themselves don't contain any repeats). I suggest using
eql?
for comparing two objects
list1
and list2
,
without repeats (see above).
One implication of this is that in Scheme, you're not limited to the control constructs the language designer thought would be useful; you can make up your own. In Pascal, for example, the predefined control constructs are
This property is necessary if the language is to be cleanly defined or
written in itself, which is nice because you only have to learn one
language to understand what's going on. For example, perhaps the most
important function in Scheme is eval, which
takes a Scheme object representing an expression, evaluates the
expression, and returns the result. (The Scheme version of
eval takes a second "environment" parameter; we'll
discuss this later.) So one could write a Scheme interpreter
(at least, the read-eval-print loop) in a few lines of Scheme:
(define scheme (lambda () (print (eval (read) (the-environment))) (scheme)))
Suppose, for another example, you were trying to program a process with twenty steps, each depending on the previous ones, and you didn't know until run-time how many of them to perform. In Pascal:
case n of 1: step1; 2: begin step1; step2; end; 3: begin step1; step2; step3; end; ... 20: begin step1; step2; ... step20; end; end;or perhaps
if (n >= 1) then begin step1; if (n >= 2) then begin step2; if (n >= 3) then begin ... end; end; end;or maybe even
for i := 1 to n do case i of 1: step1; 2: step2; ... 20: step20; end;All of these methods are inelegant, repetitive, and prone to typographical errors, and none of them really makes clear that what's going on is "doing the first n steps of a sequence of steps". But worse, what if "doing the first n steps" turned out to be a common situation in your program? You'd need to write all this code over again each time it happened, and knowing that you'd gotten one copy of it right would give you no assurance that you had typed the next copy correctly.
In Scheme, you could abstract the idea of "doing the first n steps" into a function, do-first-n, write, test, and debug it once, and use it every time you needed to do the first n steps of something. An added advantage is that the steps themselves, as well as n, can be decided at run-time, and simply passed in to do-first-n.
(define do-first-n (lambda (n steplist) (cond ((null? steplist) ()) ((= n 0) ()) (#t (eval (car steplist) (the-environment)) (do-first-n (- n 1) (cdr steplist))))))Of course, "do-first-n" is not a terribly common control construct in most programs. But there are many other "cliches" of programming that have to be written out in full every time they're used in a procedural language, but can be abstracted in Scheme. Here are some examples. Each of these happens to have a predefined Scheme function that does it for you, but you could have written any one of them yourself in a few lines, like the above recursive definition of do-first-n:
If we're going to be passing code all over the place as function arguments, how do we specify it? Well, most code comes in the form of defined functions, so we can just pass the function name, which Scheme treats as a variable whose value is the function itself:
(define add3 (lambda (x) (+ x 3))) ADD3 (map add3 '(3 5 7 2 9 -5)) (6 8 10 5 12 -2)But in a procedural language, you're not restricted to passing arguments that happen to live in variables already; you can pass literals (like 3 or "hello there") too. In the same way, you can discuss code in a literal form, without bothering to define a new function. Since add3 was defined above as essentially a variable whose value was the function
(lambda (x) (+ x 3))
, why not just pass
the function itself?
(map (lambda (x) (+ x 3)) '(3 5 7 2 9 -5)) (6 8 10 5 12 -2)In fact, perhaps a majority of the uses of these "cliche" functions are with lambda-expressions, rather than named functions, as arguments.
cond
" and recursion. But sometimes you really want a
for
-loop or a while
-loop or a switch
statement (case
for Pascal fans).
First, let me describe something that doesn't sound like a flow-of-control structure; it's really just a way of declaring local variables. Here's an example:
(let ((x 3) (y 6) (z 'bluebird)) (list x y z z y)) ;Value: (3 6 bluebird bluebird 6) (let ((x '(a b c d e f g))) (list x x x)) ;Value: ((a b c d e f g) (a b c d e f g) (a b c d e f g)) (let ((n 5000)) (* n n n n) ;Value: 625000000000000 n ;Unbound variable: n"
let
" expects its first argument to be a list of lists: each
smaller list starts with a variable name, optionally followed by a value
to give that variable. "let
" declares all the variables
temporarily, gives them the corresponding values (if any), evaluates the
remaining arguments, returns the last one, and discards the variables it
just declared.
Note that "let
" is NOT an assignment statement!
It is not intended to change the value of an existing variable, but
rather to declare a new one; if the new one has the same name as an
existing variable, the old variable (with its old value) will still exist
after the "let
" is over.
A good way to think about this is to read the first example above
as "let x be 3, y be 6, and z be bluebird for the moment.
Now evaluate the expression
`(list x y z z y)
'. OK, you can forget about x, y, and z now."
There are several variants on "let
", with names like
"let*
", "letrec
", "fluid-let
", and "named
let
", but I don't want to go into the differences now; you can
read about them in the documentation.
The "do
" special form resembles "let
" -- it declares
and initializes some local variables, then evaluates some expressions -- but
it does all this repeatedly until a specified condition is met.
"do
" expects its first argument to be a list of lists,
each consisting of a variable name, an optional initial value, and an
optional "update expression". The second argument to "do
"
should be a list of expressions analogous to those in "cond
":
if the first one evaluates to non-null, it evaluates the rest and
returns the last one, but otherwise, "do
" goes on to evaluate
its third, fourth, fifth, etc. arguments, then replaces each
local variable with the value of its "update expression" and repeats the
whole process. For example:
(do ((x 1 (+ x 1)) (sum 0 (+ sum x))) ((> x 10) sum) (display x) (display " ")) 1 2 3 4 5 6 7 8 9 10 ;Value: 55This is equivalent to the Scheme code
(do-it 1 0)
where
"do-it
" is defined by
(define (do-it x sum) (cond ((> x 10) sum) (else (display x) (display " ") (do-it (+ x 1) (+ sum x)))))If you prefer another language, it's also roughly equivalent to the C code
{ int x,sum; for (x=1, sum=0; x <= 10; sum += x++) printf ("%d ",x); return sum; }
Last modified:
Copyright 1998 by Stephen Bloch. This document is revised and corrected from time to time, so I would much prefer that you link to it rather than making your own local copy.
Stephen Bloch / sbloch@boethius.adelphi.edu