Author: Dekai Wu
Date: Due 2009.05.03 at 23:00 by CASS
Download: http://www.cs.ust.hk/~dekai/151H/assignments/a4.tar.gz
Assignment page: http://www.cs.ust.hk/~dekai/151H/assignments/a4/html/
Course page: http://www.cs.ust.hk/~dekai/151H/
In this fourth piece of your programming project, you are assigned to maintain and extend the micro-Scheme interpreter you've built in Assignments 1, 2, and 3, specifically to add function values (by supporting lambda
) and function application (by supporting standard Scheme expressions where the first element of the list is any function value, as well as by supporting explicit apply
syntax).
For this assignment, we are giving you almost no new code. The tarball a4.tar.gz contains exactly the same files as a3.tar.gz, except for (1) these instructions and (2) a slightly extended cons.hpp
interface, in which we define an additional constructor lambda
, along with a corresponding type testing predicate procedurep
, plus corresponding accessors get_formals
and get_body
, and (3) a skeleton library.scm
file that we'll describe in Step 3 below.
A function is just another kind of value, like int, double, symbol, or cons values. It must be possible to pass function values as parameters and return values, and so forth.
To make a function value, we need a constructor that constructs a ProcedureCell
holding the newly constructed function value. You might think this constructor should be called make_procedure
(like make_int
or make_symbol
), but the historical convention is to call this constructor lambda
instead. (This is analogous to the fact that cons
is also really a constructor that you might think should have been called make_cons
instead.)
So you will add a new operator to your implementation, lambda
, that allows construction of anonymous functions of zero or more arguments, as discussed in lecture:
> (lambda (x y z) (* x y z))
#<function>
> (define multiply (lambda (x y z) (* x y z)))
()
> (define multiply-tracing (lambda (x y z) (print (quote multiplying..)) (* x y z)))
()
When a lambda
expression is evaluated, it should cause the lambda
constructor in the newly extended cons.hpp
interface to be called in order to construct a new ProcedureCell
holding the new function value. (Remember, think of the lambda
constructor just as if it were instead called make_procedure
.)
The behavior of lambda
should exactly conform to Scheme R5RS, as described in lecture. It requires two or more operands. The first operand must be a list of symbols (for instance (x y z)
in the above example), which describes the formal parameters of the function. The rest of the operands must be a sequence of expressions, describing the body of the function.
You should store the list of arguments in the formals
field of the ProcedureCell
. You should store the list of expressions in the body (for instance ((print (quote multiplying..)) (* x y z))
in the above example) in the body
field of the ProcedureCell
.
You are now in a position to add the ability to apply the functions, whether anonymous or named. You should support both the normal Scheme syntax where the first element of the list is a function value, as well as explicit apply
syntax:
> ((lambda (x y) (+ x y)) 2 3)
5
> (multiply 2 3 4)
24
> (multiply-tracing 2 3 4)
multiplying..
24
> (apply multiply (quote (2 3 4)))
24
> (define apply-to-2-3-4 (lambda (f) (apply f (quote (2 3 4)))))
()
> (apply-to-2-3-4 multiply)
24
> ((quote multiply) 2 3 4)
ERROR: cannot apply a value that is not a function
> (apply (quote multiply) (quote 2 3 4))
ERROR: cannot apply a value that is not a function
> (apply-to-2-3-4 (quote multiply))
ERROR: cannot apply a value that is not a function
To accomplish this, you'll want to implement an additional virtual function apply()
for Cell
and its derived classes. In particular, the real work of applying a function should be done by Cell* ProcedureCell::apply(Cell* const args)
where this
points to a ProcedureCell
holding the function to be applied, and args
points to a list of values to be supplied as the arguments the function is being called with. You can then use your apply()
to support both the normal Scheme syntax where the first element of the list is a function value, as well as the explicit apply
syntax.
All expressions in the body must be eagerly evaluated, in left-to-right order. This guarantees that the program:
> (define foo (lambda () (print (quote Hello)) (print (quote world)) (print (quote !!))))
()
> (foo)
Hello
world
!!
()
will print out the desired words in order. (Also notice that in this example, the lambda expression constructs a function that takes zero arguments.)
Similarly, all expressions in the argument list must be eagerly evaluated (but we leave the order of evaluation unspecified; that's your choice).
Hint: Distinguish local versus global scope! When you apply a lambda (i.e., apply a function), the parameter names become local variable names. When you are doing symbol name lookup, you need to make sure that local scope takes priority over global scope. First try to find the symbol name in the local symbol table; only if that fails, then try to find it in the global symbol table. For example:
> (define z 3)
()
> ((lambda (x) (+ x z)) 2)
5
> (define x 4)
()
> ((lambda (x) (+ x z)) 2)
5
Unlike in Assignment 3, you can no longer get away with just a single global symbol table. Since functions can call functions which can call functions (or even call recursively for an unbounded number of recursions), you will need a stack of symbol tables to keep proper track of local symbol tables. Consider, for example, the following factorial
function that takes a single operand n of type int
, and returns n! which is also of type int
:
> (define factorial (lambda (x) (if (< x 2) 1 (* x (factorial (- x 1))))))
()
> (factorial 1)
1
> (factorial 4)
24
> (factorial 10)
3628800
Whenever factorial
is called, there is a new local variable x
that must not be confused with either the global variable x
or any other instances of local variables named x
. For instance, (factorial 4)
has a local variable x
bound to 4, but when it calls (factorial 3)
there will be a new local variable x
bound to 3 that must not be confused with the local variable x
associated with (factorial 4)
.
So to avoid that kind of confusion, every time a function is called, a new local symbol table must be pushed on the stack, containing all the parameter symbol names as the local variables for that function. Whenever execution of a function is finished, its local symbol table must be popped from the stack.
Remember to review the examples in Assignments 1, 2, and 3 and make sure everything still works.
Congratulations!!
You have now implemented your own fully Turing-equivalent programming language. In principle, you can now program anything using your micro-Scheme interpreter, and never have to use C++ again! More precisely, as Wikipedia puts it: In computability theory, an abstract machine or programming language is called Turing complete, Turing equivalent, or (computationally) universal if it has a computational power equivalent to (i.e., capable of emulating) a simplified model of a programmable computer known as the universal Turing machine. Being equivalent to the universal Turing machine essentially means being able to perform any computational task – though it does not mean being able to perform such tasks efficiently, quickly, or easily... The term derives from the name of mathematician Alan Turing who introduced the model of the universal Turing machine.
This means that from now on, you can implement the rest of your micro-Scheme programming language, using only micro-Scheme. We have used C++ to bootstrap your implementation of the micro-Scheme programming language, and the rest can be done in micro-Scheme itself. You don't have to use C++ any more (well, maybe for a few special purpose convenience items). Hurrah! COMP151H is almost over, and we can start learning other interesting languages besides C++.
You'll celebrate by using your interpreter to start implementing general-purpose library functions using your micro-Scheme language (not using C++). For example, the factorial
function above is actually a real function in the standard Scheme specification, which you see you can implement in your own micro-Scheme instead of using C++. Similarly, the operators >
, <=
, >=
, =
, and abs
are all general-purpose library functions that you can implement as follows using your micro-Scheme language rather than C++, as follows (for simplicity, we're providing just the versions that take exactly two operands):
> (define > (lambda (x y) (< y x)))
()
> (> 2 5)
0
> (define >= (lambda (x y) (not (< x y))))
()
> (<= 3 2)
0
> (define <= (lambda (x y) (not (< y x))))
()
> (<= 3 3)
1
> (define = (lambda (x y) (if (< x y) 0 (not (< y x)))))
()
> (= 2 2)
1
> (= 2 3)
0
> (= 3 2)
0
> (define abs (lambda (x) (if (< x 0) (- 0 x) x)))
()
> (abs 3)
3
> (abs -3)
3
Are you getting the hang of it yet?
The required additional library function you will implement is for-each
, which does something kind of similar to the STL for_each
generic algorithm for C++ that we studied in lecture.
Your for-each
function should take two operands, where the first operand must be a function of one argument, and the second operand must be a list. It should apply the function to every element in the list. The return value of for-each
is unspecified (so you could return ()
, for example).
(for-each
<function> <list>)
For example:
> (define square (lambda (x) (print (* x x))))
()
> (for-each square (quote (2 5 9 14 256)))
4
25
81
196
65536
()
You can also consult Scheme R5RS for details of for-each
.
Please feel free to implement any other general-purpose library functions you would like, just for fun. Some suggestions: the general map
and reduce
functions and the list
function. These require you to have implemented the optional bonus support for a variable number of formal parameters for lambda
, as described below.
When you are finished, add all your Scheme library functions to the library.scm
file where we've already placed the libary functions >
, <=
, >=
, =
, abs
and factorial
to get you started. Make sure your improved version of library.scm
is included in your tarball submission.
The following is purely for fun. You are not required to do it! Only try this if you are bored or want to impress me :-)
If you wish, you may also implement Scheme's support for variable number of arguments.
> (define first (lambda args (car args)))
> (first (quote a) (quote b) (quote c))
a
> (first (quote a) (quote b) (quote c) (quote d) (quote e) (quote f))
a
The key here is that args
is a symbol rather than a list of symbols. You can consult Scheme R5RS for more details.
If you do this optional bonus, then notice it will also let you implement the general version of the for-each
function as defined in Scheme R5RS so that the first operand can be a function that takes any number of operands:
(for-each
<function> <list1> <list2> ...)
Definitely don't try this unless you have everything else in this assignment perfectly done!
let
' syntactic sugarIf you are finished with all the optional bonuses so far, you can try this. Again, the following is purely for fun, and you are not expected to do it unless you want to.
You may also wish to implement Scheme's support for the let
syntactic sugar for lambda
. This can make programming with local variables much more convenient and readable. You can also consult Scheme R5RS for more details.
The let
syntax is an alternate syntax for lambda expressions, that more closely resembles local (const) variable definitions in other languages you're familiar with, like C++. For example, the following two expressions have identical meaning:
> ((lambda (x y) (* x y)) 2 3)
6
> (let ((x 2) (y 3)) (* x y))
6
If we rewrite the latter expression with better indentation, it looks like this:
(let ((x 2)
(y 3))
(* x y))
which looks much like the roughly equivalent C++ code:
{
int x(2);
int y(3);
return x * y;
}
So in fact, lambda
not only implements functions, but also implements local variables!
Except for the modified version of cons.hpp
and the new library.scm
, all other source files in a4.tar.gz
are identical to those from a3.tar.gz
.
So you should start from your Assignment 3 implementation and extend it, replacing only the old cons.hpp
with the new one. Be careful! You still may not break any of the encapsulation rules from Assignment 3.
Remember again, the objective of this programming project is for you to train your skills, by practicing correct software engineering techniques enabling you to build, maintain, and extend a non-trivial piece of well-engineered code.
You must follow the design approach outlined in this document. Do not just implement the required functionality using a different design.
This time you must use templates. In this assignment, you are expected to make good use of the STL map
- but neatly, without messing up the polymorphic approach you built in Assignment 2.
Remember we are focusing on proper use of encapsulation. So you still should not edit the files parse.hpp
, parse.cpp
, cons.hpp
, eval.hpp
, or main.cpp
. Again, the programming assignments are mini-exercises in how multiple programmers are supposed to interact and communicate in the real world; these files are owned and maintained by the other author(s).
The tarball you turn in will need to contain your new or extended implementations of eval.cpp
, Cell.hpp
, Cell.cpp
, and library.scm
.
Depending on your approach, you may or may not need to change the Makefile
. Whether you changed it or not, always make sure you include whatever Makefile
is needed to build your program, when you submit assignment. Otherwise, the graders cannot build your program.
You must write the final version of the program on your own. Sophisticated plagiarism detection systems are in operation, and they are pretty good at catching copying! If you worked in study groups, you must also acknowledge your collaborators in the write-up for each problem, whether or not they are classmates. Other cases will be dealt with as plagiarism. Re-read the policy on the course home page, and note the University's tougher policy this year regarding cheating.
Your programming style (how clearly and how well you speak C++) is what will be graded. Correct functioning of your program is necessary but not sufficient!