Project 2: a Rudimentary C++ Interpreter

Assigned 6 Oct, due 10 Nov

In this project we'll develop an interpreter for a small piece of the C++ language. C++ is a huge language, so what we can handle in a few weeks will be a teeny tiny fraction, but it'll allow us to see how an interpreter can handle variables, type declarations, assignment, I/O, and sequence.

For every function you write, you must include a contract and a well-chosen set of test cases.

Start by downloading the following three files into the directory where you're working on this program:

You don't need to look at most of reader.ss unless you're curious, but you should look at the first 25 lines (a bunch of comments describing the output format) and the last 60 (a bunch of test cases demonstrating how to use the reader). This file provides four functions that you'll want to use: Your program should be written in the PLAI language. It should start with the lines
(require "syntax.ss") ; to get the definition of syntax-tree
(require "reader.ss") ; to get the definitions of the reading functions
Remember that in the PLAI language, you need to use test and test/exn instead of check-expect and check-error.

    Parsing

  1. Write a function parse that takes in a nested list (as described in reader.ss) and returns a syntax-tree (as described in syntax.ss). For example,

    (check-expect
      (parse '(+ 3 (* 4 5)))
      (binop-expr '+ (num-lit 3) (binop-expr '* (num-lit 4) (num-lit 5))))
    
    (check-expect
      (parse '(if (> (ident "x") 3)
                  (= (ident "y") (+ (ident "x") 2))
                  (block (int "z" (+ (ident "x") 2))
    		     (cout<< "Defined z to be ")
    		     (cout<< (ident "z")))))
      (if-else-stmt (binop-expr '> (ident "x") (num-lit 3))
                    (assign-expr (ident "y") (binop-expr '+ (ident "x") (num-lit 2)))
    		(block (list (decl-with-init 'int
    	                                     (ident "z")
    					     (binop-expr '+ (ident "x") 2))
    			     (output-stmt "Defined z to be ")
    			     (output-stmt (ident "z"))))))
    
    The parser should catch as many user errors as possible, such as unrecognized types, mismatched types, etc: anything that gets past the parser should at least have a chance of working, depending on the values of variables at run-time. Think of it as like a compiler: it can and should catch things like assigning an int value to a string variable, but it can't catch division by zero because that depends on run-time values.

  2. Expressions

  3. For this part of the assignment, you may assume there are no variables, and therefore no declaration statements, input statements, or assignment expressions. You may encounter the operators +, -, *, /, %, >, and == in any reasonable combination. You will not see the = (assignment) operator, because there are no variables to assign. Remember that the > and == operators in C++ return ints: 0 for false and 1 for true.

    Write a function eval-expr (similar to calc from project 1) that takes in an abstract syntax tree representing an expression and returns a Scheme object (or an error message) for the value. For now, you may assume there are no strings, so everything is type int. Here are three sample test cases.

    (check-expect (eval-expr (binop-expr '+ (num-lit 3) (binop-expr '* (num-lit 4) (num-lit 5))) 23)
    (check-expect (eval-expr (parse-expr '(+ 3 (* 4 5)))) 23)
    (check-expect (eval-expr (parse-expr (read-string-as-expr "3+4*5"))) 23)
    
    In the first example, we give eval-expr a syntax tree constructed by hand, and it evaluates the expression to 23.
    In the second example, we call parse-expr on a nested-list representation, such as might be produced by the reader; again, it should evaluate to 23.
    In the third example, we provide a string that might appear as part of a C++ program; read-string-as-expr converts it to nested-list form, parse-expr converts it to syntax-tree form, and eval-expr evaluates it to 23. string "3+4*5" and converts it into a syntax tree, which eval-expr then evaluates to 23.

  4. Statements

  5. Write a function exec-stmt that takes in an abstract syntax tree representing a statement and does what it says. We still have no variables, and therefore no declaration statements, input statements, or assignment expressions. And we still have no strings, only ints. However, you have to handle output statements, expression statements (just evaluate the expression and throw away the result), if and if/else statements, and compound statements (aka blocks). Remember that in a C++ if or if/else statement, 0 is treated as false, and anything else is treated as true.
    You may need to use the built-in void function, which takes no arguments, does nothing, and returns nothing. You can use (void) as the answer to a cond clause when you don't want to do anything.

    If you've gotten to this point successfully, save your work as version1.ss, then go on with version 2, which adds...

  6. Data types

  7. Write a function expr-type that takes in a syntax-tree representing an expression (which may include string literals) and returns either 'int, 'string to indicate the type of the expression, or an error message to indicate that it has no type, e.g.

    "hello" > 4
    
    because the > operator doesn't work on a string and a number.

    Note that the > and == operators should work on two strings, or on two ints, but not on one of each. Likewise, the + operator should work on two strings, or on two ints, but not on one of each.

  8. Modify your eval-expr and exec-stmt functions so they work correctly with strings as well as numbers. (The > and == operators on strings should do the appropriate string comparison; the + operator should do string concatenation as in Java.)

    Save your work as version2.ss, then go on with version 3, which adds...

  9. Variables

    Now we'll start adding variables to the language. First, we'll allow variables to appear in an expression. We still won't allow assignment or input statements yet.

  10. Modify your eval-expr and exec-stmt functions so they take in an environment as an extra parameter. The environment (which you need to define) specifies what variables are defined, with their types and values. Since we don't have declaration statements yet, you'll have to test these functions by filling in environments yourself.

  11. Modify your exec-stmt function so it handles declaration statements correctly. A declaration statement in C++, like int x = 7;, is analogous to a with-expression from Project 1; its body is the rest of the block in which the declaration statement appears. But as with the "multiple-branch with" in Project 1, you can't have the same variable declared twice in the same block. However, you can have the same variable declared twice in different blocks, even if one is inside the other:

    int x = 7;
    if (x > 4) {
    	int x = 6;
    	cout << "x is ";
    	cout << x;
    	}
    else {
    	int x = 5;
    	cout << "x is ";
    	cout << x;
    	}
    cout << "\nAfter the if/else, x is ";
    cout << x;
    
    is perfectly legal C++, and your program should handle it correctly: it should output "x is 6" on one line, and on the next, "After the if/else, x is 7".

    C++ also allows variables to be declared without an initial value, e.g. int xyz; ; the actual value is unpredictable (unlike in Java, which guarantees the "default" value). To simulate this, your program should initialize such variables to a random integer (for ints) or a gibberish string (for strings).

    Hint: You'll probably need to change the contract for exec-stmt so it not only takes in an environment, but also returns an environment.

    Save your work as version3.ss. Version 4 adds assignment and input to the language.

  12. Assignment and Input

  13. Modify your program so that variable names and types are still kept in an environment, but variable values are kept in a separate data structure called a store (as described in PLAI). Declaration statements may change the environment (by adding new variable bindings), and they can change the store (by filling in initial values, e.g. "int x = 7;"); assignment expressions can change the store but they cannot change the environment. Eventually (see References, below) it'll be possible to have two variable names refer to the same location in the store.

    Remember that in C++ (and in C and Java, for that matter), assignment is an expression, which can appear anywhere that any other expression can appear, e.g.

    if (x = 3) {
    	y = (x = 2) * 5;
    	}
    else {
    	cout << y = x = x + 1;
    	}
    

    Input statements (of the form "cin >> x;") should also change the store but not the environment.

    Hint: some functions will need to return both a value and a store, or both an environment and a store. There is a way for a Scheme function to return two things, and if you wish you can look it up and try it; if not, define a struct with two fields, and return an instance of this struct.

  14. References

    Modify your program to handle reference-variables, which behave as follows:

    int x = 7;
    int &y = x; // y refers to the same memory location as x
    x = 12;
    cout << y; // should output 12
    

    Note: I modified the reader and syntax files on Oct. 12 to handle this syntax; to get this working, you'll need to download the new versions of reader.ss and syntax.ss.

    What to Turn In

    You may want to put the parser in its own file, parser.ss. Remember, the parser should catch as many user errors as possible: anything that gets past the parser should at least have a chance of working, depending on the values of variables at run-time.

    As mentioned above, I recommend doing this assignment in discrete versions: version 1 handles expressions and statements, but only one data type, and no variables; version 2 handles data types, but still no variables; version 3 handles variables, but still no assignment or input; version 4 handles assignment and input; and version 5 handles reference variables.

    It is quite possible that you won't get all 5 versions working. If not, turn in as much as you've got, with an explanation of where you got stuck and what you've tried.


    Last modified:  Tue Oct 13 09:02:38 EDT 2009
    Stephen Bloch / bloch@adelphi.edu