Scheme Interpreter Extensions

Scheme Interpreter Extensions

There are many ways to extend our Scheme interpreter. A few suggestions are below, and the Wikipedia page for Scheme is a good place to start to learn more about the language itself.

We recommend that you submit the final version of your project before attempting any of these extensions. We are not responsible if you break the required components of your interpreter and Scheme code.

Variable Arguments

Many Scheme implementations allow procedures to take in a variable number of arguments, which get placed into a list before being bound to a parameter. (Compare to Python, where variable arguments are placed into a tuple.) This is specified by preceding the last parameter with a dot (much like Python's *):

    scm> (define (add . args) (apply + args))
    add
    scm> (add 3 7 2 1)
    13
  

Recall that a dot is used to specifiy the second component of a pair. Thus, formal lists can now be ill-formed.

In order to implement this, you will need to change the following:

Quasiquote and Unquote

Quoting prevents the interpreter from evaluating an expression. Often times, we might want to evaluate part of an expression but not the rest. For example, the following constructs a list containing the symbol a and its value:

    scm> (define a 3)
    a
    scm> `(a , a)
    (a 3)
  

The backquote (`) specifies a quasiquote, which can evaluate parts of an expression. A comma (,) is an unquote, which specifies that the next expression should be evaluated. Finally, a comma followed by an at symbol (,@) is an unquote-splicing, meaning that it evaluates the next expression, which must evaluate to a list, and then splices that list into the result:

    scm> `(a ,@ '(1 2 3) 4)
    (a 1 2 3 4)
  

Quasiquotes can be nested, and an unquoted expression should only be evaluated if it is at the same nesting level as the outermost quasiquote. The nesting level increases by one in each quasiquotation and decreases by one in each unquote/unquote-splicing.

Here are some examples from the Scheme R5RS reference manual:

    scm> `(list  ,(+ 1 2)  4)
    (list 3 4)

    scm> (let ((name 'a)) `(list ,name ',name))
    (list a (quote a))

    scm> `(( foo ,(- 10 3)) ,@(cdr '(c)) . ,(car '(cons)))
    ((foo 7) . cons)

    scm> `(a `(b ,(+ 1 2)  ,(foo ,(+ 1 3)  d)  e)  f)
    (a (quasiquote (b (unquote (+ 1 2)) (unquote (foo 4 d)) e)) f)

    scm> (let ((name1 'x)
               (name2 'y))
           `(a `(b ,,name1  ,',name2 d)  e))
    (a (quasiquote (b (unquote x) (unquote (quote y)) d)) e)
  

The tokenizer already handles quasiquotes and unquotes. However, you will need to modify scheme_read to handle them as well, like you did for normal quoting. Use the strings "quasiquote", "unquote", and "unquote-splicing", respectively.

In addition, the following changes are required:

Macro Definition

Macros allow the language itself to be extended by the user. Simple macros can be provided with the define-macro special form. This must be used like a function definition, and it creates a procedure just like define. However, this procedure has a special evaluation rule: it is applied to its arguments without first evaluating them. Then the result of this application is evaluated.

Here is a simple example:

    scm> (define-macro (when test . branch)
           (list 'if test (cons 'begin branch)))
    when
    scm> (when (< 3 4)
            (print 1)
            (print 2))
    1
    2
    okay
  

The code above defines a macro when that evaluates all the expressions in its body when the predicate expression is true. (You'll need to have implemented variable argument lists for this particular example to work.)

In order to implement define-macro, add a macro parameter to do_define_form, with a default value of False. If macro is true, raise an error if function definition shorthand is not used. Otherwise, mark the resulting procedure as a macro. (There are many ways to do this; one way is to use an instance attribute.)

In scheme_eval, add a case for "define-macro" that calls do_define_form with macro as true. Then modify the procedure handling code to check if a procedure is a macro. In that case, it should apply the procedure directly to the arguments without evaluating them first. It should then evaluate and return the result.

Mutation

Like Python, Scheme has non-local assignment. In particular, the set! special form takes in a name and another expression and rebinds that name to the value of the expression in the first frame in which the name exists. Unlike Python, this frame can be the local or global frame.

In order to implement set!, add a method to Frame that rebinds a name to a new value in the first frame in which the name is found. An error should be raised if the name does not exist in any frame. You will also need to add a do_set_form function and a case for set! in scheme_eval.

Pairs can be mutated using set-car! and set-cdr! These can be easily implemented as primitive procedures in scheme_primitives.py.

Library Code

Many standard Scheme procedures can be implemented in Scheme itself, as library code. Add a mechanism to your interpreter to load up a library file on startup (e.g. scheme_lib.scm). Then provide useful procedures in scheme_lib.scm, such as map, filter, and c*r variants up to four applications of car or cdr.

Error Handling

Currently, when an error occurs while attempting to evaluate an expression, the interpreter only prints out an error message, with no backtrace. This makes it difficult to determine the source of an error.

In order to provide a useful backtrace, start by adding names to primitive procedures and procedures defined using the special define syntax. Use default names, such as [lambda], for procedures with unknown names.

Now write a new function to handle an exception and call it from the first except clause in read_eval_print_loop. A Python exception contains information about every frame between the one that raised the exception and the one that handled it. If e is an exception, then e.__traceback__ is a traceback object that contains this information. A traceback is a recursive list of frames. Read more about traceback, frame, and code objects here.

A Python exception contains information at the Python level, but a user is interested in information at the Scheme level. So you should translate the Python-level information to Scheme-level information by extracting the latter from a frame. You can read the local variables in a frame, and you can obtain its associated code object to get the name of the Python function for that frame.

Some suggestions on what to do with a Python frame:

Here are some sample traces without the tail recursion optimization:

    scm> (define (foo x) (/ 1 x))
    foo
    scm> (define (bar x) (foo x) 3)
    bar
    scm> (define (baz x) (if (= x 0) (bar x) (baz (- x 1))))
    baz
    scm> (foo 0)
    Traceback (most recent call last):
      0     (foo 0)
      1     (/ 1 0)
    Error: division by zero
    scm> (bar 0)
    Traceback (most recent call last):
      0     (bar 0)
      1     (#begin (foo x) 3)
      2     (foo 0)
      3     (/ 1 0)
    Error: division by zero
    scm> (baz 3)
    Traceback (most recent call last):
      0     (baz 3)
      1     (baz 2)
      2     (baz 1)
      3     (baz 0)
      4     (bar 0)
      5     (#begin (foo x) 3)
      6     (foo 0)
      7     (/ 1 0)
    Error: division by zero
  

With the tail recursion optimization:

    scm> (foo 0)
    Traceback (most recent call last):
      0     (foo 0)
      1     (/ 1 0)
    Error: division by zero
    scm> (bar 0)
    Traceback (most recent call last):
      0     (bar 0)
      1     (#begin (foo x) 3)
      2     (foo 0)
      3     (/ 1 0)
    Error: division by zero
    scm> (baz 3)
    Traceback (most recent call last):
      0     (bar 0)
      1     (#begin (foo x) 3)
      2     (foo 0)
      3     (/ 1 0)
    Error: division by zero
  

Further Extensions

Feel free to implement any other features of Scheme that you want. You can read the full reference manual here. Examples include named lets, let*, letrec, do loops, strings, and vectors. (If you really want a challenge, then try to implement call-with-current-continuation, which isn't even handled correctly by STk.) How close can you get to what STk provides?