发自己的语

本文大多数参考 SICP

阴阳大法

ying-yan

一个语言的解释器,就是一个 eval 函数,就是求值函数:

  • 输入: 一个表达式 s-exp
  • 输出: 一个表达式 s-exp

这个解释器的核心就是 applyeval 之间的互递归调用。

原始类型

原始类型是一个 s-exp ,求值就是 s-exp 本身。包括:整数,浮点数,字符串。例如

1  => 1
1.2 => 1.2
"abc" => "abc"

这个规则看上去很简单,但是很重要,他是递归求值的终止条件。也就是说, eval 函数碰到这些原始类型的 s-exp ,不会递归调用 apply

根据这个规则,我们写一个函数

(define true #t)
(define false #f)
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        (else (error "Unknown expression type -- EVAL" exp))))

(define (self-evaluating? exp)
  (cond ((number? exp) true)
        ((string? exp) true)
        (else false)))

我们运行一下这个函数。首先我们需要安装一个 scheme 解释器,参考 https://github.com/cisco/ChezScheme 语言参考 http://www.scheme.com/tspl4/

Chez Scheme Version 9.4
Copyright 1984-2016 Cisco Systems, Inc.

> (load "eval0.scm")
> (eval 1 '())
1
> (eval 1.2 '())
1.2
> (eval "abc" '())
"abc"
> (eval (list 1 2 3) '())
Exception in error: invalid message argument (1 2 3)
Type (debug) to enter the debugger.

恭喜你,一个解释器的雏形开始了。这里有一个开发原则,“永远有一个可以工 作的版本”。我经历过两种开发模式。

  1. 写了一天的代码,从来没有编译运行过。然后花三天去调试。
  2. 写几分钟的代码,然后立即调试。循环往复,每次增加功能一点点。

第一种模式下,大多数情况是,我自己思路不很清晰,模块还没有分解到足够细 致,所以一边写代码,一遍划分模块。

第二种模式下,我已经思路很清晰了,代码模块也很清晰了,尤其是模块划分的 颗粒度足够细致,脑子里面已经有具体的迭代步骤了。这种模式下,开发效率高 一些。

可以看到,(eval (list 1 2 3) '() 的时候,出现错误。这里也有一个开 发原则,“测试你的每一行代码”。可以看到,这几个简单的测试,覆盖了刚刚写 的所有代码。

开发初期,保证代码的测试覆盖率相对容易。这个时候,最好配合自动回归测试, 保证以后代码的覆盖率。如果开发初期没有把自动化测试做好,到了开后中后期, 在回过头来想保证代码的测试覆盖率,难度就十分大了。

好了,我们继续迭代开发我们的解释器。

quote

这个规则也是一个终止条件。如果一个 s-exp 是一个 list ,第一个元素 是 quote,第二个元素任何一个 s-exp X,那么求值结果就是 X 。

(quote 1) => 1
(quote 1.2) => 1.2
(quote "abc") => "abc
(quote (1 2 3)) => (1 2 3)
(quote a-symbol) => a-symbol
(quote quote) => quote

s-exp 中有一个重要的数据类型,就是 symbol 。 quote 可以得到 symbol 本身。

(define (eval exp env)
  (cond (......
        ((quoted? exp) (text-of-quotation exp))
         ......
        (else (error "Unknown expression type -- EVAL" exp))))

我们看看 quoted?(text-of-quotation) 怎么实现的?

(define (quoted? exp)
  (tagged-list? exp 'quote))
(define (text-of-quotation exp) (cadr exp))
(define (tagged-list? exp tag)
  (if (pair? exp)
      (eq? (car exp) tag)
      false))

看看运行结果

> (load "eval1.scm")
> (eval '(quote 1) '())
1
> (eval '(quote 1.2) '())
1.2
> (eval '(quote "hello") '())
"hello"
> (eval '(quote a-symbol) '())
a-symbol
> (eval '(quote quote) '())
quote

eval1.scm 的完整内容如下

(define true #t)
(define false #f)
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((quoted? exp) (text-of-quotation exp))
        (else (error "Unknown expression type -- EVAL" exp))))

(define (self-evaluating? exp)
  (cond ((number? exp) true)
        ((string? exp) true)
        (else false)))

(define (quoted? exp)
  (tagged-list? exp 'quote))
(define (text-of-quotation exp) (cadr exp))
(define (tagged-list? exp tag)
  (if (pair? exp)
      (eq? (car exp) tag)
      false))

一个好的程序风格是,多写一些小的函数,然后组合这些函数。函数都有名字, 可以描述自己的作用。例如 text-of-quotation 。其实这个函数本质上和 cadr 没有任何区别。但是在逻辑层次上,text-of-quotation 是更加高层 的函数,cadr 是更加底层的函数。这种逻辑层次的划分,可以提高程序的可 读性。

if

支持条件表达式,输入 s-exp 是一个 list ,有四个元素:

  1. 第一个元素是 if ,表明是一个条件表达式。
  2. 第二个元素是判断条件 <C>
  3. 第三个元素是 <T> 为真的时候,s-exp 的求值结果。
  4. 第三个元素是 <F> 为假的时候,s-exp 的求值结果。

这里有一个语言设计的问题,是否有必要增加一个 boolean 的数据类型?什么 是真,什么是假?这里有很多让人难以捉摸的小细节。这个问题也是各种语言之 间争论很久的话题。为了简单,我们不在这个问题上展开讨论,我们不增加新的 数据类型,认为符号(symbol) true 是真,其他值都是假。

(define (eval exp env)
  (cond (......
        ((if? exp) (eval-if exp env))
         ......
        (else (error "Unknown expression type -- EVAL" exp))))

先看看 if? 的实现。

(define (if? exp) (tagged-list? exp 'if))

感谢逻辑分层的代码风格,if? 的实现的可读性就很好了。

我们看看关键的 eval-if 怎么实现的。

(define (eval-if exp env)
  (if (true? (eval (if-predicate exp) env))
      (eval (if-consequent exp) env)
      (eval (if-alternative exp) env)))

这里有两点需要注意

  1. 递归调用, eval-if 递归调用了 eval ,可以看到递归调用的强大之处。
  2. 求值顺序。

“求值顺序” 是一门语言的一个重大的设计问题。不同的求值顺序,导致完全不同 的效果。这里我们注意到,如果求值条件为真,那么假的分支是不做求值的。如 果求值条件为假,那么真的分支是不会求职的。这种设计叫做短路求值 (short-circuit)。 主流语言都是这么设计的。求值顺序是一个很重要的课题, 这里不展开讨论。

看看其他辅助函数如何实现的。这里再次应用了逻辑分层的代码风格。

(define (if-predicate exp) (cadr exp))
(define (if-consequent exp) (caddr exp))
(define (if-alternative exp)
  (if (not (null? (cdddr exp)))
      (cadddr exp)
      'false))
(define (true? exp)
  (eq? exp 'true))

我们看看程序的执行效果

> (eval '(if (quote true) 1 2) '())
1
> (eval '(if (quote false) 1 2) '())
2

非常好。 这里是完整的代码

(define true #t)
(define false #f)
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((quoted? exp) (text-of-quotation exp))
        ((if? exp) (eval-if exp env))
        (else (error "Unknown expression type -- EVAL" exp))))

(define (self-evaluating? exp)
  (cond ((number? exp) true)
        ((string? exp) true)
        (else false)))
(define (if? exp) (tagged-list? exp 'if))

(define (eval-if exp env)
  (if (true? (eval (if-predicate exp) env))
      (eval (if-consequent exp) env)
      (eval (if-alternative exp) env)))

(define (if-predicate exp) (cadr exp))
(define (if-consequent exp) (caddr exp))
(define (if-alternative exp)
  (if (not (null? (cdddr exp)))
      (cadddr exp)
      'false))
(define (true? exp)
  (eq? exp 'true))

(define (quoted? exp)
  (tagged-list? exp 'quote))
(define (text-of-quotation exp) (cadr exp))
(define (tagged-list? exp tag)
  (if (pair? exp)
      (eq? (car exp) tag)
      false))

代码块求值

这个功能是一个很方便的功能。如果 <s-exp> 是下面的形式

(begin <E1> <E2> .... <En>)

那么我们对 E1 , E2 ,.... , En 分别求值,整个表达式的值就是 En 的求值结果。

(define (eval exp env)
  (cond (.......
        ((begin? exp)
         (eval-sequence (begin-actions exp) env))
        ......

看看关键函数 eval-sequence 的实现

(define (eval-sequence exps env)
  (cond ((last-exp? exps) (eval (first-exp exps) env))
        (else (eval (first-exp exps) env)
              (eval-sequence (rest-exps exps) env))))

这里注意

  1. 求值顺序
  2. 递归调用 eval
  3. 中间的表达式的求值结果被丢弃了

看看其他几个辅助函数的实现

(define (begin? exp) (tagged-list? exp 'begin))
(define (begin-actions exp) (cdr exp))
(define (last-exp? seq) (null? (cdr seq)))
(define (first-exp seq) (car seq))
(define (rest-exps seq) (cdr seq))

看看程序的执行效果

> (load "eval3.scm")
> (eval '(begin 1 2) '())

下面是完整的程序代码

(define true #t)
(define false #f)
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((quoted? exp) (text-of-quotation exp))
        ((if? exp) (eval-if exp env))
        ((begin? exp)
         (eval-sequence (begin-actions exp) env))
        (else (error "Unknown expression type -- EVAL" exp))))

(define (self-evaluating? exp)
  (cond ((number? exp) true)
        ((string? exp) true)
        (else false)))
(define (if? exp) (tagged-list? exp 'if))

(define (eval-if exp env)
  (if (true? (eval (if-predicate exp) env))
      (eval (if-consequent exp) env)
      (eval (if-alternative exp) env)))

(define (if-predicate exp) (cadr exp))
(define (if-consequent exp) (caddr exp))
(define (if-alternative exp)
  (if (not (null? (cdddr exp)))
      (cadddr exp)
      'false))
(define (true? exp)
  (eq? exp 'true))

(define (eval-sequence exps env)
  (cond ((last-exp? exps) (eval (first-exp exps) env))
        (else (eval (first-exp exps) env)
              (eval-sequence (rest-exps exps) env))))

(define (begin? exp) (tagged-list? exp 'begin))
(define (begin-actions exp) (cdr exp))
(define (last-exp? seq) (null? (cdr seq)))
(define (first-exp seq) (car seq))
(define (rest-exps seq) (cdr seq))

(define (quoted? exp)
  (tagged-list? exp 'quote))
(define (text-of-quotation exp) (cadr exp))
(define (tagged-list? exp tag)
  (if (pair? exp)
      (eq? (car exp) tag)
      false))

函数的求值

什么是函数表达式,其实就是著名的 lambda 表达式。lambda 表达式格式如下

(lambda <PARAM-LIST> <FUN-BODY>)

PARAM-LIST 是一个变量列表,引入变量的作用域,也叫环境。我们在求值 FUN-BODY 的时候,就在最内层的环境开始,有里向外的搜索变量。重名的时 候,内层变量先起作用,就这个就是所谓的阴影效果 (shadow) 。

求值一个 lambda 表达式的时候,返回值就是一个 “函数” 。

程序内部,如何表达一个函数呢?这也是一个重大的语言设计问题。编译型还是 解释型?目标语言的选择,字节码,还是机器码,还是其他的中间语言?

我们这个简单的语言中,我们就用 list 来表示。

(define (make-procedure parameters body env)
   (list 'procedure parameters body env))
(define (procedure-parameters p) (cadr p))
(define (procedure-body p) (caddr p))
(define (procedure-environment p) (cadddr p))

看看执行效果

> (make-procedure '(a b) 'a '())
(procedure (a b) a ())
> (set! p1 (make-procedure '(a b) 'a '()))
> (procedure-parameters  p1)
(a b)
> (procedure-body  p1)
a
> (procedure-env  p1)
> (procedure-environment  p1)
()

可见一个函数有三个元素,参数列表,函数体,和环境。环境这个概念一直没有 详细解释,再推到下面解释。

那么,我们先实现一个 lambda 表达式的解析。

(define (eval exp env)
  (cond (.......
        ((lambda? exp)
         (make-procedure (lambda-parameters exp)
                         (lambda-body exp)
                         env))
        ......

其他几个辅助函数的实现

(define (lambda? exp) (tagged-list? exp 'lambda))
(define (lambda-parameters exp) (cadr exp))
(define (lambda-body exp) (cddr exp))
(define (make-procedure parameters body env)
  (list 'procedure parameters body env))

看看执行效果

> (lambda? '(lambda 1))
#f
> (lambda-parameters '(lambda (a b) (+ a b)))
(a b)
> (lambda-body '(lambda (a b) (+ a b)))
((+ a b))
> (make-procedure '(a b) '((+ a b)) '())
(procedure (a b) ((+ a b)) ())
> (eval '(lambda (a b) (+ a b)) '())
(procedure (a b) ((+ a b)) ())

太好了,我们可以解析 lambda 表达式了。

这里是完整的代码

(define true #t)
(define false #f)
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((quoted? exp) (text-of-quotation exp))
        ((if? exp) (eval-if exp env))
        ((begin? exp)
         (eval-sequence (begin-actions exp) env))
        ((lambda? exp)
         (make-procedure (lambda-parameters exp)
                         (lambda-body exp)
                         env))
        (else (error "Unknown expression type -- EVAL" exp))))

(define (self-evaluating? exp)
  (cond ((number? exp) true)
        ((string? exp) true)
        (else false)))
(define (if? exp) (tagged-list? exp 'if))

(define (eval-if exp env)
  (if (true? (eval (if-predicate exp) env))
      (eval (if-consequent exp) env)
      (eval (if-alternative exp) env)))

(define (if-predicate exp) (cadr exp))
(define (if-consequent exp) (caddr exp))
(define (if-alternative exp)
  (if (not (null? (cdddr exp)))
      (cadddr exp)
      'false))
(define (true? exp)
  (eq? exp 'true))

(define (lambda? exp) (tagged-list? exp 'lambda))
(define (lambda-parameters exp) (cadr exp))
(define (lambda-body exp) (cddr exp))
(define (make-procedure parameters body env)
  (list 'procedure parameters body env))
(define (procedure-parameters p) (cadr p))
(define (procedure-body p) (caddr p))
(define (procedure-environment p) (cadddr p))

(define (eval-sequence exps env)
  (cond ((last-exp? exps) (eval (first-exp exps) env))
        (else (eval (first-exp exps) env)
              (eval-sequence (rest-exps exps) env))))

(define (begin? exp) (tagged-list? exp 'begin))
(define (begin-actions exp) (cdr exp))
(define (last-exp? seq) (null? (cdr seq)))
(define (first-exp seq) (car seq))
(define (rest-exps seq) (cdr seq))

(define (quoted? exp)
  (tagged-list? exp 'quote))
(define (text-of-quotation exp) (cadr exp))
(define (tagged-list? exp tag)
  (if (pair? exp)
      (eq? (car exp) tag)
      false))

函数调用与变量

刚刚我们定义了一个函数,但是并没有什么用处,我们看看什么是函数调用 (apply)。

(<L> <A1> <A2> ... <An>)

看看怎么解析这个函数调用型的 s-exp

  • 首先,L 是一个表达式,求值结果是一个函数
  • 然后,A1 是一个表达式,求值结果是任意值
  • 之后,依次求值 A2 ... An
  • 之后,绑定 (bind) A1 ... An 的值,到 L 的环境中。
  • 最后,对 L 的函数体求值。并返回结果

这里需要有两个注意的事项

  1. 求值顺序
  2. 递归调用
  3. 变量绑定

这是一门语言最核心的部分了。求值顺序。我似乎提到求值顺序很多次了。这里 选择了大多数主流语言的设计。

参数的求值顺序也很重要,有的语言为了性能,不定义参数的求值顺序。

第二个问题就是阴阳大法的互递归调用。apply 不停地调用 eval 得到函数对象, 和参数,绑定参数,然后调用 eval 求值函数体。而 apply 的处理,正式 eval 函数中的重要组成部分。

第三个问题就是变量的作用域的问题。变量的作用域也是一个重大的语言设计问 题。不同的语言有不同的设计,展现了不同语言丰富的表达能力。现在主流语言 是“词法作用域” (lexical scope) 。我们选择的是一个简单的词法作用域的实 现。

这里也需要注意参数的求值环境是在当前环境,而函数体的求值环境,是参数绑 定之后的新环境。

什么是求值环境?求值环境就是一个变量名称到变量值的映射关系。

什么是闭包(closure) ? 闭包也是一个环境,就是在生成一个函数的时候,当时 的环境。

什么是帧 (frame) ? 就是函数调用的时候,绑定参数生成的环境。

这就是说环境是一个动态的数状结构(递归结构)。在求值一个函数的函数体的 时候,碰到变量求值,会递归搜索闭包,这样除了参数绑定产生的本地环境之外, 还需要搜索闭包。

我们先实现 frame 。

(define (make-frame variables values)
  (cons variables values))
(define (frame-variables frame) (car frame))
(define (frame-values frame) (cdr frame))

看看运行效果

> (set! f1 (make-frame '(a b c) '(1 2 3)))
> (frame-variables f1)
(a b c)
> (frame-values f1)
(1 2 3)

我们再实现 env 。

(define (extend-environment vars vals base-env)
  (if (= (length vars) (length vals))
      (cons (make-frame vars vals) base-env)
      (if (< (length vars) (length vals))
          (error "Too many arguments supplied" vars vals)
          (error "Too few arguments supplied" vars vals))))
(define (enclosing-environment env) (cdr env))
(define (first-frame env) (car env))
(define the-empty-environment '())

看看执行效果

> (set! e1 (extend-environment '(a b c) '(1 2 3) the-empty-environment)))
> e1
(((a b c) 1 2 3))
> (enclosing-environment e1)
()
> (set! e2 (extend-environment '(x y z) '(10 20 30) e1))
> e2
(((x y z) 10 20 30) ((a b c) 1 2 3))
> (enclosing-environment e2)
(((a b c) 1 2 3))

我们再实现一个变量查找的功能

(define (lookup-variable-value var env)
  (define (env-loop env)
    (define (scan vars vals)
      (cond ((null? vars)
             (env-loop (enclosing-environment env)))
            ((eq? var (car vars))
             (car vals))
            (else (scan (cdr vars) (cdr vals)))))
    (if (eq? env the-empty-environment)
        (error "Unbound variable" var)
        (let ((frame (first-frame env)))
          (scan (frame-variables frame)
                (frame-values frame)))))
  (env-loop env))

我们看看执行效果

> (lookup-variable-value 'x e2)
10
> (lookup-variable-value 'y e2)
20
> (lookup-variable-value 'a e1)
1
> (lookup-variable-value 'a e2)
1
> (lookup-variable-value 'u e2)
Exception: Unbound variable with irritant u
Type (debug) to enter the debugger.

有了这些,我们可以实现变量的求值了。

(define (eval exp env)
  (cond (.......
        ((variable? exp) (lookup-variable-value exp env))
        ......

看看执行效果

> (eval 'a e1)
1
> (eval 'b e1)
2
> (eval 'c e1)
3
> (eval 'a e2)
1
> (eval 'b e2)
2
> (eval 'c e2)
3
> (eval 'x e2)
10
> (eval 'y e2)
20
> (eval 'z e2)
30

辅助函数的实现

(define (variable? exp) (symbol? exp))

我们继续,有了这些,我们可以实现 apply 的解析了

(define (eval exp env)
  (cond (.......
        ((application? exp)
         (apply (eval (operator exp) env)
                   (list-of-values (operands exp) env)))
        ......

看看一些辅助函数的实现

(define (application? exp) (pair? exp))
(define (operator exp) (car exp))
(define (operands exp) (cdr exp))
(define (no-operands? ops) (null? ops))
(define (first-operand ops) (car ops))
(define (rest-operands ops) (cdr ops))
(define (list-of-values exps env)
  (if (no-operands? exps)
      '()
      (cons (eval (first-operand exps) env)
            (list-of-values (rest-operands exps) env))))

看看执行效果

> (eval '((lambda (a b) a) 1 2) '())
1
> (eval '((lambda (a b) b) 1 2) '())
2
>

下面是完整的代码

(define true #t)
(define false #f)
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((quoted? exp) (text-of-quotation exp))
        ((if? exp) (eval-if exp env))
        ((begin? exp)
         (eval-sequence (begin-actions exp) env))
        ((lambda? exp)
         (make-procedure (lambda-parameters exp)
                         (lambda-body exp)
                         env))
        ((variable? exp) (lookup-variable-value exp env))
        ((application? exp)
         (apply (eval (operator exp) env)
                   (list-of-values (operands exp) env)))
        (else (error #f "Unknown expression type -- EVAL" exp))))

(define (self-evaluating? exp)
  (cond ((number? exp) true)
        ((string? exp) true)
        (else false)))
(define (if? exp) (tagged-list? exp 'if))

(define (eval-if exp env)
  (if (true? (eval (if-predicate exp) env))
      (eval (if-consequent exp) env)
      (eval (if-alternative exp) env)))

(define (if-predicate exp) (cadr exp))
(define (if-consequent exp) (caddr exp))
(define (if-alternative exp)
  (if (not (null? (cdddr exp)))
      (cadddr exp)
      'false))
(define (true? exp)
  (eq? exp 'true))

(define (lambda? exp) (tagged-list? exp 'lambda))
(define (lambda-parameters exp) (cadr exp))
(define (lambda-body exp) (cddr exp))
(define (make-procedure parameters body env)
  (list 'procedure parameters body env))
(define (procedure-parameters p) (cadr p))
(define (procedure-body p) (caddr p))
(define (procedure-environment p) (cadddr p))


(define (apply procedure arguments)
  (eval-sequence
   (procedure-body procedure)
   (extend-environment
    (procedure-parameters procedure)
    arguments
    (procedure-environment procedure))))

(define (extend-environment vars vals base-env)
  (if (= (length vars) (length vals))
      (cons (make-frame vars vals) base-env)
      (if (< (length vars) (length vals))
          (error "Too many arguments supplied" vars vals)
          (error "Too few arguments supplied" vars vals))))

(define (make-frame variables values)
  (cons variables values))
(define (frame-variables frame) (car frame))
(define (frame-values frame) (cdr frame))

(define (extend-environment vars vals base-env)
  (if (= (length vars) (length vals))
      (cons (make-frame vars vals) base-env)
      (if (< (length vars) (length vals))
          (error "Too many arguments supplied" vars vals)
          (error "Too few arguments supplied" vars vals))))
(define (enclosing-environment env) (cdr env))
(define (first-frame env) (car env))
(define the-empty-environment '())
(define (lookup-variable-value var env)
  (define (env-loop env)
    (define (scan vars vals)
      (cond ((null? vars)
             (env-loop (enclosing-environment env)))
            ((eq? var (car vars))
             (car vals))
            (else (scan (cdr vars) (cdr vals)))))
    (if (eq? env the-empty-environment)
        (error #f "Unbound variable" var)
        (let ((frame (first-frame env)))
          (scan (frame-variables frame)
                (frame-values frame)))))
  (env-loop env))

(define (variable? exp) (symbol? exp))
(define (application? exp) (pair? exp))
(define (operator exp) (car exp))
(define (operands exp) (cdr exp))
(define (no-operands? ops) (null? ops))
(define (first-operand ops) (car ops))
(define (rest-operands ops) (cdr ops))
(define (list-of-values exps env)
  (if (no-operands? exps)
      '()
      (cons (eval (first-operand exps) env)
            (list-of-values (rest-operands exps) env))))

(define (eval-sequence exps env)
  (cond ((last-exp? exps) (eval (first-exp exps) env))
        (else (eval (first-exp exps) env)
              (eval-sequence (rest-exps exps) env))))

(define (begin? exp) (tagged-list? exp 'begin))
(define (begin-actions exp) (cdr exp))
(define (last-exp? seq) (null? (cdr seq)))
(define (first-exp seq) (car seq))
(define (rest-exps seq) (cdr seq))

(define (quoted? exp)
  (tagged-list? exp 'quote))
(define (text-of-quotation exp) (cadr exp))
(define (tagged-list? exp tag)
  (if (pair? exp)
      (eq? (car exp) tag)
      false))

内置函数

我们有了语言的核心,但是这个语言还没什么用,因为连基本的加减乘除都做不 了。我们利用宿主语言提供的功能,实现一些基本的内置函数。

首先我们要区分内置函数和语言自己的函数。我们用 primitive 来标记内置函数。

(define (primitive-procedure? proc)
  (tagged-list? proc 'primitive))

primitive 函数有一个属性,就是底层的函数实现。

(define (primitive-implementation proc) (cadr proc))

我们预先定义下面这些内置函数

(define primitive-procedures
  (list (list 'car car)
        (list 'cdr cdr)
        (list 'cons cons)
        (list 'null? null?)
...
        ))

我们初始化语言的初始环境

(define (primitive-procedure-names)
  (map car
       primitive-procedures))

(define (primitive-procedure-objects)
  (map (lambda (proc) (list 'primitive (cadr proc)))
  primitive-procedures))
(define (setup-environment)
  (let ((initial-env
         (extend-environment (primitive-procedure-names)
                             (primitive-procedure-objects)
                             the-empty-environment)))
    initial-env))
(define the-global-environment (setup-environment))

看看执行效果

> (load "eval6.scm")
> the-global-environment
(((car cdr cons null?)
   (primitive #<procedure car>)
   (primitive #<procedure cdr>)
   (primitive #<procedure cons>)
   (primitive #<procedure null?>)))

对应的 apply 函数需要修改,这样才能调用内置函数。

(define (apply procedure arguments)
  (cond ((primitive-procedure? procedure)
         (apply-primitive-procedure procedure arguments))
        ((compound-procedure? procedure)
         (eval-sequence
           (procedure-body procedure)
           (extend-environment
             (procedure-parameters procedure)
             arguments
             (procedure-environment procedure))))
        (else
         (error
          "Unknown procedure type -- APPLY" procedure))))

看看其他几个辅助函数的实现

(define (compound-procedure? p) (tagged-list? p 'procedure))
(define (apply-primitive-procedure proc args)
  (apply-in-underlying-scheme
   (primitive-implementation proc) args))
(define apply-in-underlying-scheme apply)

看看执行效果

> (eval 'car the-global-environment)
(primitive #<procedure car>)
> (apply-primitive-procedure (eval 'car the-global-environment) '((1 2)))
1
> (eval '(car '(1 2)) the-global-environment)
1
> (eval '(display "hello world") the-global-environment)
hello world
>

这里有一个坑,apply 已经在宿主语言中有定义了,所有我们要换一个名字, 防止名字冲突。并且用 apply-primitive-procedure 强调我们调用的是宿主 语言的 apply 函数。

(define apply-in-underlying-scheme (top-level-value 'apply (scheme-environment)))

我们的语言居然可以输出 "hello world" 了。下面是完整的代码。

(define apply-in-underlying-scheme (top-level-value 'apply (scheme-environment)))
(define true #t)
(define false #f)
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((quoted? exp) (text-of-quotation exp))
        ((if? exp) (eval-if exp env))
        ((begin? exp)
         (eval-sequence (begin-actions exp) env))
        ((lambda? exp)
         (make-procedure (lambda-parameters exp)
                         (lambda-body exp)
                         env))
        ((variable? exp) (lookup-variable-value exp env))
        ((application? exp)
         (apply (eval (operator exp) env)
                   (list-of-values (operands exp) env)))
        (else (error #f "Unknown expression type -- EVAL" exp))))

(define (self-evaluating? exp)
  (cond ((number? exp) true)
        ((string? exp) true)
        (else false)))
(define (if? exp) (tagged-list? exp 'if))

(define (eval-if exp env)
  (if (true? (eval (if-predicate exp) env))
      (eval (if-consequent exp) env)
      (eval (if-alternative exp) env)))

(define (if-predicate exp) (cadr exp))
(define (if-consequent exp) (caddr exp))
(define (if-alternative exp)
  (if (not (null? (cdddr exp)))
      (cadddr exp)
      'false))
(define (true? exp)
  (eq? exp 'true))

(define (lambda? exp) (tagged-list? exp 'lambda))
(define (lambda-parameters exp) (cadr exp))
(define (lambda-body exp) (cddr exp))
(define (make-procedure parameters body env)
  (list 'procedure parameters body env))
(define (compound-procedure? p) (tagged-list? p 'procedure))
(define (procedure-parameters p) (cadr p))
(define (procedure-body p) (caddr p))
(define (procedure-environment p) (cadddr p))

(define (apply procedure arguments)
  (cond ((primitive-procedure? procedure)
         (apply-primitive-procedure procedure arguments))
        ((compound-procedure? procedure)
         (eval-sequence
           (procedure-body procedure)
           (extend-environment
             (procedure-parameters procedure)
             arguments
             (procedure-environment procedure))))
        (else
         (error
          "Unknown procedure type -- APPLY" procedure))))

(define (extend-environment vars vals base-env)
  (if (= (length vars) (length vals))
      (cons (make-frame vars vals) base-env)
      (if (< (length vars) (length vals))
          (error "Too many arguments supplied" vars vals)
          (error "Too few arguments supplied" vars vals))))

(define (make-frame variables values)
  (cons variables values))
(define (frame-variables frame) (car frame))
(define (frame-values frame) (cdr frame))

(define (extend-environment vars vals base-env)
  (if (= (length vars) (length vals))
      (cons (make-frame vars vals) base-env)
      (if (< (length vars) (length vals))
          (error "Too many arguments supplied" vars vals)
          (error "Too few arguments supplied" vars vals))))
(define (enclosing-environment env) (cdr env))
(define (first-frame env) (car env))
(define the-empty-environment '())
(define (lookup-variable-value var env)
  (define (env-loop env)
    (define (scan vars vals)
      (cond ((null? vars)
             (env-loop (enclosing-environment env)))
            ((eq? var (car vars))
             (car vals))
            (else (scan (cdr vars) (cdr vals)))))
    (if (eq? env the-empty-environment)
        (error #f "Unbound variable" var)
        (let ((frame (first-frame env)))
          (scan (frame-variables frame)
                (frame-values frame)))))
  (env-loop env))
(define (primitive-procedure? proc)
  (tagged-list? proc 'primitive))
(define primitive-procedures
  (list (list 'car car)
        (list 'cdr cdr)
        (list 'cons cons)
        (list 'null? null?)
        (list 'display display)
        ))
(define (primitive-implementation proc) (cadr proc))
(define (primitive-procedure-names)
  (map car
       primitive-procedures))
(define (apply-primitive-procedure proc args)
  (apply-in-underlying-scheme
   (primitive-implementation proc) args))


(define (primitive-procedure-objects)
  (map (lambda (proc) (list 'primitive (cadr proc)))
       primitive-procedures))
(define (setup-environment)
  (let ((initial-env
         (extend-environment (primitive-procedure-names)
                             (primitive-procedure-objects)
                             the-empty-environment)))
    initial-env))
(define the-global-environment (setup-environment))

(define (variable? exp) (symbol? exp))
(define (application? exp) (pair? exp))
(define (operator exp) (car exp))
(define (operands exp) (cdr exp))
(define (no-operands? ops) (null? ops))
(define (first-operand ops) (car ops))
(define (rest-operands ops) (cdr ops))
(define (list-of-values exps env)
  (if (no-operands? exps)
      '()
      (cons (eval (first-operand exps) env)
            (list-of-values (rest-operands exps) env))))

(define (eval-sequence exps env)
  (cond ((last-exp? exps) (eval (first-exp exps) env))
        (else (eval (first-exp exps) env)
              (eval-sequence (rest-exps exps) env))))

(define (begin? exp) (tagged-list? exp 'begin))
(define (begin-actions exp) (cdr exp))
(define (last-exp? seq) (null? (cdr seq)))
(define (first-exp seq) (car seq))
(define (rest-exps seq) (cdr seq))

(define (quoted? exp)
  (tagged-list? exp 'quote))
(define (text-of-quotation exp) (cadr exp))
(define (tagged-list? exp tag)
  (if (pair? exp)
      (eq? (car exp) tag)
      false))

定义变量(define) 和变量赋值(set!),还有 cond

这些都不是语言最核心的部分。这里给出完整的代码。

看看效果

> (eval '(begin (define a 10) a) the-global-environment)
10
> (eval '(begin (define a 10) (display a)) the-global-environment)
10
> (eval '(begin (define a 10) (+ a 100)) the-global-environment)
110
> (eval '(begin (define a 10) (set! a 1) (+ a 100)) the-global-environment)
101

最后完整的代码

;; 和宿主语言的交互
(define apply-in-underlying-scheme (top-level-value 'apply (scheme-environment)))
(define true #t)
(define false #f)
;; 语言的核心入口
(define (eval exp env)
  (cond ((self-evaluating? exp) exp)
        ((variable? exp) (lookup-variable-value exp env))
        ((quoted? exp) (text-of-quotation exp))
        ((assignment? exp) (eval-assignment exp env))
        ((definition? exp) (eval-definition exp env))
        ((if? exp) (eval-if exp env))
        ((lambda? exp)
         (make-procedure (lambda-parameters exp)
                         (lambda-body exp)
                         env))
        ((begin? exp)
         (eval-sequence (begin-actions exp) env))
        ((cond? exp) (eval (cond->if exp) env))
        ((application? exp)
         (apply (eval (operator exp) env)
                (list-of-values (operands exp) env)))
        (else
         (error "Unknown expression type -- EVAL" exp))))

;; 简单的类型
(define (self-evaluating? exp)
  (cond ((number? exp) true)
        ((string? exp) true)
        (else false)))
;; 关于 quote
(define (quoted? exp)
  (tagged-list? exp 'quote))
(define (text-of-quotation exp) (cadr exp))

;; 关于代码块
(define (begin? exp) (tagged-list? exp 'begin))
(define (eval-sequence exps env)
  (cond ((last-exp? exps) (eval (first-exp exps) env))
        (else (eval (first-exp exps) env)
              (eval-sequence (rest-exps exps) env))))

(define (begin-actions exp) (cdr exp))
(define (last-exp? seq) (null? (cdr seq)))
(define (first-exp seq) (car seq))
(define (rest-exps seq) (cdr seq))

;; 支持 if 条件判断
(define (if? exp) (tagged-list? exp 'if))
(define (eval-if exp env)
  (if (true? (eval (if-predicate exp) env))
      (eval (if-consequent exp) env)
      (eval (if-alternative exp) env)))
(define (if-predicate exp) (cadr exp))
(define (if-consequent exp) (caddr exp))
(define (if-alternative exp)
  (if (not (null? (cdddr exp)))
      (cadddr exp)
      'false))
(define (true? exp)
  (eq? exp 'true))
;; 支持 lambda 函数定义
(define (lambda? exp) (tagged-list? exp 'lambda))
(define (lambda-parameters exp) (cadr exp))
(define (lambda-body exp) (cddr exp))
(define (make-procedure parameters body env)
  (list 'procedure parameters body env))
(define (compound-procedure? p) (tagged-list? p 'procedure))
(define (procedure-parameters p) (cadr p))
(define (procedure-body p) (caddr p))
(define (procedure-environment p) (cadddr p))
;; 关于环境的定义
(define (extend-environment vars vals base-env)
  (if (= (length vars) (length vals))
      (cons (make-frame vars vals) base-env)
      (if (< (length vars) (length vals))
          (error "Too many arguments supplied" vars vals)
          (error "Too few arguments supplied" vars vals))))
(define (make-frame variables values)
  (cons variables values))
(define (frame-variables frame) (car frame))
(define (frame-values frame) (cdr frame))
(define (extend-environment vars vals base-env)
  (if (= (length vars) (length vals))
      (cons (make-frame vars vals) base-env)
      (if (< (length vars) (length vals))
          (error "Too many arguments supplied" vars vals)
          (error "Too few arguments supplied" vars vals))))
(define (add-binding-to-frame! var val frame)
  (set-car! frame (cons var (car frame)))
  (set-cdr! frame (cons val (cdr frame))))
(define (enclosing-environment env) (cdr env))
(define (first-frame env) (car env))

(define primitive-procedures
  (list (list 'car car)
        (list 'cdr cdr)
        (list 'cons cons)
        (list 'null? null?)
        (list '+ +)
        (list '- -)
        (list '* *)
        (list '/ /)
        (list 'display display)
        ))
(define (primitive-procedure-names)
  (map car
       primitive-procedures))
(define (primitive-procedure-objects)
  (map (lambda (proc) (list 'primitive (cadr proc)))
       primitive-procedures))
(define (setup-environment)
  (let ((initial-env
         (extend-environment (primitive-procedure-names)
                             (primitive-procedure-objects)
                             the-empty-environment)))
    initial-env))
(define the-empty-environment '())
(define the-global-environment (setup-environment))

;; 关于变量
(define (variable? exp) (symbol? exp))
(define (lookup-variable-value var env)
  (define (env-loop env)
    (define (scan vars vals)
      (cond ((null? vars)
             (env-loop (enclosing-environment env)))
            ((eq? var (car vars))
             (car vals))
            (else (scan (cdr vars) (cdr vals)))))
    (if (eq? env the-empty-environment)
        (error #f "Unbound variable" var)
        (let ((frame (first-frame env)))
          (scan (frame-variables frame)
                (frame-values frame)))))
  (env-loop env))
;; 关于 apply 的相关函数
(define (application? exp) (pair? exp))
(define (apply procedure arguments)
  (cond ((primitive-procedure? procedure)
         (apply-primitive-procedure procedure arguments))
        ((compound-procedure? procedure)
         (eval-sequence
           (procedure-body procedure)
           (extend-environment
             (procedure-parameters procedure)
             arguments
             (procedure-environment procedure))))
        (else
         (error
          "Unknown procedure type -- APPLY" procedure))))
(define (primitive-procedure? proc)
  (tagged-list? proc 'primitive))
(define (primitive-implementation proc) (cadr proc))
(define (apply-primitive-procedure proc args)
  (apply-in-underlying-scheme
   (primitive-implementation proc) args))
(define (operator exp) (car exp))
(define (operands exp) (cdr exp))
(define (no-operands? ops) (null? ops))
(define (first-operand ops) (car ops))
(define (rest-operands ops) (cdr ops))
(define (list-of-values exps env)
  (if (no-operands? exps)
      '()
      (cons (eval (first-operand exps) env)
            (list-of-values (rest-operands exps) env))))

;; 关于 define
(define (definition? exp)
  (tagged-list? exp 'define))
(define (definition-variable exp)
  (if (symbol? (cadr exp))
      (cadr exp)
      (caadr exp)))
(define (definition-value exp)
  (if (symbol? (cadr exp))
      (caddr exp)
      (make-lambda (cdadr exp)   ; formal parameters
                   (cddr exp)))) ; body
(define (eval-definition exp env)
  (define-variable! (definition-variable exp)
                    (eval (definition-value exp) env)
                    env)
  'ok)
(define (define-variable! var val env)
  (let ((frame (first-frame env)))
    (define (scan vars vals)
      (cond ((null? vars)
             (add-binding-to-frame! var val frame))
            ((eq? var (car vars))
             (set-car! vals val))
            (else (scan (cdr vars) (cdr vals)))))
    (scan (frame-variables frame)
          (frame-values frame))))
;; 关于 set!
(define (assignment? exp)
  (tagged-list? exp 'set!))
(define (assignment-variable exp) (cadr exp))
(define (assignment-value exp) (caddr exp))
(define (eval-assignment exp env)
  (set-variable-value! (assignment-variable exp)
                       (eval (assignment-value exp) env)
                       env)
  'ok)
(define (set-variable-value! var val env)
  (define (env-loop env)
    (define (scan vars vals)
      (cond ((null? vars)
             (env-loop (enclosing-environment env)))
            ((eq? var (car vars))
             (set-car! vals val))
            (else (scan (cdr vars) (cdr vals)))))
    (if (eq? env the-empty-environment)
        (error "Unbound variable -- SET!" var)
        (let ((frame (first-frame env)))
          (scan (frame-variables frame)
                (frame-values frame)))))
  (env-loop env))
;; 关于 cond!
(define (cond? exp) (tagged-list? exp 'cond))
(define (cond-clauses exp) (cdr exp))
(define (cond-else-clause? clause)
  (eq? (cond-predicate clause) 'else))
(define (cond-predicate clause) (car clause))
(define (cond-actions clause) (cdr clause))
(define (cond->if exp)
  (expand-clauses (cond-clauses exp)))

(define (expand-clauses clauses)
  (if (null? clauses)
      'false                          ; no else clause
      (let ((first (car clauses))
            (rest (cdr clauses)))
        (if (cond-else-clause? first)
            (if (null? rest)
                (sequence->exp (cond-actions first))
                (error "ELSE clause isn't last -- COND->IF"
                       clauses))
            (make-if (cond-predicate first)
                     (sequence->exp (cond-actions first))
                     (expand-clauses rest))))))
;; 底层函数
(define (tagged-list? exp tag)
  (if (pair? exp)
      (eq? (car exp) tag)
      false))

其他

这个解释器,看起来简单,其实已经支持了大多数核心功能了,甚至包括闭包的 实现。一门语言还有其他一些重要的特性需要仔细设计。

  • 模块系统
  • 内存管理 gc
  • 代码生成优化
  • 调试器
  • 调优器
  • 特色功能
    • 是否支持 continuation
    • 是否支持 coroutine
    • 是否支持 thread
    • 是否支持 macro
    • 是否支持 类型推导
    • 是否支持 OO
    • 是否支持 泛型
    • 是否支持 generator/iterator
  • 标准库设计
    • Network/File IO
    • 基础数据类型库
      • List
      • Array
      • Map
      • Hash
      • Set