oblivious normalization by evaluation
At RacketCon 2018’s office hours, David Thrane Christiansen talked about “normalization by evaluation” as a prelude to his “Implementing Dependent Types” session.
In the design he sketched, the normalizer needed help from the
evaluator. I wondered if it was possible to use a standard (oblivious)
evaluator, such as Racket’s normal eval. The answer is no —
Normal forms
To be precise, we are working with the following language:
EXP | = | X | ||
| | (lambda (X) EXP) | |||
| | (EXP EXP) | |||
X | = | variable name |
((lambda (X) EXP₁) EXP₂) → EXP₁[EXP₂ / X] ; β
NF | = | NEU | ||
| | (lambda (X) NF) | |||
NEU | = | X | ||
| | (NEU NF) |
Note: In this post, whenever I say “normal form”, I always mean β-normal form; and whenever I say “has a normal form”, I mean it has a β-normal form reachable through β steps. That is, even once I introduce other kinds of steps, the idea of having a normal form is tied to β; when I talk about whether the normal form is reachable through other kinds of steps, I’ll say so explicitly.
CBN Evaluation
If we consistently apply the β rule at the leftmost, outermost place that it can be applied, except within a lambda body, then we get call-by-name (CBN) evaluation.
Here’s a simple CBN evaluator:
; evaluate : EXP → EXP |
(define (evaluate e) |
(match e |
[`(,e1 ,e2) |
(match (evaluate e1) |
[`(lambda (,x) ,e2) |
(evaluate (subst e2 x e1))] |
[e1* ; open term! |
`(,e1* ,e2)])] |
[e e])) |
Note that the evaluator allows open terms. For example, (evaluate 'x) produces 'x. Also note that the evaluator diverges when given a term like Ω. If the original term is closed, then the second inner match case is impossible, and the evaluator either returns a lambda-abstraction or it diverges.
Given the CBN evaluator above, it is straightforward to write a normalizer that takes a term and produces a normal form (if the term has one). We start by evaluating the term, then we recur into its immediate subexpressions and normalize them.
; normalize : EXP → NF |
(define (normalize e) |
(match (evaluate e) |
[`(lambda (,x) ,e) |
`(lambda (,x) ,(normalize e))] |
[`(,e1 ,e2) |
`(,(normalize e1) ,(normalize e2))] |
[x x])) |
CBV Evaluation
((lambda (X) EXP₁) VAL₂) → EXP₁[VAL₂ / X] ; βv
VAL | = | X | ||
| | (lambda (X) EXP) |
Given these two differences, is it possible to build a β-normalizer out of a CBV-evaluator?
If we apply the evaluator to this term, it diverges (that is, it fails to terminate). If we do case analysis first and apply the evaluator to the subterms, it diverges on the second. We could do deeper case analysis, but at that point, we’re just building a normalizer from scratch.
The problem is that while this example has a normal form (reachable in
one β step), that normal form is not reachable through any number of
βv steps —
Un-evaluation
Let’s refine the question. If there is a normal form reachable through βv steps, can we find it using a CBV evaluator? Racket’s eval procedure, for example?
Now we must reckon with another issue: Racket’s eval takes an encoding of a term (as an S-expression made of lists and symbols) and produces a Racket value, and Racket provides no way to convert a function value back to a term encoding. But perhaps we could build such an “un-evaluator”, at least if we’re allowed to assume that the value we’re un-evaluating corresponds to a normal form.
; un-eval : eval(NF) -> NF
The only possible value we can get from evaluating a closed normal form is an ordinary Racket function, and the only thing we can do with that is to apply it to an argument. So we should apply the function to some sort of value that will allow us to map out its behavior.
For example, suppose we apply the function to the symbol 'a
and we get back 'a. Then we know that the function
corresponds to the normal form (lambda (x) x) —
(struct probe (term) |
#:property prop:procedure |
(lambda (self arg) |
(probe `(,(probe-term self) ,(un-eval arg))))) |
That is, a probe instance supports the operation probe-term that returns an S-expression representation of the term, but it also acts as a procedure of one argument. When applied, a probe just creates a new probe whose term is an application term where the operator is the original probe’s term and the operand is a term representing the argument. We convert the argument to a term using un-eval, which we are about to define.
The un-eval function must handle two cases: ordinary Racket functions that we want to un-eval, and probes that we have introduced to explore the first kind of function. Since we can recognize probes with the probe? predicate, we can implement un-eval by case analysis:
; un-eval : eval(NF) -> NF |
(define (un-eval p) |
(if (probe? p) |
(probe-term p) |
(let ([var (gensym)]) |
`(lambda (,var) ,(un-eval (p (probe var))))))) |
That is, to un-eval a probe, we just extract its term using probe-term. To un-eval another kind of function, we first generate a fresh variable name, then we return a lambda expression with the fresh variable and a body constructed by probing the given function.
We can now see how un-eval corresponds to the NF grammar. A probe always carries a neutral term (NEU), and un-eval returns either a neutral term (case 1) or a lambda-abstraction whose body is a normal form (case 2).
> (un-eval (lambda (z) z)) |
'(lambda (x1) x1) |
> (un-eval (lambda (x) (lambda (y) (x (lambda (z) ((z y) x)))))) |
'(lambda (x1) (lambda (x2) (x1 (lambda (x3) ((x3 x2) x1))))) |
> (un-eval THREE) |
'(lambda (x1) (lambda (x2) (x1 (x1 (x1 x2))))) |
> (un-eval (fact THREE)) |
'(lambda (x1) (lambda (x2) (x1 (x1 (x1 (x1 (x1 (x1 x2)))))))) |
So perhaps we can characterize un-eval with the following statement: if EXP has a β-normal form NF reachable by a sequence of βv steps, then (un-eval (eval (quote EXP))) produces (quote NF) (modulo α-renaming).
Beyond βv
It’s worth emphasizing here that while interesting and (lambda (f) f) are equivalent in the λ-calculus (generated by β), they are not equivalent in the λᵥ-calculus (generated by βv). They must not be, because under CBV evaluation the two terms act differently! For example, (interesting (lambda (x) x)) evaluates to (lambda (x) x) and (interesting (lambda (x) (x x))) diverges.
> (un-eval interesting) |
'(lambda (x1) x1) |
((lambda (X) EXP₁) NEU₂) → EXP₁[NEU₂ / X] ; βneu
If our purpose were to faithfully follow the restrictions of βv reduction, this should be alarming. But this whole adventure started with the question of whether CBV-evaluation can simulate (or at least approximate) β-normalization, so let’s see how far we can push it.
NEU | = | X | ||
| | (NEU NF) |
What do we lose by eagerly normalizing the argument?
This example suggests a modification to our probe representation: delay the normalization of arguments until the probe’s term is needed for the body of a lambda-abstraction. Here are the updated definitions:
; A Probe is (probe (promise-of Neutral)) |
(struct probe (term-p) |
#:property prop:procedure |
(lambda (self arg) |
(probe (delay `(,(probe-term self) ,(un-eval arg)))))) |
; probe-term : Probe -> Neutral |
(define (probe-term p) (force (probe-term-p p))) |
; un-eval+ : eval(NF) -> NF |
(define (un-eval+ p) |
(if (probe? p) |
(probe-term p) |
(let ([var (gensym*)]) |
`(lambda (,var) ,(un-eval (p (probe (delay var)))))))) |
> (un-eval+ (lambda (z) z)) |
'(lambda (x1) x1) |
> (un-eval+ (lambda (x) (lambda (y) (x (lambda (z) ((z y) x)))))) |
'(lambda (x1) (lambda (x2) (x1 (lambda (x3) ((x3 x2) x1))))) |
> (un-eval+ THREE) |
'(lambda (x1) (lambda (x2) (x1 (x1 (x1 x2))))) |
> (un-eval+ (fact THREE)) |
'(lambda (x1) (lambda (x2) (x1 (x1 (x1 (x1 (x1 (x1 x2)))))))) |
> (un-eval+ weird) |
'(lambda (x1) x1) |
NVAL | = | X | ||
| | (NVAL VAL) |
((lambda (X) EXP₁) NVAL₂) → EXP₁[NVAL₂ / X] ; βnval
I conjecture (but haven’t proven) that un-eval+ satisfies the following property: If EXP has a β-normal form NF reachable by a sequence of βv and βnval steps, then (un-eval (eval (quote EXP))) produces (quote NF) (modulo α-renaming).