6 Enumerated Shapes
This section introduces enumerated shapes —
6.1 Defining Enumerated Shapes
In the previous section, we extracted the definition of CondClause from the shape of the my-cond macro, and we defined a syntax class cond-clause corresponding to the shape.
Now let’s extend CondClause with another variant that allows the clause’s result to depend on the (true) value produced by the condition. This is similar to the => clause form that Racket’s cond macro supports, but we’ll use a keyword, #:apply, to distinguish this form of clause for my-cond. Here is the updated shape definition:
;; CondClause ::= ;; | [Expr Expr] -- represents condition, result ;; | [Expr #:apply Expr] -- represents condition, function from condition to result
(define ls '((a 1) (b 2) (c 3))) (my-cond [(assoc 'b ls) #:apply cadr] [#t 0]) ; expect 2 (my-cond [(assoc 'z ls) #:apply cadr] [#t 0]) ; expect 0
(let ([condition-value (assoc 'b ls)]) (if condition-value (cadr condition-value) (my-cond [#t 0])))
We update the definition of cond-clause by adding another pattern clause for the new variant. Its second expression has a different intepretation, so we should use a different name for its pattern variable so that we don’t confuse them:
(begin-for-syntax (define-syntax-class cond-clause #:attributes (condition result) ; !! (pattern [condition:expr result:expr]) (pattern [condition:expr #:apply get-result:expr])))
There’s a problem, though. The new pattern is fine by itself, but it doesn’t fit with the existing #:attributes declaration. The second variant doesn’t have a simple result expression; it interprets its second expression differently. The syntax class, though, needs a single interface that determines what nested attributes are bound when the syntax class is used in a macro and how the nested attributes are interpreted.
empty interface (redo case analysis)
common meaning
macro behavior
code generation
AST
6.1.1 Empty Interface Strategy (Redo Case Analysis)
One option is to give the syntax class no responsibility for interpreting its terms, and simply redo the case analysis in the macro. This is the empty interface strategy. The syntax class is still useful for input validation and as internal documentation, but since it performs no interpretation, we should declare that it exports no attributes.
(begin-for-syntax (define-syntax-class cond-clause #:attributes () (pattern [condition:expr result:expr]) (pattern [condition:expr #:apply get-result:expr])))
The following version of my-cond checks the syntactic structure of all of its arguments, then expands into a private recursive helper macro my-cond*, which performs the case analysis on each clause again:
; (my-cond CondClause ...) : Expr (define-syntax my-cond (syntax-parser [(_ c:cond-clause ...) #'(my-cond* c ...)])) ; (my-cond* CondClause ...) : Expr (define-syntax my-cond* (syntax-parser [(_) #'(void)] [(_ [condition:expr result:expr] more ...) #'(if condition result (my-cond* more ...))] [(_ [condition:expr #:apply get-result:expr] more ...) #'(let ([condition-value condition]) (if condition-value (get-result condition-value) (my-cond* more ...)))]))
An advantage of this strategy is that it is nearly always a viable option. A disadvantage is that it duplicates syntax patterns, which introduces the possibility of discrepancies between the syntax class and the macro clauses. Such discrepancies can lead to problems that are difficult to catch and debug.
;; CondClause ::= ... | [Expr #:bind c:Id Expr{c}] If the condition evaluates to a true value, it is bound to the given variable name and the result expression is evaluated in the scope of that variable. The scope of the variable does not include any other clauses.Update the design of cond-clause and my-cond for the new CondClause variant using the strategy described in this section.
6.1.2 Common Meaning Interface Strategy
(begin-for-syntax (define-syntax-class cond-clause #:attributes (condition ; Expr[(U #f C)] get-result) ; Expr[C -> Any] (pattern [condition:expr result:expr] #:with get-result #'(lambda (ignore) result)) (pattern [condition:expr #:apply get-result:expr])))
; (my-cond CondClause ...) : Expr (define-syntax my-cond (syntax-parser [(_) #'(void)] [(_ c:cond-clause more ...) #'(let ([condition-value c.condition]) (if condition-value (c.get-result condition-value) (my-cond more ...)))]))
You might worry that introducing a lambda wrapper and a function call for every simple clause form will make the generated code run slower. After all, lambda requires a closure allocation, right? In this case, that is not true. The generated lambda wrappers appear directly in application position, and the Racket compiler is more than smart enough to inline those applications away. So even though the new version of the macro expands to different Racket code for simple clauses, the compiler produces exactly the same compiled code, with zero run-time overhead.
Exercise 12: See Exercise 11 for a revised definition of CondClause. Update the design of cond-clause and my-cond for the new CondClause variant using the strategy described in this section.
6.1.3 Macro Behavior Interface Strategy
If it is difficult to find a common interface for all of a syntax class’s variants based solely on their contents, an alternative is to design the interface based on the macro behavior. This is similar to shifting from “functional” style operations defined separately from a data type to “object-oriented” style methods where behavior is defined together with the type and its variants. The potential downside, of course, is that it couples the syntax class more tightly with the macro.
(begin-for-syntax (define-syntax-class cond-clause #:attributes (code) ; Expr[(-> Any) -> Any] (pattern [condition:expr result:expr] #:with code #'(lambda (fail) (if condition result (fail)))) (pattern [condition:expr #:apply get-result:expr] #:with code #'(lambda (fail) (let ([condition-value condition]) (if condition-value (get-result condition-value) (fail)))))))
; (my-cond CondClause ...) : Expr (define-syntax my-cond (syntax-parser [(_) #'(void)] [(_ c:cond-clause more ...) #'(c.code (lambda () (my-cond more ...)))]))
Again, you might worry that the use of lambda leads to run-time inefficiency, but the way this macro uses lambda is easily optimized away by the compiler.
Exercise 13: See Exercise 11 for a revised definition of CondClause. Update the design of cond-clause and my-cond for the new CondClause variant using the strategy described in this section.
6.1.4 Code Generator Interface Strategy
(begin-for-syntax (define-syntax-class cond-clause #:attributes (make-code) ; Syntax[Expr] -> Syntax[Expr] (pattern [condition:expr result:expr] #:attr make-code (lambda (fail-expr) #`(if condition result #,fail-expr))) (pattern [condition:expr #:apply get-result:expr] #:attr make-code (lambda (fail-expr) #`(let ([condition-value condition]) (if condition-value (get-result condition-value) #,fail-expr))))))
; (my-cond CondClause ...) : Expr (define-syntax my-cond (syntax-parser [(_) #'(void)] [(_ c:cond-clause more ...) ((datum c.make-code) #'(my-cond more ...))]))
The value of c.make-code is not syntax, so we cannot use it in a syntax template. We use datum to get the attribute value (a function), and apply it to syntax representing an expression handling the rest of the clauses.
; (my-cond CondClause ...) : Expr (define-syntax my-cond (syntax-parser [(_ c:cond-clause ...) (foldr (lambda (make-code rec-expr) (make-code rec-expr)) #'(void) (datum (c.make-code ...)))]))
The expression (datum (c.make-code ...)) has type (Listof (Syntax[Expr] -> Syntax[Expr])).
Exercise 14: See Exercise 11 for a revised definition of CondClause. Update the design of cond-clause and my-cond for the new CondClause variant using the strategy described in this section.
6.1.5 AST Interface Strategy
The AST strategy is a variation on the Empty Interface Strategy (Redo Case Analysis) approach, which has the macro redo the syntax class’s case analysis. But in this variation, instead of the macro doing case analysis on the syntax, the syntax class parses its terms into values in some AST datatype, and then the macro does case analysis on the AST values.
This results in a larger interface between the syntax class and the macro or macros that use it, because the interface includes the AST datatype definition. On the other hand, the syntax class does not have to specialize its interpretation based on the behavior of any specific macro. Furthermore, the case analysis performed by the macro can be simpler and faster, and errors will be easier to catch and debug, compared to discrepancies between syntax class and macro syntax patterns.
(begin-for-syntax ; A ClauseRep is an instance of one of the following: (struct clause:normal (condition ; Syntax[Expr] result)) ; Syntax[Expr] (struct clause:apply (condition ; Syntax[Expr[(U #f C)] get-result)) ; Syntax[Expr[C -> Any]] (define-syntax-class cond-clause #:attributes (ast) ; ClauseRep (pattern [condition:expr result:expr] #:attr ast (clause:normal #'condition #'result)) (pattern [condition:expr #:apply get-result:expr] #:attr ast (clause:apply #'condition #'get-result))))
The type ClauseRep represents parsed terms of the shape CondClause. I’ve given them separate names here for clarify, but in practice I often use the same name for type and shape. The context usually disambiguates them.
;; (my-cond CondClause ...) : Expr (define-syntax my-cond (syntax-parser [(_ c:cond-clause ...) (foldr (lambda (ast rec-code) (match ast [(clause:normal condition result) #`(if #,condition #,result #,rec-code)] [(clause:apply condition get-result) #`(let ([condition-value #,condition]) (if condition-value (#,get-result condition-value) #,rec-code))])) #'(void) (datum (c.ast ...)))]))
Exercise 15: See Exercise 11 for a revised definition of CondClause. Update the design of cond-clause and my-cond for the new CondClause variant using the strategy described in this section. Double-check that a #:bind-clause variable is not visible in subsequent clauses!
6.2 Designing Enumerated Syntax
When you design an enumerated syntax shape, you must avoid ambiguity; or if you cannot completely avoid it, you must manage it carefully. To elaborate, let’s consider some alternative syntaxes for cond-clauses.
For AltCondClauseV1, let’s generalize the simple form, so that the result is determined not by a single expression but by one or more body terms. And let’s indicate the second form with the identifier apply instead of the keyword #:apply.
;; AltCondClauseV1 ::= ;; | [Expr Body ...+] ;; | [Expr apply Expr]
This syntax design is bad, because there are two variants with different meanings that contain the same terms. In fact, every term that matches the second variant also matches the first.
One could argue that a programmer is unlikely to simply write apply at the beginning of a result body intending it to be evaluated as an expression. It would have no effect; its presence would be completely useless. Still, programmers regularly trip on “out of the way” inconsistencies, and it’s a better habit to keep comfortable safety margins.
Let’s change the definition slightly so that instead of apply, we use the identifier => to indicate the second clause form:
;; AltCondClauseV2 ::= ;; | [Expr Body ...+] ;; | [Expr => Expr]
This syntax design is okay; it is in fact the design Racket uses for cond clauses. There are two crucial details, though. First, the => variant must be recognized not by the symbolic content of the => identifier, but by checking whether it is a reference to Racket’s => binding. Second. Racket defines => as a macro that always signals a syntax error. So even though we can interpret => as an expression, it is never a valid expression. In practice, we only care about avoiding overlaps with the set of valid expressions, so AltCondClauseV2 is okay.
Both properties are needed to avoid ambiguity. If we checked for => as a symbol, then a programmer could define a local variable (or macro) named =>, which would then be a valid expression, so there would be overlap. And if => were a valid expression, that also creates an overlap. (That’s the problem with the apply variant in AltCondClauseV1.)
(begin-for-syntax (define-syntax-class alt-cond-clause-v2 #:attributes (code) ; Expr[(-> Any) -> Any] #:literals (=>) (pattern [condition:expr result:expr ...+] #:with code ___) (pattern [condition:expr => get-result:expr] #:with code ___)))
The solution involves two steps. First, reorder the patterns to put the => pattern first. Second, use ~! (“cut”) to commit to the first pattern after seeing =>. Here is the code:
(begin-for-syntax (define-syntax-class alt-cond-clause-v2 #:attributes (code) ; Expr[(-> Any) -> Any] #:literals (=>) (pattern [condition:expr => ~! get-result:expr] #:with code ___) (pattern [condition:expr result:expr ...+] #:with code ___)))
Without the ~!, a term like [a => b c] would be considered a valid instance of the second pattern, rather than an invalid instance of the first pattern. (Try it and see what happens!)
The complexity of overlaps with expressions is one reason that keywords were introduced into Racket. Since both the Expr shape and the expr syntax class consider themselves completely disjoint from keywords, they avoid these issues completely. (A related issue does emerge when dealing with partly-expanded code, distinguishing definitions from expressions, for example. We’ll talk about that later. FIXME-REF)