3 Basic Shapes
This section introduces the most important basic shapes for macro design.
3.1 The Expr (Expression) Shape
The Expr shape represents the intention to interpret the term as a Racket expression by putting it in an expression context. In general, a macro cannot check a term and decide whether it is a valid expression; only the Racket macro expander can do that. As a pragmatic approximation, the Expr shape and its associated expr syntax class exclude only keyword terms, like #:when, so that macros can detect and report misuses of keyword arguments.
;; (my-when Expr Expr) : Expr
(my-when (odd? 5) (printf "odd!\n")) ; expect print (my-when (even? 5) (printf "even!\n")) ; expect no print
(define-syntax my-when (syntax-parser [(_ condition:expr result:expr) #'(if condition result (void))]))
> (check-equal? (with-output-to-string (lambda () (my-when (odd? 5) (printf "odd!\n")))) "odd!\n")
> (check-equal? (with-output-to-string (lambda () (my-when (even? 5) (printf "even!\n")))) "")
> (check-exn exn:fail:syntax? (lambda () (convert-syntax-error (my-when #:truth "verity"))))
Why? Which examples are rejected by the my-when macro itself, and what happens to the other examples? What difference does it make if you remove the expr syntax class annotations from the macro definition?
Exercise 2: Design a macro my-unless like my-when, except that it negates the condition.
Exercise 3: Design a macro catch-output that takes a single expression argument. The expression is evaluated, but its result is ignored; instead, the result of the macro is a string containing all of the output written by the expression. For example:
(catch-output (for ([i 10]) (printf "~s" i))) ; expect "0123456789"
3.2 The Body Shape
The Body shape is like Expr except that it indicates that the term will be used in a body context, so definitions are allowed in addition to expressions.
There is no distinct syntax class for Body; just use expr.
In practice, the Body shape is usually used with ellipses; see Compound Shapes. But we can make a version of my-when that takes a single Body term, even though it isn’t idiomatic Racket syntax. Here is the shape:
;; (my-when Expr Body) : Expr
(define-syntax my-when (syntax-parser [(_ condition:expr result-body:expr) #'(if condition (block result-body) (void))]))
That is, use (block ␣) to wrap a Body so it can be used in a strict Expr position. It is also common to use a (let () ␣) wrapper, but that does not work for all Body terms; it requires that the Body term ends with an expression. The block form is more flexible.
;; (#%expression Expr) : Body
Exercise 4: Check your solution to Exercise 3; does the macro also accept Body terms like the one above? That is, does the following work?If so, “fix it” (that is, make it more restrictive) using #%expression.
3.3 Proper Lexical Scoping, Part 2
; (catch-output Expr) : Expr (define-syntax catch-output (syntax-parser [(_ e:expr) #'(with-output-to-string (lambda () (#%expression e)))]))
; with-output-to-string : (-> Any) -> String (define (with-output-to-string proc) (let ([out (open-output-string)]) (parameterize ((current-output-port out)) (proc)) (get-output-string out)))
; (catch-output Expr) : Expr (define-syntax catch-output (syntax-parser [(_ e:expr) #'(let ([out (open-output-string)]) (parameterize ((current-output-port out)) (#%expression e)) (get-output-string out))]))
> (let ([get-output-string (lambda (p) "pwned!")]) (catch-output (printf "doing just fine, actually"))) "doing just fine, actually"
But what about the other direction? The macro introduces a binding of a variable named out; could this binding capture references to out in the expression given to the macro? Here is an example:
> (let ([out "Aisle 24"]) (catch-output (printf "The exit is located at ~a." out))) "The exit is located at Aisle 24."
The result shows that the macro’s out binding does not interfere with the use-site’s out variable. We say that the catch-output macro is “hygienic”.
A use-site binding does not capture a definition-site reference.
A definition-site binding does not capture a use-site reference.
3.4 The Id (Identifier) Shape
The Id shape contains all identifier terms.
The Id shape usually implies that the identifier will be used as the name for a variable, macro, or other sort of binding. In that case, we say the identifer is used as a binder.
Use the id syntax class for pattern variables whose shape is Id.
;; (my-and-let Id Expr Expr) : Expr
(define ls '((a 1) (b 2) (c 3))) (my-and-let entry (assoc 'b ls) (cadr entry)) ; expect 2 (my-and-let entry (assoc 'z ls) (cadr entry)) ; expect #f
(define-syntax my-and-let (syntax-parser [(_ x:id e1:expr e2:expr) #'(let ([x e1]) (if x e2 #f))]))
Label the identifier so we can refer to it later. So instead of Id, we write x:Id. The label does not have to be the same as the name of the pattern variable, but it makes sense to use the same name here.
Add an environment annotation to the second Expr indicating that it’s in the scope of a variable whose name is whatever actual identifier x refers to: Expr{x}.
;; (my-and-let x:Id Expr Expr{x}) : Expr
We can check the implementation: e1 does not occur in the scope of x, and e2 does occur in the scope of x.
; (my-and-let x:Id Expr Expr{x}) : Expr (define-syntax my-and-let (syntax-parser [(_ x:id e1:expr e2:expr) #'(let () (define x e1) ; BAD (if x e2 #f))]))
; (my-and-let x:Id Expr Expr{x}) : Expr (define-syntax my-and-let (syntax-parser [(_ x:id e1:expr e2:expr) #'(let () (define tmp e1) (if tmp (let ([x tmp]) e2) #f))]))
This implementation is good (although more complicated than unnecessary), because e1 no longer occurs in the scope of x. But what about tmp? Because of hygiene, the definition of tmp introduced by the macro is not visible to e1. (To be clear, it would be wrong to write Expr{tmp} for the shape of the first expression.)
Exercise 5: Generalize my-and-let to my-if-let, which takes an extra expression argument which is the macro’s result if the condition is false. The macro should have the following shape:
;; (my-if-let x:Id Expr Expr{x} Expr) : Expr Double-check your solution to make sure it follows the scoping specified by the shape.
3.5 Expressions, Types, and Contracts
(my-match-pair (list 1 2 3) n ns (< n (length ns))) ; expect #t (my-match-pair (list 'p "hello world") tag content (format "<~a>~a</~a>" tag (string-join content " ") tag)) ; expect "<p>hello world</p>"
;; (my-match-pair Expr x:Id xs:Id Expr{x,xs}) : Expr
(define-syntax my-match-pair (syntax-parser [(_ pair:expr x:id xs:id result:expr) #'(let ([pair-v pair]) (let ([x (car pair-v)] [xs (cdr pair-v)]) result))]))
Note that we introduce a temporary variable (or auxiliary variable) named pair-v to avoid evaluating the pair expression twice.
;; (my-match-pair Expr[(cons T1 T2)] x:Id xs:Id Expr{x:T1,xs:T2}) : Expr
;; (my-match-pair Expr[(cons T1 T2)] x:Id xs:Id Expr{x:T1,xs:T2}[R]) : Expr[R]
When I say “type” here, I’m not talking about Typed Racket or some other typed language implemented in Racket, nor do I mean that there’s a super-secret type checker hidden somewhere in Racket next to a flight simulator. By “type” I mean a semi-formal, unchecked description of expressions and macros that manipulate them. In this case, the shape declaration for my-match-pair warns the user that the first argument must produce a pair. If it doesn’t, the user has failed their obligations, and the macro may do bad things.
Of course, given human limitations, we would prefer the macro not to do bad things. Ideally, the macro definition and macro uses could be statically checked for compliance with shape declarations, but Racket does not not implement such a checker for macros. (It’s complicated.) At least, though, the macro enforce approximations of the types of expression arguments using contracts.
Use the expr/c syntax class for a pattern variable whose shape is Expr[Type] when Type has a useful contract approximation. In this example, the type (cons T1 T2) has a useful contract approximation pair?, but there is no useful contract for the type R. The expr/c syntax class takes an argument, so you cannot use the : notation; you must use ~var or #:declare instead. The argument is a syntax object representing the contract to apply to the expression. (It is #'pair? instead of pair? because the contract check is performed at run time.) In the syntax template, use the c ("contracted") attribute of the pattern variable to get the expression with a contract-checking wrapper. Here’s the contract-checked version of the macro:
; (my-match-pair Expr[(cons T1 T2)] x:Id xs:Id Expr{x:T1,xs:T2}[R]) : Expr[R] (define-syntax my-match-pair (syntax-parser [(_ (~var pair (expr/c #'pair?)) x:id xs:id result:expr) #'(let ([pair-v pair.c]) ; Important: pair.c, not pair (let ([x (car pair-v)] [xs (cdr pair-v)]) result))]))
(define-syntax my-match-pair (syntax-parser [(_ pair x:id xs:id result:expr) #:declare pair (expr/c #'pair?) #'(let ([pair-v pair.c]) ; Important: pair.c, not pair (let ([x (car pair-v)] [xs (cdr pair-v)]) result))]))
> (my-match-pair 'not-a-pair n ns (void)) my-match-pair: contract violation
expected: pair?
given: 'not-a-pair
in: pair?
macro argument contract
contract from: top-level
blaming: top-level
(assuming the contract is correct)
at: eval:32:0
Exercise 6: Modify the my-when macro to check that the condition expression produces a boolean value. (Note: this is not idiomatic for Racket conditional macros).
3.6 Uses of Expressions
It can use the value (or values) that the expression evaluates to. For example, the behavior of the my-when macro depends on the value that its first expression produces.
It can determine whether the expression is evaluated or when the expression is evaluated. The my-when example determines whether to evaluate its second expression. The standard delay macro is a classic example of controlling when an expression is evaluated.
It can change what dynamic context the expression is evaluated within. For example, a macro could use parameterize to evaluate the expression in a context with different values for some parameters.
It can change the static context the expression is evaluated within. Mainly, this means putting the expression in the scope of additional bindings, as we did in my-and-let and my-match-pair.
There are some restrictions on what macros can do and should do with expressions:
A macro cannot get the value of the expression at compile time. The expression represents computation that will occur later, at run time, perhaps on different machines, perhaps many times with different values in the run-time environment. A macro can only interact with an expression’s value by producing code to process the value at run time.
A macro must not look at the contents of the expression itself. Expressions are macro-extensible, so there is no grammar to guide case analysis. Interpreting expressions is the macro expander’s business, so don’t try it yourself. The macro expander is complicated, and if you attempt to duplicate its work “just a little”, you are likely to make unjustified assumptions and get it wrong. For example, an expression consisting of a self-quoting datum is not necessarily a constant, or even free of side effects; it might have a nonstandard #%datum binding, which could give it any behavior at all. Likewise, a plain identifier is not necessarily a variable reference; it might be an identifier macro, or it might have a nonstandard #%top binding.
In later sections (FIXME-REF), we’ll talk about how to cooperate with the macro expander to do case analysis of expressions and other forms.
In general, a macro should not duplicate an argument expression. That is, the expression should occur exactly once in the macro’s expansion. Duplicating expressions leads to expanding the same code multiple times, which can lead to slow compilation and bloated compiled code. The increases to both time and code size are potentially exponential, if duplicated expressions themselves contain macros that duplicate expressions and so on.
If you need to refer to an expression’s value multiple times, bind it to a temporary variable. If you need to evaluate the same expression multiple times, then bind a temporary variable to a thunk containing the expression and then apply the thunk multiple times.
One exception to this rule is if the macro knows that the expression is “simple”, like a variable reference or quoted constant, because the macro is private and all of its use sites can be inspected. We’ll discuss this case in Helper Macros and Simple Expressions.
In general, a macro should evaluate expressions in the same order that they appear (that is, “left to right”), unless it has a reason to do otherwise.
In Racket information generally flows from left to right, and the interpretation of later terms can depend on earlier terms. For example, my-when uses the value of its first (that is, left-most) expression argument to decide whether to evaluate its second (that is, right-most) expression. It would be non-idiomatic syntax design to put the condition expression second and the result expression first.
Similarly, the scope of an identifier is generally somewhere to the right of the identifier itself. For example, in my-match-pair, the identifiers are in scope in the following expression. If we swapped my-match-pair’s expressions, so it had the shape (my-match-pair Expr{x,xs} x:Id xs:Id Expr), that would not be idiomatic.
The same principles apply to Body terms as well.