3 Basic Shapes

This section introduces the most important basic shapes for macro design.

3.1 The Expr (Expression) Shape

The Expr shape represents the intention to interpret the term as a Racket expression by putting it in an expression context. In general, a macro cannot check a term and decide whether it is a valid expression; only the Racket macro expander can do that. As a pragmatic approximation, the Expr shape and its associated expr syntax class exclude only keyword terms, like #:when, so that macros can detect and report misuses of keyword arguments.

As an example, let’s implement my-when, a simple version of Racket’s when form. It takes two expressions; the first is the condition, and the second is the result to be evaluated only if the condition is true. Here is the shape:

;; (my-when Expr Expr) : Expr

Here are some examples:

(my-when (odd? 5) (printf "odd!\n")) ; expect print
(my-when (even? 5) (printf "even!\n")) ; expect no print

Here’s the implementation:

(define-syntax my-when
  (syntax-parser
    [(_ condition:expr result:expr)
     #'(if condition result (void))]))

We use the expr syntax class to annotate pattern variables that have the Expr shape. Note that the names of the pattern variables do not include the :expr annotation, so in the syntax template we simply write condition and result.

To test the macro, we rephrase the previous examples as tests:

> (check-equal? (with-output-to-string
                  (lambda () (my-when (odd? 5) (printf "odd!\n"))))
                "odd!\n")
> (check-equal? (with-output-to-string
                  (lambda () (my-when (even? 5) (printf "even!\n"))))
                "")
> (check-exn exn:fail:syntax?
             (lambda ()
               (convert-syntax-error
                (my-when #:truth "verity"))))

Exercise 1: Each of the following uses of my-when violates its declared shape:
(my-when #:true "verity")
(my-when 'ok (define ns '(1 2 3)) (length ns))
(my-when (odd? 1) (begin (define one 1) (+ one one)))
(my-when #f (+ #:one #:two))
Why? Which examples are rejected by the my-when macro itself, and what happens to the other examples? What difference does it make if you remove the expr syntax class annotations from the macro definition?

Exercise 2: Design a macro my-unless like my-when, except that it negates the condition.

Exercise 3: Design a macro catch-output that takes a single expression argument. The expression is evaluated, but its result is ignored; instead, the result of the macro is a string containing all of the output written by the expression. For example:
(catch-output (for ([i 10]) (printf "~s" i))) ; expect "0123456789"

3.2 The Body Shape

The Body shape is like Expr except that it indicates that the term will be used in a body context, so definitions are allowed in addition to expressions.

There is no distinct syntax class for Body; just use expr.

In practice, the Body shape is usually used with ellipses; see Compound Shapes. But we can make a version of my-when that takes a single Body term, even though it isn’t idiomatic Racket syntax. Here is the shape:

;; (my-when Expr Body) : Expr

Here is an example allowed by the new shape but not by the previous shape:

(define n 37)
(my-when (odd? n)
(begin (define q (quotient n 2)) (printf "q = ~s\n" q)))

Given the new shape, the previous implementation would be wrong, since it does not place its second argument in a body context. Here is an updated implementation:

(define-syntax my-when
  (syntax-parser
    [(_ condition:expr result-body:expr)
     #'(if condition (block result-body) (void))]))

That is, use (block ␣) to wrap a Body so it can be used in a strict Expr position. It is also common to use a (let () ␣) wrapper, but that does not work for all Body terms; it requires that the Body term ends with an expression. The block form is more flexible.

Racket’s #%expression form is useful in the opposite situation. It has the following shape:

;; (#%expression Expr) : Body

That is, use (#%expression ␣) to turn a Body position into a strict Expr position.

Exercise 4: Check your solution to Exercise 3; does the macro also accept Body terms like the one above? That is, does the following work?
(catch-output (begin (define q (quotient n 2)) (printf "q = ~s\n" q)))
If so, “fix it” (that is, make it more restrictive) using #%expression.

3.3 Proper Lexical Scoping, Part 2

Here is one solution to Exercise 3 using with-output-to-string:

; (catch-output Expr) : Expr
(define-syntax catch-output
  (syntax-parser
    [(_ e:expr)
     #'(with-output-to-string (lambda () (#%expression e)))]))

Racket already provides with-output-to-string from the racket/port library, but if it did not, we could define it as follows:

; with-output-to-string : (-> Any) -> String
(define (with-output-to-string proc)
  (let ([out (open-output-string)])
    (parameterize ((current-output-port out))
      (proc))
    (get-output-string out)))

Here is another implementation of catch-output, which essentially inlines the definition of with-output-to-string into the macro template:

; (catch-output Expr) : Expr
(define-syntax catch-output
  (syntax-parser
    [(_ e:expr)
     #'(let ([out (open-output-string)])
         (parameterize ((current-output-port out))
           (#%expression e))
         (get-output-string out))]))

In Proper Lexical Scoping we saw that we cannot interfere with a macro’s “free variables” by shadowing them at the macro use site. For example, the following attempt to capture the macro’s reference to get-output-string fails:

> (let ([get-output-string (lambda (p) "pwned!")])
(catch-output (printf "doing just fine, actually")))
"doing just fine, actually"

But what about the other direction? The macro introduces a binding of a variable named out; could this binding capture references to out in the expression given to the macro? Here is an example:

> (let ([out "Aisle 24"])
(catch-output (printf "The exit is located at ~a." out)))
"The exit is located at Aisle 24."

The result shows that the macro’s out binding does not interfere with the use-site’s out variable. We say that the catch-output macro is “hygienic”.

A macro is hygienic if it follows these two lexical scoping principles:

A use-site binding does not capture a definition-site reference.
A definition-site binding does not capture a use-site reference.

Racket macros are hygienic by default. In FIXME-REF we will discuss a few situations when it is useful to break hygiene.

3.4 The Id (Identifier) Shape

The Id shape contains all identifier terms.

The Id shape usually implies that the identifier will be used as the name for a variable, macro, or other sort of binding. In that case, we say the identifer is used as a binder.

Use the id syntax class for pattern variables whose shape is Id.

Let’s write a macro my-and-let that acts like and with two expressions but binds the result of the first expression to the given identifier before evaluating the second expression. Here is the shape:

;; (my-and-let Id Expr Expr) : Expr

Here are some examples:

(define ls '((a 1) (b 2) (c 3)))
(my-and-let entry (assoc 'b ls) (cadr entry)) ; expect 2
(my-and-let entry (assoc 'z ls) (cadr entry)) ; expect #f

Here is an implementation:

(define-syntax my-and-let
  (syntax-parser
    [(_ x:id e1:expr e2:expr)
     #'(let ([x e1]) (if x e2 #f))]))

The main point of my-and-let, though, is that if the second expression is evaluated, it is evaluated in an environment where the identifier is bound to the value of the first expression. Let’s put that information in the shape of my-and-let. It requires two changes:

Label the identifier so we can refer to it later. So instead of Id, we write x:Id. The label does not have to be the same as the name of the pattern variable, but it makes sense to use the same name here.
Add an environment annotation to the second Expr indicating that it’s in the scope of a variable whose name is whatever actual identifier x refers to: Expr{x}.

Here is the updated shape for my-and-let:

;; (my-and-let x:Id Expr Expr{x}) : Expr

We can check the implementation: e1 does not occur in the scope of x, and e2 does occur in the scope of x.

Here is another implementation:

; (my-and-let x:Id Expr Expr{x}) : Expr
(define-syntax my-and-let
  (syntax-parser
    [(_ x:id e1:expr e2:expr)
     #'(let ()
         (define x e1)    ; BAD
         (if x e2 #f))]))

This implementation is wrong, because e1 occurs in the scope of x, but it should not.

Here is another version:

; (my-and-let x:Id Expr Expr{x}) : Expr
(define-syntax my-and-let
  (syntax-parser
    [(_ x:id e1:expr e2:expr)
     #'(let ()
         (define tmp e1)
         (if tmp (let ([x tmp]) e2) #f))]))

This implementation is good (although more complicated than unnecessary), because e1 no longer occurs in the scope of x. But what about tmp? Because of hygiene, the definition of tmp introduced by the macro is not visible to e1. (To be clear, it would be wrong to write Expr{tmp} for the shape of the first expression.)

Exercise 5: Generalize my-and-let to my-if-let, which takes an extra expression argument which is the macro’s result if the condition is false. The macro should have the following shape:
;; (my-if-let x:Id Expr Expr{x} Expr) : Expr
Double-check your solution to make sure it follows the scoping specified by the shape.

3.5 Expressions, Types, and Contracts

Let’s design the macro my-match-pair, which takes an expression to destructure, two identifiers to bind as variables, and a result expression. Here are some examples:

(my-match-pair (list 1 2 3) n ns (< n (length ns)))
; expect #t
(my-match-pair (list 'p "hello world") tag content
(format "<~a>~a</~a>" tag (string-join content " ") tag))
; expect "<p>hello world</p>"

Here is one shape we could write for my-match-pair:

;; (my-match-pair Expr x:Id xs:Id Expr{x,xs}) : Expr

Here’s an implementation:

(define-syntax my-match-pair
  (syntax-parser
    [(_ pair:expr x:id xs:id result:expr)
     #'(let ([pair-v pair])
         (let ([x (car pair-v)]
               [xs (cdr pair-v)])
           result))]))

Note that we introduce a temporary variable (or auxiliary variable) named pair-v to avoid evaluating the pair expression twice.

We could add more information to the shape. The macro expects the first argument to be a pair, and whatever types of values the pair contains become the types of the identifiers:

;; (my-match-pair Expr[(cons T1 T2)] x:Id xs:Id Expr{x:T1,xs:T2}) : Expr

I’ve written Expr[(cons T1 T2)] for the shape of expressions of type (cons T1 T2), where the type (cons T1 T2) is the type of all pairs (values made with the cons constructor) whose first component has type T1 and whose second component has type T2. The second expression’s environment annotation includes the types of the variables. This macro shape is polymorphic; there is an implicit forall (T1, T2) at the beginning of the declaration.

The result of the macro is the result of the second expression, so the type of the macro is the same as the type of the second expression. We could add that to the shape too:

;; (my-match-pair Expr[(cons T1 T2)] x:Id xs:Id Expr{x:T1,xs:T2}[R]) : Expr[R]

Now the second Expr has both a environment annotation and a type annotation.

When I say “type” here, I’m not talking about Typed Racket or some other typed language implemented in Racket, nor do I mean that there’s a super-secret type checker hidden somewhere in Racket next to a flight simulator. By “type” I mean a semi-formal, unchecked description of expressions and macros that manipulate them. In this case, the shape declaration for my-match-pair warns the user that the first argument must produce a pair. If it doesn’t, the user has failed their obligations, and the macro may do bad things.

Of course, given human limitations, we would prefer the macro not to do bad things. Ideally, the macro definition and macro uses could be statically checked for compliance with shape declarations, but Racket does not not implement such a checker for macros. (It’s complicated.) At least, though, the macro enforce approximations of the types of expression arguments using contracts.

Use the expr/c syntax class for a pattern variable whose shape is Expr[Type] when Type has a useful contract approximation. In this example, the type (cons T1 T2) has a useful contract approximation pair?, but there is no useful contract for the type R. The expr/c syntax class takes an argument, so you cannot use the : notation; you must use ~var or #:declare instead. The argument is a syntax object representing the contract to apply to the expression. (It is #'pair? instead of pair? because the contract check is performed at run time.) In the syntax template, use the c ("contracted") attribute of the pattern variable to get the expression with a contract-checking wrapper. Here’s the contract-checked version of the macro:

; (my-match-pair Expr[(cons T1 T2)] x:Id xs:Id Expr{x:T1,xs:T2}[R]) : Expr[R]
(define-syntax my-match-pair
  (syntax-parser
    [(_ (~var pair (expr/c #'pair?)) x:id xs:id result:expr)
     #'(let ([pair-v pair.c])   ; Important: pair.c, not pair
         (let ([x (car pair-v)]
               [xs (cdr pair-v)])
           result))]))

Here’s the implementation using #:declare instead of ~var:

(define-syntax my-match-pair
  (syntax-parser
    [(_ pair x:id xs:id result:expr)
     #:declare pair (expr/c #'pair?)
     #'(let ([pair-v pair.c])   ; Important: pair.c, not pair
         (let ([x (car pair-v)]
               [xs (cdr pair-v)])
           result))]))

Now calling my-match-pair raises a contract violation if the first expression does not produce a pair. For example:

> (my-match-pair 'not-a-pair n ns (void))
my-match-pair: contract violation
  expected: pair?
  given: 'not-a-pair
  in: pair?
      macro argument contract
  contract from: top-level
  blaming: top-level
   (assuming the contract is correct)
  at: eval:32:0

Exercise 6: Modify the my-when macro to check that the condition expression produces a boolean value. (Note: this is not idiomatic for Racket conditional macros).

3.6 Uses of Expressions

In general, what can a macro do with an expression (Expr)?

It can use the value (or values) that the expression evaluates to. For example, the behavior of the my-when macro depends on the value that its first expression produces.
It can determine whether the expression is evaluated or when the expression is evaluated. The my-when example determines whether to evaluate its second expression. The standard delay macro is a classic example of controlling when an expression is evaluated.
It can change what dynamic context the expression is evaluated within. For example, a macro could use parameterize to evaluate the expression in a context with different values for some parameters.
It can change the static context the expression is evaluated within. Mainly, this means putting the expression in the scope of additional bindings, as we did in my-and-let and my-match-pair.

There are some restrictions on what macros can do and should do with expressions:

A macro cannot get the value of the expression at compile time. The expression represents computation that will occur later, at run time, perhaps on different machines, perhaps many times with different values in the run-time environment. A macro can only interact with an expression’s value by producing code to process the value at run time.
A macro must not look at the contents of the expression itself. Expressions are macro-extensible, so there is no grammar to guide case analysis. Interpreting expressions is the macro expander’s business, so don’t try it yourself. The macro expander is complicated, and if you attempt to duplicate its work “just a little”, you are likely to make unjustified assumptions and get it wrong. For example, an expression consisting of a self-quoting datum is not necessarily a constant, or even free of side effects; it might have a nonstandard #%datum binding, which could give it any behavior at all. Likewise, a plain identifier is not necessarily a variable reference; it might be an identifier macro, or it might have a nonstandard #%top binding.
In later sections (FIXME-REF), we’ll talk about how to cooperate with the macro expander to do case analysis of expressions and other forms.
In general, a macro should not duplicate an argument expression. That is, the expression should occur exactly once in the macro’s expansion. Duplicating expressions leads to expanding the same code multiple times, which can lead to slow compilation and bloated compiled code. The increases to both time and code size are potentially exponential, if duplicated expressions themselves contain macros that duplicate expressions and so on.
If you need to refer to an expression’s value multiple times, bind it to a temporary variable. If you need to evaluate the same expression multiple times, then bind a temporary variable to a thunk containing the expression and then apply the thunk multiple times.
One exception to this rule is if the macro knows that the expression is “simple”, like a variable reference or quoted constant, because the macro is private and all of its use sites can be inspected. We’ll discuss this case in Helper Macros and Simple Expressions.
In general, a macro should evaluate expressions in the same order that they appear (that is, “left to right”), unless it has a reason to do otherwise.
In Racket information generally flows from left to right, and the interpretation of later terms can depend on earlier terms. For example, my-when uses the value of its first (that is, left-most) expression argument to decide whether to evaluate its second (that is, right-most) expression. It would be non-idiomatic syntax design to put the condition expression second and the result expression first.
Similarly, the scope of an identifier is generally somewhere to the right of the identifier itself. For example, in my-match-pair, the identifiers are in scope in the following expression. If we swapped my-match-pair’s expressions, so it had the shape (my-match-pair Expr{x,xs} x:Id xs:Id Expr), that would not be idiomatic.

The same principles apply to Body terms as well.

contents ← prev up next →

1	Introduction
2	Terms and Shapes
3	Basic Shapes
4	Compound Shapes
5	Shape Definitions
6	Enumerated Shapes
7	Multi-Term Shapes
8	Recursive Shapes
9	Compile-Time Computation and Information
10	Unhygienic Macros

3.1	The Expr (Expression) Shape
3.2	The Body Shape
3.3	Proper Lexical Scoping, Part 2
3.4	The Id (Identifier) Shape
3.5	Expressions, Types, and Contracts
3.6	Uses of Expressions