1 Basic Macrology

Ryan Culpepper
and Claire Alvis

This chapter introduces simple Racket macros and gives some advice for writing them.

1.1 Your First Macro

Suppose we wanted a feature, assert, that takes an expression and evaluates it, raising an error that includes the expression text if it does not evaluate to a true value. The result of the assert expression itself is (void).

Clearly, assert cannot be a function; a function cannot access the text of its arguments. It must be a macro.

We can use define-syntax-rule to define simple macros. A define-syntax-rule definition consists of two parts: a pattern and a template. The pattern describes what uses of the macros look like, and the template is used to construct the term that the macro is rewritten to. The pattern must start with the macro name; other identifiers occuring in the pattern are called pattern variables, and the terms that they match from the macro use are substituted into the template when the macro is rewritten. The pattern variables are essentially the “arguments” of the macro.

The first step is to write down an example of what a use of the macro should look like and what code that macro use should correspond to. Sometimes you can get away with a broad sketch of an example; other times—especially if you get stuck—it is helpful to be concrete.

The assert macro should be used like this:

(assert expression)

For example, say we have a list ls, and we wish to check whether the length of the list is greater or equal to one:

(assert (>= (length ls) 1))

This use should behave as if we had written

(unless (>= (length ls) 1)
(error 'assert "assertion failed: (>= (length ls) 1)"))

We can’t (yet) actually make the macro create the exact string above, but we can instead make it generate code that produces the right error at run time, with the help of the quote form and error’s built in formatting capabilities:

(unless (>= (length ls) 1)
(error 'assert "assertion failed: ~s" (quote (>= (length ls) 1))))

Lesson: Don’t fixate on the exact code you first write down for the macro’s example expansion. Often, you must change it slightly to make it easier for the macro to produce.

Lesson: It’s often simpler to produce an expression that does a computation at run time than to do the computation at compile time.

So we write the macro as follows:

(define-syntax-rule (assert expr)
(unless expr
(error 'assert "assertion failed: ~s" (quote expr))))

Whenever the macro expander encounters a use of the macro, like this:

(assert (>= (length ls) 1))

it substitutes the argument (>= (length ls) 1) for every occurrence of the pattern variable expr in the macro definition’s template—including the occurrence within the quote form:

(unless (>= (length ls) 1)
(error 'assert "assertion failed: ~s"
'(>= (length ls) 1)))

Exercise 1: Write a macro noisy-v1 that takes an expression expr and prints "evaluating expr\n" before evaluating the expression. The result of the macro should be the result of the expression. (Hint: use begin.)

Exercise 2: Write a macro noisy-v2 that takes an expression expr and prints "evaluating expr ..." before evaluating the expression and "done\n" afterwards. The result of the macro should be the result of the expression. (Hint: use begin0.)

1.2 Basic Macro Facts

A macro is a rewrite rule attached to an identifier, called the macro name, and it only rewrites terms matching its pattern, which must be a parenthesized term starting with the macro name. A macro cannot be used to define arbitrary rewriting rules over existing syntactic forms; for example, the following transformation is not a macro:

(if (not e1) e2 e3) ⇒ (if e1 e3 e2)

On the other hand, such a transformation could be implemented in the if syntactic form when it is defined. Later in this guide, we will discuss how to extend the power of macros to more general transformations.

Not every term in a program matching a macro’s pattern is rewritten (“expanded”). Macros are rewritten only in certain contexts, called expansion contexts—essentially, contexts where expressions or definitions may appear. For example, if assert is the macro defined above, then the following occurrences of assert are not uses of the macro:

(let ((assert (> 1 2))) ___)
(cond [assert (odd? 4)] [else ___])
'(assert #f)

The first occurrence of assert is in a let-binding; assert is interpreted as a variable name to bind to the value of (> 1 2). In the second line, the cond form treats assert and (odd? 4) as separate expressions, and the use of assert by itself is a syntax error (the use does not match assert’s pattern). In the final example, the assert occurs as part of a quoted constant.

Macros are expanded “outermost-first,” in contrast to nested function calls, which are evaluated “innermost-first.” The outermost-first expansion order is necessary because the macro expander only knows the expansion contexts of primitive syntactic forms; it must expand away the outer macros so that it knows what inner terms need to be expanded.

1.3 Auxiliary Variables and Hygiene

Suppose we want a macro my-or2 that expects two expressions e1 and e2. If e1 produces a true value, it returns that value; otherwise, it returns the value of e2. Here is a first attempt at defining of my-or2:

(define-syntax-rule (my-or2 e1 e2)
(if e1 e1 e2))

If e1 is a simple expression like #t, this definition will work just fine, but if is a complex expression involving macros, the macro expander must expand it twice. If it evaluates to a true value, the evaluation happens twice. Worse, if it contains side-effects, the side-effects are performed twice.

Lesson: A macro template should contain at most one reference to an expression argument.

So instead we should evaluate e1 first and save the result in a temporary variable x:

(define-syntax-rule (my-or2 e1 e2)
(let ([x e1])
(if x x e2)))

If we try this macro, it seems to work as we expect:

> (my-or2 #f #f)
#f
> (my-or2 #t #f)
#t
> (my-or2 5 7)
5
> (my-or2 #t (/ 1 0))
#t

Notice that in the final example, my-or2 returns #t without evaluating (/ 1 0), which would have raised an error. In other words, my-or2 “short-circuits” the evaluation of its second argument.

One cause for concern is the use of x as an auxiliary variable. Might this use of x interfere with a use of x in the expressions we give to my-or2?

For example, consider this use of the macro:

> (let ([x 5])
(my-or2 (even? x) (odd? x)))
#t

That makes sense; 5 is certainly either even or odd.

If we expand my-or2 by hand, however, we might expect to get the following:

> (let ([x 5])
    (let ([x (even? x)])
      (if x x (odd? x))))
odd?: contract violation
  expected: integer
  given: #f

But when we use the macro, it—surprisingly—behaves exactly as we wanted!

> (let ([x 5])
(my-or2 (even? x) (odd? x)))
#t

That is, the occurrence of x in (odd? x) refers to the binding of x around the use of the macro, not the binding introduced by its expansion.

This property of Racket macros is called hygiene. Instead of the “naive” expansion we wrote above, the Racket macro expander actually produces something like the following:

(let ([x 5])
(let ([x_1 (even? x)])
(if x_1 x_1 (odd? x))))

The macro expander distinguishes identifiers introduced by a macro and keeps them “separate” from identifiers given to the macro in its arguments. If an introduced identifier is used in a binding position, it does not capture identifiers of the same name in the macro’s arguments.

Similarly, references introduced by the macro are not captured by bindings of the same name in the context of the macro’s use. In the example above, the references to the let and if syntactic forms refer to the let and if bindings in scope at the macro definition site, regardless of what those names mean at the macro use site. You might not have thought of let and if as names that could be shadowed, but Racket uses the same binding rules for both variables and names of syntactic forms.

We will discuss the actual mechanism the macro expander uses to enforce hygiene later in the guide.

Lesson: An identifier that is part of a macro template neither captures references in a macro argument, if the identifier is used as a binder, nor is it captured by bindings in the environment where the macro is used, if the identifier is used as a reference.

1.4 Binding Forms

One of the most powerful and unique capabilities of macros is the ability to create new binding forms—macros that evaluate expressions in an environment extended with additional bindings.

A binding form accepts identifiers to bind, in addition to expressions, as arguments. (Hygiene prevents only identifiers present as literals in the macro template from binding references in macro arguments—binders that come from macro arguments do bind references in other macro arguments.)

For example, suppose we want a macro andlet1, which takes an identifier (binder) and two expressions. The first expression is evaluated in the environment of the macro use, without extensions. If it produces a true value, the second is evaluated in that environment extended with the identifier bound to the value of the first expression. In other words, the scope of the identifier is the second expression.

(define-syntax-rule (andlet1 var e1 e2)
(let ([var e1])
(if var e2 #f)))

By inspecting the macro template, we can see that e2 is in the scope of the let-binding of var, and e1 is not.

Note that the macro does not check that the var argument is an identifier—such syntax validation is not possible with define-syntax-rule. If the macro is given something else, it produces an invalid let expression, and let will signal a syntax error. We will discuss how to do syntax validation later in this guide.

Exercise 3 [solution]: Write a macro iflet that takes an identifier and three expressions. If the first expression (the condition) evaluates to a true value, that value is bound to the identifier and the second expression (the “then branch”) is evaluated in its scope; otherwise, the third expression is evaluated outside the scope of the identifier.
(define alist '((1 . apple) (2 . pear)))
(equal? (iflet x (assoc 1 alist) (cdr x) 'none) 'apple)
(equal? (let ([x 'plum]) (iflet x (assoc 3 alist) (cdr x) x)) 'plum)

Lesson: When designing a binding form, write tests that check that expressions have the right variables in scope—and don’t have the wrong variables in scope.

1.5 Changing an Expression’s Dynamic Context

So far, we’ve seen a few things that a macro can do with an expression argument. It can use its value (as in assert); it turn it into a datum using quote; it can extend the environment the expression is evaluated in; and it can decide whether or not to evaluate it (as in the short-circuiting my-or2).

Another thing a macro can do is affect the dynamic context an expression is evaluated in. For now, we’ll use parameters as the primary example of dynamic context, but others include threads, continuation marks, and dynamic-wind.

For example, consider a macro that evaluates its argument expression, throws away the value, and returns a string representing all of the output generated during the expression’s evaluation.

(define-syntax-rule (capture-output expr)
  (let ([out (open-output-string)])
    (parameterize ((current-output-port out))
      expr
      (get-output-string out))))

> (printf "hello world!")
hello world!
> (capture-output (printf "hello world!"))
"hello world!"

Exercise 4: Write a macro forever that takes an expression and evaluates it repeatedly in a loop.

Exercise 5: Write a macro handle that takes two expressions. It should evaluate the first expression, and if it returns a value, the result of the macro use is that value. If the first expression raises an exception, it evaluates the second expression and returns its result. Use with-handlers.
(equal? (handle 5 6) 5)
(equal? (handle (/ 1 0) 'whoops) 'whoops)

1.6 Keep Macros Simple

Recall capture-output from Changing an Expression’s Dynamic Context. The only reason it needs to be a macro at all is to delay the evaluation of its argument until it can place it in the proper context. Another way to implement capture-output is for the macro itself to only delay the evaluation of the expression—by turning it into a procedure—and rely on a function that implements the dynamic behavior.

(define-syntax-rule (capture-output e)
  (capture-output-fun (lambda () e)))

; capture-output-fun : (-> Any) -> String
(define (capture-output-fun thunk)
  (let ([out (open-output-string)])
    (parameterize ((current-output-port out))
      (thunk)
      (get-output-string out))))

The benefit of factoring out the dynamic part is twofold: it minimizes the size of the expanded code, and it allows you to test the helper function capture-output-fun.

Lesson: Keep the code introduced by a macro to a minimum. Use helper functions to implement complex dynamic behavior.

Exercise 6: Rewrite the handle and forever macros so that the dynamic behavior is implemented by a function.

Exercise 7 [solution]: Rewrite the andlet1 macro so that the dynamic behavior is implemented by a function. What happens to the identifier argument? (Note: this macro is so simple that there is no benefit to creating a separate function to handle it. Do it anyway; it’s an instructive example.)

Exercise 8: Write a macro test that behaves similarly to test-equal? from rackunit. It takes three expressions: a test name (string) expression, an “actual” expression, and an “expected” expression. If the expressions evaluate to equal? values, then it prints nothing and returns (void); if the expressions evaluate to values that are not equal, then it prints "test test-name failed\n"; and if either expression raises an error, it prints "test test-name failed\n".

Exercise 9: Every macro you’ve written (or rewritten) in this section produces a function applied to one or more expressions with lambda wrapped around them. In which cases is the macro worthwhile, and in which cases would it be better to define a function instead and let users write the lambdas themselves?

Lesson: Not every macro that can be written needs to be written.

1.7 Ellipsis Patterns and Templates

Let us continue with the capture-output macro, and let us suppose that we want to extend it to take multiple expressions, which should be evaluated in order, and the output of all of them captured and combined.

The macro system we are using gives us a convenient way to represent a sequence of arbitrarily many expressions. We indicate that a macro accepts an arbitrary number of arguments with a ... in the pattern after the pattern variable. Then, in the template, we use ... after the code that contains the pattern variable—in this case, just the pattern variable itself.

(define-syntax-rule (capture-output e ...)
(capture-output-fun (lambda () e ...)))

> (capture-output
    (displayln "I am the eggman")
    (displayln "They are the eggmen")
    (displayln "I am the walrus"))
"I am the eggman\nThey are the eggmen\nI am the walrus\n"

Unfortunately, this implementation breaks when capture-output is called with no arguments.

> (capture-output)
eval:17:0: lambda: bad syntax
in: (lambda ())

When capture-output is called with no arguments, it produces a lambda with an empty body, which is illegal. One way to fix this is to insert a final expression within the lambda:

(define-syntax-rule (capture-output e ...)
(capture-output-fun (lambda () e ... (void))))

> (capture-output)
""

Alternatively, the macro-writer might require capture-output be called with at least one argument. Then, the (capture-output) call will give a more meaningful error message, as seen below.

(define-syntax-rule (capture-output e1 e2 ...)
(capture-output-fun (lambda () e1 e2 ...)))

> (capture-output)
eval:23:0: capture-output: use does not match pattern:
(capture-output e1 e2 ...)
in: (capture-output)

Yet another option is to wrap each expression individually and pass a list of functions to the auxiliary function:

(define-syntax-rule (capture-output e ...)
  (capture-output-fun (list (lambda () e) ...)))

(define (capture-output-fun thunks)
  (let ([out (open-output-string)])
    (parameterize ((current-output-port out))
      (for ([thunk (in-list thunks)]) (thunk))
      (get-output-string out))))

Lesson: With ellipses, keep the empty case in mind, and make sure its expansion is legal.

Exercise 10: Write my-and and my-or macros that use ellipses to take any number of expressions. (Do not use Racket’s and or or forms in your solution.)

1.8 Ellipses with Complex Patterns

Consider a simplified version of Racket’s let form with the following syntax:

syntax
(my-let ([var-id rhs-expr] ...) body-expr)

The my-let form evaluates all of the rhs-exprs, binds them to the var-ids, and returns the value of the body-expr evaluated in the scope of all of the var-ids.

We can use ellipses after a complex pattern, not just after a simple pattern variable, as long as the components of the pattern are treated uniformly in the template:

(define-syntax-rule (my-let ([var-id rhs-expr] ...) body-expr)
((lambda (var-id ...) body-expr) rhs-expr ...))

We can also implement my-letrec, which has the same syntax as my-let but evaluates all of the right-hand side expressions in the scope of all of bound variables.

(define-syntax-rule (my-letrec ([var-id rhs-expr] ...) body-expr)
  (my-let ([var-id #f] ...)
    (set! var-id rhs-expr) ...
    body-expr))

Exercise 11: Recall that define-syntax-rule does no validation; it may be given var-id arguments that aren’t identifiers. Explore what happens when you misuse the macro. Find two expressions—misuses of my-let—that cause different syntax errors to be reported by lambda. Find a misuse of my-let that produces a syntactically valid expression that raises a run-time error when evaluated. Find a misuse of my-let that runs without error.

Exercise 12: Unlike let and letrec, let* cannot be implemented using define-syntax-rule and ellipses. Why not? Think about this before reading the next section.

Exercise 13 [solution]: Write a macro my-cond-v0, which has the pattern (my-cond-v0 [question-expr answer-expr] ...) and acts like Racket’s cond form. Hint: if the dynamic representation of an expression is a procedure, what is the dynamic representation of a my-cond-v0 clause?

1.9 Recursive Macros

Consider Racket’s let* form. We cannot implement such a macro using define-syntax-rule, because handling sequences of terms requires ellipses, and ellipses require that the components of the repeated pattern are handled uniformly in the template. The scope of each identifier bound by let* includes the body as well as every following right-hand side expression. In other words, the scope of the bound variables is non-uniform; alternatively, the environment of the right-hand side expressions is non-uniform. So we cannot implement let* with ellipses (uniform treatment) unless we already have a target form that implements that non-uniform binding structure. (If we are allowed to expand into let*, then of course implementing my-let* using define-syntax-rule is trivial.)

What is would a plausible expansion of my-let* look like, then? Here’s one:

(my-let* ([x 1] [y (f x)] [z (g x y)]) (h x y z))
⇒
(let ([x 1])
  (let ([y (f x)])
    (let ([z (g x y)])
      (h x y z))))

Clearly, my-let* has a nice recursive structure. We could implement it if we were able to define recursive macros that could expand using different templates given different input patterns.

Such recursive macros can be defined with syntax-rules. A macro definition has the following form:

(define-syntax macro-id
(syntax-rules (literal-id ...)
[pattern template] ...))

The macro’s clauses are tried in order, and the macro is rewritten using the template that corresponds to the first pattern that matches the macro use. We do not need the literal-id list yet; for now it will be empty.

Here is the definition of my-let*:

(define-syntax my-let*
  (syntax-rules ()
    [(my-let* () body-expr)
     body-expr]
    [(my-let* ([id rhs-expr] binding ...) body-expr)
     (let ([id rhs-expr]) (my-let* (binding ...) body-expr))]))

Inspect the macro definition and confirm that in each case, the scope of one of the bound identifiers consists of the following right-hand side expressions and the body expression.

Exercise 14: Rewrite my-and and my-or as recursive macros.

1.10 Matching Literal Identifiers

Now let’s consider a simplified version of Racket’s cond form. Here’s the syntax:

syntax
(my-cond clause ... maybe-else-clause)

clause = [test-expr answer-expr]

maybe-else-clause =
| [else answer-expr]

The my-cond form tries each clause in order; it evaluates the test-expr and if the result is a true value, it returns the value of the corresponding answer-expr. If no test-expr evaluates to a true value, the result is the answer-expr of the else clause, if it is present, or (void) otherwise.

Note the two kinds of clauses: only the last clause of the mycond expression can be an else clause. The empty line in the definition of the maybe-else-clause nonterminal means that the term might be absent. So our recursive macro must have two base cases.

We can recognize else by including else it in the macro’s literals list; then uses of else in a pattern are not pattern variables, but instead match only other occurrences of that identifier.

(define-syntax my-cond
  (syntax-rules (else)
    [(my-cond)
     (void)]
    [(my-cond [else answer-expr])
     answer-expr]
    [(my-cond [question-expr answer-expr] clause ...)
     (if question-expr
         answer-expr
         (my-cond clause ...))]))

Exercise 15: Extend my-cond with => clauses as in Racket’s cond form. Test your macro thoroughly to make sure you put the macro’s patterns in the right order. Try the clauses in a bad order and discover what happens when you use the macro.

Exercise 16: Extend my-cond so that normal clauses can have arbitrarily many answer-expr expressions. If there are no answer expressions, then the value of the question-expr is returned if it is a true value. Again, test to make sure the macro’s clauses are in the right order.

1.11 Helper Macros and Private Variables

Consider the Racket case form. Let’s write a macro my-case, which has similar syntax and behavior. Here’s the syntax of my-case:

syntax
(my-case val-expr clause ... maybe-else-clause)

clause = [(datum ...) result-expr]

maybe-else-clause =
| [else result-expr]

The my-case form evaluates val-expr and finds the first clause with a datum that is equal? to the value of val-expr; the macro’s result is the value of that clause’s result-expr. If no clause has a matching datum, the result is the else clause’s result-expr, if present, or (void) otherwise.

Here’s a first (bad) attempt at writing my-case. Take a minute and look at it before you continue reading. See if you can find the problem (or problems).

(define-syntax my-case-v0
  (syntax-rules (else)
    [(my-case-v0 val)
     (void)]
    [(my-case-v0 val [else result-expr])
     result-expr]
    [(my-case-v0 val [(datum ...) result-expr] clause ...)
     (if (member val '(datum ...))
         result-expr
         (my-case-v0 val clause ...))]))

What’s wrong with this macro?

The problem is that the first argument, val, is an expression, and the final template of the macro contains multiple references to it. The expression is duplicated. One solution would be to create a let-bound variable in the third template. That would be adequate to fix this issue.

There’s another, although less disastrous, peculiarity about this macro. If the my-case-v0 expression has no clauses (or none besides an else clause), then the value expression is not evaluated at all! But we expect that expression to always be evaluated; it is a “strict” subexpression of my-case.

In cases like these, it is useful to evaluate the strict expressions once, at the “beginning” of the macro, and store their values in private variables. If there are multiple strict expressions, syntax ergonomics suggests they should be evaluated in order. If there is validation to be done on the strict expression arguments (if a particular type is expected, for example), it should also be done at this time. The rest of the macro’s work can be done by a helper macro.

A private variable is a variable created by a macro and passed to its helper macros but not exposed to the user. In particular, it can be duplicated freely, because an expression known to be a variable cannot have side effects. In addition, if the macro does not mutate the variable (nor any of the macro’s helpers), then it can be trusted to maintain its value, and not be concurrently mutated by another thread.

Here’s a fixed version of the macro. I use the suffix -pv for private variable.

(define-syntax-rule (my-case val-expr clause ...)
  (let ([v val-expr])
    (my-case* v clause ...)))

(define-syntax my-case*
  (syntax-rules (else)
    [(my-case* val-pv)
     (void)]
    [(my-case* val-pv [else result-expr])
     result-expr]
    [(my-case* val-pv [(datum ...) result-expr] clause ...)
     (if (member val-pv '(datum ...))
         result-expr
         (my-case val-pv clause ...))]))

Note that the only way to get a private variable is to create one or to get one from a trusted source. If the helper macro, my-case*, were exported from its module, for example, then it could no longer trust that its val-pv argument was in fact a private variable.

Exercise 17 (★) [solution]: Write a macro minimatch1 with the following syntax:
syntax
(minimatch1 val-expr mm-pattern result-expr)

mm-pattern = variable-id
| (quote datum)
| (cons first-mm-pattern rest-mm-pattern)
The minimatch1 macro should act like match restricted to a single clause and restricted to the pattern grammar above. The result-expr should be evaluated in the scope of all of the variable-ids in the mm-pattern. If the value produced by val-expr does not match the mm-pattern, raise an error.

Exercise 18 (★) [solution]: Write a macro minimatch with the following syntax:
syntax
(minimatch val-expr clause ...)

clause = [mm-pattern result-expr]

mm-pattern = variable-id
| (quote datum)
| (cons first-mm-pattern rest-mm-pattern)
The minimatch macro should act like match restricted to the pattern grammar above. Each mm-pattern is tried in order until one matches; then the corresponding result-expr is evaluated in the scope of all of the mm-pattern’s variable-ids. If the value produced by val-expr does not match any mm-pattern, raise an error.
Hint: Implementing pattern matching generally involves recurring through the structure of the pattern while keeping track of what to do on success as well as what to do on failure. Write the macro (and its helpers) using expressions to represent the success and failure actions.

1.12 Basic Macrology Review

Use define-syntax-rule to define simple pattern-based macros. Use syntax-rules to discriminate between different patterns and write recursive macros. Both systems implement proper lexical scoping, or hygiene. Keep in mind that neither system does syntax validation; we discuss validation in the next section.

When designing a macro, start by writing an example use and the expected expansion. You may need to adjust the expansion to make it easier for a macro to produce. Use helper functions to implement complex run-time behavior, and use helper macros to implement complex syntactic transformations.

For every expression argument to a macro, consider that expression’s static and dynamic treatment. A macro can put an expression in the scope of variable bindings—an expression’s environment is its primary form of static context. Dynamic treatment includes determining whether an expression is evaluated, the number of times it is evaluated, and the order in which it is evaluated. A macro can also evalute an expression in a modified dynamic context—for example, within an exception handler or with different parameter values installed via parameterize.

← prev up next →

	Preliminaries
1	Basic Macrology
2	Specifying and Validating Syntax
3	Solutions for Selected Exercises
4	More Topics to Cover

1.1	Your First Macro
1.2	Basic Macro Facts
1.3	Auxiliary Variables and Hygiene
1.4	Binding Forms
1.5	Changing an Expression’s Dynamic Context
1.6	Keep Macros Simple
1.7	Ellipsis Patterns and Templates
1.8	Ellipses with Complex Patterns
1.9	Recursive Macros
1.10	Matching Literal Identifiers
1.11	Helper Macros and Private Variables
1.12	Basic Macrology Review