On this page:
2.1 Terms
2.2 Interpretations of Terms
2.3 Shapes

2 Terms and Shapes

This section introduces terminology for talking about the pieces of Racket programs and their interpretation. In particular, it introduces the idea of shapes, which we will use as the specification language that drives macro design and organizes implementation strategies.

2.1 Terms

Consider the following Racket code:

(define (map f xs)
  (cond [(pair? xs)
         (cons (f (car xs)) (map f (cdr xs)))]
        [(null? xs) '()]))

The code is a tree of terms. A term is, roughly, an atom or a parenthesized group of terms. So all of the following are terms:
  • define, map, xs, pair? More specifically, these are identifier terms.

  • (pair? xs), (f (car xs)), [(null? xs) '()] More specifically, these are list terms.

  • '() This is also a list term, because it is read as (quote ()).

The following is not a term:
  • map f That’s two terms.

The following are also terms that occur in the program above, even though it might not be immediately apparent:
  • (f xs) Because (map f xs) is the same as (map . (f xs)), which is also the same as (map . (f . (xs . ()))).

  • quote Because it’s a subterm of '(), which is the same as (quote ()).

Here are some other terms that don’t appear in the program above:
  • #t, 5, #e1e3, "racket-lang.org", #(1 2 3), #s(point 3 4), #:unless, #rx"[01]+" A boolean term, two number terms, a string term, and so on.

Racket represents terms using syntax objects, a kind of value.

It will be helpful to keep the two levels separate (term vs value representation), but that’s hard, because we don’t have enough distinct terms (err, I mean words) to name everything. In some cases, the context should either make the usage clear or make the distinction moot. In some cases, I’ll disambiguate by saying, for example, identifier term vs identifier value.

2.2 Interpretations of Terms

What is an expression?

The concept of “expression” doesn’t simply refer to some subset of terms. Any term can be an expression, given the right context. And a term might be an expression when used in one place but not when used in another. Is the identifier f an expression? In the example code above, the first occurrence of f is not an expression, but the second and third occurrences are expressions. It depends on context — that is, where the term appears in the code. The term f isn’t an expression when it occurs in the function definition’s formal parameter list, but it is an expression when it occurs in operator position of an application. What is an “application”? Well, it’s a kind of expression — and so we have to keep looking outward to figure out what’s going on.

Here’s the reasoning for the second and third occurrences of f being expressions: The example is a use of define, and the rule for define is that the body is an expression (that’s an oversimplification, actually). The body is a cond expression, and a cond expression’s arguments are “clauses”, which are not expressions themselves, but consist of two expressions grouped together (again, oversimplified). The second expression of the first cond clause is a function call to cons, so its first argument is an expression. And that expression is a function call (because f is bound as a variable), so that f is an expression. And that’s how we know, starting from the top.

Of course, if we wrap quote around the whole thing, then all of that reasoning is invalidated, because the argument of a quote expression is not interpreted as a definition or expression.

So “expression” doesn’t refer to a subset of terms (decidable or not). But that doesn’t mean that it isn’t an important concept. Rather, “expression” describes an interpretation or intended usage of a term. Here are names for the main interpretations that are handled by Racket’s macro expander:
  • expression or expression term Used in an expression position, like the test of an if or an argument to a function.

  • body term Used as one element of a lambda body, let body, etc. A “body” is also called an “internal definition context”.

  • module-level term Used as one element of a module body or submodule body.

  • top-level term Used at the top level, for example at the REPL or in a call to eval.

The word form is used to identify a variant of expression, module-level term, etc. The concept of “variant” usually coincides with the leading identifier of the term. For example: if is an expression form; provide is a module-level form but not a top-level form, but require is allowed both as a module-level form and as a top-level form.

The word form can also refer to the entire term, as in “(require racket/list) is a module-level form”.

The word definition refers to a subset of the body forms, roughly. In fact, we could say that a body term is either an expression or a definition.

2.3 Shapes

When we design a macro, the intended interpretation of an argument can be as important or more important than the set of terms allowed for that argument. To usefully describe macros and the ways they treat their arguments, we need to talk about both of these aspects. We’ll do that with a semi-formal description language of shapes.

A shape has two aspects:
  • the set of terms belonging to the shape, and

  • the interpretation or intended usage of the terms of that shape

Different basic shapes place different degrees of emphasis on these two aspects.

A shape is not the same thing as a syntax pattern, although there is generally a correspondence between shapes and patterns. In particular, we’ll use implement basic shapes using syntax classes. A syntax class check terms for membership in the shape’s set of terms and it can compute attributes related to the interpretation of the shape. But a syntax class cannot always check every aspect of a shape’s interpretation; for example, a syntax class cannot verify that we use a term in an expression position in the code that we generate. That obligation stays with the macro writer.

The following sections introduce different shapes and show how they affect the design and implementation of macros that use them.