define vs attach
Racket’s macro system gives macros two ways of associating static (or compile-time) information with a particular name. One way is to define the name as the static information; the other is to attach the information to the already-defined name.
Much of the power of Racket macros comes from their ability to create and manipulate novel kinds of static information [MTWT]. For example, a struct definition associates the struct name with a static record containing the names (identifiers) of the struct’s functions. One client macros is match, which uses this information to transform a struct pattern into code that uses the struct’s predicate and accessors to recognize and destructure instances.
(struct point (x y)) |
(match v |
[(point x y) ....] |
....) |
⇒ |
(if (point? v) |
(let ([x (point-x v)] [y (point-y v)]) ....) |
(match v ....)) |
Another client is struct-out:
(struct point (x y)) |
(provide (struct-out point)) |
⇒ |
(provide point point? point-x point-y struct:point) |
Structs provide an interface to their static information (racket/struct-info), so programmers can create additional client macros.
The define approach
Racket enables this kind of association through an extension to define-syntax that allows the right-hand side to be any (compile-time) value, not just a macro transformer. In fact, in Racket any name bound via define-syntax to a function is treated as a macro. Likewise, a struct type is just a name bound via define-syntax to a static struct record. The code that struct produces to define point looks something like this:
(struct point (x y)) |
⇒ |
(begin |
(define-values (make-point point? point-x point-y) ....) |
(define-syntax point |
(make-struct-info .... #'make-point #'point? |
(list #'point-x #'point-y) |
...))) |
The code is similar for signatures and match expanders and sequence syntaxes and so on. Client macros fetch static information using the syntax-local-value function.
The define approach seems sensible as long as a name has a
single kind of information associated with it, but what if we want a
name to carry multiple kinds of static information? For example,
suppose we have an ORM library with a macro that defines struct types
that also carry information about their persistent storage. Or what if
we want a name to carry static information but also act as a value?
For example, a struct name carries the static information discussed
above, but it also acts as a constructor—
I will refer to an independent kind of information attachable to a name as a sense of information or of meaning. Note that a name having multiple senses of information is different from what some languages call “multiple namespaces.” Racket has a single “namespace” in that sense: all of the information attached to a name has the same scope. Consider the following code:
The lambda expression’s binding of point shadows not only the constructor meaning of point but also its struct-name meaning (used by match and struct-out).
The way a name can carry multiple senses of information is if its (compile-time) value is appropriate for representing each kind of information. Whether that is possible depends on the representations chosen for each sense. If one sense of information is represented as a vector and another as a list, then there is no way for one name to encompass both. But if both kinds of information are represented via generic interfaces (or, at a lower level, struct type properties), then we can simply create a new struct type that implements all of the required interfaces.
In particular, a name’s behavior as an expression is controlled by the
prop:procedure interface. So a struct name, for example, has
a compile-time value (created by make-struct-info) that
implements both a specialized struct-type information interface and
the prop:procedure interface. The first interface serves
clients such as match and struct-out, and the second
interface allows struct names to be used as constructors—
The attach approach
The problem with the definition approach, however, is that it relies
on all senses of a name to be known and available when the name is
defined. No more can be added later. Consider again the example of
match patterns. There are a number of built-in pattern
forms—
If the set of names to give new meaning to is small, fixed, and known
when the client macro is defined—
In general, the code to attach information to the sense sense of an identifier name will look something like the following:
(begin-for-syntax |
(sense-attach! #'name sense-information)) |
where sense-attach! is backed by a free-identifier table. The begin-for-syntax form is necessary to update the information in the correct phase and to make the information “persistent” by performing it every time the enclosing module is visited, not just once when the enclosing module is compiled [YWIW].
Using an identifier table relaxes the constraints imposed by the
definition-based mechanism: new meanings can be attached to a name
after its definition, and a name can carry many senses of meaning
without conflicts—
Typed Racket [LL], for example, uses the attach approach
for its type environment—
For Typed Racket, one benefit of attach over define is that the information survives macro expansion. Typed Racket works by fully expanding a module body, then analyzing (type checking) and transforming the code (to protect exports from untyped contexts, for example). Fully expanded code contains no macros or other syntax bindings, though, so typed variables cannot be represented as expression macros, and if they were bound to some other kinds of static information, they wouldn’t be usable as variables. Storing type information in a separate static table solves the problem.
There are drawbacks to the attach approach, though. It buys
flexibility in the form of (limited) mutation, and that has some
standard costs. The attach approach separates the definition of
a name from the point where some kind of static information is added
to the name, and there are no inherent constraints on when that
information can be added—
A bad answer would be to pick an arbitrary answer (perhaps the
“latest” information attached for that sense). That corresponds to
blindly updating the table, no matter if it already contained an entry
for the name in question. A better answer would be to raise an error
if multiple sites try to attach information to the same name. That
would be a “fail early” approach to conflicts. Another answer might
be to change the state associated with the name to some
“incompatible” indicator and only signal an error if that sense of
the name is actually used in the program—
In many cases, the best approach is to avoid the problem by strictly limiting the ways in which information can be attached to names. For example, a library might internally represent information via a compile-time table, so that it can attach meaning to a few existing names, but only export a form that follows a define-like discipline.
(define-sense name ....sense-info....) |
⇒ |
(begin (define-syntax name |
(lambda (stx) ....meaning-as-expr....)) |
(begin-for-syntax |
(sense-attach! #'name ....sense-info....))) |
Or it might mark the general attach facility as for power users or extension builders only.
Both define and attach approaches work with local bindings too, although the code produced for attach must be adapted: local bindings have no need for cross-module persistence, and begin-for-syntax cannot be used except at module level anyway. The attaching form should produce the following code instead:
(define-syntaxes () |
(begin |
(sense-attach! #'name ....sense-info....) |
(values))) |
In fact, this is the essense of what begin-for-syntax means for expressions anyway, so the same code can be used for both module contexts and local (definition) contexts.
Bibliography
[LL] | Sam Tobin-Hochstadt and Vincent St-Amour and Ryan Culpepper and Matthew Flatt and Matthias Felleisen, “Languages as libraries,” Programming Language Design and Implementation (PLDI), 2011. http://www.ccs.neu.edu/racket/pubs/pldi11-thacff.pdf | |
[MTWT] | Matthew Flatt, Ryan Culpepper, David Darais, and Robert Bruce Findler, “Macros that work together,” Journal of Functional Programming 22, 2012. http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=8573399 | |
[YWIW] | Matthew Flatt, “Composable and compilable macros: you want it when?,” International Conference on Functional Programming (ICFP), 2002. http://www.cs.utah.edu/plt/publications/macromod.pdf |