< Previous | Next >
August 28, 2010 9:44 PM CDT by psilord in category Lisp

Hooked on Phonics

I've been learning about reader-macros (along with set-macro-character and get-macro-character) in Lisp recently. Reader macros are functions associated with characters--known as dispatching macro characters, in the readtable where the association may happen at load or runtime.

When invoked, the reader macro function is passed the stream and the character which invoked it. At this point the reader-macro can consume however many characters it wants (which could be enormous!) and implement a full lexical analysis and parsing algorithm for a totally different language embedded into Lisp. However, it must return either no value or a single lisp object--which could a list of other objects. Also, note that a reader macro may not have side effects because it is implementation defined if it'll get called a few times on the same portion of the stream due to things like backspace being used for standard input.

Examples of reader macro use would be to write a reader macro to convert a pure SQL statement into a lisp form, or convert Perl-like regular expressions into PPCRE calls (such as described in the book Let over Lambda). I'm curious about them because I want to write a heavily annotating lisp tokenizer which would have much of the behavior of a lisp reader, but keeps track of everything it does and where the tokens came from.

I found while writing the code for this post that knowing the above somehow wasn't good enough. I screwed up the understanding of reader macros pretty badly for a while. It could be because I'm likely stupid, but I'm going to believe that it is because of the Absinthe.

I've included the small examples I wrote so one can see how the reader macro works. After this, I'll describe the situation which really tripped me up for a while.

;;;; You are free to use, modify, and redistribute this
;;;; code. Attribution to me, Peter Keller (psilord@cs.wisc.edu), is
;;;; appreciated, but not required.  There is no warranty with this
;;;; code.

(defun reader-iota-0 (str ch)
  "(reader-iota-0 str ch)

  A reader-macro for forms like <ch><integer> which will return 
  a list of integers from 0 to (1- <integer>). If not quoted this list
  will be evaluated."

  (declare (ignorable ch))
  (let ((int (read str)))
    (loop for x from 0 to (1- int) collect x)))

(defun reader-iota-1 (str ch)
  "(reader-iota-1 str ch)

  A reader-macro for forms like <ch><integer> which will return 
  a list of integers from 1 to <integer>. If not quoted this list
  will be evaluated."

  (declare (ignorable ch))
  (let ((int (read str)))
    (loop for x from 1 to int collect x)))

(defmacro with-macro-character ((ch func) &body body)
  "(with-macro-character (ch func) body)

  Bind the reader macro func to ch and execute the body in this
  environment.  Restore the original reader-macros when this form is
  done."

  (let ((c (gensym))
        (f (gensym))
        (o (gensym)))
    `(let ((,c ,ch)
           (,f ,func))
       (let ((,o (get-macro-character ,c)))
         (set-macro-character ,c ,f)
         (unwind-protect
              (progn ,@body)
           (set-macro-character ,c ,o))))))

(defmacro with-macro-characters (pairs &body body)
  "(with-macro-characters ((ch1 func1) (ch1 func2) ...) body)

  Bind the reader macro func1 to ch1, and so on, and execute the body
  in this environment. Restore the original reader-macros when this
  form is done."

  (if (null pairs)
      `(progn ,@body)
      `(if (oddp (length ',(car pairs)))
           (error "with-macro-characters: ~A must be a pair of a character and a reader-macro-function" ',(car pairs))
           (with-macro-character ,(car pairs)
             (with-macro-characters ,(cdr pairs)
               ,@body)))))

(defun try-it ()
  (with-macro-characters
      ((#\! #'reader-iota-0)
       (#\@ #'reader-iota-1))
    (concatenate 'list
                 '(first)
                 (read-from-string "!10")
                 '(second)
                 (read-from-string "@10"))))

;; This next modification will take affect for the rest of the source
;; file including anything loaded after this since I'm doing it at the
;; toplevel. We use eval-when to ensure that the compiler's readtable
;; is modified while reading the source file.
;;
;; Note reversed assignment as opposed to the function try-it!
(eval-when (:compile-toplevel :load-toplevel :execute)
  (set-macro-character #\@ #'reader-iota-0)
  (set-macro-character #\! #'reader-iota-1))

(defun foo ()
  (let ((a '!10))
    (let ((b '@10))
      (concatenate 'list '(first) a '(second) b))))

The main thing I really learned was that a reader macro only is in effect when the lisp reader is active. That sounds completely obvious right--I mean, it is even in the name of the damn thing isn't it? However, there is one place where it is easy to misunderstand it. Lemme explain where it works and why first.

When you load a lisp file, load executes each toplevel form as it encounters it in order. If there happens to be a toplevel call to set-macro-character, it will be executed. The rest of the code in the lisp file, including anything loaded after that form, will have the reader macro available for use since the toplevel set-macro-character form adjusted the currently executing lisp reader which is in the process of loading the code. The function foo is a good example of this load time behavior since it occurs after the toplevel association of the reader macro functions.

Now, suppose you use the reader macro in a function, like try-it above. At this point, the readtable is only modified at runtime. The lisp reader is only affected when it is reading, suppose with READ, READ-CHAR, READ-SEQUENCE, or READ-FROM-STRING. We invoke the reader explicitly to convert the special form into lisp.

Here is the place where it doesn't intuitively work. Suppose I have this function which utilizes the reader macros defined above. It looks similar to the function foo in that you use the dispatch macro characters after you adjust the readtable, but it behaves very differently:

(defun wont-work ()
    (set-macro-character #\! #'reader-iota-0)
    (set-macro-character #\@ #'reader-iota-1)

    (let ((ret (concatenate 'list '(first) '!10 '(second) '@10)))

      (set-macro-character #\! nil)
      (set-macro-character #\@ nil)
      ret))

This won't work at all with the implication lexically suggested in the code. In fact, it won't even compile because you'll get an error similar to:

The value !10 is not of type SEQUENCE.

The reason why is simply that by the time this function is executing, it has already been completely processed by a lisp reader and '!10 and '@10 are symbols at execution time. No uses of the lisp reader occur in the function wont-work so changing the readtable at runtime here is meaningless. This took me longer than it should have to fully understand. There is much more to reader macros than I have mentioned here and I'm still reading the docs some more...

I will caution that messing with read macros, especially in a REPL, is often dangerous. If you mess it up or pick a bad character to be a dispatch macro character, say #\- or something along those lines, you'll get a pile of esoteric errors out of the corrupted lisp reader and it basically becomes useless. In these scenarios--unlike life, just exit the image, restart it, and try again.

End of Line.

< Previous | Next >