< Previous | Next >

Angry Unix Programmer

Segmentation fault
Debugger process has died
Sanity breaking.
August 28, 2010 9:44 PM CDT by psilord in category Lisp

Hooked on Phonics

I've been learning about reader-macros (along with set-macro-character and get-macro-character) in Lisp recently. Reader macros are functions associated with characters--known as dispatching macro characters, in the readtable where the association may happen at load or runtime.

When invoked, the reader macro function is passed the stream and the character which invoked it. At this point the reader-macro can consume however many characters it wants (which could be enormous!) and implement a full lexical analysis and parsing algorithm for a totally different language embedded into Lisp. However, it must return either no value or a single lisp object--which could a list of other objects. Also, note that a reader macro may not have side effects because it is implementation defined if it'll get called a few times on the same portion of the stream due to things like backspace being used for standard input.

Examples of reader macro use would be to write a reader macro to convert a pure SQL statement into a lisp form, or convert Perl-like regular expressions into PPCRE calls (such as described in the book Let over Lambda). I'm curious about them because I want to write a heavily annotating lisp tokenizer which would have much of the behavior of a lisp reader, but keeps track of everything it does and where the tokens came from.

I found while writing the code for this post that knowing the above somehow wasn't good enough. I screwed up the understanding of reader macros pretty badly for a while. It could be because I'm likely stupid, but I'm going to believe that it is because of the Absinthe.

I've included the small examples I wrote so one can see how the reader macro works. After this, I'll describe the situation which really tripped me up for a while.

;;;; You are free to use, modify, and redistribute this
;;;; code. Attribution to me, Peter Keller (psilord@cs.wisc.edu), is
;;;; appreciated, but not required.  There is no warranty with this
;;;; code.

(defun reader-iota-0 (str ch)
  "(reader-iota-0 str ch)

  A reader-macro for forms like <ch><integer> which will return 
  a list of integers from 0 to (1- <integer>). If not quoted this list
  will be evaluated."

  (declare (ignorable ch))
  (let ((int (read str)))
    (loop for x from 0 to (1- int) collect x)))

(defun reader-iota-1 (str ch)
  "(reader-iota-1 str ch)

  A reader-macro for forms like <ch><integer> which will return 
  a list of integers from 1 to <integer>. If not quoted this list
  will be evaluated."

  (declare (ignorable ch))
  (let ((int (read str)))
    (loop for x from 1 to int collect x)))

(defmacro with-macro-character ((ch func) &body body)
  "(with-macro-character (ch func) body)

  Bind the reader macro func to ch and execute the body in this
  environment.  Restore the original reader-macros when this form is
  done."

  (let ((c (gensym))
        (f (gensym))
        (o (gensym)))
    `(let ((,c ,ch)
           (,f ,func))
       (let ((,o (get-macro-character ,c)))
         (set-macro-character ,c ,f)
         (unwind-protect
              (progn ,@body)
           (set-macro-character ,c ,o))))))

(defmacro with-macro-characters (pairs &body body)
  "(with-macro-characters ((ch1 func1) (ch1 func2) ...) body)

  Bind the reader macro func1 to ch1, and so on, and execute the body
  in this environment. Restore the original reader-macros when this
  form is done."

  (if (null pairs)
      `(progn ,@body)
      `(if (oddp (length ',(car pairs)))
           (error "with-macro-characters: ~A must be a pair of a character and a reader-macro-function" ',(car pairs))
           (with-macro-character ,(car pairs)
             (with-macro-characters ,(cdr pairs)
               ,@body)))))

(defun try-it ()
  (with-macro-characters
      ((#\! #'reader-iota-0)
       (#\@ #'reader-iota-1))
    (concatenate 'list
                 '(first)
                 (read-from-string "!10")
                 '(second)
                 (read-from-string "@10"))))

;; This next modification will take affect for the rest of the source
;; file including anything loaded after this since I'm doing it at the
;; toplevel. We use eval-when to ensure that the compiler's readtable
;; is modified while reading the source file.
;;
;; Note reversed assignment as opposed to the function try-it!
(eval-when (:compile-toplevel :load-toplevel :execute)
  (set-macro-character #\@ #'reader-iota-0)
  (set-macro-character #\! #'reader-iota-1))

(defun foo ()
  (let ((a '!10))
    (let ((b '@10))
      (concatenate 'list '(first) a '(second) b))))

The main thing I really learned was that a reader macro only is in effect when the lisp reader is active. That sounds completely obvious right--I mean, it is even in the name of the damn thing isn't it? However, there is one place where it is easy to misunderstand it. Lemme explain where it works and why first.

When you load a lisp file, load executes each toplevel form as it encounters it in order. If there happens to be a toplevel call to set-macro-character, it will be executed. The rest of the code in the lisp file, including anything loaded after that form, will have the reader macro available for use since the toplevel set-macro-character form adjusted the currently executing lisp reader which is in the process of loading the code. The function foo is a good example of this load time behavior since it occurs after the toplevel association of the reader macro functions.

Now, suppose you use the reader macro in a function, like try-it above. At this point, the readtable is only modified at runtime. The lisp reader is only affected when it is reading, suppose with READ, READ-CHAR, READ-SEQUENCE, or READ-FROM-STRING. We invoke the reader explicitly to convert the special form into lisp.

Here is the place where it doesn't intuitively work. Suppose I have this function which utilizes the reader macros defined above. It looks similar to the function foo in that you use the dispatch macro characters after you adjust the readtable, but it behaves very differently:

(defun wont-work ()
    (set-macro-character #\! #'reader-iota-0)
    (set-macro-character #\@ #'reader-iota-1)

    (let ((ret (concatenate 'list '(first) '!10 '(second) '@10)))

      (set-macro-character #\! nil)
      (set-macro-character #\@ nil)
      ret))

This won't work at all with the implication lexically suggested in the code. In fact, it won't even compile because you'll get an error similar to:

The value !10 is not of type SEQUENCE.

The reason why is simply that by the time this function is executing, it has already been completely processed by a lisp reader and '!10 and '@10 are symbols at execution time. No uses of the lisp reader occur in the function wont-work so changing the readtable at runtime here is meaningless. This took me longer than it should have to fully understand. There is much more to reader macros than I have mentioned here and I'm still reading the docs some more...

I will caution that messing with read macros, especially in a REPL, is often dangerous. If you mess it up or pick a bad character to be a dispatch macro character, say #\- or something along those lines, you'll get a pile of esoteric errors out of the corrupted lisp reader and it basically becomes useless. In these scenarios--unlike life, just exit the image, restart it, and try again.

End of Line.

July 16, 2010 12:26 AM CDT by psilord in category Unknown Kadath

Zebra Crossing

Meh. I'm feeling black and white.

End of Line.

July 13, 2010 11:03 PM CDT by psilord in category Lisp

It Was Fun

On a whim one night, I made up a very simple language called LISP-LITE which is really a subset of C in lisp syntax form. Then I wrote a Common Lisp translator for the sexp representation of it to C. None of this is new and the technique is as old as dirt, but it was fun and had high hack value. One day I might extend it into the inside of a Common Lisp compiler in order to author C in the middle of Lisp while sharing lexical scope and whatnot. Or maybe I'll just drink some more booze and listen to more Death Metal. You never know...

Edit: I fixed the code to actually produce compilable C and emit the semicolons where they should have been emitted in the translator.

;; A late night hack cause I was bored and curious.
;; This code is under the Apache v2.0 license.
;; Peter Keller (psilord@cs.wisc.edu)
;;
;; A brain dead translator from a lisp-like language (described below) to C.
;; See Structure and Interpretation of Computer Programs for a very full
;; explanation of this idea. Lisp in Small Pieces builds entire compilers
;; based upon this idea. Homoiconicity is what makes lisp so appealing to me,
;; far more than most other languages.

;; LISP-LITE is defined thusly:
;; cfunc defines a function which must include types in the param list.
;; All function names are translated to lower case.
;; (cfunc name ((type1 var1) (type2 var2) ...) ...)
;;
;; creturn does a c return on an expression
;; (creturn expr)
;;
;; cassign assigns the value of an expression to a symbol
;; (cassign symbol value)
;;
;; clet allows one to declare new variables for later use.
;; (clet ((type1 var1) (type2 var2) ..) ...)
;;
;; cif allows a choice to be made
;; (cif cond-expr true-expr false-expr)
;;
;; cliteral allows emitting of here-doc style lisp code.
;; (cliteral "#include <stdio.h>")
;;
;; Operators + - * / < > <= >= MUST take two arguments only
;;
;; No closures or anything cool like that, just a very simple translator

;; Simple emitter function.
(defmacro emit (fmt &rest args)
  `(format t ,fmt ,@args))

;; Translate a form, and all subforms, into C.
(defun xlate (form)
  (cond
    ;; In LISP-LITE, I'm loosly defining a number. The long and the
    ;; short of it is, don't use rationals, complex numbers, etc.
    ((numberp form)
     (emit "~A" form))

    ;; All variables are lower case. You should only use valid C tokens.
    ((symbolp form)
     (emit "~(~A~)" form))

    ((stringp form)
     (emit "\"~A\"" form))

    ((equal (car form) 'cfunc)
     (apply #'xlate-cfunc (cdr form)))

    ((equal (car form) 'creturn)
     (apply #'xlate-creturn (cdr form)))

    ((intersection '(+ - * / < > == <= >=) (list (car form)))
     (xlate-op form))

    ((equal (car form) 'cassign)
     (xlate-cassign (cdr form)))

    ((equal (car form) 'clet)
     (apply #'xlate-clet (cdr form)))

    ((equal (car form) 'cfuncall)
     (apply #'xlate-cfuncall (cdr form)))

    ((equal (car form) 'cif)
     (apply #'xlate-cif (cdr form)))

    ;; Dump a here-doc-like string out which is C code.
    ;; Used when I need boilerplate code.
    ((equal (car form) 'cliteral)
     (apply #'xlate-cliteral (cdr form)))

    (t
     (emit "PARSE ERROR: ~A~%" (car form)))))

(defun xlate-cfunc (return-type name args &rest body)

  ;; emit function header
  (emit "/* Translating function ~(~A~) */~%" name)
  (emit "~(~A~) ~(~A~)" return-type name)

  ;; emit arguments, slather love on format.
  (emit "(~{~{~(~A~) ~(~A~)~}~^, ~})~%" args)

  (emit "{~%")
  ;; emit all body expressions
  (dolist (form body)
    (xlate form)
    (emit ";~%"))
  (emit "}~%"))

(defun xlate-creturn (form)
  (emit "return ")
  (xlate form))

;; (+ 1 2) === 1 + 2
(defun xlate-op (form)
  (emit "(")
  (xlate (cadr form))
  (emit " ~A " (car form))
  (xlate (caddr form))
  (emit ")"))

;; (cassign x 10) === x = 10
(defun xlate-cassign (form)
  (emit "~(~A~) = " (car form))
  (xlate (cadr form)))

(defun xlate-clet (bindings &rest body)
  ;; in a new scope
  (emit "{ /* new scope for clet */~%")

  ;; emit declarations
  (dolist (b bindings)
    (emit "~(~A~) ~(~A~);~%" (car b) (cadr b)))

  ;; emit all body expressions
  (dolist (form body)
    (xlate form)
    (emit ";~%"))

  ;; close scope
  (emit "}~%"))

(defun xlate-cfuncall (name &rest params)
  (emit "~(~A~)(" name)

  ;; This is just to get the commas right cause my format-fu isn't up to
  ;; snuff...
  (dotimes (idx (length params))
    (xlate (nth idx params))
    (when (/= idx (1- (length params)))
      (emit ", ")))

  (emit ")"))

;; We only allow one expression, like lisp, need cprogn equivalent to have a
;; translation into C.
(defun xlate-cif (conditional true false)
  (emit "if (")
  (xlate conditional)
  (emit ") {~%")
  (xlate true)
  (emit ";~%")
  (emit "} else {~%")
  (xlate false)
  (emit ";~%")
  (emit "}~%"))

(defun xlate-cliteral (form)
  (emit "/* Boilerplate code */~%")
  (emit "~A~%" form))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; This is a set of macros to test xlate.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; Translate exactly one form.
(defmacro xlate-test (form)
  (let ((f (gensym)))
    `(let ((,f ,form))
       (terpri)
       (emit "--------------------------------------------------~%")
       (emit "TESTING LISP-LITE FORM~%")
       (emit "--------------------------------------------------~%")
       (emit "~S~%" ,f)
       (emit "--------------------------------------------------~%")
       (emit "TRANSLATION TO C~%")
       (emit "--------------------------------------------------~%")
       (xlate ,f))))

;; Translate a list of forms.
(defmacro xlate-forms (forms)
  (let ((fs (gensym)))
    `(let ((,fs ,forms))
       (terpri)
       (emit "--------------------------------------------------~%")
       (emit "TESTING LISP-LITE FORM~%")
       (emit "--------------------------------------------------~%")
       (emit "~S~%" ,fs)
       (emit "--------------------------------------------------~%")
       (emit "TRANSLATION TO C~%")
       (emit "--------------------------------------------------~%")
       (mapc #'xlate ,fs))))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Try it out
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;; Simple function translation
(xlate-test
 '(cfunc int thingy ((int x) (float c))
   (creturn (+ x c))))

;; Does it do operator precedence correctly?
(xlate-test
 '(cfunc int thingy ((int x) (float c))
   (creturn (+ (- 1 3) (* x c)))))

;; assign the value (+ c 10) to x
(xlate-test
 '(cfunc int thingy ((int x) (float c))
   (cassign x (+ c 10))
   (creturn (+ (- 1 3) (* x c)))))

;; Open a new scope and use the variables.
(xlate-test
 '(cfunc int thingy ((int x) (float c))
   (clet ((int a) (float b))
    (cassign a 10)
    (cassign b 20)
    (creturn (+ x (+ c (* a b)))))))

;; And now for something completely different:
;; A real piece of lisp-lite code translated into real C!
;; Convert all blocks of lisp-lite into C
;; Fibonacci Function
(xlate-forms
 '(
   ;; Form 1 is a chunk of boiler plate C
   (cliteral
    "#include <stdio.h>
#include <stdlib.h>")

   ;; Form 2 is the Fibonacci function.
   (cfunc int fib ((int n))
    (cif (< n 2)
     (creturn n)
     (creturn
      (+ (cfuncall fib (- n 1))
         (cfuncall fib (- n 2))))))

   ;; Form 3 is the main function.
   ;; Play loosely with lisp-identifiers versus C identifiers for the type
   ;; of argv.
   (cfunc int main ((int argc) (char** argv))
    ;; Would need to handle string output a little better to handle
    ;; escaped characters.
    (cfuncall printf "output: %d\\n" (cfuncall fib 10))
    (creturn 0))))

And here is the output.

--------------------------------------------------
TESTING LISP-LITE FORM
--------------------------------------------------
(CFUNC INT THINGY ((INT X) (FLOAT C)) (CRETURN (+ X C)))
--------------------------------------------------
TRANSLATION TO C
--------------------------------------------------
/* Translating function thingy */
int thingy(int x, float c)
{
return (x + c);
}

--------------------------------------------------
TESTING LISP-LITE FORM
--------------------------------------------------
(CFUNC INT THINGY ((INT X) (FLOAT C)) (CRETURN (+ (- 1 3) (* X C))))
--------------------------------------------------
TRANSLATION TO C
--------------------------------------------------
/* Translating function thingy */
int thingy(int x, float c)
{
return ((1 - 3) + (x * c));
}

--------------------------------------------------
TESTING LISP-LITE FORM
--------------------------------------------------
(CFUNC INT THINGY ((INT X) (FLOAT C)) (CASSIGN X (+ C 10))
 (CRETURN (+ (- 1 3) (* X C))))
--------------------------------------------------
TRANSLATION TO C
--------------------------------------------------
/* Translating function thingy */
int thingy(int x, float c)
{
x = (c + 10);
return ((1 - 3) + (x * c));
}

--------------------------------------------------
TESTING LISP-LITE FORM
--------------------------------------------------
(CFUNC INT THINGY ((INT X) (FLOAT C))
 (CLET ((INT A) (FLOAT B)) (CASSIGN A 10) (CASSIGN B 20)
  (CRETURN (+ X (+ C (* A B))))))
--------------------------------------------------
TRANSLATION TO C
--------------------------------------------------
/* Translating function thingy */
int thingy(int x, float c)
{
{ /* new scope for clet */
int a;
float b;
a = 10;
b = 20;
return (x + (c + (a * b)));
}
;
}

--------------------------------------------------
TESTING LISP-LITE FORM
--------------------------------------------------
((CLITERAL "#include <stdio.h>
#include <stdlib.h>")
 (CFUNC INT FIB ((INT N))
  (CIF (< N 2) (CRETURN N)
   (CRETURN (+ (CFUNCALL FIB (- N 1)) (CFUNCALL FIB (- N 2))))))
 (CFUNC INT MAIN ((INT ARGC) (CHAR** ARGV))
  (CFUNCALL PRINTF "output: %d\\n" (CFUNCALL FIB 10)) (CRETURN 0)))
--------------------------------------------------
TRANSLATION TO C
--------------------------------------------------
/* Boilerplate code */
#include <stdio.h>
#include <stdlib.h>
/* Translating function fib */
int fib(int n)
{
if ((n < 2)) {
return n;
} else {
return (fib((n - 1)) + fib((n - 2)));
}
;
}
/* Translating function main */
int main(int argc, char** argv)
{
printf("output: %d\n", fib(10));
return 0;
}

I could have made the LISP-LITE language even closer to a subset of lisp (so close, that they could be indistinguishable), but I didn't because I was curious about the LISP-LITE syntax itself and translating it into C. I wasn't concerned about preserving the semantics of Common Lisp for this example.

Ack! I spilled my Vodka! Oh, no... never mind... it was just my dreams.

End of Line.

May 25, 2010 11:35 PM CDT by psilord in category Lisp

Stumbling Through History

I was noticing some Lisp code I was writing was getting slower and slower the more I added to it. Normally, this means one is writing idiotic codes and while that may be the problem here, I am a firm believer of denial and looked the other way.

Instead, I profiled my code using some off the shelf directions in the SBCL manual.

What I found was intriguing. Basically, I was spending inordinate amounts of time deep inside some internal function of FORMAT. This was quite surprising to me because while I was printing debugging stuff out for sure, there just wasn't enough output text to justify such profiling numbers.

After some serious thought, I traced the cause to this macro call I had written:

(defmacro emit (s fmt &rest args)                                       ; Wrong!
  `(when *debug-output*
     (format ,s (concatenate 'string "DEBUG: " ,fmt) ,@args)
     (finish-output)))

The intention of this macro was that I want to see the output when there is a valid stream, and not to see it when there is a nil stream, or the *debug-output* special variable was nil. I didn't want to keep writing the template of checking the *debug-output* variable, so I wrote the macro. This was like the tenth function I wrote when I started my project and didn't think anything of it...

It turns out after a while of writing that I had hundreds of these calls, all using a nil stream, in my source codes. I'd set the stream to t or nil in each individual emit invocation depending upon what I was testing or debugging to see output or not. Can you begin to see what is wrong? It turns out that I was calling format with a nil stream (often in tight loops) and simply throwing away the prettily formatted string right into the garbage collector. I didn't see the output and it cost me an arm and a leg to not see any output!

What I really wanted was that when nil is passed to the emit function, the debugging output expression should vanish from the code stream. When I mean vanish, I mean not even a function call there to cost me something. A good way to make something vanish is have it become a (progn) expression which the compiler will throw out since it does nothing.

Starting from that thought, here is my altered emit macro:

(defmacro emit (s fmt &rest args)
  (if (null s)
      `(progn)
      `(when *debug-output*
         (format ,s (concatenate 'string "DEBUG: " ,fmt) ,@args)
         (finish-output))))

Now, I can actually enforce the vanishing of the debugging output (and even the computation cost of the arguments!). I put the new macro into my codes, recompiled, reprofiled, and was very pleased.

EDIT: Some weeks later, after reading more about what macros are legally able to return, it dawns upon me that (progn) evaluates to NIL and I can use this to my benefit. Since the returned NIL from a macro expansion is precisely the same NIL that a (progn) could return. This means I can do this new version instead which is slightly cleaner.

(defmacro emit (s fmt &rest args)
  (unless (null s)
    `(when *debug-output*
       (format ,s (concatenate 'string "DEBUG: " ,fmt) ,@args)
       (finish-output))))

Upon reflection, it is a little sad that the defmacro must return a value since it changes the return value of this kind of code:

;; The initial code
* (let ((a 42)) 
    a)
42

;; And I put in a debugging line to print out a...
* (let ((a 42)) 
    a 
    (emit nil "a = ~A~%" a))
NIL

Oops! I'll need to remember this until I figure out how to make the (emit ...) call completely disappear. I suppose I could use conditional reading with #+ and #-, but it seems an dirty solution to type that for hundreds of locations...

End of Line.

May 22, 2010 8:45 PM CDT by psilord in category Lisp

Stay to the Left

I've been learning lisp by writing a lot of it lately and I've run into the first time I needed a real defmacro to do some work for me. The project I'm working on uses a lot of DEFSTRUCTs and so sometimes I have to manhandle a pile of structures in one piece of code. I hate typing the accessors, so I use WITH-SLOTS a lot.

Usually, it looks like this (a contrived example that isn't far from the truth):

(with-slots (a b c) x
  (with-slots (d e f) y
    (with-slots (g h i) z
      (setf a d
            e h))))

After a while, I even hated typing all of that too and how it made my code inch towards the right. I'm an old hacker; I like 80 columns for my code! Being in the process of reading "On Lisp" by Paul Graham and "Let Over Lambda" by Doug Hoyte, I figured some miniscule portion of that awesome knowledge might have wedged itself between my ears as I felt it rushing past. I thought for a bit and came up with this:

;; You are free to use, modify, and redistribute this macro. Attribution to
;; me, Peter Keller (psilord@cs.wisc.edu), is appreciated, but not required.
;; There is no warranty with this code.

(defmacro multiple-with-slots (sbinds &body body)
  (if (null sbinds)
      `(progn ,@body)
      `(with-slots ,(car sbinds) ,(cadr sbinds)
         (multiple-with-slots ,(cddr sbinds) ,@body))))

Yup, that's right, it is a recursive macro! It is probably overengineered and a real lisp professional would chuckle a bit and mention something in the ANSI spec I managed to overlook which does exactly what I wanted already--not to mention the N ways it is probably wrong. However, it is the first nontrivial macro I've written and so it will hold a special place in my heart.

You use it like this:

* ;; In a lisp REPL

* (defstruct xx a b c)
* (defstruct yy d e f)
* (setf x (make-xx :a 1 :b 2 :c 3))
* (setf y (make-yy :d 4 :e 5 :f 6))

* (multiple-with-slots
      ((a b c) x
       (d e f) y)
    (rotatef a d))

* x

#S(XX :A 4 :B 2 :C 3)
* y

#S(YY :D 1 :E 5 :F 6)

What I very much like about this is that the slot names are generalized variables which allows manipulation by SETF, ROTATEF and all manner of things like that automatically work. No different than WITH-SLOTS, but I didn't lose the behavior either, and that was important to me.

I may rewrite it a bit later to take advantage of grouping functions to rip apart the sbinds list, but for now, it is quite useful to me.

Note: It sorta sucks that everyone these days has to explicitly state some licensing aspect about their code they publish. I guess the trials and tribulations of software lawsuits have taken their toll.

Meh.

End of Line.

April 21, 2010 9:40 PM CDT by psilord in category Lisp

Coloring Inside the Lines

The topic of parenthesis paralysis arose in some circles I frequent in relation to Lisp. This is a semi-mythical notion that somehow the use of parentheses in Lisp contributes to cognitive difficulty or the seizing of thought. It is usually used as an excuse not to learn Lisp or make fun of it.

Related to this idea, a person by the name of Alessio Stalla hacked the Lisp reader to try and present a syntax that is more "newbie friendly". The actual hack is pretty cool.

His altering of the syntax of lisp made it look like what follows. I have some fundamental disagreements about the syntax--in particular the if expression consequence and alternate both being grouped by the { } characters associated with the if form. But that conversation is for another day...

with-output-to-string (str) {
  print #\c str;
  if (> foo 45) {
    progn { print 3; print 4; }
    print 5;
  }
}

This generated some discussion and I decided to extend a post I had done in Usenet into a blog entry.

I've hacked scheme for a long time and lisp for a little while. I believe there is such a thing as parenthesis-paralysis. After some careful introspection upon it over the years, I've come to a belief about how it arises and presents.

It appears to me to happen because of two concepts:

  1. Indention style
  2. Destructuring bind, as performed by a human reading code.

Indention style, especially in lisp, is extremely important and emacs/slime is the only thing that culturally "does it correctly out of the box". It does it correctly, because the cultural indention style of Lisp appears to be exactly defined as what SLIME produces!

Most editor's notion of auto-indention is that when you hit return, you align with the first non-whitespace character on the previous line OR you just indent X columns. Combine this with a regular Joe's idea of indention being TAB, and you have a disastrous situation. For example suppose my tab stops are set to 4 in the following example. Every place you see a period is where I hit tab:

(let ((thing 42)
.   (stuff 99)
.   (qux 100))
    
.   (format t "Hi~%"))

Looks like crap doesn't it?

In other block structured languages, a simple tab is sufficient for code readability, e.g., C:

void foo(void)
{
.   int thing = 42;
.   int stuff = 99;
.   int qux = 100

.   printf("Hi\n");
}

Notice, from a mental model of indention, I did exactly the same behavior. I indented the bindings, left a space, then indented the body. People use the same mental model of indention for similar languages. This realization is the key to why people entering lisp for the first time have such a hard time about it.

Let's take a look at Alessio Stalla's precipitating example. If your tab stop is set to 2, then this is exactly how you would type it using a regular editor's default notion of indention and the TAB key.

with-output-to-string (str) {
. print #\c str;
. if (> foo 45) {
. . progn { print 3; print 4; }
. . print 5;
. }
}

We can hypothesize that the author's mental model of code indention appears to follow to the default behavior of their editor and that they write in block structured imperative languages more often than not. I believe the author of the code view this indention model as "correct". I put that in quotes, since it is purely subjective.

In C, C++, PHP, perl, Java, Python, Fortran 90, etc, etc, etc, applying the same mental indention model is just fine and leads to pretty readable code. Specifically in python, it is enshrined.

In Lisp (and somewhat scheme), the default behavior of the editor and/or the mental model of indention is completely wrong. Why? Reason number 2.

Humans use visual whitespace and alignments of similar boundary markers to block off semantically related entities. This helps them to destructure the forms into semantically meaningful sections or elements.

The correctly indented lisp form for the above let form:

(let ((thing 42)
      (stuff 99)
      (qux 100))

  (format t "Hi~%"))

You can see how the parentheses of the binding forms are aligned with each other. The ( which "sticks out" on the left is the demarcation of the beginning of the binding list. The fact the (format ...) is indented to the left of the bindings, but the right of the let states the bindings are over and tells you this is the body of the let.

Another one, more complex in lisp (done with the mental model of Algol-style indention):

(do ((x 0 (1+ x))
.   (y 10 (1- y))
.   (z 0 (1+ z)))
.   ((zerop y) (+ x z))

.   (format t "Hi!~%"))

That is terrible, since the test condition vanishes into the step forms. Get rid of the newline in the middle and it is a train wreck. The problem is that this is how editors want to do it by default! I've seen the modern generation of programmers lately. They use pico, notepad, gedit, or kde/gnome stuff (mere toy code editors to me) to write their code and they don't even realize there are others to choose from that are much better. The editors all do this ham fisted auto-indent behavior and can't really be taught otherwise. I can't imagine this while writing lisp. I would go insane. (Arguably, I may be insane, but that's for the courts to figure out.)

Here is the correctly indented do form. Notice how the "nubs" of the left side parentheses on the do visual block allow the eye to disambiguate the step and test forms.

(do ((x 0 (1+ x))
     (y 10 (1- y))
     (z 0 (1+ z)))
    ((zerop y) (+ x z))

  (format t "Hi~%"))

The crux of the indention issue is that culturally, lisp doesn't indent at the same quanta of indention as other languages over the various semantic forms of the language. Sometimes you need 1/4 of an indention, or 1/2 of one, or a one space difference. Programmers, when presented with such a problem, spend their time laboriously aligning (instead of code writing) parenthesis (often mixing tabs and spaces) so everything looks right--because their editor doesn't do it for them! This is the first part of the paralysis. If they don't know what looks right to begin with, they simply self converge to a self-consistent form of indention and write all their code in it.

The paralysis continues when, not having been grounded in the cultural indention style of lisp or not having an editor that performs it automatically, you start reading other people's code. If you get unlucky and read a collection of people's code that never learned the correct cultural style (each having their own style), then the destructuring process gets amazingly difficult to do. You end up reading character by character, parenthesis by parenthesis, trying to figure out where semantic forms start and end. The most powerful visual/spatial recognizer in your head (like 1/4 of your brain matter is devoted to it!) sits completely idle!

I hacked my emacs/SLIME syntax coloring to be very detailed and show many different classes of semantic ideas: comments, ANSI functions, macros, special forms, type names, special variables, constants, etc, etc, etc to all be color coordinated with each other using color theory to give me insight into what I've written--and whether I'm writing it correctly!

The most important change I did was to make the parenthesis color nearly the same as the background color. In a sense they vanish and I'm left with the spatial blocks as an indention structure, which works just fine. I can always resolve a specific parenthesis from the background if I have to to disambiguate a form. Usually, if I screwed something up, the indention level of my code base goes wrong and it is pretty easy to find where I missed a parenthesis.

Oddly, lisp is the only language for which I do this, all others are just white text on a black background and I prefer it that way. Syntax/Semantics highlighting is distracting to me in non lisp languages. It is a requirement for lisp!

After determined use of the colorizing/indention system slime and my hacks provide me, I occasionally find that looking at non-colorized, but correctly indented Lisp, is hard. The parentheses make a visual heaviness that is hard to ignore; I almost read it as ALL CAPS style yelling. If I'm looking at non-colorized and non-culturally indented lisp, I revert to the "hunt and peck" method of trying to destructure the semantic forms in my head. It gets frustrating very quickly.

I honestly believe parenthesis-paralysis is real. The right editor will make the paralysis completely disappear, but the wrong editor will accentuate it.

As a treat (maybe I'm using an esoteric definition of the word...), here is some horrendously written emacs lisp which augments the Lisp syntax highlighter in a certain color theme to in fact delineate various portions of the language to me. It is by no means the whole of what I want out of a Lisp IDE, but is a tiny step that saves me some heartache.

I searched through emacs' color themes for a long time, and mostly came to the conclusion that most of those theme designers had never opened, let alone been in the same state as, a book on color theory or color perception. Bold claims? Sure, but I'm an idiot writing a blog, and that should be normal fare on the Internet. This is why I don't have a comment section.

What I did was started with the idea of warm colors being near to the viewer, cool colors being far. I also took into consideration complimentary colors, color intensity, and a Western view of the general meaning of bare colors. Each axis: warm to cool, color to compliment, and low to high intensity, cultural meaning of color, has a reason to exist in the syntax highlighting. Once you are aware of these meanings it becomes much easier to visually process the code and get a handle of the dependencies in the code. This design isn't finished yet, and over time I'm evolving it as I see what I actually care about when writing Lisp.

Here are some rules about the things I wanted to emphasize or ensure I could easily separate on my screen. These rules are not the complete set I wish to do in my source, just the ones I was generally able to do given the tools I had. Also, things like special forms and macros get the same color, because I haven't found a reason to separate them.

Once one has the concrete plan about what one cares about in the Lisp code, you go and start assigning colors. Emphasizing means higher intensity, and separating means putting the concepts into color compliment domains. Choosing colors on the grey/brown intensity scale mean you only care about emphasis in between them, but they don't exist as compliments to other things. A complimentary concept is like code and comments or special variables and non-special variables. Use of warm and cool colors can give a sense of depth in the code allowing once to inspect code quickly for important things, which should be up front.

So, given the semantic plan and color knowledge, I went and assigned colors. This is what I made.

Screenshot 1: Some code out of the MD5 implementation by Pierre R. Mai. Notice how the coloring regularizes the text and makes it easy for your eye to wander over the color blocks inspecting what is present.

Colorized Text

Screenshot 2: Some code out of the MIDI library implementation by Mathieu Chabanne, Camille Constant, Emmanuel Necibar, Stephanie Recco, Robert Strandh, David Lewis, Marcus Pearce, and Christophe Rhodes. If I were looking at this code, the special variable *time* would be most important to me. Good thing it is a high intensity color.

Colorized Text

This code was generated by me scraping the Common Lisp Hyperspec web pages, munging the symbols and whatnot into the form you see below, and then insufficiently reading the emacs lisp programming manual. It is built on top of the comidia color theme, but I suspect if you change the colors can be customized to another theme. There are some errors in the highlighting like 1+ doesn't get recognized probably because it looks like a +1+ constant in terms of the regexes and I haven't figure out exactly how to fix it. There are other limitations and omissions, like ,@*foo* not being highlighted correctly that I need to figure out as well. I fix them when they annoy me enough.

There are some tasks like highlighting certain grouping parentheses when the point is not in the extent of them that I have no idea how to do. Some of the complex highlighting I want which is more semantics based, e.g., any symbol in a condition spot in a handler-case is colored a specific color but only if it actually exists in the source, and symbols used in functional places should be colored color A, but if used in another namespace are colored according to that namespace, is impossible to do in emacs since the language structure is not fully realized in the editor. Because this simplistic highlighter is semanticly ignorant, if you have a habit of choosing common lisp symbols and using them in other namespaces, this syntax highlighter is probably going to overly upset you.

As for me, I use the non-terminal mode of emacs and I don't intersect namespace boundaries on the same symbol. I don't care if this doesn't work in terminal mode since I will most likely never use that mode. If you decide to use it, good luck.

Frankly, the most useful thing out of this code is the 978 common lisp symbols that have been categorized for your use in a different piece of code.

All the code I wrote in this blog post is under the *spins the wheel* Modified BSD License.

;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; set up a color theme that is good enough for now.
;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(when (require 'color-theme nil 'noerror)
  (color-theme-comidia))

;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Set up some highlighting rules for all known ANSI common lisp functions and
;; macro names, special operators, etc. This assumes color-theme-comidia.
;; This is always tinkerable....
;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defface ansi-lisp-boolean
  '((t (:foreground "white")))
  "Color all of the ANSI Common Lisp Booleans this specific color")

(defface ansi-lisp-constant
  '((t (:foreground "orchid1")))
  "Color all of the ANSI Common Lisp Constants this specific color")

(defface ansi-lisp-declaration
  '((t (:foreground "indianred2")))
  "Color all of the ANSI Common Lisp Declarations this specific color")

(defface ansi-lisp-condition-type
  '((t (:foreground "indianred2")))
  "Color all of the ANSI Common Lisp Condition Types this specific color")

(defface ansi-lisp-function
  '((t (:foreground "pale green")))
  "Color all of the ANSI Common Lisp Functions this specific color")

(defface ansi-lisp-generic-function
  '((t (:foreground "cyan"))) ; same as the comida color-theme which I use.
  "Color all of the ANSI Common Lisp Generic Functions this specific color")

(defface ansi-lisp-macro
  '((t (:foreground "cyan"))) ; same as comida color-theme which I use.
  "Color all of the ANSI Common Lisp Macros this specific color")

(defface ansi-lisp-special-operator
  '((t (:foreground "cyan"))) ; same as macros and as the comida color-theme.
  "Color all of the ANSI Common Lisp Special Operators this specific color")

(defface ansi-lisp-type
  '((t (:foreground "red")))
  "Color all of the ANSI Common Lisp Types this specific color")

(defface ansi-lisp-unknown
  '((t (:foreground "red")))
  "Color all of the ANSI Common Lisp  this specific color")

(defface ansi-lisp-global-variable
  '((t (:foreground "yellow2")))
  "Color all of the ANSI Common Lisp Globals this specific color")

(defface ansi-lisp-expression
  '((t (:foreground "indianred1")))
  "Color all of the ANSI Common Lisp Expressions (like declare) this specific color")

(defface ansi-lisp-parenthesis
  '((t (:foreground "#4d4d3d")))  ; grey25, but with more yellow in it
  "Color all of the ANSI Common Lisp Parenthesis")

(defface ansi-lisp-numbers
  '((t (:foreground "orchid1")))
  "Color all of the ANSI Common Lisp Numbers")

;;; These are functions which produce the individual entries for each kind of
;;; symbol name according to what face they should have.

(defun ansi-boolean (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-boolean))

(defun ansi-constant (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-constant))

(defun ansi-declaration (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-declaration))

(defun ansi-condition-type (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-condition-type))

(defun ansi-function (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-function))

(defun ansi-generic-function (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-generic-function))

(defun ansi-macro (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-macro))

(defun ansi-special-operator (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-special-operator))

(defun ansi-type (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-type))

(defun ansi-unknown (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-unknown))

(defun ansi-global-variable (x)
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-global-variable))

(defun ansi-expression (x)
                                        ; used for things like declare
  (list
   (concatenate
    'string "\\<\\(" (symbol-name x) "\\)\\>") 0 ''ansi-lisp-expression))

(when t ;; sometimes I need to disable this whole thing....
  (add-hook 'lisp-mode-hook
            (lambda ()
              (font-lock-add-keywords
               nil
               (append
                ;; Conventional Constant Variables
                '(("\\<\\([+][^ +]*[+]\\)\\>" 0 #'ansi-lisp-constant))

                ;; Conventional Global Variables, including ANSI ones
                '(("\\<\\([*][^ *]*[*]\\)\\>" 0 #'ansi-lisp-global-variable))

                ;; Lisp Numbers, simple ones, just integers
                '(("\\<\\([+-]?[0-9]+\\)\\>" 0 #'ansi-lisp-numbers))

                ;; I'm a psycho and want my parentheis color to be controlled.
                '(("\\([()]\\)" 0 #'ansi-lisp-parenthesis))

                ;; These are often important to see, but I don't know how to
                ;; highlight the matching parenthesis with it
                ;;'(("\\([#][']\\)" 0 'ansi-lisp-boolean))

                (mapcar #'ansi-boolean
                        '(nil t))

                (mapcar #'ansi-constant '(array-dimension-limit
                        array-rank-limit array-total-size-limit
                        boole-1 boole-2 boole-and boole-andc1
                        boole-andc2 boole-c1 boole-c2 boole-clr
                        boole-eqv boole-ior boole-nand boole-nor
                        boole-orc1 boole-orc2 boole-set boole-xor
                        call-arguments-limit char-code-limit
                        double-float-epsilon
                        double-float-negative-epsilon
                        internal-time-units-per-second
                        lambda-list-keywords lambda-parameters-limit
                        least-negative-double-float
                        least-negative-long-float
                        least-negative-normalized-double-float
                        least-negative-normalized-long-float
                        least-negative-normalized-short-float
                        least-negative-normalized-single-float
                        least-negative-short-float
                        least-negative-single-float
                        least-positive-double-float
                        least-positive-long-float
                        least-positive-normalized-double-float
                        least-positive-normalized-long-float
                        least-positive-normalized-short-float
                        least-positive-normalized-single-float
                        least-positive-short-float
                        least-positive-single-float long-float-epsilon
                        long-float-negative-epsilon
                        most-negative-double-float
                        most-negative-fixnum most-negative-long-float
                        most-negative-short-float
                        most-negative-single-float
                        most-positive-double-float
                        most-positive-fixnum most-positive-long-float
                        most-positive-short-float
                        most-positive-single-float
                        multiple-values-limit pi short-float-epsilon
                        short-float-negative-epsilon
                        single-float-epsilon
                        single-float-negative-epsilon))

                (mapcar #'ansi-declaration '(type compilation-speed
                        debug declaration dynamic-extent ftype
                        ignorable ignore inline notinline optimize
                        safety space special speed))

                (mapcar #'ansi-condition-type '(arithmetic-error
                        cell-error condition control-error
                        division-by-zero end-of-file file-error
                        floating-point-inexact
                        floating-point-invalid-operation
                        floating-point-overflow
                        floating-point-underflow package-error
                        parse-error print-not-readable program-error
                        reader-error serious-condition
                        simple-condition simple-error
                        simple-type-error simple-warning
                        storage-condition stream-error style-warning
                        type-error unbound-slot unbound-variable
                        undefined-function warning))

                (mapcar #'ansi-function '(- / \* \+ abort and atom bit
                        character complex cons continue eql error
                        float list logical-pathname member mod
                        muffle-warning not null pathname rational
                        store-value string use-value values vector /=
                        1- 1\+ < <= = > >= abs acons acos acosh adjoin
                        adjust-array adjustable-array-p alpha-char-p
                        alphanumericp append apply apropos
                        apropos-list aref arithmetic-error-operands
                        arithmetic-error-operation array-dimension
                        array-dimensions array-displacement
                        array-element-type array-has-fill-pointer-p
                        array-in-bounds-p array-rank
                        array-row-major-index array-total-size arrayp
                        ash asin asinh assoc assoc-if assoc-if-not
                        atan atanh bit-and bit-andc1 bit-andc2 bit-eqv
                        bit-ior bit-nand bit-nor bit-not bit-orc1
                        bit-orc2 bit-vector-p bit-xor boole
                        both-case-p boundp break
                        broadcast-stream-streams butlast byte
                        byte-position byte-size caaaar caaadr caaar
                        caadar caaddr caadr caar cadaar cadadr cadar
                        caddar cadddr caddr cadr call-next-method car
                        cdaaar cdaadr cdaar cdadar cdaddr cdadr cdar
                        cddaar cddadr cddar cdddar cddddr cdddr cddr
                        cdr ceiling cell-error-name cerror char
                        char-code char-downcase char-equal
                        char-greaterp char-int char-lessp char-name
                        char-not-equal char-not-greaterp
                        char-not-lessp char-upcase char/= char< char<=
                        char= char> char>= characterp cis class-of
                        clear-input clear-output close clrhash
                        code-char coerce compile compile-file
                        compile-file-pathname compiled-function-p
                        compiler-macro compiler-macro-function
                        complement complexp compute-restarts
                        concatenate concatenated-stream-streams
                        conjugate consp constantly constantp
                        copy-alist copy-list copy-pprint-dispatch
                        copy-readtable copy-seq copy-structure
                        copy-symbol copy-tree cos cosh count count-if
                        count-if-not decode-float
                        decode-universal-time delete delete-duplicates
                        delete-file delete-if delete-if-not
                        delete-package denominator deposit-field
                        describe describe-object digit-char
                        digit-char-p directory directory-namestring
                        disassemble documentation dpb dribble
                        echo-stream-input-stream
                        echo-stream-output-stream ed eighth elt
                        encode-universal-time endp enough-namestring
                        ensure-directories-exist
                        ensure-generic-function eq equal equalp eval
                        evenp every exp export expt fboundp fceiling
                        fdefinition ffloor fifth file-author
                        file-error-pathname file-length
                        file-namestring file-position
                        file-string-length file-write-date fill
                        fill-pointer find find-all-symbols find-class
                        find-if find-if-not find-package find-restart
                        find-symbol finish-output first float-digits
                        float-precision float-radix float-sign floatp
                        floor fmakunbound force-output format fourth
                        fresh-line fround ftruncate funcall
                        function-lambda-expression functionp gcd
                        gensym gentemp get get-decoded-time
                        get-dispatch-macro-character
                        get-internal-real-time get-internal-run-time
                        get-macro-character get-output-stream-string
                        get-properties get-setf-expansion
                        get-universal-time getf gethash graphic-char-p
                        hash-table-count hash-table-p
                        hash-table-rehash-size
                        hash-table-rehash-threshold hash-table-size
                        hash-table-test host-namestring identity
                        imagpart import input-stream-p inspect
                        integer-decode-float integer-length integerp
                        interactive-stream-p intern intersection
                        invalid-method-error invoke-debugger
                        invoke-restart invoke-restart-interactively
                        isqrt keywordp last lcm ldb ldb-test ldiff
                        length lisp-implementation-type
                        lisp-implementation-version list-all-packages
                        list-length list\* listen listp load
                        load-logical-pathname-translations log logand
                        logandc1 logandc2 logbitp logcount logeqv
                        logical-pathname-translations logior lognand
                        lognor lognot logorc1 logorc2 logtest logxor
                        long-site-name lower-case-p machine-instance
                        machine-type machine-version macro-function
                        macroexpand macroexpand-1 make-array
                        make-broadcast-stream make-concatenated-stream
                        make-condition make-dispatch-macro-character
                        make-echo-stream make-hash-table make-list
                        make-load-form-saving-slots make-package
                        make-pathname make-random-state make-sequence
                        make-string make-string-input-stream
                        make-string-output-stream make-symbol
                        make-synonym-stream make-two-way-stream
                        makunbound map map-into mapc mapcan mapcar
                        mapcon maphash mapl maplist mask-field max
                        member-if member-if-not merge merge-pathnames
                        method-combination-error min minusp mismatch
                        name-char namestring nbutlast nconc
                        next-method-p nintersection ninth notany
                        notevery nreconc nreverse nset-difference
                        nset-exclusive-or nstring-capitalize
                        nstring-downcase nstring-upcase nsublis nsubst
                        nsubst-if nsubst-if-not nsubstitute
                        nsubstitute-if nsubstitute-if-not nth nthcdr
                        numberp numerator nunion oddp open
                        open-stream-p output-stream-p
                        package-error-package package-name
                        package-nicknames package-shadowing-symbols
                        package-use-list package-used-by-list packagep
                        pairlis parse-integer parse-namestring
                        pathname-device pathname-directory
                        pathname-host pathname-match-p pathname-name
                        pathname-type pathname-version pathnamep
                        peek-char phase plusp position position-if
                        position-if-not pprint pprint-dispatch
                        pprint-fill pprint-indent pprint-linear
                        pprint-newline pprint-tab pprint-tabular prin1
                        prin1-to-string princ princ-to-string print
                        print-not-readable-object print-object
                        probe-file proclaim provide random
                        random-state-p rassoc rassoc-if rassoc-if-not
                        rationalize rationalp read read-byte read-char
                        read-char-no-hang read-delimited-list
                        read-from-string read-line
                        read-preserving-whitespace read-sequence
                        readtable-case readtablep realp realpart
                        reduce rem remhash remove remove-duplicates
                        remove-if remove-if-not remprop rename-file
                        rename-package replace require rest
                        restart-name revappend reverse room round
                        row-major-aref rplaca rplacd sbit scale-float
                        schar search second set set-difference
                        set-dispatch-macro-character set-exclusive-or
                        set-macro-character set-pprint-dispatch
                        set-syntax-from-char seventh shadow
                        shadowing-import short-site-name signal signum
                        simple-bit-vector-p
                        simple-condition-format-arguments
                        simple-condition-format-control
                        simple-string-p simple-vector-p sin sinh sixth
                        sleep slot-boundp slot-exists-p
                        slot-makunbound slot-value software-type
                        software-version some sort special-operator-p
                        sqrt stable-sort standard-char-p
                        stream-element-type stream-error-stream
                        stream-external-format streamp
                        string-capitalize string-downcase string-equal
                        string-greaterp string-left-trim string-lessp
                        string-not-equal string-not-greaterp
                        string-not-lessp string-right-trim string-trim
                        string-upcase string/= string< string<=
                        string= string> string>= stringp structure
                        sublis subseq subsetp subst subst-if
                        subst-if-not substitute substitute-if
                        substitute-if-not subtypep svref sxhash
                        symbol-function symbol-name symbol-package
                        symbol-plist symbol-value symbolp
                        synonym-stream-symbol tailp tan tanh tenth
                        terpri third translate-logical-pathname
                        translate-pathname tree-equal truename
                        truncate two-way-stream-input-stream
                        two-way-stream-output-stream type-error-datum
                        type-error-expected-type type-of typep
                        unbound-slot-instance unexport unintern union
                        unread-char unuse-package
                        upgraded-array-element-type
                        upgraded-complex-part-type upper-case-p
                        use-package user-homedir-pathname values-list
                        variable vector-pop vector-push
                        vector-push-extend vectorp warn
                        wild-pathname-p write write-byte write-char
                        write-line write-sequence write-string
                        write-to-string y-or-n-p yes-or-no-p zerop))

                (mapcar #'ansi-generic-function '(add-method
                        allocate-instance change-class class-name
                        compute-applicable-methods find-method
                        function-keywords initialize-instance
                        make-instance make-instances-obsolete
                        make-load-form method-qualifiers
                        no-applicable-method no-next-method
                        reinitialize-instance remove-method
                        shared-initialize slot-missing slot-unbound
                        update-instance-for-different-class
                        update-instance-for-redefined-class))

                (mapcar #'ansi-macro '(lambda or setf assert
                        call-method case ccase check-type cond
                        ctypecase decf declaim defclass defconstant
                        defgeneric define-compiler-macro
                        define-condition define-method-combination
                        define-modify-macro define-setf-expander
                        define-symbol-macro defmacro defmethod
                        defpackage defparameter defsetf defstruct
                        deftype defun defvar destructuring-bind do
                        do-all-symbols do-external-symbols do-symbols
                        do\* dolist dotimes ecase etypecase formatter
                        handler-bind handler-case ignore-errors
                        in-package incf loop loop-finish make-method
                        multiple-value-bind multiple-value-list
                        multiple-value-setq nth-value otherwise pop
                        pprint-exit-if-list-exhausted
                        pprint-logical-block pprint-pop
                        print-unreadable-object prog prog1 prog2
                        prog\* psetf psetq push pushnew remf
                        restart-bind restart-case return rotatef
                        shiftf step time trace typecase unless untrace
                        when with-accessors with-compilation-unit
                        with-condition-restarts
                        with-hash-table-iterator
                        with-input-from-string with-open-file
                        with-open-stream with-output-to-string
                        with-package-iterator with-simple-restart
                        with-slots with-standard-io-syntax))

                (mapcar #'ansi-special-operator '(function block catch
                        eval-when flet go if labels let let\*
                        load-time-value locally macrolet
                        multiple-value-call multiple-value-prog1 progn
                        progv quote return-from setq symbol-macrolet
                        tagbody the throw unwind-protect))

                (mapcar #'ansi-type '(array base-char base-string
                        bignum bit-vector boolean broadcast-stream
                        built-in-class class compiled-function
                        concatenated-stream double-float echo-stream
                        extended-char file-stream fixnum
                        generic-function hash-table integer keyword
                        long-float method number package random-state
                        ratio readtable real restart satisfies
                        sequence short-float signed-byte simple-array
                        simple-base-string simple-bit-vector
                        simple-string simple-vector single-float
                        standard-char standard-class
                        standard-generic-function standard-method
                        standard-object stream string-stream
                        structure-class structure-object symbol
                        synonym-stream two-way-stream unsigned-byte))

                (mapcar #'ansi-unknown
                        '(method-combination))

                (mapcar #'ansi-global-variable '(// /// \*\* \*\*\*
                        \*break-on-signals\* \*compile-file-pathname\*
                        \*compile-file-truename\* \*compile-print\*
                        \*compile-verbose\* \*debug-io\*
                        \*debugger-hook\*
                        \*default-pathname-defaults\* \*error-output\*
                        \*features\* \*gensym-counter\*
                        \*load-pathname\* \*load-print\*
                        \*load-truename\* \*load-verbose\*
                        \*macroexpand-hook\* \*modules\* \*package\*
                        \*print-array\* \*print-base\* \*print-case\*
                        \*print-circle\* \*print-escape\*
                        \*print-gensym\* \*print-length\*
                        \*print-level\* \*print-lines\*
                        \*print-miser-width\*
                        \*print-pprint-dispatch\* \*print-pretty\*
                        \*print-radix\* \*print-readably\*
                        \*print-right-margin\* \*query-io\*
                        \*random-state\* \*read-base\*
                        \*read-default-float-format\* \*read-eval\*
                        \*read-suppress\* \*readtable\*
                        \*standard-input\* \*standard-output\*
                        \*terminal-io\* \*trace-output\* \+\+ \+\+\+))

                (mapcar #'ansi-expression
                        '(declare))

                )))))

End of Line.

April 2, 2010 12:10 PM CDT by psilord in category Lisp

Network Programming in Common Lisp with IOLib

The other day some months ago, I had a hankering to write some distributed computing frameworks in the 10K to 50K machine range in Common Lisp. I have some applications in mind for such things.

Of course, the first thing someone thinks about when designing such a system is "Should I mix my Coca Cola with Bacardi 151 or instead this rubbing alcohol I have here for cleaning my electronics?". Ok, the second thing someone thinks about is "How are all of the agents going to robustly talk to each other and handle network boundary cases?"

This means Network Programming. Calm down, there are books for that. What I needed was a nice solution in Common Lisp. I looked around and of course each implementation of Common Lisp has its own socket API, but I wanted something more portable between implementations. I didn't care so much about portability between operating systems, though. Linux (and a bottle of pills) is just fine for me.

I settled, like a wet cough in the lungs, on IOLib. This Common Lisp library can do a lot of things: IPV4, IPV6, UDP, DNS resolving, and most importantly, multiplex nonblocking I/O with multiple clients. The last thing I wanted to do was reinvent an I/O event multiplexer....

I dove into the manual, and uh, whoops, yeah, about that. At the time of this blog post, the manual was a bit thin. It suffered from the same affliction about 99.9% of other open source manuals do, an out of date API list, and not much else. Such is the nature of open source documentation. It is like being pissed off the sky is blue.

This time, instead of complaining about it, I did something about it. Open source, if you squint, also means contributing documentation as well.

After conversing with the author of IOLib (Stelian Ionescu) for extended periods of time while he disentangled the stupid I was spewing from my pie-hole, I produced a TUTORIAL for IOLib! This tutorial shows client/server examples for blocking I/O and nonblocking I/O with the multiplexer. The example source codes are available in the current release of IOLib available from the aforementioned site. Sorry about the formatting, I can't bring myself to htmlify it yet (it took a couple months to write and I just don't want to look at it for a while), maybe in some future revision of it I will. Of course, once the tutorial gets incorporated into a more formal discovery system on the IOLib website, I'll provide a link at the top of the tutorial's specific page to point to it so someone always find the most current one.

It is also newbie friendly to Common Lisp coders (mostly, because, um, I'm a newbie to Common Lisp at the moment so I wrote for my audience). I'm quite certain I've left important crap I shouldn't have out of the tutorial, so if you find something you just don't understand, let me know at the email address in the tutorial and I'll clarify it.

As I got to really know IOLib, I found that it was a really useful library. It sits at just the right abstraction level for the things for which I need it and turns out to be pretty simple to use and quite fast. The author is friendly, knowledgeable, and willing to accept contributions. Try it out!

End of Line.

January 8, 2010 12:21 AM CST by psilord in category Lisp

Absinthe and Lisp

Suppose one wants to write a helluva a lot of code in Common Lisp while drinking Absinthe and wishing his life went a different way in some respects.

Now stop supposing.

There are quite a few different implementations of Common Lisp out there. I'm pretty sure they have serious advocates and detailed pros and cons and yadda yadda yadda.... Know what? I don't care. I picked SBCL simply because some nice guy on a mailing list somewhere said a tool called clbuild will help me keep it and a lot of common usage CL libraries up to date and that Emacs' SLIME mode uses it natively. Good enough for me, but initially I'm using vim with Limp instead of Emacs for now.

clbuild is a funny thing. It is pretty damned useful. If you wanted to start from scratch and check it out, this is what you'd do. I'd recommend messing about in a /tmp directory for a while before making a real install so you can learn what happens. Some of this follows the clbuild page, but there is a little more detail one needs to know almost right away that I'll specify. Here is what you initially do:

Linux > cd /tmp
Linux > darcs get http://common-lisp.net/project/clbuild/clbuild
Linux > cd ./clbuild
Linux > chmod +x clbuild
Linux > ./clbuild check

At this point, I would recommend you actually find all of the source control programs clbuild wants and install them into your linux distro. It'll just make the next steps easier because various CL libraries use various source control programs. Let's keep going, first we'll get all of the stable projects, then we'll get and compile sbcl.

Linux > ./clbuild update --main-projects 
Linux > ./clbuild update sbcl
Linux > ./clbuild compile-implementation sbcl

At this point, you're almost done, now comes the fiddly parts. For example, sbcl requires a core file that it unexecs into memory and it has to be told where that core file is. If you do this:

Linux > ./clbuild lisp

then clbuild will know how to invoke sbcl so that it finds the correct core file. If, however, you want to run sbcl on the command line, you should write a little shell script that invokes the right sbcl with the right core file using the --core argument. Due to the semantics of the SBCL command line, this script is slightly harder to write than one thinks, but if you zen the SBCL manual for command line arguments a bit you'll figure it out.

Ok, now, what about other libraries in the wnnp section of the projects? To install these, and I recommend installing them individually as needed, you'd do (for example, iolib):

Linux > ./clbuild install iolib

Here are where things get tricky. The install will go fine, but if you try to (require 'iolib) into your running sbcl--which means sbcl will compile it on the fly, you'll get an error about an unreadable character. It turns out that the iolib library has UTF-8 encoded into its source and it appears the default of sbcl is to accept only 7-bit ASCII input. What to do?

In my case, I haven't started hacking on lisp yet, so I don't have any lisp projects in my .sbcl/systems/ directory to intermix with the sbcl just compiled from clbuild. So, we're going to copy the file clbuild.conf.default to clbuild.conf and comment out the first and second unsets in the new file in order to utilize my fledgling $HOME/.sbclrc file.

Then add this entry into your $HOME/.sbclrc file:

(setf sb-impl::*default-external-format* :utf-8)

This has the effect of telling sbcl upon start up that I wanted a utf-8 reader. After these steps were done it loaded iolib just fine. Since I'm new to Common Lisp and sbcl, this took a bit to figure out.

You probably want a few more things in your .sbclrc file as well, here is what I have in mine:

;; Allow reading of utf-8
(setf sb-impl::*default-external-format* :utf-8)

;; Allow higher precision with floats
(setf *read-default-float-format* 'double-float)

;; Tell ASDF where clbuilds central repository is.
(require :asdf)
(setf asdf:*central-registry* 
	'(#p"/home/psilord/content/code/lisp/clbuild/systems/"))

;; If a fasl was stale, try to recompile and load (once).
;; This fixes the situation when I recompile sbcl with a new revision.
(defmethod asdf:perform :around ((o asdf:load-op)
  (c asdf:cl-source-file))
  (handler-case (call-next-method o c)
                ;; If a fasl was stale, try to recompile and load (once).
                (sb-ext:invalid-fasl ()
                                     (asdf:perform (make-instance 'asdf:compile-op) c)
                                     (call-next-method))))

If you're an emacs user, you can probably do:

Linux > ./clbuild slime

and get a copy of emacs up and running with slime automatically set up and working in it. If you do this method DO NOT follow the setup directions concerning your .emacs file in the SLIME manual. There is some kind of conflict I wasn't arsed to figure out which causes an insane keymapping in your emacs session.

While Limp for vim is actually pretty good (except for what happens when you want to run a personally installed copy of SBCL...), it deviates a little bit in the cultural indention rules for Common Lisp and every person to whom I show my code comments on it and wants to correct it. So, I'm learning emacs for the sole purpose of writing Common Lisp in it just so it'll do the indenting correctly. Emacs was a bit of a bother in the beginning since the non-modal editing commands actually cause more keys to be typed while doing command-like things instead of editing-like things, but I'll get over it.

At this point, you can write some Common Lisp and see things work. I'd recommend going over to this blog post to learn how to use ASDF which describes how your source is organized and is quite useful and commonly applied. I'd also recommend carefully understanding how Common Lisp's packaging system works and how you move into, move out of, and use packages.

Three very useful books are "ANSI Common Lisp" by Paul Graham, which I luckily owned, "Practical Common Lisp", and "Successful Lisp". They are all beginner-ish books in Common Lisp. They overlap somewhat, but in the places they don't it is very instructive.

If you happen to not know much about functional programming, then you should read the Structure and Interpretation of Computer Programs. This wonderful book will give you the knowledge you need. It happens to be written for Scheme, which is a dialect of Lisp, but all of the main ideas of functional programming still apply, there are just some fiddly bits with #' (sharp-quote), argument evaluation order semantics, a very different standard library of functions, and macros of which one needs to be aware.

*sigh* my bottle has an end, my pain doesn't.

End of Line.

July 14, 2009 10:36 PM CDT by psilord in category Useless Ideas

Bits and Pieces

Apparently, a while ago I had gotten very interested in the implementations of chess programs that used the bitboard representation. I stumbled across a pedagogical tome I had been writing about it and decided that it was filled with enough goodness--even though it is incomplete, that I should put it up.

If you happen to know about the bitboard representation for chess, drop me a line and if you can contribute various knowledge I need, I'll finish the pages. I'm at psilord@cs.wisc.edu.

You can find the chess pages here. The CSS is horked for them right now and it looks like a rotting weasel bloating in the sun. I'd say that I'm sorry, but I'd be lying. It is more a fact that the CSS for those pages was designed before the CSS for the blog and they don't play nice together yet. I'll maybe fix it in the future.

I'm putting my email address online you say? Well, it doesn't stop the several hundred pieces of spam I get each day without it being on the web anywhere, so who cares. Putting a human encrypted email address online these days in the hopes one won't get spam is the paragon of denial.

End of Line.

June 21, 2009 2:19 AM CDT by psilord in category Useless Ideas

Assumption Inference

Today I was thinking about the old adage of the average number of arguments passed to a function is four.

Why four? Why not two or six or seventeen? What is special about that number that programs written by human beings exhibit such a behavior? How is the selection of that average number related to how humans process information, or is it related at all?

Which brings me to an analogous topic... Human short term memory can hold 7 symbols on average....

Oops. It turns out that is an urban legend and masterfully destroys the original blog post I had made using that reasoning. Well, that's what you get when you poorly skim the Internet looking for something. Sure the Internet is filled with piles of knowledge which can elevate you to PhD status, help the sick and the poor, oh, and also porn. Lots of porn. And poor quality videos about idiots doing stupid things and getting seriously hurt. Anyhow, the indisputable and eternal paragon of information retrieval that is Wikipedia had an article about working memory which instead I will skim and then about which I shall make bold and unsubstantiated claims.

Programming is a dance between the long term and working memory. The long term memory contains the library that retrieves what API calls, algorithms, and data structures one needs to perform the manipulation in question. The working memory fundamentally holds the storage of the current data flow graph of only a few hops before and after the direct line being authored at any given moment. If one subscribed to certain working memory models, then possibly there is some kind of episodic memory that hackers use to replay (or on the other side of the coin: plan) how they are going to manipulate some variables or other programming concepts that once they complete authoring, create another episodic set and then implement that.

I hypothesize that the reason why there are so few parameters to functions is that there are often significant out of band pieces of information surrounding the implementation of the function that present a situation where the hacker forgets about them and the constraints they force upon the solution due to working memory overload. I'd say there are explicit and implicit pieces of information about functions (and individual code lines in general, but for now we'll just talk about functions--and Unix system functions at that!). A constraint is something which must be kept true/false and is either a first class mental symbol as in the passed in pointer must be non-NULL, or something which is inferred from other constraints, such as if I fill up /tmp while write()ing a file, then malloc() will fail on an older Solaris machine--because a hidden constraint of the virtual memory system is bound to the free space in /tmp.

The difference between explicit and implicit hinges upon inherent documentation and/or commonality versus hidden side effect and/or uniqueness. Inherent documentation would be something like seeing the function prototype on a man page and being presented with the direct arguments to the function, or perfect recall from long term memory. A hidden side effect would be something like the function returning a pointer to a static buffer that a later call to the function will alter. A unique fact would be something like read() of very small chunks don't update the a_time on a file in NFS if client side attribute caching is enabled (which it often is).

As the number and temporal sequencing of mental symbols increases around the call to a function, the more chance there is for erroneous code to be written. This is an important thing to understand and may explain why certain functions have an error prone nature about them in certain contexts. Regardless of how the working memory of a human works, one can hypothesize that only a certain number of symbolic pieces of information: function prototypes, variables, their spatial relationship on screen, the explanation of someone describing how to do it to you, how the use of an idiom makes you feel, etc., can be kept in attentive memory at any give time. One can also make a reasonable assumption that this set of symbols and their temporal relation kept in the attention centers has an average across different human beings, and that average is probably a small number or small length of time (seconds at best). It seems, though, that if episodic memory is utilized, the information can be held in stasis much longer.

"I don't believe the bilious feces that you are projectile vomiting into the Internet and would like examples", you say? [Actually, someone didn't believe them and convinced me of a rewrite. I almost got lucky because I don't have a comment system, but unfortunately Alan knew my email address.]

I've analyzed some common functions out of libc to show the sets of the mental symbols needed to use them correctly which were then grouped into explicit and implicit sets by, check this out, subjective observation. How's that for scientific? My observation is that common mental symbols between functions are ranked explicit and first with implicit and last being the unique mental symbols associated with the function that are only remembered from the time the man page is read to the act of authoring it into the code. Implicit mental symbols would be the ones most strongly held in the working memory system.

I predict that functions whose explicit and implicit data flow symbols seemingly pass a magic number of "too damn much" cause the function to suffer from erroneous application. I'm assuming that in general the explicit stuff is recalled from long term memory and the implicit stuff picked from a man page about 0-2 seconds before writing the piece of code that it affects.