02 May 2008

ObjLisp

Lately, I've been poking at Lisp again. I never really liked it, mainly because I found the amazing syntax painful (yes, amazing but painful). While I realized the possibilities that the syntax presented, I found myself fighting to produce syntactically valid code. Part of this problem was the fact that I didn't quite understand the underlying structure of a Lisp list, so I didn't really understand just how powerful lisp syntax was, and part of that was that it's just damned confusing. :)

For those people that haven't ``gotten'' it yet. Lisp lists are composed of pairs, so the list 1 2 3 is actually (1 . (2 . (3 . nil))). First part is car, second part is cdr, so when you do (cdr 1 2 3) you're just getting (2 . (3 . nil)) back. The pairs are called cons cells, cons pairs, and a number of other things. Knowing this about lisp makes understanding how to manipulate Lisp code a lot easier.

I decided, having had this lisp epiphany that I would like to produce a Lisp runtime that is object-oriented. Specifically, that the entire system is implemented as a set of objects (from cons cells to primatives). I decided to do this implementation on top of David Chisnall's libobjc_tr,
which is an object runtime designed to fit nicely into most languages object model[1]. Object's map onto Lisp well in some ways, and are a little less obvious in others. An important part of my design is that everything is done by message passing and so I began to consider some simple Lisp expressions and figure out how to best map them onto a set of objects and messages.

To understand some of the ideas lets consider some examples:

1. (+ 1 2 3)
2. (+ "hello," " " "world")

Traditionally + only works on number values, but I of course wanted it to be able to work regardless of the type of the object as long as the object implemented a + method. In consideration of this, I've dealt with evaluation of a list in a particular way to facilitate this. When an eval message is sent to a list, car of the list is passed to car of cdr with cdr of cdr as the list of arguments. To be more clear:

1.+(2 3)

where + is car, 1 is cdr.car, and (2 3) is cdr.cdr.

From here the + method of the integer object handles adding it to the contents of cdr.

So far, I have an implementation of what I've described (though currently message semantics rely on generic functions instead of treating the first argument as a method name, because I just realized this is a better solution). I'm also currently working on an array object on which a hash
table object can be built, which will be used to implement ivars and method lookups in the language instead of hacking around the problem as I'm currently doing. Once I've got the array going, I'm planning to move onto probably the most important part of the implementation: functions.

With that in mind I'm going to address what I imagine will be the function definition and calling semantics.

3. (set f (lambda (x) (* x x)))

First, some information. When the parser (by which I mean the semantic actions a parser would use that I've implemented to make it easier for me to avoid writing a parser in the short term) encounters an id, it creates a ref object. The ref object when evaluated lookups up what its value should be. The ref object also supports a the set method, which when passed, sets the value of cdr of arguments to the ref object in the current scope (the specific semantics of which I haven't quite worked out, but scope will be lexical). So, when you say f.set(lambda ...) you're setting the (probably) nil value of f to the lambda defined in the arguments. So then, how do we deal with the lambda definition itself:

4. (lambda (x) (* x x))

Cons object accepts a message lambda, with arguments that consist of the function body. We then create a function object with the cons object that received the initial message as the arguments list, and the following arguments (the cdr) as the function body.

Here comes the juicy and unclear bit:

5. (f 1)

This won't work. Indeed it breaks the semantics of the language as we've defined so far. The correct way to do this would be something like:

6. (f lobby 1)

but that doesn't really work with the hope of being fairly compatible with standard lisp semantics (at least to some degree). Another idea is this:

lobby is object, which all methods delegate to, so that, calls to f on int would fall through to object. This wouldn't work in cases where you do something like define a function '+', since the int version would be used first.

So this is the problem I face when trying to cons up how plain old functions will work. Any thoughts on a good way to deal with this would be awesome, and I'll keep you informed as I work on it more. Also, anyone that wants the code can just drop me a comment or an email and I can hook you up (or put it in a public repository somewhere).

[1] http://etoileos.com/news/archive/2007/11/10/2313/