02 May 2008

ObjLisp

Lately, I've been poking at Lisp again. I never really liked it, mainly because I found the amazing syntax painful (yes, amazing but painful). While I realized the possibilities that the syntax presented, I found myself fighting to produce syntactically valid code. Part of this problem was the fact that I didn't quite understand the underlying structure of a Lisp list, so I didn't really understand just how powerful lisp syntax was, and part of that was that it's just damned confusing. :)

For those people that haven't ``gotten'' it yet. Lisp lists are composed of pairs, so the list 1 2 3 is actually (1 . (2 . (3 . nil))). First part is car, second part is cdr, so when you do (cdr 1 2 3) you're just getting (2 . (3 . nil)) back. The pairs are called cons cells, cons pairs, and a number of other things. Knowing this about lisp makes understanding how to manipulate Lisp code a lot easier.

I decided, having had this lisp epiphany that I would like to produce a Lisp runtime that is object-oriented. Specifically, that the entire system is implemented as a set of objects (from cons cells to primatives). I decided to do this implementation on top of David Chisnall's libobjc_tr,
which is an object runtime designed to fit nicely into most languages object model[1]. Object's map onto Lisp well in some ways, and are a little less obvious in others. An important part of my design is that everything is done by message passing and so I began to consider some simple Lisp expressions and figure out how to best map them onto a set of objects and messages.

To understand some of the ideas lets consider some examples:

1. (+ 1 2 3)
2. (+ "hello," " " "world")

Traditionally + only works on number values, but I of course wanted it to be able to work regardless of the type of the object as long as the object implemented a + method. In consideration of this, I've dealt with evaluation of a list in a particular way to facilitate this. When an eval message is sent to a list, car of the list is passed to car of cdr with cdr of cdr as the list of arguments. To be more clear:

1.+(2 3)

where + is car, 1 is cdr.car, and (2 3) is cdr.cdr.

From here the + method of the integer object handles adding it to the contents of cdr.

So far, I have an implementation of what I've described (though currently message semantics rely on generic functions instead of treating the first argument as a method name, because I just realized this is a better solution). I'm also currently working on an array object on which a hash
table object can be built, which will be used to implement ivars and method lookups in the language instead of hacking around the problem as I'm currently doing. Once I've got the array going, I'm planning to move onto probably the most important part of the implementation: functions.

With that in mind I'm going to address what I imagine will be the function definition and calling semantics.

3. (set f (lambda (x) (* x x)))

First, some information. When the parser (by which I mean the semantic actions a parser would use that I've implemented to make it easier for me to avoid writing a parser in the short term) encounters an id, it creates a ref object. The ref object when evaluated lookups up what its value should be. The ref object also supports a the set method, which when passed, sets the value of cdr of arguments to the ref object in the current scope (the specific semantics of which I haven't quite worked out, but scope will be lexical). So, when you say f.set(lambda ...) you're setting the (probably) nil value of f to the lambda defined in the arguments. So then, how do we deal with the lambda definition itself:

4. (lambda (x) (* x x))

Cons object accepts a message lambda, with arguments that consist of the function body. We then create a function object with the cons object that received the initial message as the arguments list, and the following arguments (the cdr) as the function body.

Here comes the juicy and unclear bit:

5. (f 1)

This won't work. Indeed it breaks the semantics of the language as we've defined so far. The correct way to do this would be something like:

6. (f lobby 1)

but that doesn't really work with the hope of being fairly compatible with standard lisp semantics (at least to some degree). Another idea is this:

lobby is object, which all methods delegate to, so that, calls to f on int would fall through to object. This wouldn't work in cases where you do something like define a function '+', since the int version would be used first.

So this is the problem I face when trying to cons up how plain old functions will work. Any thoughts on a good way to deal with this would be awesome, and I'll keep you informed as I work on it more. Also, anyone that wants the code can just drop me a comment or an email and I can hook you up (or put it in a public repository somewhere).

[1] http://etoileos.com/news/archive/2007/11/10/2313/

7 comments:

Peter Eddy said...

Why not just use CLOS? It's better than the single dispatch that you seem to be talking about, and it's already there.

alexbl said...

CLOS isn't object oriented at the core. It's an object system built on top of Common Lisp. As for multiple dispatch, my model supports first argument polymorphism and as for other arguments that's just syntactic sugar that could be replicated with a macro.

phaylon said...

I'm not sure OO in its pure sense will be so nice to use. while it would work wonderfully for user-defined data, I think the core would suffer.

I was thinking for a while about building a Scheme that's completely based on generics. Which is (imu) a fancy way for doing multidispatch-functions.

The problem with the (method-message obj args ...) syntax that I see with core are things like lambda, which you mentioned. Lambda doesn't just accept (lambda (x y) (+ x y)), it also can do (lambda args (+ (car args) (cadr args))).

Peter Eddy said...

CLOS is part of the definition of Common Lisp, you can't have Common Lisp without CLOS.

So I'm wondering, what is it that you want to do that you think you can't do with CLOS?

alexbl said...

Robert: Dealing with lambda isn't such a big problem, we just attach a lambda message to the ref object. The really big problem is dealing with things like the f(x) example I gave. I personally am not sure it's such a bad thing to force people to attach things to objects for that component nature, but on the other hand it's not quite as lisp-ish.

Peter: CLOS can't do things like (+ "blah" "bleh") == "blahbleh". Further, since I'm using libobjc_tr, it's possible to make calls from Objective-C or another language using the libobjc_tr runtime to objects in the Lisp. Further, it's a thought exercise in how one can create an object-oriented language with lisp semantics.

Peter Eddy said...

I see. Just want to point out that if you're OK with using some name other than +, you can do string concatenation in CLOS like so:

(defgeneric ++ (x y))

(defmethod ++ ((x string) (y string))
(concatenate 'string x y))

(defmethod ++ ((x number) (y number))
(+ x y))

(++ "blah" "bleh")
=> "blahbleh"

(++ 1 2)
-> 3

alexbl said...

Peter: Indeed, CLOS can do that, but CLOS can't change the low-level behavior and primitives of Lisp, because CLOS is built ON TOP of Common Lisp, and not as part of it.