The Power of Lisp: Writing a Procedural Dialect
Recently I've been having a lot of fun with my lisp dialect.
The interesting thing is that the basic model of how functions and macros work, was finalised within the first few days of me working on it; most of my time was spent on writing and debugging the garbage collector, adding new types, adding new builtins, and implementing tail call optimisation. And of course extending the format of user-defined functions and macros to allow for documentation and an associated name.
But the basics of how a function or a macro work is quite simple, and yet despite its simplicity it is quite powerful.
Transforming code
Macro application takes in its arguments as-is, and substitutes for the associated parameter name anywhere it is found in its body. This substituted body is then evaluated.
For example, take the following two macros:
(defm += (var val) (:= var (+ var val)))
(defm ++ (var) (+= var 1))
Now in the code (++ i) will expand to the macro application of ++ with i bound to var, and substituted into its body, giving (+= i 1). Now when that is evaluated, it binds i to var and 1 to val, giving (:= i (+ i 1)). Its body is then evaluated (:= is a builtin macro, implemented in C, and + is a builtin function).
What about something like (+= n (println "The answer is:" 42))? Now n is bound to var, and (println "The answer is:" 42) to val, with the macro expanding to (:= n (+ n (println "The answer is:" 42))), which will display The answer is: 42 to the terminal, before adding 42 to n and assigning the result to n.
See if you can work out how (defm while (cond body) (if cond (do body (while cond body)))) works.
answer
Given some while loop, let's say (while (< i 10) (++ i)), it gets expanded to (if (< i 10) (do (++ i) (while (< i 10) (++ i)))).
if will only evaluate its then-branch when the condition evaluates to true; in that case the body is run first (in this case (++ i)), before the while macro is expanded.
Something you might notice is that the while macro at the end of the if is the very same as the while which produced the if! So, as long as the condition is true, while will produce and call itself, implementing iteration through recursion.
Now, you might already be starting to see how powerful these macros can be. But how powerful are they exactly? Certainly, I can define some amount of new syntax1 like the while loop. But could I do something like transform this functional language into a procedural one?
defm is itself a macro, with (defm name (params) body) expanding approximately to (def name (quote (params) (name "") T body)).Procedural Code in a Functional Language?2
As it turns out, the answer is yes.
The easy part is actually to write the new procedural syntax, taking a bit less than 100 lines to create let (an alias for def for style points), and fun to define a procedural function, with procedural functions coming with their own full environment, automatically allowing multiple expressions for its body without needing an explicit do, and requiring that the return value be specified with a return expression, rather than returning the last value in its body, as well as supporting for loops with break!
This allows me to take lisp code like this:
(defn fib (n) (tmpfn (a b n)
(if n (rec b (+ a b) (- n 1)) a)
(0 1 n)
)
And write it in a more familiar style as follows:
(fun fib (n)
(let a 0)
(let b 0)
(for (let i 0) (< i n) (++ i)
(let tmp b)
(:= b (+ a b))
(:= a tmp)
)
(return a)
)
This gets expanded to the following monstrosity (cleaned up a bit for your viewing pleasure), if you're interested:
(defn fib (n)
"a procedural function"
(.fun.extract-val (env-new (or
()
(do (let a 0) ())
(do (let b 0) ())
(.fun.for.transform (env-new (do
(def .fun.for.rec (\ ()
(if (< i n) (or
(do (let tmp b) ())
(do (:= b (+ a b)) ())
(do (:= a tmp) ())
(do (++ i) ())
(.fun.for.rec)
))
))
(let i 0)
(.fun.for.rec)
)))
(list a)
)))
)
Here I employ a few interesting tricks to make everything work. You might notice most of the expressions are wrapped in (do ... ()), but the (return a) transformed into a (list a). This is so that I can implement the body with or, which will return the first truthy value; then I make all non-return expressions give back (), which is falsy, so that execution might continue, and I wrap the retuned value in a list, as any non-empty list is truthy (allowing me to return falsy values such as 0 or F). Then .fun.extract-val will unwrap the value out of the list, or leave the empty list as-is.
For the for loop I do something similar. First, I must transform it into a recursive function (as looping is not natively supported by rholisp), and I do the same transformations to the expressions in its body. Except that (break) gets transformed into a simple T, also breaking out of the loop, but not with a value wrapped in a list. .fun.for.transform then just turns T into () and leaves other values untouched, preventing breaking out of the loop from returning from the function.
As simple as that! Well, simple in concept, the implementation took a few hours, but it wasn't too bad.
And Procedural Scripts as Well
But now comes the next step. I don't want the ability to write procedural code in my functional language. I want to be able to write procedural files: only variable and function definitions allowed top-level, and only the procedural style. The program logic then sits in main, which is called implicitly.
Surely this cannot be done, at least without fiddling with the interpreter source, or doing some sort of halfway bootstrapping, right?
Well, as it turns out, it can. I can't quite define a macro which prevents non-procedural (top level) code going forward, but what I can do is write a program to run procedural scripts; and then run the code only with eval, using the existing interpreter plus macros, rather than having to implement my own syntax.
This program is a bit longer was harder for me to write, taking a day or two, but conceptually it isn't that complicated.
First, it reads the file specified by the command line argument, before parsing it into a list of sequential lisp values and storing the parsed program into a variable (.run-procedural.prog).
I then define and run a temporary function, which will, as long as more values remain in the program, check if the value is allowed top-level, run it if it is, and display an error otherwise.
Lastly, with (exit (main args)) I run the main function and exit with the status code specified by its result.
Now, in order to check whether an expression is allowed top level, one must simply check that it is a non-empty list (that is, a function call or macro application), and that the first element of this expression (the function or macro) is in the list of allowed names (currently being include, let, and fun). Simple.
Unless, that is, you want to allow arbitrary macros that expand to one of the allowed expressions. Now it becomes a fun programming challenge.
So, if the name does not match, the first thing to do is to fetch its value, and check if its a callable and a macro3. If not, well, then its not allowed. If so, we must now expand the macro, and for this we need code that 1) understands the structure of a macro, and 2) can then expand the macro, one step at a time and without evaluating the result.4
while macro, a macro could expand an indefinite amount of times, so we have to artificially impose a limit on how many times a macro may expand before we by default assume it isn't allowedThe structure of a user-defined macro is not too complicated: Once we know it is a macro, we know that the second element is the parameter list, and the last is the body. From there we just have to pair the parameters up with the arguments5 and call the subs-with builtin with the parameter-argument bindings and the body of the macro. This will then perform substitution using the interpreter's substitution function, the same function as is used to actually implement user-defined macros.
Do This and That and That as Well
And so we have a functional procedural dialect!6
A program written in the procedural dialect is available here, which you can compare to a program written the usual way.
As a quick comparison, here's a truth machine program, first implemented functionally, the procedurally:(defn truth-machine (inp)
(if (do (print-val inp) inp) ; print the input, then check if it is truthy (input must be printed at least once)
(truth-machine 1) ; loop if input is truthy
)
)
(pstr "Enter either 0 or 1:")
(truth-machine (scd (parse (readline stdin)))) ; assume the user entered a valid lisp value
(fun truth-machine (inp)
(print-val inp) ; print the input
(if inp (truth-machine 1)) ; loop if the input is truthy (== 1)
)
(fun main (args)
(pstr "Enter either 0 or 1:")
(truth-machine (scd (parse (readline stdin)))) ; assume the user entered a valid lisp value
(return 0)
)
Okay, maybe a truth machine program doesn't give the greatest demonstration of the differences, but go and check how the fibonacci program is implemented in the procedural program compared to functionally in util.lisp.