sysp: Systems Lisp compiling to C with homoiconic macros, refcounted memory, Hindley-Milner type inference
29 points by veqq
29 points by veqq
In the first example I see an explicit else keyword. I haven't seen this in other lisp-likes, and it seems redundant given the more common (if cond true-branch false-branch) idiom. Any idea as to why this is implemented the way it is?
It's just a personal preference. I know in a lot of lisp languages (like arc), people try to aim for as few tokens as possible, but I like having it explicit. That's also why I have the commas in the function declarations. It is supposed to look like a normal lisp so it's not mandatory.
It looks like it just skips over else if you use it: https://github.com/karans4/sysp/blob/1521ee6dae988c2aef5e8c84f2c72ebacd6759f8/sysp.lisp#L1048
I’m kinda confused as to the use of the term “homoiconic macros” in the title. Homiconicity is a property of the language, not the macro system
My understanding of "homoiconic" is that it is a marketing term that is used to point at some language feature you are proud of. I observe that it means different things for different authors. The term was invented by Calvin Mooers to describe Trac, but the meanings given to the word have certainly changed since the Lisp community adopted it to market Lisp.
In this case, the macro system is also described as "real homoiconic metaprogramming with gensym". Here, "real" is also used as a marketing term in conjunction with "homoiconic". Since "gensym" seems to be a key part of the goodness that is being promoted here, I infer that "real homoiconic" macros are traditional Lisp macros, in contradistinction, I assume, to Scheme style hygenic macros. Many people consider Scheme to be homoiconic, but in this context I'm not sure if it is, since the Scheme standard doesn't specify a "real homoiconic" macro system.
I define homoiconicity as a combination of
So I would expect a homoiconic macro system relies on these features, i.e. lisp-style lightweight AST manipulation, probably not hygienic; not textual macros like C, not pattern-based macros like scheme or Rust macros-by-example, not heavyweight AST(ish) manipulation like Rust proc macros.
It’s a longstanding argument whether Scheme is homoiconic or not because a name in Scheme source is not just a symbol, it’s bound to a lexical scope. As a result Scheme bends or breaks homoiconicity, depending on which side of the argument you are on. And the reason Scheme departs from Lisp in this respect is to support hygienic macros, so there’s evidently some connection between the way a macro system is designed and whether its host language is considered homoiconic.
You might find Shriram Krishnamurthi's idea of replacing "homoiconic" with "bicameral" interesting: https://parentheticallyspeaking.org/articles/bicameral-not-homoiconic/
If I'm being completely honest, I've been writing code in various lisps for a few years at least, and the one thing I can't wrap my head around is scheme's hygienic macros. I didn't know about the debate about whether scheme's macros are homoiconic. That's very interesting, I'll have to look into it.
In general, I've been thinking about this "C with parentheses" idea for a few years. At the very least, supporting lisp style macros would allow you to automate the tedium of writing so much boilerplate. I like to keep things as simple as possible, and trying to manipulate an AST in other languages besides lisp seems like too much work. The only other language I've seen which let you change the language so much at a fundamental level so easily is FORTH.
But yeah, you're correct about the way I'm using the term. In most other languages, you try not to change them too much so the macro system is pretty much an after thought, but the fact that you can write lisp code in lisp is a fundamental part of the language and used all of the time. Thus, the macros (which are lisp functions which take in code and output code) are homoiconic.
I didn't know about the debate about whether scheme's macros are homoiconic. That's very interesting, I'll have to look into it.
I did some searchengineering and I found this example:
https://www.jucs.org/jucs_16_2/embedding_hygiene_compatible_macros/jucs_16_02_0271_0295_costanza.pdf
Discussing hygienic macros, it says:
However, more involved macros become more complex, due to the fact that the latter approach differentiates between surface syntax, which is still represented as s-expressions, and internal representation of source code in terms of syntax objects. This leads to a system in which the 'homoiconicity' of traditional Lisp macros is lost and, in some cases, code fragments have to be manually mapped between the different representations in macro definitions, for example for the purpose of breaking macro hygiene.
Perhaps the debate was more active in the past around the time of R5RS and its syntax-rules. These days Scheme seems to be accepted as homoiconic without caveats.
C with parentheses
Some related work:
And then you can use a Pratt parser to close the loop and write the code in a C-like syntax. An example of this sort syntactic sugar is at the bottom of the paper.
This is a much more complex definition of homoiconicity than the one I am familiar with, which is that code is represented using primitive data structures in the language. I'm surprised that some would litigate that Scheme is not homoiconic; wrapped syntax objects are certainly a distinct data type, but the syntax for creating them is the same for creating data literals.
This is a much more complex definition of homoiconicity than the one I am familiar with, which is that code is represented using primitive data structures in the language
Yeah :-) I think it needs a fairly elaborate definition because there are plenty of other languages that have an AST library and quasiquoting (eg, Haskell, Rust) but they don’t get described as “homoiconic”. If a language lacks any of the things I listed then it wouldn’t be homoiconic, and simplicity is a crucial part of it. The primitive data structures in Haskell and Rust are much more elaborate than Lisp’s, and so are their ASTs. And it has to be simple to get the right interplay between read and print and source and data.
I actually think that Rust and Haskell are easy to rule out under the definition I’m used to because their source code isn’t possible to represent arbitrary data types at the top level. For example, a struct literal is not a valid Rust item.