Clojure on Fennel part one: Persistent Data Structures
19 points by veqq
19 points by veqq
I've been hoping someone would do this for fennel. It's interesting that the performance differences are so stark, especially compared to Java vs Clojure's performance differences, which is roughly x1.25-x2. Is this difference because Lua's tables are so much faster than Java's Hashmaps? Or that the ClojureFnl impl is slower than Clojure's PersistentHashMaps?
Well, there are a lot of things to consider.
First of all, I think Clojure's HAMT implementation targets JVM's JIT compiler or at least is written in such a way that it can heavily benefit from it (branch prediction, cache coherence). PUC Lua is fast, but not that fast, and LuaJIT requires writing code in a way that will allow squeezing most of the performance, but it's hard to do that. And I can't write low-level code to meet cache and branch prediction requirements, because I have to stay in the Lua land. Making these data structures as a C library would probably work much better.
Secondly, I had to implement hashing in pure Lua, even for strings - in JVM, that's an inbuilt functionality, highly optimized. Lua doesn't expose its hashing function.
Finally, there are always problems with comparing tables in Lua - there is no standard way of comparing two tables, and currently I'm working on optimizing my equality function (already brought down Vector benchmark from 1000x to 700x slowdown on LuaJIT just by doing that). The worst case is O(n^2) for tables that use other tables as keys, and there's no way around it, but it should be quite rare. So there are probably more places I could improve performance. But it won't be as fast, I'm pretty sure.
All in all, I wouldn't say the performance is that bad, but yes, Lua tables are very fast, because they're very simple underneath. I believe Java collections are more nuanced and are probably slower than plain Lua tables (on LuaJIT, that is). But per-operation times for my HAMT implementation are quite smol still.
On a side note, my benchmark process may be flawed (probably is). I did a similar benchmark for Clojure and got a difference of 40x for PHAMT vs regular Java hash map in favor of the latter. So I'll have to work on my benchmark more before I can say anything more than what is already in the article.
It sounds like probably if you targeted LuaJIT specifically you could use the FFI to structure things in terms of arrays and avoid the overhead of tables, plus do all the hashing natively too. I bet in this case you could get very close to the Clojure version; at that point probably the only place you'd be behind would be the GC. (LuaJIT's generational GC is still in very early stages IIRC.) But of course it's hard to justify giving up the broader compatibility.
Yes, that was one of the options.
AFAIK, there was a library that adds similar FFI capabilities to PUC Lua, like ones found in the LuaJIT CFFI module, but they're not fully compatible, and honestly, I don't want to go that route.
I already have two C dependencies for ClojureFnl, better not to add more than that.
Edit: and still, value semantics would take a huge toll on performance.