edn.c: A fast, zero-copy EDN (Extensible Data Notation) reader written in C11 with SIMD acceleration
52 points by delaguardo
52 points by delaguardo
This looks great. I wish EDN would get a proper specification and conforming parsers/generators for clojure JVM, it's very implementation-specific right now.
I had a similar feeling, but reading https://github.com/edn-format/edn ... while it's fairly casual I have a hard time reading this and thinking "this is underspecified" (the same cannot be said of YAML! At least the EDN spec says "this is UTF-8")
But I haven't dug into this too much beyond thinking "edn might be nice to use" for some project and then just kinda forgetting about it. Just bit surprised cuz I feel like I looked this up recently and was unsatisfied and now I feel like it's fine. Maybe it was me looking at the YAML "spec" which tempered my expectations
The spec isn't too bad (missing small things like "how to write infinity"), but the built-in library clojure.edn has a number of weird behaviors that are copied directly from Clojure's reader/printer (for example, EdnReader.java is more liberal with numbers).
I wonder how much of this kind of explicit SIMD intrinsic code is actually necessary given that you can just write SIMD-amenable regular code and the GCC/LLVM vectorizer will try to emit optimal vector code for the target platform?
I actually was learning while doing this) from what knowledge I have so far - you definitely can write regular code and it will be optimized by compiler but only if you have predictable data buffers size. for such parser it is not always a case and I decided to play safely. but I would not say I'm 100% sure about this. if you know a better way please let me know.
The annoying thing about trusting the compiler to optimize is that you can't trust the compiler to optimize, because there's no actual specification or promise for when it optimizes.
You can add automated tests that ensure the optimization you expect is actually happening, lest a new point release of the compiler silently decide to not do the optimization you were relying on. In addition to the blunt instrument of just generating assembly and grepping it, there are compiler vectorization reports that may help. (GCC and LLVM have different methods for this.)
For a different approach to the problem, take a look at ispc, a SIMD-oriented dialect of C.
The annoying thing about trusting the compiler to optimize is that you can't trust the compiler to optimize, because there's no actual specification or promise for when it optimizes.
one thing that worked well for me is sprinkling with #pragma openmp simd, which will write a warning when the compiler wasn't able to optimize
Ah, I was also recently learning more about SIMD, that's why I ask! Because I had an interesting experience of (1) realizing I could optimize a certain procedure with vector intrinsics; (2) rewriting it with Zig's explicit vector stuff; (3) having to change certain things to make it possible like padding a certain thing to four bytes etc; and then (4) realizing that the LLVM vectorizer could generate very good code if I only did step 3!
In other words, I could lay it up for the compiler by making my code vectorizable. I was excited by this so I tried compiling for obscure architectures like LoongArch and WASM64+SIMD to see it generating different vector opcodes I'd never heard of...
So I recommend taking some appropriate part of the code and writing it in a merely vectorizable way to see what your compiler can do!
As far as I know compilers can’t yet autovectorize this kind of lexical analysis, so the state of the art is still to write them by hand with vector intrinsics.
Interesting. Any idea what's preventing them? I've had LLVM thoroughly autovectorize string buffer code—after I rearranged it just a bit to make it actually possible (avoiding bounds checks, etc).
The tricky part is spreading the lexer state rightwards across the vector, and altering the state according to the characters in left-to-right order.
That makes sense, thanks. I see this code also does stuff like gated dynamic array growth that you'd have to deal with in a way that doesn't cause branching on every iteration, etc.
Is this for jank? I am eagerly awaiting the first alpha. That project has a lot of promise
I kept jank in mind during development and I would be glad to know what they think about this reader. But compatibility with jank wasn't the goal from the beginning.
so if we need fast edn in e.g. python, it's all about FFI'ing the hell out of API? Or there's something more clever coming in?
from my experiments cython was a bit better from ergonomics and performance point of view. but I still test alternative approaches. I will add it later same way as wasm bindings