Fil-C
59 points by snej
59 points by snej
I think this stuff is really cool and establishes a new sort of "runtime environment" for C programs. Just kinda annoying that the social media posts around this are like "you don't need to rewrite in rust because this exists!"
Like... I dunno, I don't want to be writing C even if the memory safety stuff is resolved.
Didn't realize this was C++ capable though, which feels a bit more doable. I could see writing a greenfield C++ program from the start with this.
EDIT: also... would have been neat if this could have been built off of CompCert. Would be a nice target for verifying that Fil-C does what it says! Might be hard to prove though
I feel like this has come up a lot here ?
Mea culpa, I probably missed it because it gets tagged with [c], which I filter, despite also being relevant to C++. Sometimes I find lobste.rs’ filter rules annoying.
I missed the previous times this got posted, so I appreciate you posting it again. Looks very interesting.
Everything else aside, I find that a curious choice. As someone hardly using C and having used some C++ in the past 50% of the submissions feel like they are dealing with tooling and concepts that apply to both (for me).
Yeah, I just turned off that filter. I think I added it because was getting annoyed by people being Wrong On The Internet about the supposed virtues of C.
One particularly interesting thing about this is that if you used Fil-C as the basis of a garbage-collected language like Lisp, then you could suddenly access all C libraries that work under Fil-C in a seamless way.
Previously, foreign function interfaces for GC'ed languages were painful, but with this you could e.g. use some C hashtable library directly, and everything would just work, with automatic GC (and safety) also for the C parts.
It looks like a lot of C libraries require only minimal modifications to work under Fil-C, and some even no modifications at all. The author has even demonstrated a full X11 desktop running in Fil-C.
P.S. One thing I would love to see for this use case would be a way to tag pointers so that the dynamic language could look at a location in memory and know whether it's one of its own native objects (which follow a known memory layout), or a foreign C object which has to be treated as opaque. It looks like there are some bits available in the internal pointer representation (FILC_SPECIAL_TYPE_*), so this might work.
One thing I would love to see for this use case would be a way to tag pointers
That’s one way to think about what Capabilities are: unforgeable pointers.
Fil-C is a technically impressive fit of deep and creative engineering, for sure. Also, I like the edgy puns.
On the practical side, I don't really understand what it solves. From what I've seen in the author's presentations, it turns memory unsafety into an instant denial of service, while also bringing a significant performance penalty. It does not seem practical to run any long-lived programs dealing with untrusted input on top of it (even zsh kept crashing during basic usage in an author's presentation). It also has constraints like interop with libraries not compiled with Fil-C.
Personally I anticipate that this will be the only way to run legacy C code in some production environments within the next decade - because memory unsafe code will be banned as too risky.
A program cleanly exiting in response to malicious input is a lot better than a program giving up all the secrets it has access to.
I agree that I see the security benefits here. What I don't get, what is a combination of performance and practical considerations that makes compiling and running under Fil-C practical.
Huge legacy codebases (justified by "we have too much C to rewrite") could be a good target. On the other hand, Fil-C would just expose their existing bugs immediately. For me, it's more of a sanitizer. It still does not run unsafe code safely, it just adds light compilation checks and mostly makes it crash at run time. Legacy code would still require nontrivial changes just to compile and run reliably.
Performance sensitive code (justified by "C is fast") would take a hit and a GC would negate performance and latency predictability.
So, the use case here seems to be 1) explicit ban of unsafe code, which is practically unlikely, 2) legacy codebases that are miraculously bug-free to run safely (which is a nice proof they are bug-free, but again, it would be highly unlikely)
It still does not run unsafe code safely ... it mostly makes it crash at run time.
It does! Crashing is a safe behavior, because it's defined. The program does not execute in an undefined state, e.g. executing code supplied by an attacker (RCE).
The simplest way to explain this is with some analogies:
Assertions downgrade catastrophic correctness/safety bugs into liveness bugs
From https://lobste.rs/s/w1rq9r/tigerbeetle_coding_style_guide#c_7wht7b
Likewise
Are you claiming that CHERI isn't memory safe too? By your definition it wouldn't be
Likewise:
More on this same topic ... I've been saying this for 5+ years, and it seems like people are finally convinced:
https://lobste.rs/s/xnyrve/memory_safety_features_zig#c_1fehz5
https://news.ycombinator.com/item?id=35836307
https://news.ycombinator.com/item?id=35834513
https://news.ycombinator.com/item?id=21832009
When seg faults happen, they are a safe behavior. On the other hand, if a seg fault does NOT happen, it doesn't mean there's no unsafe behavior. CHERI and Fil-C change that, which makes them safe.
(That is: if the claims of CHERI and Fil-C are true, then using them does result in memory safety)
My feeling is that most people who claim that crashes are unsafe have never debugged a real memory safety error.
e.g. the kind that goes undetected for years. I have debugged exactly one in my career, and it was painful.
I actually didn't solve it -- a coworker did.
On the other hand, I have debugged and fixed thousands of crashes.
( I am going to hammer this home a bit, not because of this specific message, but because I've been saying the same thing for 5 years ... and I anticipate I'll again need to link back to this comment )
Consider the well-known case of the murdered Saudi journalist:
https://en.wikipedia.org/wiki/Jamal_Khashoggi
https://en.wikipedia.org/wiki/Pegasus_(spyware)
He was murdered due to a zero-click exploit in iOS. That is, they sent him a text message, which exploited a memory safety bug, and his location was revealed.
If you were Jamal Khashoggi, would you rather that iMessage crash when it receives the malicious payload?
Or would you rather your phone execute code supplied by the attacker?
This is why it does not make sense to label crashes as safety bugs.
Crashes and seg faults are what cause safety. They prevent undefined code execution.
And this definition was well-established by decades ago by computer scientist Luca Cardelli, which I pointed out in the linked comments above.
He specifically called out "the bugs that may go unnoticed for a long period of time" -- e.g. iOS zero-click text message exploits.
You're technically correct, the best kind of correct!
I agree that Fil-C runs code as long as it's safe, and this is my point: it cannot turn unsafe code into safe code, it can just prevent it from executing, making buggy but working code paths in C programs unavailable. Is a safe but dead program useful? Personally, I'd rather rewrite it in Rust.
it cannot turn unsafe code into safe code
I disagree. If CHERI and Fil-C work as claimed, then they turn C into a memory safe language.
They give additional semantics to C
This is true in theory -- using the definitions I gave above
And it's also true in practice -- a journalist using a phone protected with these technologies would not be murdered (using that attack)
(In other words, if you have a learned/acquired an informal definition of the word "safe" from Rust, then you should replace that in your head with a definition that was established before Rust existed. This definition is more precise, and more useful in practice.
Obviously, you may still prefer to write Rust code because it avoids more crashes, or for many other reasons. But that's a separate issue from memory safety. )
In other words, if you have a learned/acquired an informal definition of the word "safe" from Rust, then you should replace that in your head with a definition that was established before Rust existed. This definition is more precise, and more useful in practice.
As near as I can tell Fil-C and Rust entirely agree on the definition of safe code - the absence of undefined behavior - i.e. memory safety.
They achieve it through very different means. Fil-C takes the approach of using conservative implicit allocations, garbage collection, complex runtime checks, and the like. Rust takes the approach of conservative static analysis. Calling a Fil-C function from rust (if Fil-C exposed a ffi to enable this, which it doesn't seem to?) would be "safe" (but you couldn't transparently pass a rust pointer into Fil-C as a pointer because there are different representations).
Fil-C and Rust entirely agree on the definition of safe code
Well show me where the people I've been arguing with for 5 years got their definition of "safe" (links above)
Pretty sure it's from Rust, even though I'm not aware that Rust ever officially documented their definition of "safe". Probably because Rust doesn't have a spec, as other languages do
If they did document it, I'd definitely like to see it!
Last time, people also argued that it isn't a common misconception ... when it came up yet again right here
(The misconception being: if a program crashes at runtime, it doesn't necessarily mean anything unsafe happened. It does mean there was a bug.)
Well show me where the people I've been arguing with for 5 years got their definition of "safe" (links above)
They probably just misunderstood something? Expecting everyone on the internet to always be correct is.. uh... questionable. And your first link is in fact two well known rust developers agreeing that abrupt crashes in the form of guaranteed segfaults are safe.
Rust has always defined abruptly terminating the program to be safe - there's always been things in rust that will terminate the program without any warning (e.g. a second panic occurring while unwinding from a panic - which can easily happen from things that look entirely innocent like arithmetic operations or something that allocates).
Pretty sure it's from Rust, even though I'm not aware that Rust ever officially documented their definition of "safe".
Documented here. I'm pretty sure this has been documented for a long, long, time though I don't really care to do the archeology to prove it. "Safe" has since I've started programming rust (significantly pre 1.0) always meant that it is impossible for the programmer to trigger undefined behavior.
Probably because Rust doesn't have a spec, like C and C++ do
It does in fact have a spec now. Here's the spec's definition of unsafe.
Is that a spec for Ferrocene, or is it a spec for Rust? (honest question)
Anyway, the next time someone says something like "Fil-C or X isn't safe because the program can still crash", do you agree that's incorrect?
Started as a spec for Ferrocene, but was donated to and adopted by Rust since then.
Yeah, that would be an incorrect statement.
Yeah, I only see it in the "we have too much C to rewrite" case. Performance sensitive code in such an environment would have to be rewritten into another language.
And realistically I expect most code isn't running into memory safety bugs most of the time. And most code generally stops working not that long after hitting memory safety bugs anyways.
As a sanitizer I don't think it actually works that well. It allows things that are undefined behaviour in the C standard, it just defines the behaviour. E.g. you can return a pointer to a local variable in Fil-C and it "just works" (allocates it on the heap implicitly).
Of course I'm overindexing on the specific tool too much to some extent, I'm not so much predicting Fil-C in particular will be important as some tool of this kind...
also code that was running on the hardware of a decade ago is going to get a significant performance boost just by running on something newer, so you have a bunch of spare cycles to work with
Memory unsafely is always an instant denial of service: if an attacker can trigger a DoS under strict bounds checking and similar, then they can also just cause it without that, moreover instead of a DoS they can target data leaks or RCE instead.
I find the belief that an attacker that can find a memory safety error that terminates the program with strict safety enforcement cannot accomplish at least that absent such enforcement confusing. We've already established that the attacker was able to find and engineer scenarios in which they can terminate the process at will, why do people think such an attacker is unable to do that without the enforcement?
Memory unsafely is always an instant denial of service
That's not true: I wish Heartbleed was immediately visible, and was not sending random pieces of memory from millions of servers for months. How exactly was this a denial of service in any way?
It's a popular misconception about memory unsafety that it's always a loud segfault. No, segfaults are just a tip of the iceberg of silent memory corruption and heisenbugs.
Thank you for making my point, which you missed: if you have a memory access error the best case is a DoS. That is: an attacker find the bug and causes a DoS. Alternatively they find the bug and discover that it is not a DoS and does not need to be.
Instead of the dreaded DoS, you get a memory fault that is vastly more powerful.
The HeartBleed example is a case where Memory Unsafety isn't a DoS or memory fault; It's just silently incorrect (undefined in this case) behavior that happens to allows an exploit to occur. The argument of Rust/Fil-C/CHERI here is that it turns memory unsafety cases (which may not need to be DoS) into DoS, so that they dont silently allow bad things.
This could be great for running unit tests in a CI, or even long running system tests to catch bugs in C code.
I guess you’d have to turn off optimizations in your “release”, non Fil-C build, then?
(The assumption is that while Fil-C will catch lots of bugs, under optimizations it’s not the same as what Fil-C ran)
So it doesn't do a static check somehow and float up all possible errors at compile time via some kind of proof?
It literally just adds runtime checks, slowing everything down?
Useless then, IMHO.
If you want to stick to C/C++ at all, a solution should be worked on that can prove the code correct according to some model. If that is impossible, then the language must be restricted capability-wise until it can be.
I think there’s an avenue here for doing all development builds in this to squash more bugs. Similar to using assertions, but compiling them out for release.
Just this alone would be very useful. The question of Fil-C in production is one of threat model / performance trade off. But also Fil-C is young. Maybe there’s wild improvements to be made re: performance.
That doesn’t seem like a good idea, given that it turns some undefined (and potentially exploitable) behavior into defined behavior?
I mean, this is not a new practice. There are some teams that have assertions enabled only in development, and then compile them out “for speed” for production. If you have good test coverage, competent QA processes, etc, etc then maybe it’s ok?
Other languages are also doing this. Zig has dev and release builds. In Zig, it’s even pretty standard to use a different memory allocator in dev / testing in an attempt to squash bugs. Keep in mind that Zig isn’t much more memory safe than C/C++.
A co-worker just showed me this. I've never heard of it before, and the home page sounds a bit snake-oily, but I thought it was worth a post. Anyone used it?
FWIW the co-workers take was "it’s Linux only so I guess it’s no better than an address sanitizer for our purposes" (i.e. our C++ codebase has to run on many platforms, and we already use Clang's ASan for development & testing, which also catches memory errors.)
The difference with asan is that Fil-C is designed to be completely airtight, while asan is more best-effort and not designed for use in production.
I haven't used it, but as others have mentioned, we have had stories about it, with good comments! :)
https://lobste.rs/s/1utowa/fil_c_manifesto_garbage_memory_safety_out
https://lobste.rs/s/p0pozh/fil_c
https://lobste.rs/s/q7b1gm/fil_s_unbelievable_garbage_collector
It's not the same guarantee.
ASan: "I don't see a place where this code does something unsafe, so you can run it in production without safety checks."
Fil-C: "The code tried to do something unsafe in production, so I killed it."
Haven't used it personally, but members of my circle speak well of it. We see a lot of engaging commentary on Fil-C here and elsewhere, with the author making frequent appearances.
I think there's a vast toolchain of C programs that don't require high performance, but could do with some memory safety (think cybersecurity CLI.)