liballocs: Meta-level run-time services for Unix processes... a.k.a. dragging Unix into the 1980s

13 points by mccd

muvlon

I'm clearly not smalltalk brained enough to understand what the author is going for.

It feels like this is a bit of support code that lets you query allocation info, and the author tells us that this will allow us to realize all our wildest dreams, but we have to draw the rest of the owl. But I have no idea how to do that.

mccd

There's an example at around 27:20 in this video with Node interop with C:

https://youtube.com/watch?v=LwicN2u6Dro

The talk itself might also help understanding better the motivation and aims.

The original research paper linked below also contains concrete applications.

Corbin

Previously, on Lobsters, we discussed the notion of substrate-dependent computing from the same author.

snej

Original research paper, with lots of implementation details, is here.

agocke

so long as you have debugging information

This will be the core cost. Depending on the language and compiler you're looking at between 2-10x the size of code for debugging info. These costs are the main reason why the Windows ecosystem stopped shipping PDBs with the code decades ago.

As you think about interop, the costs grow. Take C# Native AOT, for example. If you compile all of .NET to native code, you're looking at a binary hundreds of megabytes in size. No different for any "batteries-included" language. In order to get down to reasonable size, you have to trim out the things you don't need. But if your native interop space includes arbitrary introspection, you need to somehow limit the set of things that can be introspected on in order to avoid compiling everything.

But once you start writing code to explicitly decide which code is visible through interop you've lost the point of arbitrary reflection in the first place.

david_chisnall

This will be the core cost. Depending on the language and compiler you're looking at between 2-10x the size of code for debugging info.

(I've had this conversation with Stephen before: ) It's also relies on fairly accurate debug information. And, today, debug information is always best effort. If an optimisation fails to preserve debug info, that's not ideal, but it's accepted.

Most of the debug info that this uses is in the form of function signatures, which tend to be quite stable, but even a 1% error rate is problematic for this kind of use case.

In order to get down to reasonable size, you have to trim out the things you don't need. But if your native interop space includes arbitrary introspection, you need to somehow limit the set of things that can be introspected on in order to avoid compiling everything.

This is the reason that Objective-C strongly encourages dynamic linking. You basically can't do any dead-code elimination in Objective-C because there are a bunch of places in common idioms that do string processing and feed the result into introspection. Or iterate over the set of methods / instance variables in a class. And this is mostly fine if most of your code is in a few libraries that are dynamically linked into every process. It's much less desirable if you do want to do static linking and every program comes with a few hundred MiBs of code.
- agocke
  
  It's much less desirable if you do want to do static linking and every program comes with a few hundred MiBs of code.
  
  I think it's worse than that. It's also only fine if you have many processes to amortize over. If you're shipping single-application computers (aka Docker containers), you're paying for lots of things you will provably never use.