You don't need free lists

34 points by gingerBill


tobin_baker

It's trivial to make a hierarchical bitmap concurrent if you have a single writer (set bits top-down, clear bits bottom-up). For multiple writers, a lock-free stack is much simpler (I recommend avoiding Pop() in favor of PopAll() if possible so you don't need to deal with ABA).

HackerFoo

It’s O(1) when log N is a constant. (I’m tired of this type of abuse of big-O notation. It’s okay to have higher big-O if you have an idea of what N will be.)

olliej

One problem with free lists is they’re extremely susceptible to buffer overflow attacks - it’s a large component of why they’re not used is new allocators - essentially you don’t want the metadata that defines the heap to be in the same place as objects in the heap.

The cache benefits of object reuse is something I’ve never been too sure about. You obviously get an immediate win during post-alloc initialization, but if you end up with sparse allocation of related objects (as the post says: functionally random allocation) the longer term performance seems like to suffer. But on high end CPUs the load predictors and such may be able to compensate for that as well.

@marginalia also pointed out the risk of false sharing which I think is a good way to cause problems, but I think the bigger problem is cross core conflict rather than local cache inefficiency - basically if you’re writing to an object in one core and it intersects unrelated objects in a single you can start forcing cross core synchronization - default TSO on x86 hits this, and all atomic ops have this problem (eg a core doesn’t need to synchronize with other cores if no other cores have a view of the relevant bytes)

accelbread

For memory constrained devices, I often just have a sentinel value, usually 0 or -1, that i search the array for to find an empty slot. The lists are also small though so bitmap wouldnt have much more overhead. Might make sense for larger structures, but unsure if it does for arrays of small non-zero handles/fds. It would also mean freeing is more than just one atomic assignment.

marginalia

significantly better slot allocation strategy with improved memory locality

There's some unstated assumption here I'm not following. Wouldn't it be an undesirable property if different allocations were to share cache lines, because of false sharing and so on?