A generic dynamic array in C that stores no capacity and needs no struct
12 points by alurm
12 points by alurm
[edit: I forgot the most obvious problem: this has the same problem as arrays vs pointers in C++, IntVec tells you you’re looking at an array but a pointer type doesn’t tell you if you an array or a pointer]
There seems to be an assumption that realloc grows at power of two rates, which is probably true on the whole, but I really wish people would simply stop using realloc: it wastes space if you don’t need, if you don’t grow at the correct rate you end up with unnecessary copies (in addition to excessive memory use).
There are APIs you can use to get the appropriate allocation size for a given requested capacity, and then use that as your internal buffer capacity. That gives you guaranteed minimum wastage and copying.
Now I think the reason this particular allocator uses 2^n allocation sizes specifically to allow it to go directly from current element count to capacity, but of course you can do via the above allocation size APIs - though the overhead of that call might matter enough for their use cases.
I really wish people would simply stop using realloc [...] There are APIs you can use to get the appropriate allocation size for a given requested capacity, and then use that as your internal buffer capacity. That gives you guaranteed minimum wastage and copying.
Per glibc's malloc_usable_size(3):
writing to the excess memory without first calling
realloc(3)to resize the allocation is not supported
freebsd/jemalloc's has a similar warning:
The malloc_usable_size() function is not a mechanism for in-place realloc(); rather it is provided solely as a tool for introspection purposes. Any discrepancy between the requested allocation size and the size reported by malloc_usable_size() should not be depended on, since such behavior is entirely implementation-dependent.
mimalloc does not really document its malloc_usable_size however mi_usable_size hints at the same thing:
The returned size can be used to call mi_expand successfully.
mi_expand is a realloc which guarantees working in-place (it returns null otherwise).
The mimalloc docs do seem to guide the user more towards LBYL and calling mi_good_size before allocating, but apparently that's a darwin-ism (darwin also has malloc_size which corresponds to glibc's malloc_usable_size)
You've misunderstood what I was saying. I am not saying: perform an allocation and use malloc_size to find out how big the allocation is and then use the excess memory in the block, because indeed that is unsound.
I am saying: use malloc_good_size to get the actual appropriate allocation size for the capacity you want, and then malloc that much: you don't waste space due to an inaccurate capacity, and you don't do excess copies due to a realloc that copies unused space.
Worth noting that a few years ago compilers quietly changed the semantics of malloc_usable_size() to be unusable size: previously it was OK to use the reserved size without an explicit realloc(). The change was related to compilers starting to track dynamic object sizes for higher fortification levels. previously
Calling something like malloc_usable_size() for every vec_push() would be a performance risk; overhead varies by implementation. To achieve the higher utilization you're going for, I would either explicitly store the capacity or use a denser sequence of size classes. FWIW, jemalloc uses a high performance algorithm for computing size classes that would work here, but the implementation complexity would be hard to justify for a hacky data structure like this one. A more useful hack might be to cram [0..2] integers directly into ints and avoid allocating an array for those sizes. That said, if I were implementing anything remotely like this data structure, I would avoid the 2-element outer data structure and put the size in the first element of the allocated array. (Better though: just use a structure with a size field and a flexible array of elements.) Managing compound data structures that have both stack and heap lifetime concerns is fraught with peril.
malloc_usable_size is very fast for basically every malloc implementation in common use; for glibc and other allocators that store size metadata in-line with the allocation it's literally a single load, and even segmented metadata allocators like mimalloc afaik would compile down to a handful of instructions (masking the chunk up to the page to access its metadata).
The performance argument isn't referring to the cost of the function implementation, it is the cost of any call at all vs an inline bound computation and check.
Right I (if I were making such a data structure) would have recorded that capacity, as for any significantly sized array the capacity record would not be significant, and for any small array the total size would be insignificant, unless you have a large number of them, but in that case there are typically better layouts, however this implementation could be intended specifically to serve such use cases (very large numbers of very small mutable arrays)
If I had data where I had to support that last use case I would probably try to include a bulk change operation that limited the number of checks and regrowth cycles as much as possible.
i like to either use a giant demand paged array or have explicit resize, but have it map/unmap pages in a reserved address space
this requires a hard-coded maximum size, and it takes up address space, so you can't use it for every little array. but it's good for ones that you expect to grow, and nothing moves