Optimal Bounds for Open Addressing Without Reordering

5 points by cyplo


davmac

Having spent a bit of time dissecting the paper recently, my take is that it’s not as exciting as it might first sound. It does present a technique with improved complexity for hash table lookups but that complexity is expressed in terms of the inverse of the “sparsity” (expressed via the delta symbol within the paper) of the hashtable, not in terms of the number of elements in the table. I.e. intuitively a more densely populated backing array would lead to worse lookup times, but they have a technique which is O(1).

Practical use of hash tables though generally puts a bound on the load factor; if the load factor would increase beyond a certain bound, then the array size is doubled (thus halving the load) and items are rehashed, which has an amortized O(1) cost. With the load factor bounded by a constant in this way, the inverse of the sparsity is therefore also bounded by a constant, and therefore the lookup time is effectively O(1) anyway.

There’s also no allowance for dynamic hash tables (where items are added and removed over time). It’s more about building a hash table as a one-off that is then used for a large number of lookups, which will remain fast even the table is heavily loaded.

So, if my understanding is correct, the technique they present has limited significance. It might allow, for example, constructing more densely populated (and therefore less memory-consuming) hash tables which are then used for a large number of lookups, although there are known existing techniques which are probably on the same level anyway (it’s possible to spend a little extra effort during item insertion to reduce the average lookup time, i.e. the “reordering” that the paper title alludes to). It’s not as exciting as “we can now improve the performance of all hash tables” for example. I’m still reading through the 2nd theorem though, so I guess there might be something slightly more interesting in that.

cyplo

New hash tables just dropped.

fanf

previously

snej

A page into section 2 my eyes started glazing over at all the math notation. I hope someone writes an explainer for this paper!