Tiny Pointers

17 points by MatheusRich


This paper introduces a new data-structural object that we call the tiny pointer. In many applications, traditional log n-bit pointers can be replaced with o(log n)-bit tiny pointers at the cost of only a constantfactor time overhead. We develop a comprehensive theory of tiny pointers, and give optimal constructions for both fixed-size tiny pointers (i.e., settings in which all of the tiny pointers must be the same size) and variable-size tiny pointers (i.e., settings in which the average tiny-pointer size must be small, but some tiny pointers can be larger). If a tiny pointer references an element in an array filled to load factor 1−1/k, then the optimal tiny-pointer size is Θ(log log log n+log k) bits in the fixed-size case, and Θ(log k) expected bits in the variable-size case. Our tiny-pointer constructions also require us to revisit several classic problems having to do with balls and bins; these results may be of independent interest.

Using tiny pointers, we revisit five classic data-structure problems. We show that:

• A data structure storing n v-bit values for n keys with constant-time modifications/queries can be implemented to take space nv +O(n log(r) n) bits, for any constant r > 0, as long as the user stores a tiny pointer of expected size O(1) with each key—here, log(r) n is the r-th iterated logarithm.

• Any binary search tree can be made succinct with constant-factor time overhead, and can even be made to be within O(n) bits of optimal if we allow for O(log∗ n)-time modifications—this holds even for rotation-based trees such as the splay tree and the red-black tree.

• Any fixed-capacity key-value dictionary can be made stable (i.e., items do not move once inserted) with constant-time overhead and 1 + o(1) space overhead.

• Any key-value dictionary that requires uniform-size values can be made to support arbitrary-size values with constant-time overhead and with an additional space consumption of log(r) n+O(log j) bits per j-bit value for an arbitrary constant r > 0 of our choice. • Given an external-memory array A of size (1 + ε)n containing a dynamic set of up to n key-value pairs, it is possible to maintain an internal-memory stash of size O(n log ε−1) bits so that the location of any key-value pair in A can be computed in constant time (and with no IOs). These are all well studied and classic problems, and in each case tiny pointers allow for us to take a natural space-inefficient solution that uses pointers and make it space-efficient for free

fanf

This paper is prior work leading to this recent link