modified van Emde Boas Tree
the fastest successor/predecessor for integers set
A modified van Emde Boas data structure that achieves time complexity for successor/predecessor/insert/search/delete in O(lg w) (with high probability), where w is the bits of the word used to store the integer. Space complexity is O(n), where n is how many numbers are stored. Thus, the constrain is that lg n <= w, since any number must fit in a word.
If w is not too high, then this is optimal. In case w is too high, fusion trees are the best. However, dynamic versions of fusion tree only achieve expected O(lg n/lg w).
When lg w >= sqrt(lg(n)) is better to use fusion trees. However, for dinamyc fusion trees only expected O(lg n/lg w) has been achieved.
For example, in a static case, with a 32-bit word, w = 32 (as this current implementation that uses scala's Int which is 32-bit signed integer), static van Emde Boas should be used if n >~ 2^25 ~ 33 millon . Otherwise, static fusion trees should be preferred. In the same scenario, static, for 64-bit word, w = 64, this structure is better than fusion trees if n >~ 68 billon . You're likely to have memory problems with such big numbers, and you may want to switch to a distributed system. Due to its recursive nature, I imagine this structure is easy to implement in a distributed way.
The current implementation, handles scala's Int which is a signed 32-bit number. However, th
time complexity (with high probability)
- search: O(lg w)
- successor: O(lg w)
- predecessor: O(lg w)
- delete: O(lg w)
- insert: O(lg w)
where n = 2^w is the maximum number that can be stored.
space complexity
- O(n)
may TODO
- Create performance tests
- Measure memory consumption againt data size (n) and bit (w)
- Create a complete immutable version
- Improve usage of generators for performance tests
- Do performance tests for Immutable Version
- Compare performance between mutable and immutable
- Write predecessor
- Add delete
- Allow any ordering
- Augment the data structure
- Compare to fusion tree
- Compare to
O(lg n)successor/predecessor of trees - Present performance results
features
- immutable or mutable
- dynamic
- fully tested
- with benchmarks code