Static Predecessor Problem
Problem Definition
- desire a data structure representing a set
of items with a query of the form that requires low space and fast query times. - Static simply implies the set doesn't change, dynamic would mean we support insertion and deletion
- Ex:
- Static: store sorted numbers and do binary search
- Dynamic: Balanced BST (query =
)
Word RAM Model
- items are integers in
where is the word size and the universe thus has size . - We also assume that all pointers fit in a world i.e.
and - All normal operations such as +, -, / (int), x, | , &, >>, << can be done in constant time
By choosing the better of vEB trees or Fusion trees we can achieve query times
Van Emde Boas Tree (vEB)
-
Dynamic
-
Update and Query in
-
Naive version uses
space but can be made with randomisation -
Basic Idea: divide and conquer with bit manipulation
-
vEB trees are defined recursively: If
denotes a a vED tree of size u, then each stores - the minimum elements (V.min)
- the maximum elements (V.max)
- a top
subtree (V.summary) bottom subtrees (V.cluster) (a size array)
We express every element
The query can be defined in the following manner:
pred(V, x = <c, i>):
if x > v.max: return V.max
else if V.cluster[c].min < x:
return pred(V.cluster[c], i)
else:
c' = pred(V.summary, c)
return V.cluster[c'].max
Query time:
We define insertion as follows:
insert(V, x = <c, i>):
if V is empty:
V.min = x;
return;
if x < V.min: swap(x, V.min)
if x > V.max: V.max = x
if V.cluster[c].min == null:
insert(V.summary, c)
insert(V.cluster[c], i)
Insertion time:
: the case in which we make two recursive calls i.e. cluster is empty, however the immediate call returns immediately (first if branch), thus is actually is:
Space:
, ( ) for the summary + for the clusters and some constant for min and max - This reduces to
❌ POOR ❌ - We can however exploit the fact that most clusters are empty and use hashing to map a cluster id
to a pointer to the corresponding non-empty cluster. This results in a space complexity of
Y-Fast tries
- Use a bit array of length
where bits are set to 1 if the element is in the set and 0 otherwise. This uses time ❌ - We can build a perfect binary time where each internal node stores the logical OR of its children. We store all the 1s in a doubly linked list.
- To get the predecessor of a 1 we go backwards in the linked list
- To get the predecessor of a 0 we go up the tree and find the nearest 1, we then follow the 1's to find the nearest 1 in the other branch.
- if we get the ancestor from the right branch (going to the left) we find the maximum which gives the predecessor
- if we get the ancestor from the left branch (going to the right) we find the minimum which gives the successor and since all the 1's are stored in a double linked list we just go one step back to find the predecessor.
- This uses
time to find a ancestor that is a 1.
- We use the fact that all paths to the root are monotone (0 for a while and then all 1s) and thus we can do a binary search to find where it changes from a 0 to 1. If we store the tree as an array(root at 0, node
has left child at and right child at ), we can find the -th ancestor simply by right shifting the index by k bits. Thus we can find the ancestor in
x-fast trie: We can improve the space usage by storing the 1's in a hash table; where the key is the index of a particular node. Thus when we search we simply check whether a key exists. This takes
y-fast trie: We use indirection. If we have a x-fast trie on
Fusion Trees
- Static for now
- Query in
- Linear space