Trees
RedBlack Tree
1. Red Rule: A red child must have a black father
2. Black Rule: All paths to external nodes pass through the
same number of black nodes.
3. All the leaves are black, and the sky is grey.
Rotations are terminal cases. Only happen once per ﬁxup.
If we have a series of insertdelete for which the insertion point
is known, the amortized cost to each action is O(n).
Height:log n ≤ h ≤ 2 log n
Limit of rotations: 2 per insert.
Bound of ratios between two branches
Trees
RedBlack Tree
1.
Red Rule
: A red child must have a black father2.
Black Rule
: All paths to external nodes pass through thesame number of black nodes.3. All the leaves are black, and the sky is grey.Rotations are terminal cases. Only happen once per ﬁxup.If we have a series of
insertdelete
for which the insertion pointis known, the amortized cost to each action is
O
(
n
)
.Height:
log
n
≤
h
≤
2log
n
Limit of rotations: 2 per insert.Bound of ratios between two branches
L,R
:
S
(
R
)
≤
(
S
(
L
))
2
Completely isomorphic to 24 Trees.
BTree
d
deﬁnes the
minimum number of
keys on a nodeHeight:
h
≈
log
d
n
1. Every node has at most
d
children and at least
d
2
children(root excluded).2. The root has at least 2 children if it isn’t a leaf.3. A nonleaf node with
k
children contains
k
−
1
keys.4. On B+ trees, leaves appear at the same level.5. Nodes at each level form linked lists
d
is optimized for HDD/cache block size
Insert:
Add to insertion point. If the node gets too large,
split
.
O
(log
n
)
≤
O
(
log
d
n
)
Split:
The middle of the node (low median) moves up to be theedge of the father node.
O
(
d
)
Delete:
If the key is not in a leaf, switch with succ/pred. Delete,and deal with short node
v
:
1. If
v
is the root, discard; terminate.2. If
v
has a nonshort sibling, steal from it; terminate.3. Fuse
v
with its sibling,
repeat
with
p
←
p
[
v
]
.
Traversals
Traverse(t):if
t==null
then return
→
print (t)
//preorder
Traverse(t.left)
→
(OR) print(t)
//inorder
Traverse(t.right)
→
(OR) print(t)
//postorder
Heaps
Binary Binomial Fibonacci
findMin
Θ(1) Θ(1) Θ(1)
deleteMin
Θ(log
n
) Θ(log
n
)
O
(log
n
)
insert
Θ(log
n
)
O
(log
n
) Θ(1)
decreaseKey
Θ(log
n
) Θ(log
n
) Θ(1)
meld
Θ(
n
) Θ(log
n
) Θ(1)
Binary
Melding: If the heap is represented by an array, link the twoarrays together and
HeapifyUp
.
O
(
n
)
.
Binomial
Melding: Unify trees by rank like binary summation.
O
(log
n
)
Fibonacci HeapMaximum degree:
D
(
n
)
≤
log
ϕ
n
;
ϕ
=(
1+
√
5
)
2
Minimum size of degree
k
:
s
k
≥
F
k
+2
Marking:
Every node which lost one child is marked.
Cascading Cut
: Cut every marked node climbing upwards.
Keeps amortized
O
(log
n
)
time for
deleteMin.
Otherwise
O
(
√
n
)
.
Proof of
ϕ
k
bound:
1. All subtrees of junction
j
, sorted by order of insertion are of degree
D
[
s
i
]
≥
i
−
2
(Proof: when
x
’s largest subtree was added,since
D
[
x
]
was
i
−
1
, so was the subtree. Since then, it couldlose only one child, so it is at least
i
−
2
)2.
F
k
+2
= 1 +
ki
=0
F
i
;
F
k
+2
≥
ϕ
k
3. If
x
is a node and
k
= deg[
x
]
,
S
x
≥
F
k
+2
≥
ϕ
k
.(Proof: Assume induction after the base cases and then
s
k
=2 +
ki
=2
S
i
−
2
≥
2 +
ki
=2
F
i
= 1 +
ki
=0
F
i
=
F
k
+2
)
Structures
Median Heap: one minheap and one maxheap with
∀
x
∈
min
,y
∈
max :
x > y
then the minimum is on the medianheap
Sorting
ComparablesQuickSort
O
(
n
log
n
)
O
n
2
InPlacePartiton:BubbleSortHeapSort
(
O
(
n
log
n
)
,
aux):InsertionSortMergeSort (
O
(
n
log
n
)
,aux):
SelectionSort
(
O
n
2
, inplace)
:
Traverse
n
slots keeping scoreof the maximum. Swap it with
A
[
n
]
. Repeat with
A
[
n
−
1]
.
Linear TimeBucketSort
Θ(
n
)
:If the range is known, make the appropriate number of buckets,then:1. Scatter: Go over the srcinal array, putting each object in itsbucket.2. Sort each nonempty bucket (recursively or otherwise)3. Gather: Visit the buckets in order and put all elements backinto the srcinal array.
CountSort
Θ(
n
)
:1. Given an array
A
bounded in the discrete range
C
, initializean array with that size.2. Passing through
A
, increment every occurence of a number
i
in its proper slot in
C
.3. Passing through
C
, add the number represented by
i
into
A
a total of
C
[
i
]
times.
RadixSort
Θ(
n
)
:1. Take the least signiﬁcant digit.2. Group the keys based on that digit, but otherwise keep thesrcinal order of keys. (This is what makes the LSD radix sorta stable sort).3. Repeat the grouping process with each more signiﬁcant digit.
Selection
QuickSelect
O
(
n
)
O
n
2
5tuple Select
Hashing
Universal Family
: a family of mappings
H.
∀
h
∈
H.h
:
U
→
[
m
]
is universal iﬀ
∀
k
1
=
k
2
∈
U
:
Pr
h
∈
H
[
h
(
k
1
) =
h
(
k
2
)]
≤
1
m
Example: If
U
= [
p
] =
{
0
,
1
,...,p
−
1
}
then
H
p,m
=
{
h
a,b

1
≤
a
≤
p
; 0
≤
b
≤
p
}
and every hash function is
h
a,b
(
k
) = ((
ak
+
b
)
mod
(
p
))
mod
(
m
)
Linear Probing
:
searching in incremental order through the
Open
Addressing
:
Use
h
1
(
x
)
to hash and
h
2
(
x
)
to permute.No pointers.
Open
Hashing
:Perfect Hash:
When one function clashes, try another.
O
(
∞
)
.
Load Factor
α
:
The length of a possible collision chain. When

U

=
n, α
=
mn
.
MethodsModular:
Multipilicative, Additive, Tabular(byte)additive
PerformanceChaining
E
[
X
]
Worst CaseSuccessful Search/Del
12
(1 +
α
)
n
Failed Search/Veriﬁed Insert
1 +
α n
ProbingLinear:
h
(
k,i
) = (
h
(
k
) +
i
)
mod
m
Quadratic:
h
k,i
) =
h
(
k
) +
c
1
i
+
c
2
i
2
mod
m
Double:
h
(
k,i
) = (
h
1
(
k
) +
ih
2
(
k
))
mod
m
E
[
X
]
Unsuc. Search Succ. SearchUni. Probing
11
−
α
1
α
ln
11
−
α
Lin. Probing
12
1 +
11
−
α
2
12
1 +
11
−
α
So Linear Probing is slightly worse but better for cache.
Collision Expectancy:
P
[
X
≤
2
E
[
X
]]
≥
12
So:1. if
m
=
n
then
E
[

Col

< n
]
≥
n
2
2. if
m
=
n
2
then
E
[

Col

<
1]
≥
12
And with 2 there are nocollisions.
TwoLevel Hashing
Computing the number of collisions per level:
n
−
1
i
=0
n
i
2
=

Col

1. Choose
m
=
n
and
h
such that

Col

< n
.2. Store the
n
i
elements hashed to
i
in a small table of size
n
2
i
using a
perfect
hash function
h
i
.
Random algorithm for constructing a
perfect
two levelhash table:
1. Choose a random
h
from
H
(
n
)
and compute the number of collisions. If there are more than n collisions, repeat.2. For each cell
i
,if
ni >
1
, choose a random hash function from
H
(
ni
2)
. If there are any collisions, repeat.
Expected construction time –
O
(
n
)
Worst Case search time 
O
(1)
UnionFind
MakeSet(
x
) Union(
x,y
) Find(
x
)
O
(1)
O
(1)
O
(
α
(
k
))
Union by Rank:
The larger tree remains the master tree inevery union.
Path Compression:
every
ﬁnd
operation ﬁrst ﬁnds the masterroot, then repeats its walk to change the subroots.
Recursion
Master Theorem:
for
T
(
n
) =
aT
nb
+
f
(
n
);
a
≥
1
, b >
1
, ε >
0
:
T
(
n
) = Θ
n
log
b
a
f
(
n
) =
O
n
log
b
(
a
)
−
ε
T
(
n
) = Θ
n
log
b
a
log
k
+1
n
f
(
n
) = Θ
n
log
b
a
log
k
n
;
k
≥
0
T
(
n
) = Θ(
f
(
n
))
f
(
n
) = Ω
n
log
b
a
+
ε
af
nb
≥
cf
(
n
)
Building a recursion tree: build one tree for running times (in
T
(
αn
)
) and one for
f
(
n
)
.
Orders of Growth
f
=
O
(
g
) limsup
x
→∞
f g
<
∞
f
=
o
(
g
)
f gx
→∞
→
0
f
= Θ(
g
) lim
x
→∞
f g
∈
R
+
f
= Ω(
g
) liminf
x
→∞
f g
>
0
f
=
ω
(
g
)
f gx
→∞
→ ∞
Amortized Analysis
Potential Method:
Set
Φ
to examine a parameter on datastucture
D
i
where
i
indexes the state of the structure. If
c
i
isthe actual cost of action
i
, then
ˆ
c
i
=
c
i
+ Φ(
D
i
)
−
Φ(
D
i
−
1
)
.The total potential amortized cost will then be
ni
=1
ˆ
c
i
=
ni
=1
(
c
i
+ Φ(
D
i
)
−
Φ(
D
i
−
1
)) =
ni
=1
c
i
+ Φ(
D
i
)
−
Φ(
D
0
)
Deterministic algorithm:
Always predictable.
Stirling’s Approximation:
n
!
∼√
2
πn
ne
n
⇒
log
n
!
∼
x
log
x
−
x