Let L be the in-order list of keys from the original binary search tree T. Construct a new binary search tree T' by calling reconstruct(L):
We need to show that T' is equivalent to T and that T' has minimum height.reconstruct(L) if size(L) = 0, return nil m <- floor(size(L)/2) n <- new node n->key = L[m] m->left = reconstruct(L[0 .. m - 1]) m->right = reconstruct(L[m + 1 .. size(L) - 1]) return m
To show equivalence, let L be the in-order list of keys (k1, k2, ..., kn) from a binary search tree. search(k) returns true if k is in L and false otherwise; minimum() returns k1 and maximum() returns kn; predecessor(ki) returns ki - 1 if i > 1 successor(ki) returns ki + 1 if i < n. From these facts comes the observation that that two binary search trees are equivalent if they both produce the same in-order list of keys.
If size(L) < 2, reconstruct(L) returns a tree that produces L as its in-order list.
Assume that the in-order list produced by the tree returned by reconstruct(L) is identical to L when size(L) < i for some i >= 1. Consider the in-order list produced by the tree T' returned by reconstruct(L) when size(L) = i.
With m = floor(size(L)/2), the left subtree of T' was built with a recursive call to reconfigure() with a list containing less than i keys. Because reconfigure() doesn't change subtrees once created, the assumption holds and left subtree of T' produces the in-order list of keys L[1..m - 1]. A similar argument holds for the right subtree of T'. Because both subtrees of T' produce matching in-order lists, the in-order traversal of T' matches the list L, and T' is equivalent to the original tree T.
Now for minimal height. A binary tree of n nodes has minimum height when it can't be reconfigured as a binary tree with the same number of nodes and a smaller height. An almost complete binary tree of height h contains a minimum of 2h nodes (because a complete binary tree of height h - 1 contains 2h - 1 + 1 - 1 nodes), so if a binary tree of height h contains at least 2h nodes, it must be minimal.
We need to show that reconstruct(), when called with a list of length l, builds a binary tree of height h such that 2h <= l.
If the list contains a single key, then reconstruct() returns a one-node tree of height 0, and 20 = 1. Assume that, for any list of length l no more than i >= 1, reconstruct() returns a binary tree of height h such that 2h <= l. Now consider what happens when reconstruct() is called with a list of length i + 1.
There are two cases to consider. If i + 1 is odd, then reconstruct() forms two subtrees of floor((i + 1)/2) nodes and of height h. Because floor((i + 1)/2) no greater than i, the assumption holds and 2h <= floor((i + 1)/2). From this we get
That is, the assumption holds when i + 1 is odd.
2h + 2h <= floor((i + 1)/2) + floor((i + 1)/2) 2h + 1 <= 2floor((i + 1)/2) < 2floor((i + 1)/2) + 1 < i + 1
When i + 1 is even, one of the two subtrees formed by reconfigure() will have one node less than the other; without loss of generality, let the left subtree be the smaller of the two subtrees. The number of nodes in the right subtree, nr, is less than i + 1 and the assumption holds to give 2h <= nr where h is the height of the right subtree. The right subtree has one more node than the left subtree, which gives
and the assumption holds when i + 1 is even.
2h + 2h <= nr + nr 2h + 1 <= i + 1
The assumption holds when i + 1 is odd or even, and so holds always and reconfigure() produces minimal binary trees.
p[x]
and
p[p[x]]
without checking to see if they're defined. Explain why the
algorithm can safely do this. Remember to consider all possible cases for
x in your explanation.
At first glance it seems that this question is easily answered by considering
the levels in a red-black tree at which node x
could be. If x
is at
level two or higher, then its grandparent - which will be at level 0 or higher
- exists and p[p[x]]
is defined. If x
is at level 0, then x
is
the root and the loop terminates without accessing p[p[x]]
.
The case when x
is at level 1 seems equally as easy as the other cases:
using the property that the root of a red-black tree is always black, x
has a parent - the root - and the root is black, so the test color[p[x]] =
red
fails and p[p[x]]
is never accessed.
Unfortunately, the black-root assumption is not as obvious as it seems. There
are two ways x
could be at level 1: it could have been inserted as a leaf
at level 1, or it could have been inserted at a higher level and pushed up the
tree towards the root. If x
was inserted at level 1, then the black-root
assumption holds because the tree is otherwise unchanged. In pushing x
up
the tree, however, note that lines 9 and 15 in RB-Insert() set x
's
grandparent's color to red. To show that RB-Insert() is correct, we have to
show that x
can never be pushed to level 1 with a red root.
Because RB-Insert() reaches no higher in the tree than the grandparent, the
root's color doesn't change when x
is at level 3 or higher; moving x
up to level 1, if it ever happens, preserves the black-root assumption, causing
the loop to terminate without accessing p[p[x]]
. If x
is at level 2,
its grandparent is the root, which may be set to red; the analysis of this case
has to be more detailed.
There are three cases to consider when x
is at level 2 (there are actually
six cases, but the other three are symmetric with these three). Following the
book's terminology, in case 1, lines 7 to 10 in RB-Insert(), the root is set
to red, but x
is pushed to the root, so the loop terminates without
accessing p[p[x]]
. In case 3, lines 14 to 16 in RB-Insert(), the root
is set to red, but is right-rotated out of the root, being replaced by x
's
parent, which is colored black. x
is at level 1 and the black-root
assumption holds; the loop terminates without accessing p[p[x]]
. In case
2, x
is set to its parent, then left-rotated into case 3, which operates
correctly by the previous argument.
The root can have at most 2t - 1 keys, which leads to 2t children, each of which can have at most 2t - 1 keys, and so on. The total number of keys in this case is
Any non-root node must have at least t - 1 keys, leading to t children, each of which must have at least t - 1 keys, and so on. The total number of keys in this case is
sum(i = 0 to h, (2t)i(2t - 1)) = (2t - 1)sum(i = 0 to h, (2t)i) = (2t - 1)(((2t)h + 1 - 1)/(2t - 1)) = (2t)h + 1 - 1
The root can have a minimum of one key, leading to two subtrees, and the number of keys in this case is 2(th - 1) + 1 = 2th - 1.
sum(i = 0 to h - 1, ti(t - 1)) = (t - 1)sum(i = 0 to h - 1, ti) = (t - 1)((th - 1 + 1 - 1)/(t - 1)) = th - 1
The easy cases first: if the tree is empty or contains one node, then inserting and deleting a node leaves the tree unchanged. If RB-Insert() doesn't execute the while loop, then it has inserted a red leaf node, and RB-Delete() deletes red leaf nodes without further changes to the tree, leaving the tree in its original state.
Now it gets harder. On the first iteration through the RB-Insert() while loop, case 1 leaves the new node a red leaf, as does case three. Case 2, however, rotates the new node into the interior of the tree, and from there case 3 rotates the new node one level higher into the tree and changes its color to black. Note also that all three cases make changes to the tree in excess of adding a new node.
The second and third insertion cases terminate the while loop; only the first case leads to further iterations. Each iteration of case 1 moves up to the grandparent of the current node (node x in RB-Insert()), so all three insertion cases leave the inserted node in the state it had after the first iteration.
If RB-Insert() makes one or more iterations through the while loop and end with the inserted node being a red leaf, then RB-Delete() will just delete the inserted node without making any other changes, the the final tree will be different from the original tree because of the changes made by the three RB-Insert() cases.
The last case to consider occurs when the inserted node is black and rotated up into the tree interior by RB-Insert() cases 2 and 3. In this case, the new node will have two children, the left its parent and the right its grandparent. Because the new node has two children, it will be replaced by either the predecessor in the left child or successor in the right child when deleted. If the predecessor is used, the final tree is different because the grandparent is in a different place in the original and final trees. Replacing the inserted node by its successor has the same problem, except when the right child is a single node. Further, the right child of the inserted node is red, so it can adopt the extra 1 for black height held by the deleted node. If RB-Insert() made one iteration of the while loop when inserting the node, and so didn't change any more of the tree, the final tree will be the same as the initial tree.
This page last modified on 23 February 2000.