CS 305 & 503 Quiz 3

Data Structures & Algorithms
CS 305 & 503

Binary-tree delete() can be implemented using rotations or value hoisting. Which implementation is more efficient in terms of nodes touched? Justify your answer.

Asymptotically, if delete() searches for the value to be deleted, then it doesn't make any difference if rotations or hoisting is used; the search takes O(n) node touches. If delete() is called with a pointer to the node to be deleted, so no searching is required, it still doesn't make any difference if rotations or hoisting are used because both dive down to a leaf, which requires O(n) node touches.

Counting actual node touches, hoisting a value touches two nodes at each level (either once on the way down and once on the way up, or once when its value is hoisted to the parent and again when its child's value is hoisted into it). Rotating a node n touches four nodes: n and n's child, parent, and grandparent. Over the same path through a tree, hoisting performs half the node touches that rotating performs.

The answers, approximately verbatim, and in no particular order:

Value hoisting is more efficient than using rotations to implement a binary tree delete() because when we want to delete a root node it is easy¹ to fill the root node by using a leaf value and it is efficiently² done by using value hoisting as this implementation touches less number of child-tree nodes.³

1: “Easy” means what?

2: “Efficiently” means what?

3: Hoisting touches fewer nodes than what? And how many nodes does it touch?
Value hoisting should be more efficient in terms of nodes touched.

Because binary-tree property⁴ is every children value should all smaller or all greater than their parents.⁵ That means if using rotations, implementation should go through each layer between the leaf node and the delete node. Value hoisting should just operate the delete node and the node⁶ should be hoisted.⁷

4: What's the binary-tree property?

5: Ah, you mean the binary-search-tree property, kind of. What does that have to do with this question?

6: Do you mean the node to be deleted? Or some other node?

7: Hoisted where?
Value hoisting is more efficient in terms of nodes touched. In a standard binary tree where you don't need to preserve any property. Value hoisting requires that you traverse down to a leaf node in order to find a value to hoist. This traversal at worst touches n nodes, where n is the number of nodes in the tree.

However, using rotation you need to touch multiple nodes for each rotation. First you need to touch the node which you are rotating and also one of its children to promote to its parent. You then also have to move any orphaned children to a new parent. If you don't rotate intelligently based on the balance of the children, then you will have at worst n rotations that each touch multiple nodes. This isn't asymptotically worse through.⁸

8: Why so?
Using rotations is more efficient because we just find a leaf node and copy the value of a leaf to become a node we need to remove.⁹ It's easy way to do it.

9: Then how does hoisting work?
Value hoisting is more efficient.
Using rotations, a worst-case delete from the root rotates the root to leaf. Every time rotate same same node was touched two times.¹⁰

Using hoisting, worst-case delete from root, first find one leaf. Nodes touched only are from root to leaf one time.¹¹ Then switch value from leaf to root. Null out the leaf's parent. Left/right.

10: Do constants like 2 matter?

11: How many nodes is that?
Value hoisting would be more efficient. This is because the deletion of the unwanted data can be replaced by its child, being it has children.¹² If not, it can be just deleted.

Rotations would require more steps.¹³

12: What happens to the child?

13: How many more steps?
Value hoisting implementation should be implemented on a binary tree to use the delete() function. In this case, less nodes are being touched,¹⁴ which helps to keep the balance in the tree.¹⁵ It follows O(1) order.¹⁶ But in case of rotation more values might need to be moved, which causes an unbalance in the tree.

14: Fewer nodes are being touched than what? What's the alternative?

15: Why can you assume balance for this question?

16: What is O(1)? It can't be deleting, can it?
Value hoisting of nodes in a tree is O(1) in number of operations.¹⁷

Rotating nodes is O(n) in number of operators. Due to the unbalanced structure of a binary tree you could touch all nodes.

Value hoisting being O(1) is more efficient.

17: What are operations? Is that what you're supposed to be counting?
Binary-tree delete()

Using rotations
1. You rotate the subtree¹⁸ such that the node to be deleted becomes leaf of the new node.
2. Delete the leaf node.¹⁹
Using hoisting
1. You copy immediate descendant value to the node to be deleted.
2. Repeat step 1 till you reach leaf.
3. Delete leaf node.¹⁹
According to Binary-tree delete is more efficient in terms of nodes touched.²⁰

18: Which subtree?

19: Ok, but how many nodes does that touch?

20: By what accounting?
Using rotations is more efficient in terms of nodes touched when performing delete() on a binary tree. Using rotations would give O(log n) behavior because you are only touching half the nodes in a particular subtree.²¹ However, value hoisting requires you to dive down and find a leaf node,²² which results in O(n) or linear behavior.

21: Why half the nodes?

22: Don't rotations have the same requirement?

An AVL-- tree is an AVL tree that is not a binary search tree. Given an AVL-- tree with n nodes, describe the expected asymptotic performance of the operations add(), remove(), and find() on the AVL-- tree in terms of node touches.

Because an AVL-- tree is balanced, add() is still O(log n), either because new values are still added at the leaves, or because new values are added at the root and may require a full trip down the tree to restore balance. .footnote[ We didn't cover this detail in class, and you weren't expected to know it. ] An AVL-- tree isn't a binary search tree, which means find() has O(n) worst-case performance because it may have to search the entire tree. delete() also has O(n) worst-case performance, unless the delete() argument points directly to the value in the tree, in which case it has O(log n) worst-case performance.

The answers, approximately verbatim, and in no particular order:

add() is constant, O(1), since you can add nodes to the root of the tree.¹ You would just touch the existing root to make it a child of the new root.

remove() is linear, O(n), since you have to first find the node that contains the value to remove (assuming it exists), then move it down to a leaf node to remove it.

find() is linear, O(n), since in the worst case the value doesn't exist, you touch every node trying to find it before you can conclude that it doesn't exist.

1: An AVL-- tree is still a height-balanced tree, and that has to be maintained.
The asymptotic performance of the operations add(), remove(), and find() on the AVL tree² is O(log n) or logarithmic behavior. For the operation add(), you can add the value as a leaf, and use node rotations to make the tree well balanced. For the operation remove(), you also have to use node rotations to make the value you want to remove a leaf, while keeping the tree balanced. In the worst case, these operations will touch half of the nodes in the tree.³⁴

2: An AVL-- tree.

3: How could they touch half the nodes in a tree?

4: And what about find()?
In an AVL tree⁵ with n nodes:

For find(): it has linear worst case performance O(n) as it has to search all the nodes in the tree to find the value.

For remove(): It has a linear worst-case performance O(n) as it has to find the value and then delete the value from that particular node.

For add(): It has a linear-average case⁶ performance log(n) as it has to find the root node and add the new value to it.

5: An AVL-- tree.

6: Worst-case performance.
add(),
add new node to leaf node, then rotate the node based on heap value.⁷ Worst-case rotate to the root. Every time it rotate half the value in the tree was gone. So take log(n).

remove(),
Worst-case remove from root. Rotate to the leaf, take log(n)

find(),
worst case node in the leaf, take log(n)

7: Where does a heap get involved? Are you thinking of a treap?
To add values into an AVL-- tree, we just add a new value to a leaf node and make sure any two sibling subtrees differ in height by at most 1.⁸

To find a node, we need to go through each node on the tree. It costs O(n) in worst case.

To remove a node, we meed to find a leaf and copy the value of the leaf node to become a node we want to rename.⁹

8: And how much does that cost?

9: Cost?
An AVL tree is a balanced tree. Asymptotic performance of operations ¹⁰

add() = O(log n)
remove() = O(log n)
find() = O(log n)

10: Why is this so?
To add() a value depending on the structure of the AVL-- tree we would start at end of the tree (leaf nodes) and if it's a heap max¹¹ we would implement a method of rotations, where the value is added.

To remove() a node the operation would be quite different.¹² The node that is to be removed would be replaced by the existing node that came after it.¹³

The find() node would be traversing starting from

11: What do heaps have to do with anything?

12: Different from what?

13: And what happens to the children of the node being used as a replacement? And what's the performance?
An AVL tree¹⁴ is constantly in balance by at least height 1.¹⁵ Find operations are of O(log n) because of its structure.¹⁶ Add and remove operations are of O(1) because of the immediate rebalancing that takes place after these operations because an AVL is not in any particular order.¹⁷ Add and remove operations will only ever touch a constant number of nodes.¹⁸

14: An AVL-- tree.

15: You mean the difference in children heights is at most one, right?

16: The structure of what? The AVL-- tree?

17: How does that result in O(1) performance?

18: Why is that?
AVL-- tree [picture] ¹⁹²⁰

19: Constant? How is that possible?

20: Why?

add() implementation in an AVL-- tree has a asymptotic performance of O(n log n). To add a value, since it is not a binary search tree, we have to find a node where we can add the value following all the tree property.²¹ Since it is not a binary starch tree, we can't say which side the value should be placed (left/right).²²

remove() implementation has a asymptotic performance of O(n). We can copy the removing value to the leaf and null the value of parent node.²³

find() implementation has an asymptotic performance of O(n log n), since it has to search both subtrees as it is not a binary search tree. So, we don't know which side has a larger value, that doubles the work.²⁴

21: What tree property? Do you mean balance?

22: True, but so what?

23: Why is that O(n)?

24: How does doubling the work result in O(n log n)?
Traverse a non-empty binary search tree, printing the value at each node. Which traversal prints the values in descending (that is, largest-to-smallest) order? Justify your answer.
Descending order requires that larger values be printed before smaller values. The printing traversal would be in-order, but the right tree is traversed before the left tree. In terms of the notation given in class, this traversal is RNL instead of the more usual LNR version of the in-order traversal.
The answers, approximately verbatim, and in no particular order:
1. Using post-order will print the values in descending order because every value in the left subtree will be less than the value of the root and the values in the right subtree is greater than the value of the root in non-empty binary search tree.¹
  Post-order: right, left, root.
  
  1: Except, where does the root value get printed?
2. Pre-Order Traversal (RNL)² prints the values in descending order. Firstly, all the right side child node values of the binary search tree are printed and then the root node value is printed and then all the left child node values are printed.
  
  2: RNL isn't a pre-order traversal
3. An in-order traversal will print the values of a non-empty binary search tree in descending order. An in-order traversal starts at the left child, then processes the root, and finally the right child (it could also start at the right child and finish with the left child). In a binary search tree, the right child is at least the value if its parent, and the left child is less than the value of its parent. Therefore, in this case an in-order traversal will start with the right child (the largest), then the root (the middle value), and finally the left child (the smallest).³
  
  3: This answer is too long; you only need the last two sentences.
4. RNL traversal. [picture]
5. In-order, RNL. Each node's left children's values must be no grater than its value, and its right child's value must be at least its value. This is the binary search property and applies recursively throughout the entire tree. Therefore if you always look at the rightmost children first, you will get the largest values. Then you look at the nodes for the “middle” values and finally the left children for the least values.
  
  This is known as in-order traversal, even through it is RNL, because the node is accessed in between the children.
6. The in-order traversal with right node⁴ first, then root, then left node, i.e., RNL. Traversing prints node from largest to smallest value in a binary search tree.
  
  Binary search tree follow Binary Search principle, in which the right node value is greater than root value, and left node value is smaller than root value. It is followed in every subtree of the main binary search tree.
  Hence traversing in order RNL we get values in descending order.
  
  4: Child.
7. Binary search tree character is every left value is less than the parent node and the right node is grater than the parent node.⁵
  
  The post-order traversal will print the values in descending order,⁶ because the post-order start from the left to right.⁷
  
  5: Every value in the right subtree is at least the value at the root node.
  
  6: Where does the value at the root get printed in this scheme?
  
  7: But didn't you just identify the left subtree as holding smaller values and the right subtree as holding larger values?
8. To print from largest to smallest values by traversing using O(log n).⁸ This is because it can use a ???⁹ structure where it is consistent finding next highest value.
  
  8: How can that be? You have to print every value, and that requires touching every node.
  
  9: Can't read it.
9. Assuming the tree has a node order of larger than node is right subtree and smaller than node in left subtree.¹⁰
  
  You would start in the right-most node of tree and subtrees than the root then the left node, traversing from right to left.¹¹
  
  [picture]
  
  Value in node # 1 is defined to be less than value in node # 2. Value in node # 2 is defined to be less than value in node # 3.
  
  10: Good.
  
  11: Do you mean start in the right subtree, then the root, then the left subtree?
10. In this we would us RNL order as we now that in binary search tree the values on the right of the root node are greater or equal to the values of root node and the values of left node is less than the root node value. So if we print the node in RNL order, the value would be in descending order.
The array-based binary tree implementation discussed in class uses space exponential in the number of tree levels. Describe another array-based binary-tree implementation that uses asymptotically less space. Do not worry about making the implementation dynamic.
Use three arrays value, left, and right, each of length n. value[i] contains the value stored at the i-th node; left[i] contains the index of the root of i-th node's left subtree; and right[i] contains the index of the root of i-th node's right subtree. You can add refinements, such as free lists, but that doesn't have anything to do with the space used.
The answers, approximately verbatim, and in no particular order:
1. Another array-based binary-tree with less space would be assembling an array with levels.¹ That is, creating an array with asymptotically 2ⁱ⁺¹ - 1.²
  
  1: What does that mean? And how is it done?
  
  2: But isn't that exponential space?
2. Use array-based binary-search tree.³ The binary-search tree will start with the half of the array.⁴ That will use less space.⁵
  
  3: How does being a binary-search tree influence the space required? And how is it array-based?
  
  4: What does it mean to start with half the array?
  
  5: Why?
3. Use two array based implementation.⁶
  
  Start with one array, store the value of root node.
  
  Use another array, store the index of root's child.⁷
  
  Find the position use the index and store the value root's child in the first array and keep going until store all the nodes in the tree.
  
  6: Will two arrays be enough?
  
  7: How many children might a node have?
4. An array-based binary tree implementation can be done in linear space, in the number of nodes in the tree.
  
  If each element of the array is a structure that contains a value (which is that nodes' value), and the index of the array for each child, then the array wouldn't need to reserve space for empty nodes.⁸ New nodes could be added to the end of the array, and removing a node would require moving each node after it (in terms of position in the array) over to fill the gap.⁹ Traversing would require you start at the root and follow each child index.
  
  The major difference between the two implementations is the method described in class always reserves enough space for every child in every level of the tree. Each level the children grow in number by a product of 2. In this implementation space is reserved only for existing non-empty children.
  
  8: True, but it would have empty space for new nodes, wouldn't it?
  
  9: Why do you need to move the nodes? Doesn't this mean you'd have to adjust the child indices in the nodes? And anyway, the question asks about space, not time.
5. Heap is another array-based binary-tree implementation that uses less space¹⁰ because the root node in heap has value at least or most.¹¹
  
  10: Doesn't a heap use the same array-based implementation?
  
  11: What does that have to do with space?
6. A treap can be used in this case. It is also an array-based structure,¹² which also has the property of binary-search tree.¹³ The treap uses less space as compared to an array-based binary tree, and it is more balanced.
  
  12: Is it?
  
  13: Why is having the binary-search property important to saving space?
7. A balanced-binary tree will have a linear rise in the exponential values¹⁴ which is 2ⁱ¹⁵ where i is the linear height of the nodes. [picture]
  
  14: A linear rise in the exponential values? What could that possibly mean?
  
  15: what is 2ⁱ? It can't be the linear rise.
8. - The natural mapping array-based binary-tree implementation uses asymptotically less space.¹⁶
  - It stores root value from index i = 1 for easy access/mapping of the elements in the tree.
  - It allocates space for max. no. of node from the given binary tree.
  - E.g., [picture]
  - No swapping/bubbling up/down required.¹⁷
  - Mapping/searching elements is constant.¹⁸
  - Exponential space is reduced to finite value.¹⁹
  16: What is the natural mapping?
  
  17: True, but no swapping or bubbling was required in the space-inefficient array representation either.
  
  18: Constant over what?
  
  19: Exponential space is a finite value, albeit possibly big.
9. The array-based binary tree that uses space exponential in the number of tree levels is a heap.²⁰ Another array-based binary tree that uses asymptotically less space is a binary search tree.²¹ Because you do not have to deal with the idea of levels, you do not have to leave extra space in the array.
  
  20: A heap implemented as an array, yes.
  
  21: Asymptotically less space than what? And why is that so?
10. This would require more information be placed in the node of each element in the array. Data stored in array would to have pointers to the left or right children as well as the data itself.
  
  This implementation would add overhead involved with storage in respect to linked sequence implementation,²² and be finite in size,²³ but would not incur memory management overhead,²⁴ with only O(n) space required per node.²⁵
  
  22: What is the linked-sequence implementation?
  
  23: Aren't all implementations finite in size?
  
  24: Why not?
  
  25: Why isn't it constant space per node?
Given an empty heap and a non-empty set of values, characterize the least amount of work needed to put all the values into the heap using the heap add() operation. Justify your characterization. Your characterization should be in terms of an asymptotic estimate of the number of values moved in the heap.
Because the values have to be moved into the heap, the work can't be less than O(n). If the values are added in an order consistent with the heap, ascending order for a min-heap and descending order for a max-heap, the values are added to the heap in their proper location within the heap and don't need to be moved again.
The answers, approximately verbatim, and in no particular order:
1. The least amount of work is O(1) because when the set of values are in order (from smallest to largest or from the largest to the smallest), we can insert the values one by one. That will take the least amount of work.¹
  
  1: True, but is that a total of O(1) or O(n) work?
2. The cost of adding value into heap will be O(log n) because the value of root node in heap is smallest or highest.² If we put a value into an empty heap, it costs O(1) because this is the first node in the heap. If we put a value in to non-empty heap, we need to compare a new value to root node and move the old values if needed to make sure it's a max heap or min heap only.³
  
  2: What in particular about the root value makes the insertion cost O(log n)?
  
  3: And what's the cost of that?
3. For adding one value to the heap tree it requires log n operations and for adding n to the heap tree it requires n log n operations.⁴
  
  4: But is the the best possible case?
4. Assume heap is min-heap.
  
  The least amount of work is putting all the values in increasing order.
  
  add(), first we add new value node to the leaf. If it's less than parent's value, swap the value.
  
  If the input value order are increasing, we may not to swap the value.⁵
  
  5: Ok, but what's the asymptotic estimate?
5. The least amount of work needed to put all the values into the heap would be if the set of values happens to be ordered.
  
  Each add operation would be constant time O(1) since it would not need to move any nodes. It could simply add at the root or at a leaf, depending on the order of the values. If it added at the root it would only need to move the existing root to a child of the new root.
  
  Overall since each add would be constant, but you would need to call add for each element (n times), the estimation would be O(1)O(n) which is O(n).
6. Heap ADT follows heap property (max or min) where the root is greater than ??? ???⁶ children in case of max heap.
  In add() operation, asymptotic estimate
  1. You issue element in heap, O(1)⁷
  2. Bubble up the value, O(log n).
  In order to characterize a better asymptotic estimate for this operation, we should try making heap add() like binary tree add() which asymptotically O(1) constant time.⁸
  ∴ characterize can be either
  1. Shuffling in order to get a better sequence which requires less bubbling up.
  2. Give sorted sequence to heap add() , which requires no bubbling up/down. Hence constant-time add() operation.⁹
  6: Can't read them.
  
  7: What is issuing an element in a heap? Do you mean adding?
  
  8: But does it make sense to treat a heap add as if it were a binary-tree add? Don't the two trees have different requirements?
  
  9: What's the answer? What's the least amount of work needed to add values to a heap?
7. We would use a min-heap to put all values in heap. The characteristics of the min heap that is the subtree value are more than it in parent value would help in a way that if we used add() operation. The value could be added to the heap in an order.¹⁰ Once all values are added, the only part left is to do rotation.¹¹ We can rotate the values in a way that the parent value is always less than child node value. This way the less value would be moved in a heap maintaining it asymptotic estimated value.¹²
  
  10: Which order? Determined how?
  
  11: Why would this be necessary?
  
  12: Ok, but what's the asymptotic performance estimate?
8. If you are given an empty heap, and a non-empty set of values, an asymptotic estimate of the number of values moved in the heap is O(n). This is true because in order to put all the values into the heap, every value has to be touched.¹³ There are “n” number of values, therefore O(n) is the asymptotic estimate.¹⁴
  
  13: That's true, but isn't it also true that, once added to the heap, a new value may need to be bubbled up, possibly requiring O(log n) more value movements?
  
  14: Is this always true, or only for special cases?
9. An asymptotic estimate of adding values to a heap would be O(n), due to the unbalanced structure of the heap.¹⁵ An add to the heap would force the heap to reorder itself possibly touching all values within the heap especially if a spine was present.¹⁶
  
  15: Is that true? Are heaps unbalanced?
  
  16: That may be true, but the question's asking about the best possible case, which does the least amount of work.
10. Since we are using a heap to set our values we have the option of using a heap max or a heap min. For a heap max we will start by adding values that are the largest of values first and work our way down to the least amount of values.¹⁷
  
  17: Ok, and what's the resulting asymptotic performance estimate?
This page last modified on 8 December 2010.
expected add() take O(1) operation.

remove() take O(n) operation.

find() take O(n) operation.

expected	`add()` take O(1) operation.
	`remove()` take O(n) operation.
	`find()` take O(n) operation.

Data Structures & AlgorithmsCS 305 & 503

Quiz 3, 1 December 2010

Data Structures & Algorithms
CS 305 & 503