CS 305-503 Quiz 3

CS 305-503, Data Structures & Algorithms

Quiz 3, 16 December 2008

Explain why it is more important for a binary search tree to be balanced than it is for a binary tree to be balanced.

Balance indicates that siblings don’t differ in height by too much. As a global property, balance translates into the property that the children of any node each contains about the same amount of data as its sibling.

Binary search trees gain their advantage over binary trees via the binary search tree property, which lets algorithms make accurate predictions about where data might, and might not, be in a binary search tree.

The answers, approximately verbatim, in no particular order, and with comments in italic:
- Because a binary search tree needs to be balanced so that the order imposed on it is meaningful. If we have a tree that is not balanced, searching that tree becomes more expensive because there are more levels to traverse. The worst case behavior begins to show when the tree is not balanced.
  The meaning of the order among values in a binary search tree is independent of the tree’s balance. Also, why it is that lack of balance leads to traversing more levels?
- Binary search trees are more important to have balance because of the traversal that would need to be done through it. If one side was weighed too much (i.e., having too many children nodes), the traversal would not function correctly. One side would be filled with numbers/elements larger or smaller than the starting node, throwing off the whole balance of the tree.
  Traversal correctness is independent of tree balance; efficiency may be effected, but correctness isn’t. Also, by definition, the children of a root in an binary search tree have values that are larger or smaller than the root node, depending on the child (left or right); this has nothing to do with balance.
- When using binary search trees balance is necessary to ensure that the computer does not take unnecessary time searching through the wrong side of the root node. Even if one side is unbalanced and shorter than the other, the computer should still take more time searching through the other side.
  Balance isn’t going to help if the tree search is looking in the wrong place.
- A binary search tree needs to be balanced because the more unbalanced it becomes the more it behaves like a list. And a list costs more work to traverse than a binary search tree. A binary search tree effectively splits the amount you have to traverse in half.
  A balanced binary search tree effectively splits the amount traversed in half, but why? What is it about balance that gives binary trees this property?
- In a BST if looking for a value you know you search right for bigger and left for smaller than parent node. If a BST is not balanced searching for that value will take longer. A binary tree takes a long time to search anyway b/c it’s not laid out well. Therefore balance is important for accurate searching for BST’s.
  Why does searching an unbalanced BST take longer? Why isn’t a binary tree not laid out well? And once again, balance has nothing to do with a search’s accuracy, only with its efficiency.
- When using a binary search tree, our goal is to make use of its order log n behavior even though it has a order n worst case behavior. Being balanced enables the log n behavior.
  In a binary tree even if the tree is not balanced, it can still function. But for the way we would like the bst to function with order log n behavior, the balanced tree helps.
  Why does being balanced enable log n behavior?
- It is more important for a binary search tree to be balanced because it is easier to obey the binary search tree property, which makes it easier to traverse through the tree. The property states that all values to the left are less than the root and all values to the right are more than the root. Having a balanced tree makes it more efficient to search. For example, if the data we are looking for is less than the root traverse to the left, and if it’s greater traverse to the right.
  A binary tree either does or does not obey the binary-search tree property independent of whether or not the tree is or is not balanced.
- A binary search tree must follow the binary search property. Also, a binary search tree is easier to traverse. It is more important for a binary search tree to be balanced because that is what makes the traversal easier while still obeying the binary search tree property. (Binary search tree has 3 properties:
  1. Everything to the left of root is ≤ root value.
  2. Everything to the right of root is ≥ root value.
  3. tree is balanced.
  )
  A binary search tree may or may not be balanced, but it must obey the binary search property. Also, why does balance make the traversal easier (which we’ll take to mean log n cost)?
- For all honesty, an unbalanced binary tree or binary search tree kinda turns into a really bad stack. But I digress, it also makes searching much more difficult; certain algorithms won’t work as well, or at all. Imagine one that will return false once it reaches a null node.
  Search trees are meant to be more organized + searchable!
  What does balance have to do with any of this?
- A binary search tree will rely on the elements being correctly ordered and balanced; if the tree is not balanced, it may search the wrong branch of the tree and not find the intended value, or may expect the value to be in a child node where it should be but is not.
  Once again, balance has nothing to do with search correctness.
An implementation of a binary-tree search works correctly when looking for values in either a binary-search tree or a heap. Explain what’s wrong with the binary search.

The key point is that the binary search works in a heap, which obeys the heap property, not the binary-search property. Because the heap property ensures only that the parent’s value is the extreme of all its children’s values, the binary search must traverse both siblings of a parent independent of the search value, which is what’s wrong. A more efficient binary-search implementation would use the binary-search property to determine which sibling the data should be in and ignore the other sibling, making it inapplicable to a heap.

The answers, approximately verbatim, in no particular order, and with comments in italic:
- The binary-tree search is searching every value, it isn’t following the rule that BST’s live by. If a value is greater look to the right otherwise look to the left. It’s just searching every value in the BST or heap, it should only work on the BST b/c different rules apply to heaps than BST’s.
  Heaps aren’t searchable in the same manner as BST’s, you need a heapsort.
  How are the rules different? Also, heapsort’s another matter.
- The binary search would deem to be not effective when the tree is not organized correctly or incomplete. Based on whether the search is performed with NLR, LNR, or LRN, the binary search will act differently and causes different problems based on the situation at hand. If the trees during the search are incorrectly ordered or incomplete, the search will also have skewed judgment of which is the initial node and the correct parents and children for each side.
  Certainly if the tree’s incorrect you can’t except the search to work correctly, but there’s no justification for assuming incorrect trees.
- If an implementation of a binary tree search works correctly only on a binary-search tree or a heap, it only searches according to the value of the root. As the BST and the heap follow similar properties, the implementation searches based on whether the data being searched is greater or less than the root.
  Is it possible to search a heap based on whether the data being searched is greater or less than the root?
- A binary search tree is arranged so the left child is no greater than its parent and the right child is no less than its parent. A heap is arranged in either a max heap or a min heap. The search has the pattern LHR, LRH, and HRL. The binary search cannot perform that function and search effectively.
  What function is it that a binary search cannot perform effectively? And why can’t the function be performed correctly?
- A binary search only works well when the data is distributed well. If the data can not be halved continually, we lose the “log n” in our big O and start looking more like n². We are depending on the order imposed by the heap property in order to give us an n log n search.
  “Distributed well” means what?
- A binary search tree is ordered; a heap isn’t ordered. It shouldn’t work in the heap. Heaps are not trees. It’s also probably taking too long too.
  What is it that shouldn’t work in the heap? Certainly not the search, which the question states does work on heaps. And heaps are trees.
- There are 3 types of binary search:
  1. LNR
  2. NLR
  3. LRN
  bts uses the order log n property for searching other than in worst case behavior. In a heap, the values must be arranged in a certain manner depending on whether it is a min heap or a max heap. So the NLR is best for heap. But if the implementation works correctly for binary search tree then proper comparison is not getting done using the binary search.
  The binary search property doesn’t have anything to do with the log n property; that’s part of the problem with binary trees left hanging for a later course. Why is it that if the search works correctly on binary search trees, then the search isn’t using the proper comparisons?
- When using a binary search tree to implement a binary-tree, the values are already in tree form so the search just fits right into play. Using a binary search with a BTS would just slow up the computer.
  And for heaps?
- The binary-tree search should not work correctly with a binary-search tree or a heap as these are differently structured.
  The binary-tree search should return the correct result when used with a binary tree, not with a heap.
  It’s usually not a good idea to argue with the question; if it tells you that something is true, you should believe it and work from there.
Given the letters ‘A’ through ‘Z’, arrange them in an array such that a binary search for ‘A’ uses the largest number of comparisons among all the letters in the array. Your answer should support a binary search for any character, but it should force the maximum number of comparisons for the letter ‘A’.

The idea is to locate ‘A’ as far away from middles as possible; that is, put ‘A’ at one end of the array or the other. The letter arrangement A through Z will do the trick; the reverse order would also work with suitable comparison adjustments.

The answers, approximately verbatim and in no particular order:
- Make the array start in the middle of the alphabet like @ Mor N. [?]
  So array[0] = ‘M’
  array[1] = ‘N’
  array[2] = ‘L’ …
  and array[25] = ‘A’.
  It would take the longest to search for A in this manner. Taking (i - 1)/2 to find ‘A’ would take the longest amount of time.
  The letter arrangement isn’t clear, nor is the meaning of i. Does a binary search for any character work?
- A through Z at indices 0 through 25 respectively.
- Z through A at indices 25 through 50 respectively, values at indices 0 through 24 unspecified.
  Z is the root node. Searching for A imposes a cost of O(n²) worst case for searching a tree. The tree is essentially a linked list.
- If we can use an arrangement of a maxheap but with the worst case behavior, i.e. with no right side children, then a maximum number of comparisons can be done if the binary search starts from the node first and travels all the way down to find “A”. The “NLR” method of searching should be allowed to satisfy this condition.
  How can a max heap arrangement have no right children?
- LSMFDBACEGIKM at indices 0 through 12 and randomize the rest.
  It’s not clear how a binary search for ‘Z’ works in this scheme
- A through Z at indices 0 through 25 respectively.
- LKIJGHEFCDABMZXYVWTURSPQNO at indices 0 through 25 respectively.
  It seems that ‘A’ is on the wrong side of ‘E’.
- BCDEFGHIJKLMNOPQRSTUVWXYZA at indices 0 through 25 respectively.
  How does a binary search find ‘A’ with this arrangement?
- ZYXWVUTSRQPONAML (additional other letters) B.
  How does a binary search find ‘A’ with this arrangement?
- ‘A’ should be in the middle of the array so the first and last values get checked each time before being compared to ‘A’.
  But if ‘A’ goes in the middle, won’t it be found first?
Explain the difficulty (or perhaps the unfortunate property) that arises when using an array to implement a binary tree that’s not complete.

See Nyhof, page 656.

The answers, approximately verbatim and in no particular order:
- Most array implementations are static, even though there are dynamic implementations available. A binary tree that is not complete continually needs to add more nodes and or children. After searching the limits of an array it would become extremely difficult to accommodate new elements.
- When using an array to implement a binary tree that’s not complete, there is a difficulty because this may cause an unbalanced tree.
- When using an array with an incomplete tree, the chance of hitting a node without any data in it is grater. Junk values and places in memory become more apparent, but with fail-safes incomplete nodes can be avoided.
  How does using an array effect whether or not a node has data?
- It quickly unbalances the binary tree, or the incomplete binary tree becomes very difficult to search due to the array’s structure. It’s not even organized so that the computer could do a quicksort and have a better ‘guess’ or the tree will have a syntax error.
  How does using an array unbalance the tree? How does the array’s structure make the tree difficult to search?
- The array would contain empty values for some nodes. The space would be wasted and in this case, the cost of using an array would be too costly. In a sense, a stack or a linked list would work smoother if the unfortunate chance of an incomplete binary tree was to occur.
  If you need to use a binary tree, how would a stack or a linked list be smoother? And what does “smoother” mean?
- It is difficult to compensate for a branch in the tree if that particular level is not full, i.e. with one blank node, the level is using 3 indices for the array instead of the more natural four. The next level must be considered to start after the blank node.
  The indexing scheme for an array-based binary tree depends only on the node (parent or child) index: 2i + 1' or 2(i + 1)' to go from parent i to child or i/2 to go from child i to parent. How does the level enter into it?
- The unfortunate property is that you are allocating space for elements that do not yet exist. An array based implementation maximum has the storage overhead of n where n is the number of elements that can fit into the tree. Using a list with pointers, you would only need to store (level at the tree*2) unused pointers.
- You wouldn’t know if it was intentionally left out or a mistake.
  Some children that should be there won’t be found in the array when you try to find them in 2i + 1 and 2(i + 1).
  Also if not balanced properly children will look like (in the array) that they belong to different parent nodes instead of their actual parent nodes.
- By using an array to implement a binary tree that is not complete, it can waste unused space if the array is static and is larger than the binary tree. It also can make the binary tree unbalanced causing more work which defeats the purpose of making it a binary tree in the first place.
- If an array is used to implement a binary tree that is not complete, the tree will be unbalanced.

This page last modified on 25 November 2008.