CS 509 Test 4

CS 509, Advanced Programming II

Fall 2002 - Test 4

Insertion sort and selection sort are both O(n²) algorithms. However, selection sort is usually considered more efficient than insertion sort. Present an asymptotic analysis to explain why.

Insertion sort Selection sort

for i = 1; i < n; i++ x = a[i] for j = i - 1; j > 0 and a[j] > x; j++ a[j] = a[j + 1] a[j - 1] = x

for i = n - 1; i > 0; i-- j = i for k = i - 1; k >= 0; k-- if a[k] > a[j] j = k t = a[i] a[i] = a[j] a[j] = t

Hint: assume that an element of a is a very large object occupying kilobytes of storage.

Both algorithms are nested loops, which explains the O(n²) behavior. Looking more closely at the inner loops, insertion sort, in the worst case, assigns one array element to another. Selection sort, on the other hand, just assigns an index value in the inner loop and swaps array elements once per outer-loop iteration.

The hint gives us everything else we need: because array elements are large, moving them is expensive, and the asymptotic analysis should count the number of element moves:

Insertion sort Selection sort

for i = 1; i < n; i++ O(1) for j = i - 1; j > 0 and a[j] > x; j++ O(1) O(1)

for i = n - 1; i > 0; i-- for k = i - 1; k >= 0; k-- if a[k] > a[j] O(1) O(1) O(1)

The analysis has to make some assumptions about the cost of the comparing two array elements. One reasonable but arguable assumption is that comparisons don't involve moving anything, so they don't count at all. Another reasonable but arguable assumption is that comparisons may, in the worst case, involve looking at all the bytes in an element, and so do cost something, even though nothing's moving. Because moving is always more expensive than comparing, the analysis will go with the first assumption that compares don't count. When combined with the sequence rule, this gives

Insertion sort Selection sort

for i = 1; i < n; i++ O(1) for j = i - 1; j > 0; j++ O(1) O(1)

for i = n - 1; i > 0; i-- for k = i - 1; k >= 0; k-- O(1)

as the next analysis step. The inner-loop iteration count is bounded by O(n) and the inner-loop guards involve no array elements, which leaves

Insertion sort Selection sort

for i = 1; i < n; i++ O(1) O(n) O(1)

for i = n - 1; i > 0; i-- O(1)

as the next step. One more turn of the analysis crank for the outer loops, coupled with the sequence rule produces the final results

Insertion sort Selection sort

O(n²)

O(n)

and selection sort has asymptotically better behavior than does insertion sort with respect to array-element movement. Some people answered this without presenting an asymptotic analysis; they didn't do well. Almost everybody else did an analysis showing why both algorithms are O(n²), which is not what the question asked for. A few people then went on to describe why selection sort is more effeicent, but didn't back their discussion with an asymptotic analysis.

The std::equal(b, e, b2) generic algorithm returns true if the values in the sequence starting at b2 are element-by-element equal to the values given in the iterator range (b, e) and returns false otherwise.

The search(b, e, b2, e2) generic algorithm returns true if the sequence of values indicated by the iterator range (b2, e2) appears somewhere in the sequence of values indicated by the iterator range (b, e) and returns false otherwise.

Assuming that both generic algorithms are implemented using the weakest possible iterator kinds, can both algorithms be implemented with the same kind of iterator? Explain.

The easiest way to answer this question is to look at the implementation of each of the algorithms:

equal(b, e, b2) while b != e if *b++ != *b2++ return false return true

search(b, e, b2, e2) while b != e i = b++ j = b2 found = true while i != e and j != e2 if *i++ != *j++ found = false break; if found return true return false

equal() uses r-value *, != and ++ on the iterators, which are all at least input iterators. search() uses all these operators and assignment too, which means the iterators must be at least forward iterators. Because the weakest iterators required in each algorithm are different, they can't be implemented using the same weakest iterator.

It turns out you can ahswer this question "yes" with the following implementation of search() using equal():

search(b, e, b2, e2)

  while b2 != e2
    if equal(b, e, b2)
      return true
    b2++

  return false

This code make use of the copy constructor rather than assignments to duplicate iterators. A copule of people mentioned this could be done, but didn't give details.

Describe a test case in which the correct answer must have v2 equal to 0 in the min_3sort() return value; that is, describe a test case in which the wrong answer is any answer where v2 is not equal to 0.
For v2 = 0 to be the only correct answer, it must be that iv2 has the majority of numbers congruent to 0 modulo 3, because if iv2 doesn't have the majority, it will take fewer moves to have some vector hold all numbers congruent to 0 modulo 3. With this analysis, the simplest test case is
iv1: [ 1 ]
iv2: [ 0 ]
iv3: [ 2 ]

which requires no moves.
Almost everybody got this correct, but many answers were way more complicated then they needed to be.
Generally, a container will re-implement a generic algorithm when the container's iterator kind is too weak to be used by the algorithm. For example, the list container implements sort() member function because the sort() generic algorithm requires random iterators and the list iterators are only bidirectional.
The map container implements the find() member function, which has the same behavior as the find() generic algorithm. However, the find() generic algorithm requires input iterators, while the map iterators are the bidirectional. Why does map implement find() as a member function?
Map re-implements find() because it can do a more efficient job looking for values then can the generic std::find(). std::find() requires input iterators, which means it must use a linear search to look for values because input iterators can only go forward by one (++). Further, the sequence of values defined by the iterator range is unconstrained.
The map value sequence is stored in sorted order, which means that something more efficient than a linear search can be used to look for specific values. By using a tree traversal, the member function find() can locate a value (or not) in O(log n) time, rather than the O(n) time it takes for a linear search.
Way too many people got this wrong for a varity of reasons. Some got distracted by the bidirectional iterators, which had something to do with the answer but not everything.

This page last modified on 5 November 2002.