-
A colleague of yours, knowing that good hash functions are random, writes a
hash function that, among other operations, multiplies the key value by a
number randomly chosen in the range 1 to n, where n is the size of
the scatter table (this multiplication takes place before the key value is
reduced mod n). Is this a good idea? Justify your answer.
It's not a good idea. A hash function's most important characteristic is that
it be a function, that is, if h(k) = v, then h(k) always equals
v. If a hash function multiplies in a random number when computing
v, it's likely that successive values computed for k will be
different.
The answers, approximately verbatim,
and in no particular order:
-
No. A good hash function — rather, a valid hash function
— must produce the same hash code for similar input data. If
a random multiplier is used then it would not produce the same value every time
for the same key. By “similar input data” I mean data which is
considered to have the same value.
-
No, this is not a good idea. Good hash functions require random and widely
spread index.1 If multiplies a random number, the key value will be changed, and
the mod n will also be changed.
-
Yes. Given that the other operations the hash function performs are
generating a completely random hash value2 of the item. The multiplication of the number if
randomly selected each time a hash is performed will cause the hash to be
offset from the others. This offset will help avoid collisions when inserting
items into the hash data structure, as well as cut down on grouping.
-
Yes. As we know the two different values collide when they produce the same
hash index,3 so
multiplying the key value by a number randomly chosen in the range 1 to n
is a good idea. This will create unique keys for values.4
-
Yes, since the properties of a good hash function are to be random, uniform and
wide spread. ???5 the hash function following all
these properties is a good hash function.6
-
It is not a good idea to multiply the key value by a random number from 1 to
n because it will create a pattern that can be detected rather easily if
the size of the scatter table is small.7 Also, this will not cause the hash
function to be uniform because it could cause the values to be packed together,
thereby creating a cluster.8
-
This would not be a good idea if it's going to take place before mod n because
it can change the key value with the wrong random number9 before reducing it to mod n.
-
Yes, this is a good idea. This helps because a key value is also referred by
another index in the array.10 Another benefit to this idea is that the multiplied
value is within the range of scatter table, which would to avoid the problem of
going into an infinite loop.11
-
No, we use hash function to create index for key value. If we use (k*(1
to n)) % n there is a good chance k = bn,12 which means multiply the number randomly chosen does not
change.13 The index key values with k will land in the same index. A good hash function requires widely spread
index.
-
Bubble sort can be modified to stop early when it detects that the array that
its working on has become sorted. Can insertion sort be modified in a similar
way so that it too will stop when it detects a sorted array? Justify your
answer. Note that the modifications don't explicitly check for a sorted array,
but rather detect some characteristic of the sort's usual operation and
conclude that the array's sorted or not.
Insertion sort can't be modified to take advantage of sorted array segments as
easily as bubble sort can. Each bubble sort pass scans the entire unsorted
array part, and it's the scan that allows early termination (if nothing's
swapped, nothing's out of order, so count the swaps per scan and stop if the
count's 0). Insertion sort, however, doesn't scan the unsorted array part, and
so has no easy way to discover anything about it.
The answers, approximately verbatim,
and in no particular order:
-
Because a bubble sort shuffles values in an array which can detect when it has
become sorted.1
An insertion will select the next value size value in the array with out moving
other values, with this action it can't stop early as a bubble sort.2
-
Yes, it can be modified in a similar way.3
Insertion sort will stop when it finished comparing each element in an array
and swapping; it takes O(n2).4
for (int ...) ⇒ outer loop takes O(n).
for(int ...) → inner loop takes O(n).
⇒ O(n2).
-
No. Bubble sort can stop early when it detects that the array has been sorted
because the bubble sort doesn't need to compare the rest of the array.5 but insertion sort is
different, even the array were sorted, we still need compare between the insert
value and the sorted value in the array.
-
No, you cannot modify insertion sort similarly to bubble sort to stop when
detects a sorted array. Because insertion sort is a sort-in-i/p
sort.6 It sorts (finalize the
position) in coming element at the time of insertion itself.
Since insertion sort inserts new elements in sorted order, we cannot modify it
to check remaining array is sorted or not.7
-
No. Insertion sort requires a comparison of the currently picked index as a
min value to all other values in the sequence being sorted.8 This is what leads insertion sort to having O(n2)
behavior.9
-
Yes, insertion sort can be modified to stop when it detects a sorted array.
Both bubble sort and insertion sort move many values around, and work in an
similar way. Insertion sort slides the value to the right place, and can be
modified to keep track of the sorted parts of the array.10
-
Can not, we use gap to insert one unsorted value to sorted values. Can not
track the values in unsorted, have to go through unsorted values to fill in the
gap.
-
Bubble sort can detect a sorted array when it makes a pass through the sequence
and does not perform a single swap.
Insertion sort looks at the first value in the sequence of unsorted data, then
inserts it at the appropriate location in the sorted sequence. The best you
can do is have it sweep the sorted data right-to-left instead of
left-to-right. This makes it so you only have to do one comparison for each
value in a sorted sequence to determine if it's in the right location. However,
there is no way to immediately determine the array is sorted based on a
characteristic of the sort's usual operation.
-
Yes, it is possible to stop insertion sort after it detects it a sorted array.
Bubble sort starts from the bottom and goes up, sorting and swapping those
values. Insertion sort goes through the array and sees a value that is smaller
than value that is in the sort array and inserts the smaller value at the
proper place. Once all values are sorted, there are no more unsorted values
and the insertion sort would stop.11
-
A heap sort producing ascending order can use either a min heap or a max heap.
Which heap is the better choice? Justify your answer.
Either will work, but max heaps are simpler. Heapsort builds the heap into the
array being sorted. As elements are removed from the heap, the heap shrinks
from the bottom (at the leaves), opening up space in the array. Assuming an
ascending sort and the tree-array indexing given in class, space in the array
opens up at the right-end of the array; that is, for the larger values.
Building a max heap opens up space in the proper end of the array for the
values removed from the heap.
A min heap would have to open up space at the left end of the array, which
means the heap would have to build right-to-left (root-to-leaf) in the
array. Right-to-left heaps are easy to do (just pretend the heap is
left-to-right, then subtract n - 1 from each index, where n is the
array size, to flip the index around), but are marginally more complicated than
left-to-right heaps.
The answers, approximately verbatim,
and in no particular order:
-
Max heap. Using heap we start from array beginning to end,
- If use min heap, pop the min value should be stored at beginning of
array. We do this by
- remove min from heap,1
- add new unsorted value to the heap.2
- If use max heap, after find max value in the heap, we simply swap it with
the end value of array.
-
Max heap is the better choice. Because max heap, the root is the largest one
and all the children is less than parents. So we can put the root to the end
of the order and after that every time we got the rest of largest values. We
can put it into the end of the rest order.3
-
A min heap is the best choice because we want a heap sort producing ascending
order, not descending order.4
-
For an ascending order heapsort, the better choice would be a min heap, since
it is an ascending order you need to start with the smaller value, then
chose the next greater value (basically start with the smallest value then
chose the next smallest value, and so on).5
-
Min heap is the best choice. Being that we want the data to be sorted to
ascending order, we want the least value at the first position in the sequence.
A min heap has the least value at its head, since it guarantees all of its
descendant values are at least its value. The array based implementation of a
min heap further supports this as the head of the heap is stored at the first
position in the sequence (index 0).6
This property, the heap property, continues recursively in the heap.7
-
Min heap would be a better choice because the property of min heap would help
to produce a sort in an ascending order in more time efficient manner.8 The property of the
min heap that every value in parent node is less than its subnodes. So, we
know the values start from the smallest and increases ascendingly. So, using a
min heap is a better choice, when all the values need to be ascending order.
-
A max heap is the better choice because a heap sort repeatedly takes the max
heap element and stores it in an open space. In a max heap, the value of the
parent is always greater than the value of its child, therefore a heap sort can
start at the root and work its way down to its children.9
-
Heap10 is a guaranteed O(n log n) complexity
algorithm. For ascending behavior thus asymptotically both are
same.11 But the expense of swapping
elements is more in max heap compared to min heap.
∴ min heap is better.
-
A min heap would be a better choice for arranging elements in ascending order.
This would provide more efficient access to elements of lower value prior to
elements of grater value.12 One would
be able to access elements of lower value from a higher level13 then grater values instead of having to
traverse fully to leaf nodes at the bottom of the tree.14
-
Selection sort and bubble sort are both O(n2) sorts, but usually selection
sort is faster than bubble sort on the same array. A colleague of yours
explains the difference by noting that selection sort swaps values that are
usually far away from each other, and so is more efficient that bubble sort.
What do you think of your colleague's explanation? Justify your answer.
The explanation almost completely misses the point. Both bubble sort and
selection sort are O(n2) sorts, but only if you're counting data touches or
comparisons. Both sorts repeatedly scan the unsorted array part, but bubble
sort drags along the extreme value, while selection sort remembers its
index. When counting data movements per scan, bubble sort is worst-case O(n),
but selection sort is worst-case O(1), and that's why selection sort can be
more efficient than bubble sort, assuming expensive data movements.
The answers, approximately verbatim,
and in no particular order:
-
I do not think my colleague's explanation is accurate. The reason selection
sort is usually more efficient than bubble sort is because selection sort keeps
track of the smallest value, whereas bubble sort moves much data because it
“bubbles” each value into the correct place. Selection sort uses fewer
assignments than bubble sort, causing selection sort to be cheaper as well.
-
He is on the right track. Bubble sort is slow when a value being swapped is
far from its final destination. This is because it swaps the value X
times where X is the distance between the value and its destination.
Worst case, a value is swapped (n - 1) times.
Selection sort solves this issue by just pointing to values that need to be
swapped an once it selected the proper values it preforms only one swap. It is
the nature of bubble sort which limits its swaps to adjacent values that makes
it slower than selection sort.
-
I think my colleague's explanation is wrong. It does not matter if selection
sort swaps values far away from each other or near each other. It swaps values
when necessary,1 for example,
if we want to sort element in ascending order, then if a > b then
swaps it2
- largest element in sorted array.
- smallest element in unsorted array.
-
Yes, I believe the explanation. Bubble sort starts from bottom of array, swap
the lowest value to the top and starts again from the bottom of the array and
goes to the top swapping the value and sorts an array.
Selection sort starts from the top of an array and goes to the bottom, selects
the values to be swapped in order to make a sorted array. Selection sort is
thus faster, because it can select a value and swap it to the proper place in
an array.
-
The reason selection sort usually faster than bubble sort is because selection
sort does not swap values in the inner loop. It swaps only after it finds the
min value, so the number of swaps is less than bubble sort, not because swap
values are far away from each other.
-
I think that's right, because both bubble sort and selection sort swap values.
Bubble sort compare two values which are nearby; selection sort can compare two
values which are far away.3 For example, in a sequence list
maybe only two value are not in order. So we can just swap the two values
without comparing each one.4 So the selection sort should be more efficient than bubble sort in
some particular array.
-
I think the explanation is not relevant. The selection sort is faster than
bubble sort because it's number of swaps per iteration are less than bubble
sort.
Selection sort scans entire sequence and selects one max/min value and place in
its position. Swaps = O(1) per iteration.
Bubble sort compares every element and makes swaps at every new element till
the max value reaches its position (i.e., the bubbling up behavior). Swaps =
O(n) per iteration.
Thus: The expense of swaps is the difference between selection and bubble
sorts.
-
I agree that selection sort is faster than bubble sort.5 This is because selection sort uses
2/3 less assignments than bubble sort.6
-
Swapping values from “far away” is one property of selection sort that is
different from bubble sort. It does allow more flexibility in the items that
insertion sort7 may swap as
opposed to bubble sort. The only swapping that bubble sort may do is on
elements that neighbor each other. This characteristic causes bubble sort to
perform more operations in the form of swaps to move an element the same number
of index values that insertion sort could do in one swap. They are correct but
in an over-simplified way without justification.
-
Radix sort was described in class as working from left to right (from most
significant to least significant) through the digits. Would the sort work
correctly if it worked from right to left (from least significant to most
significant)? Justify your answer.
- No, not using the method discussed in class. When making piles with
right-to-left order, each pile has a relation to every other pile; for example,
every card in the 900s pile is larger than every card in every other pile.
This is why right-to-left order can repeatedly subdivide the piles and still
end up with sorted numbers.
Using left-to-right order to repeatedly subdivide piles would result in chaos,
because there would be no strong relation between piles; for example, nothing
useful can be said about the relation between the cards in the 1s and 5s pile,
except that they all end in the same digit.
- Yes, but using a method different from that discussed in class. After
the first right-to-left pass, the individual piles would have to be
re-assembled into a single pile for the second pass over the 10s digits.
However, the relation between 1s piles, although weak, is still necessary: for
example, after the second pass, for any 10s pile, any card in 2s pile should be
before any card in the 5s pile (assuming an ascending sort). The order can be
maintained by thinking about queues rather than piles. A single queue is
distributed into ten queues, one for each digit. After all numbers have been
distributed, each of the digit queues is re-assembled into a single queue in
the appropriate order for the next pass.
The right-to-left radix sort was (and may still be in some specialized areas,
such as voting) was an economically and
historically
significant sort.
The answers, approximately verbatim,
and in no particular order:
-
Yes, the sort would work correctly because it sorts through decimal digit
numbers.1 It puts elements into groups such as
100s: 000-099, 100-199, ...2
10s: 00-09, 10-19, ...
So it doesn't matter because it puts elements into groups.3
-
No. Radix sort is doing bit comparison operations on the items within the
structure.4 Sorting sequence is dependent upon the
left-to-right characteristics of the bits each element is represented by within
system memory. For example, ordering 3-digit numbers in ascending order would
require 100's ordering then 10s ordering then ones ordering in standard
left-to-right comparison. In right-to-left ordering items with the same 1s
would group followed by same 10s grouping then 100s grouping, possibly only
creating an order by their position within the sequence relative to the closest
ordered grouping they were first part of.
In a right-to-left ordering
101 | | 100 |
001 | ⇒ | 010 |
100 | | 001 |
010 | | 101 |
-
No. The sort can not work correctly because from the right to left the digits
will change. for example: if we go through the order should be
left → right | | right → left |
000 | | 100 |
001 | | 010 |
010 | | 100 |
011 | | 110 |
100 | | 001 |
101 | | 011 |
110 | | 101 |
111 | | 111 |
So the most of order5 is different from
left-to-right order.
-
No. By observing the comparisons of most significant digits first you are
gaining extra information than just the order of that digit. The extra
information is that every number whose most significant digit is less than
another number of the same number of digit's most significant digit is
therefore less than that number. That is key to the sorting algorithm since it
is grouping the numbers into increasingly ordered groups.
If you were to look at the least significant digits first, you would not gain
this information. For example, if you look at the numbers 91 and 82, you can
see 82 is less than 91 just by comparing the most significant digits 8 < 9.
However, if you were to look at the least significant digits, the fact that 2
> 1 has no relation to the order of 91 and 82.
-
No, assume have three-digit numbers.
[picture]
We start from right, get 4 0s top and 4 1s bottom. Then go to 2nd column and
to 3rd column, the values in 3rd column is not ordered.
-
No, the sort won't work correctly.
Sorting from right-to-left (from least significant to most significant) may
place same range element in different buckets which cannot be reunited again.
E.g., seq = { 0 82 2 102 1000 100 20 1082 }
[picture]
The o/p6 is not sorted sequence,
thus will not sort correctly.
-
No radix sort would not work correctly if it worked from least significant to
most significant through the digits. This is true because radix sort is
designed to begin with the most significant digit, and work its way through.7
-
No, the most efficient way to sort8 in radix sort is to go from left-to-right. This
helps because we are sorting the most significant to the least significant.
The values that are least significant on the right side would be done, once the
sorts from left, as it will reach the end the values would be sorted. But
doing it from the right-to-left would take more time9 and thus would be less efficient way to use it.
-
The radix sort should work just as efficient10 if it were to work from right-to-left
because as long as it is consistent11 but
only in reverse it should be able to work from least significant to the most
significant (following the same sequence but only in reverse).