Lecture Notes for CS 512, Algorithm Design
17 April 2000, Greed Problems
- huffman codes
(a Java animation)
- given a set of n characters, encode them
- fixed-length encoding - use ceiling(log n) bits
- variable-length encoding - let more frequent characters be
represented with fewer bits than are less frequent characters
- how to encode according to frequency
- use a full binary tree with leaves representing characters to be
encoded
- build the tree so that more frequently appearing characters are
closer to the root - minimize the sum of the depth-frequency product
for each character
- the path from root to leaf gives the encoding - left is 0 and right
is 1
- how to build the tree
- does it have the optimal subproblem property
- let T be optimal and consider sibling characters x and
y
- form T' from T by combining x and y into a
character z with fz = fx fy
- let T'' be an optimal version of T'
- then replacing z in T'' with x and y would
produce a more optimal version of T, a contradiction
- T'' can't exist and T' is optimal
- the optimal subproblem property holds
- does it have the greedy choice property
- let T be an optimum tree T and the characters b and
c be furthest from the root of T; let fb <= fc
- x and y have the lowest frequencies with fx <=
fy; these are the two characters picked by the greedy
algorithm
- form T' by swapping x and b in T
- the depth-product sum for T is no less than the depth-product
sum for T', but T is optimal, so its sum is no more than
the sum for T' - that is the sums are equal.
- repeat the argument with characters c and y
- the optimal tree T can be converted into an optimal tree
incorporating the first greedy choice
- repeat as needed
- the greedy choice property holds
- the tree-building algorithm
- store the characters in a min-priority queue ordered by frequency
- repeatedly select the two smallest items, combine into a tree node
with a combined frequency and add the new node to the queue
- stop when the queue contains one item - the root of the optimal
tree
This page last modified on 17 April 2000.