Somewhat less obscurely, consider the difference
optimal encoding size - optimal' encoding sizewhere optimal' is the same tree as optimal, except x and min1 have been swapped. Because the only difference between optimal and optimal' is the positions of x and min1, the difference reduces to
[ f(min1)lo(min1) + f(x)lo(x)) ] - [ f(min1)lo'(min1) + f(x)lo'(x) ]Also because of the swapping, lo'(x) = lo(min1) and lo'(min1) = lo(x), clears lo from the difference:
[ f(min1)lo'(x) + f(x)lo'(min1)) ] - [ f(min1)lo'(min1) + f(x)lo'(x) ]which can be re-arranged to
[ f(min1)lo'(x) - f(min1)lo'(min1)] + [ f(x)lo'(min1) - f(x)lo'(x)) ]One more re-arrangement gives
(f(min1) - f(x)(lo'(min1) - lo'(x))Because f(min1) ≤ f(x) and lo'(x) ≤ lo'(min1), the product must be nonnegative, which means the greedy encoding size must be at most the optimal encoding size.
This page last modified on 24 January 2006.