Given that the data can undergo arbitrary preprocessing, bzip2 is not the best
possible compression algorithm available. However, all high-performance
algorithms produce compressed data that is unusable in the data structures
being considered. The 333,414 bytes of compressed data is being used as an
unrealizable lower-bound estimate to illustrate the improvements provided by
Bloom filters.
This page last modified on 2006 January 24.