Lecture notes for CS 503, Advanced Programming I

Advanced Programming I Lecture Notes

4 April 2006 • Hashing Basics


Easy after a small amount of thought. Suppose p[] is an n-element identity permutation (that is, p[i] = i for 0 ≤ i < n). There are at least two ways to generate a random permutation in p:

for i = 0 to n - 1
  swap(p[i], p[random(n)])
and
<<<<<<< gen-hashb.html
#! /bin/bash
#
# ranperm256 - write to std-out a random permutation of the 256 integers
#   0..255.  The random permutation is written as a 16x16 matrix.

awk 'BEGIN {
       for (i = 0; i < 256; i++)
         rp[i] = i

       srand()
       for (i = 0; i < 255; i++) {
         j = i + int(rand()*(256 - i))
	 t = rp[i]
	 rp[i] = rp[j]
	 rp[j] = t
         }

       for (i = 0; i < 16; i++) {
         for (j = 0; j < 16; j++)
	   printf "%4d", rp[i*16 + j]
	 print ""
	 }
       }
    ' < /dev/null
=======
for i = 0 to n - 1
  swap(p[i], p[i + random(n - i)])
>>>>>>> 1.14
where random(n) returns at random one of the n integers 0 to n - 1. The first for loop, because each element in p can be swapped with n values, generates nn possible permutations. The second for loop, because element i can be swapped with n - i values, generates n! permutations.

There are n! possible permutations of n values. The second for loop potentially generates all possible permutations with equal probability (1/n!). The first for loop generates far more potential permutations than there are possible ones, making it less clear that all possible permutations are equally likely. In addition, statistical analysis of the permutations generated by the first for loop shows, surprisingly, that the generated permutations tend to clump together over the range of possible permutations and are not uniformly distributed. Of the two loops, the second loop is the more effective random-permutation generator, both conceptually and emperically.

The article Randomized Shuffling by Tim Rolfe (Dr Dobb's Journal, January, 2000) contains a detailed comparison of these two loops, including graphs showing statistical properties of the generated permutations.


This page last modified on 24 January 2006.