Programming Assignment 6 - Geometric Hashing

Advanced Programming I, Spring 2006

Due Date

This assignment is due by 5:30 p.m. on Thursday, 20 April.

See the assignment turn-in page (last modified on 18 January 2006) for instructions on turning in your assignment.

Hershey fonts define letters by a set of strokes, where each stroke is a chain of points. For example, the letter "H" is represented by the three strokes ((0, 0), (0, 4)), ((4, 0), (4, 4)), and ((0, 2), (4, 2)), where each stroke is a chain of two points.

Given a set C of characters from a Hershey font, the character-recognition problem involves determining whether a set of strokes describes one of the characters in C. If the unknown character and the characters in C are from the same font, character representation is fairly simple problem: group the characters in C by the number of strokes in each character, and within each group order the characters by the number of points in each stroke. Given an unknown character c, find in C the characters with the same number of strokes and sequence of chain sizes. Then a straightforward comparison of chains determines which, if any, character c matches.

Character recognition gets a bit harder when the unknown character is described only by its points and not by the set of strokes. In this representation the letter 'H' is ((0, 0), (0, 2), (0, 4), (4, 0), (4, 2), (4, 4)). The solution, however, is approximately the same: group the characters in C by number of points, and then order the points in each character in some way, such as by increasing x first and then by increasing y.

Character recognition gets still a bit harder when some of the points may be missing from the unknown character. The previous solution has to be extended to look at all known characters with at least the same number of points as the unknown character. In addition, the solution may not produce a single answer because an unknown character with missing points may match several characters. For example, an "I" missing the bottom horizontal stroke may match a "T", or a "Q" without the short stroke may match a "O". The usual approach in this case is to present all the possibilities in, for example, decreasing order of number of points matched.

The character recognition problem gets involved when the unknown character, in addition to having missing points, has undergone rigid-body rotations and scaling. Now the points in the unknown character are completely different from the points in the known characters and a simple comparison fails. However, because the transformations were rigid-body, the geometric relations among the unknown points still hold, and that is the key to implementing character recognition.

To solve character recognition with missing and transformed points, all the characters, known and unknown, should be placed in canonical (or standard) form; this eliminates variations and provides a consistent basis for comparison. The usual canonical form is to pick two points from the known characters, say the first and last or the first and second, and form a line segment S between the two points. Then transform (that is, rotate and scale) the character so that S co-incides with the unit segment on the Y axis.

Given a possibly incomplete set of possibly translated points from an unknown character, recognition proceeds by putting the unknown character in canonical format and comparing the results to the known characters in canonical format using the approach given for recognizing unknown characters with untranslated but possibly missing points described above.

There is a problem, however: which two points in the unknown character should be chosen to form S, the line segment transformed to the unit interval on the Y axis? In order to insure a match between two characters, the same two points for S have to be chosen on each character. And the points in the unknown character can be in any order, so it's impossible to say, for example, which are the first and last points.

The solution to this problem is to use brute force: examine all possible pairs of points in the unknown character. Put the unknown character in canonical form using each pair of points, and compare the now possibly canonical character to the known canonical characters, keeping track of the best match found so far.

This solution, although it works, is expensive: there are O(n²) possible canonical forms for the unknown character where n is the number of points in the unknown character, each of which has to be compared to the known characters. This is where geometric hashing comes in.

In geometric hashing, the known character points, in canonical form, are stored in a hash table. The points of each possible variant of the unknown character are hashed into the same table; collisions indicate possible matches. After all the points of a canonical version of the unknown character are hashed, the number of matching points gives an indication of likelihood that the known and unknown characters match.

The Problem

Write a program that uses geometric hashing to determine if a given unknown character matches a given known character.

Input

Input to your program comes from the procedure

void letters::trial(double * & normal, double * & distorted);

which stores into each pointer reference an array describing a letter. Each letter returned is randomly selected from the available letters; your program has the task of determining if the two letters returned by trial() are the same.

The contents of the normal array are a complete and correct description of the points in a letter as given by the font. The contents of the distorted array are also the points in a letter given by the font; however, the points may have undergone any of the following transformations:

The points may have been scaled by an arbitrary factor.
The points may have been rotated about the origin by an arbitrary angle.
Some subset of the points may be missing.

The scaling and rotation transformations are rigid-body: the same transformation was applied to all points. These transformations are applied probabilistically; if the die rolls right, it may be that none of them are applied, in which case the distorted description will also be a complete and correct description of the points in a letter given by the font.

Both arrays returned by trial() have the same format, so only the normal will be described.

normal[0] is the number of points in the character.
normal[2*i + 1] is the x coordinate of the ith point for 0 ≤ i < normal[0].
normal[2*(i + 1)] is the y coordinate of the ith point for 0 ≤ i < normal[0].

Both arrays should be freed by the caller when no longer needed. The program you submit should call letters::trial() exactly once; your program may call it as many times as necessary while you develop and implement your code.

You can find the files letters.h and letters.cc in the /export/home/class/CS-503/a6 assignment directory.

Output

When your program has finished processing the letters received from letters::trial(), it should call

bool letters::results(bool same_characters)

with true if your program has determined that the two characters received from letters::trial() are the same or false otherwise. The return value from letters::results() indicates if your program's determination is correct (true) or incorrect (false).

The program you submit should call letters::results() exactly once; your program may call it as many times as necessary while you develop and implement your code.

Testing

The procedures

double * letters::normal(char c);
double * letters::distorted(char c);

return an array describing the given letter c, 'A' ≤ c ≤ 'Z'. The array has the format given above and should be freed by the caller when no longer needed.

These procedures are for your convenience when developing and testing your code. The program you submit should not call either of these procedures.

This page last modified on 21 April 2006.