Data Structures & Algorithms, Fall 2010

Programming Assignment 3 - An Example Solution


Table of Contents

Introduction

This page presents a solution to the third programming assignment, which required a method that searches a binary tree for a matching subtree.

The first matching rule states that two empty trees match. Because that can be immediately tested, it's easiest to get that out of the way and then go on to the more complicated case if necessary.

Note that, if both trees are empty, match() returns the source tree, in accordance with the matching rule that the subtree returned comes from the source tree. Several implementations chose to ignore this part of the matching rules and returned the pattern, which resulted in the matched subtree contains nodes that are not from the source tree message during testing.

What happens if the pattern is empty but the tree is not? This case isn't covered by the rules, but there are two sensible answers: there's a match, or there isn't. If there isn't a match, return null. However, in regular-expression matching over strings, the empty-string pattern matches an empty string in the source. By analogy, the empty-pattern tree should match an empty tree in the source. Because the source tree isn't empty, the code has to rummage around for an empty node from the source tree, which is best handed off to a private method.

On to the second matching rule: two trees match if their roots have the same value and their respective child trees match.

Before applying the second rule, the code has to make sure the source-tree root has a value; that is, that the source tree isn't empty. Otherwise, just implement the rule. If the second rule fails, the search continues with the source tree's left and right children, in accordance with the definition of binary trees.

Despite the simplicity of this tree-matching code, many implementations went drastically wrong when they tried to implement it.

The first, minor, problem occured when implementations returned pattern rather than tree; this is the same problem described above. The second problem is more serious: way too many (eight out of ten) implementations ignored the simple definition given by the second matching rule, and coded the tree match as an ungainly sequence of heavily nested if statements and boolean flags.

Ignoring the structure of the second match rule had several unpleasant consequences. First, almost all (six out of eight) complex match implementations failed on at least one test, either because the code was wrong, or it missed a case (or both). Second, the complex implementations used orders of magnitude more code than the second-rule implementation, and were spread out over pages, which made the code difficult to understand and debug. Third, it took days to write, test, and debug the complex implementation, instead of the ten to 15 minutes it would have taken to implement the second matching rule directly.

The third problem was most serious: most implementations used the wrong test for equality. Instead of using equals(), code used ==:

tree.value() == pattern.value()

Fortunately for this poorly-written code, the tests use Integers as the generic type parameter, and for Integers equals() produces the same result as ==. However, if the tests were run with StringBuilder for example, all the tests that should have succeeded would have failed.

That's all that was needed to solve the third programming assignment. The only part left is findEmptyNode(), which is only interesting because of its non-recursive tree-search implementation.

Index


This page last modified on 16 December 2010.

Creative
    Commons License