CS 176, Introduction to Computer Science II

Lab Assignment 3 Answer


This program is supposed to find all occurrences of one string in another string. Going with the idea of extremes, two extremes with respect to string location are the beginning and the end of a string.

$ la3
Input a sentence:  plink plonk plunk
Input characters:  pl
plink plonk plunk
      ^^
plink plonk plunk
            ^^
Input a sentence:  
There's a problem with finding a match at the start of a string. Poking around in the searcher methods, the first thing next() does is advance the search index to avoid finding the same string it found last time. This suggests that perhaps next() is advancing too far the first time it's called.

Looking at the searcher constructor verifies the suggestion: the constructor sets the search index to the first character and the first call to next() advances it to the second character, which will miss a match starting at the first character of the string. The fix is simple: set the search index to the character before the first character of the string; this character doesn't exist, but that doesn't matter because it will never be accessed because the first thing next() does is advance the index to the first character.

searcher(string s, string w) {

 // Construct an instance of a searcher that finds, in left to right
 // order, all occurrences of w in s.

 str = fix(s);
 wrd = fix(w);
 search_index = -1;
 }
The first test case confirms the fix:
cl la3
Input a sentence:  plink plonk plunk
Input characters:  pl
plink plonk plunk
^^
plink plonk plunk
      ^^
plink plonk plunk
            ^^
Input a sentence:  
Going to the other extreme:
$ la3
Input a sentence:  plink plonk plunk
Input characters:  nk
plink plonk plunk
   ^^
plink plonk plunk
         ^^
Input a sentence:  
Another problem, and next() seems a likely suspect.
int next(void) {

 // Return the index of the next occurrence of wrd in str or -1 if no such
 // occurrence exists.

 search_index++;

 while (search_index + wrd.length() < str.length()) {

   bool found = true;

   for (int j = 0; (j < wrd.length()) && found; j++)
     found = (wrd[j] == str[search_index + j]);
   if (found) return search_index;

   search_index++;
   }

 return -1;
 }
The while loop controls which portions of the string get searched, and sure enough the loop terminates one character too early. The index search_index + wrd.length() references one past the end of a potential match. The for loop, however, never references one past the end of a match. There is no problem with bad array accesses when search_index + wrd.length() equals str.length() and the while loop quits one iteration too early.
while (search_index + wrd.length() <= str.length()) {

  bool found = true;

  for (int j = 0; (j < wrd.length()) && found; j++)
    found = (wrd[j] == str[search_index + j]);
  if (found) return search_index;

  search_index++;
  }
Using while (search_index + wrd.length() < str.length() + 1) { also works. However, using while (search_index < str.length()) { Does not work. Why? The second test case shows the fix:
cl la3
Input a sentence:  plink plonk plunk
Input characters:  nk
plink plonk plunk
   ^^
plink plonk plunk
         ^^
plink plonk plunk
               ^^
Input a sentence:  
The third test case relies on the more traditional extreme of size:
cl la3
Input a sentence:  blah
Input characters:  for a long time I was without style or grace
Input a sentence:  for
Input characters:  for a long time I was without style or grace
Input a sentence:  
Mercifully, it appears the code can handle the case when the searched-for string is longer then the string being searched, even if the first part of the string matches.


This page last modified on 6 June 2001.