'
) or a double-quote characters ("
); the
same delimiter character must appear on both ends of the quotation. Any
characters may appear in a quotation; if the delimiter character appears in the
quotation, it must be doubled (that is, appear as ''
or ""
).
Write an algorithm that accepts a string and finds a quotation in the string, stopping at the end of the first quotation it finds, or at the end of the string. The input string may or may not contain a quotation, and the quotation may or may not be syntactically correct.
This FSM treats a string with no quotation as an error; alternatively, you could draw the end arrow from Start to End. Most people who got close to the right answer tripped up over strings such as![]()
""""
.
Backwards reasoning traces from the symptoms causing distress to the problem causing the symptoms. For performance, distress is caused by excessive resource use; too much time or space, for example. After measuring the program to determine the symptoms, backwards reasoning, along with detailed trace measurements, would trace back to the sections of code that are causing the excessive resource use.
Notice how well the 80-20 rule works with backward reasoning. The 80-20 rule says that excessive resource use is generally caused by a small portion of the code. This means that you usually have only a few specific locations in code to find using backwards reasoning.
Their spam filter had a simple double loop implementation:
This algorithm has asymptotic running time italit(O(MP)), where M is the number of messages to check and P is the number of spam patterns for which to check. In a production environment with a high volume of mail, both M and P are large enough turn the spam filter into a bottleneck.for each message m for each spam pattern p if p found in m then do something.
The greatest performance improvement came from specializing the inner loop to cut down on the number of iterations. By sorting the patterns and indexing on the first letter, the spam filter could quickly and efficiently locate the patterns starting with a particular letter.
The worst-case asymptotic running time remains at italit(O(MP)) for the improved spam filter. The average-case asymptotic running time improved, although not as much as expected, for reasons explained by Kernighan and Pike in Chapter 7.
The 80-20 rule is a heuristic (or rule of thumb) stating that 80% of a program's execution time is spent in 20% of the code.
The 80-20 rule is important because it suggests that a program with poor performance may have one or two fairly small sections of hot-spot code that are causing the problem. The first step in improving the program's performance is to find the hot spots and fix them.
This page last modified on 28 February 2000.