Programming Assignment 4b - Cumulative Word Counts

Computer Algorithms I, Fall 2003


Due Date

This assignment is due by 5:00 p.m. on Monday, 24 November.

See the assignment turn-in page (last modified on 3 November 2003) for instructions on turning in your assignment.

The Problem

Up to this point, your word counter has been working on a page-by-page basis. mitm() receives a document, scrapes information off it, prints the information, and throws the information away before returning for the next document.

Modify mitm() to become cumulative; that is, information scraped from each document is kept between calls to mitm(). The information printed by mitm() after each page represents the word counts for all documents browsed from the time your word counter was started.

mitm() output is ordered by increasing lexicographic (dictionary) order on the words, as it was for Assignment 4a. When the same word appears in several documents, the documents are ordered by decreasing word count; documents with the same word count may appear in any order. For example

these http://www.drew.edu/ 1 232
to http://www.drew.edu/ 5 232
to http://www.monmouth.edu/ 2 105
to http://www.kean.edu/ 2 70
tour http://www.drew.edu/ 1 232
tournament http://www.kean.edu/ 1 70
traphagen http://www.drew.edu/ 2 232
trustees http://www.kean.edu/ 1 70
under http://www.kean.edu/ 1 70
undergrad http://www.drew.edu/ 1 232
university http://www.drew.edu/ 4 232
university http://www.monmouth.edu/ 2 105
university http://www.kean.edu/ 1 70
unsubscribe http://www.drew.edu/ 1 232

If mitm() gets a page it has already scraped, it should re-scrape the new page and replace the page's old information with the new information.

You might want to read up on global and static variables in Deitel & Deitel (start at page 181).

You can find os-word-counter, my solution to this assignment, in the assignment directory /export/home/class/cs-305/pa4b; os is one of linux or solaris. Remember, the objective of this assignment is to cumulatively keep track of document word counts; it is not to faithfully reproduce the behavior of my solution. If my solution's wrong and you copy the error, you're going to lose points.


This page last modified on 11 November 2003.