See the assignment turn-in page (last modified on 3 November 2003) for instructions on turning in your assignment.
mitm()
receives a document, scrapes information off it, prints the
information, and throws the information away before returning for the next
document.
Modify mitm()
to become cumulative; that is, information scraped from each
document is kept between calls to mitm()
. The information printed by
mitm()
after each page represents the word counts for all documents browsed
from the time your word counter was started.
mitm()
output is ordered by increasing lexicographic (dictionary) order on
the words, as it was for Assignment 4a. When the same word
appears in several documents, the documents are ordered by decreasing word
count; documents with the same word count may appear in any order. For example
these http://www.drew.edu/ 1 232 to http://www.drew.edu/ 5 232 to http://www.monmouth.edu/ 2 105 to http://www.kean.edu/ 2 70 tour http://www.drew.edu/ 1 232 tournament http://www.kean.edu/ 1 70 traphagen http://www.drew.edu/ 2 232 trustees http://www.kean.edu/ 1 70 under http://www.kean.edu/ 1 70 undergrad http://www.drew.edu/ 1 232 university http://www.drew.edu/ 4 232 university http://www.monmouth.edu/ 2 105 university http://www.kean.edu/ 1 70 unsubscribe http://www.drew.edu/ 1 232
If mitm()
gets a page it has already scraped, it should re-scrape the new
page and replace the page's old information with the new information.
You might want to read up on global and static variables in Deitel & Deitel (start at page 181).
You can find os-word-counter
, my solution to this assignment, in the
assignment directory /export/home/class/cs-305/pa4b
; os is one of linux
or solaris
.
Remember, the objective of this assignment is to cumulatively keep track of
document word counts; it is not to faithfully reproduce the behavior of my
solution. If my solution's wrong and you copy the error, you're going to lose
points.
This page last modified on 11 November 2003.