See the assignment turn-in page (last modified on 3 November 2003) for instructions on turning in your assignment.
http://pagelog/
command?args
If the command portion of the is missing, the URI is syntatically
incorrect.
The commands your page-logger should recognize are: output
, find
, and
clear
. The forms and behaviors of each of these commands are
output
:
http://pagelog/output?
filename
outputs the cumulative word list to a file with the given name, or to std-out
(which corresponds to std::cout
) if filename or ?
filename
is missing. The new file should replace an old file of the same name.
filename may begin with a slash, as in, for example, the URI
http://pagelog/output?/tmp/words.out
In such cases, the filename without the initial slash is, to continue the
example, /tmp/words.out
, which creates the file words.out
in the
directory /tmp
. Files that are not absolute are written relative to the
directory in which cmd(page-logger) is running.
There are three filenames your code should recognize and treat specially:
http://pagelog/output?std::cout
- Output should be written to
std-out (which corresponds to std::cout
). This is the default behavior in
the absence of a filename.
http://pagelog/output?std::cerr
- Output should be written to
std-error (which corresponds to std::cerr
).
http://pagelog/output?html
- Output should
be written to (that is, stored in) a storage block allocated within mitm()
and returned via the resource parameter in mitm()
; the output should not be
null terminated. The size field of the resource contains the output's length
in characters. The information in the block will be sent back to the
requesting browser as an HTML page; the code calling mitm()
will free the
storage block.
The special filenames must appear exactly as given above; the URI
http://pagelog/output?cout
creates a file called cout
in the
directory in which the page-logger is running and writes the word statistics to
it.
find
:
http://pagelog/find?
[ keyword [ , keyword ]... ]
where the notation [ A ] means that A is optional; that is, it may or may not appear and the notation A... means zero or more repetitions of A; that is, nothing or A or AA or AAA and so on.
Each keyword is a taken to be a word in a Web page. It is not required
that keyword match the syntax used to define words in a Web page. For
example, <300!!!>
is a valid keyword; it won't be a useful keyword, but
that's not the page-logger's problem.
In response to a find command, the page-logger should return a list of pages
containing all the keywords given in the command. The list should be a
sequence of URIs, each separated from the next by a newline character (the last
URI, if any, need not be followed by a newline character). URIs in the list
should be ordered by descending hit count, which is the total number of
times all keywords in the find command match words in the page; URIs with,
identical hit counts can be arbitrarirly ordered relative to one-another. If a
find command contains no keywords following the ?
, or contains no ?
or keywords, then all pages collected so far should be listed.
The page list should be placed in a block of storage and returned via the
mitm()
resource parameter, just as is done for HTML output described
above.
clear
:
http://pagelog/clear
The clear command clears all cumulative page information gathered so far; effecively re-starting the page-logger.
If an error occurs when processing a command, your code should pass back an informative error message using the same approach as described for HTML output above. The error message will be sent back to the requesting browser as an HTML page.
This page last modified on 8 December 2003.