The two most prominent things in the problem statement are
the mailbox reading program readmbox
and the mailbox. These are not the
only things mentioned in the problem statement - there are readmbox
commands and messages, for example - but all the other things can be associated
with either readmbox
or the mailbox. In terms of abstraction, it seems
that readmbox
and the mailbox are the most abstract things, in terms of
encompassing and hiding the most details, and so represent a good place to
start the design of the system.
Because the problem statement defines the structure of mailboxes, there isn't
much design we have to do with respect to the mailbox itself. readmbox
, on
the other hand, is defined largely in terms of what the user can do with it:
attach it to a mailbox, get the number of messages in the mailbox, and print
the contents of messages in the mailbox. That's about all the problem
statement says about readmbox
, so these actions will serve as the starting
point of readmbox
's design.
(As a software engineering aside, the previous paragraphs are a simple example of the object-oriented design technique called noun-verb analysis. As you move along your software engineering career, you'll probably run across many, many versions of noun-verb analysis.)
Of the three operations - attach, count, and print - the most important one
seems to be attach; attach must proceed count and print, and once attach is
designed, count and print should follow along directly. How should
readmbox
attach to the mailbox? There are two possibilities: readmbox
treats the mailbox as an external object and manipulates it as a file, or
readmbox
reads the contents into the mailbox and manipulates an internal
representation.
This problem doesn't dictate the choice of which way to attach to the mailbox,
and there's nothing in the problem that makes one way better than the other.
This is supposed to be an exercise in vector programming, so it seems more
appropriate to read the mailbox into readmbox
as some kind of vector.
But, enough of this - ten minutes are up. Let's write some code!
Main
readmbox
main()
is straightforward: initialize the mailbox and go
interact with the user.
<main()
>= (U->)
int main(int argc, char * argv[]) {
if (argc != 2) {
cerr << "Command format is \"" << argv[0] << " mailbox\".\n";
return EXIT_FAILURE;
}
mailbox mbox;
const string resp = mbox.initialize(argv[1]);
if (resp != "") {
cerr << resp << ".\n";
return EXIT_FAILURE;
}
do_io(mbox);
return EXIT_SUCCESS;
}
User interaction is also straightforward: prompt for a command, read it, validate it, and, if valid, execute it. "Execute" in this case means print the message associated with the input number.
<user interaction>= (U->) static void do_io(const mailbox & mbox) { const int mcnt = mbox.msgcnt(); if (mcnt == 1) cout << "There is 1 message.\n"; else cout << "There are " << mcnt << " messages.\n"; while (true) { cout << "* "; string resp; cin >> resp; if (cin.eof() || resp == "q" || resp == "Q") break; if (!is_number(resp)) { cout << "Please reply with a message number or 'q' to quit.\n"; continue; } int mno = atoi(resp.c_str()); if ((mno < 1) || (mno > mcnt)) { cout << "Message number " << mno << " is out of range; the valid range is [1, " << mcnt << "].\n"; continue; } mbox.output(cout, mno); } }
A message number is just a string of digits, and is_number()
checks to make
sure. The is_not_digit()
circumlocution is not necessary because the STL
provides the unary predicate adaptor not1()
, but we haven't got to function
adopters yet.
<number test>= (U->) static bool is_not_digit(char c) { return !isdigit(c); } static bool is_number(const string & str) { return find_if(str.begin(), str.end(), is_not_digit) == str.end(); }
That's it for the readmbox
program itself.
<readmbox.cc
>= #include <iostream> #include <string> #include <algorithm> #include <cctype> #include <stdlib.h> #include "mailbox-char.h" <number test> <user interaction> <main()
>
The Mailbox Class Interface
From the discussion of the main routine above, the mailbox's public interface
must provide functions for at least three operations: initialize the mailbox
from the contents of a mailbox file,
<public mailbox functions>= (U->) [D->] string initialize(const char * mbox_name);
return the number of messages in the mailbox,
<public mailbox functions>+= (U->) [<-D->] unsigned msgcnt(void) const;
and print the contents of a message in the mailbox.
<public mailbox functions>+= (U->) [<-D] void output(ostream & out, unsigned mno) const;
The private parts of the mailbox object begin to reveal how the mailbox is
represented within readmbox
.
The contents of the external mailbox is represented as a vector of characters.
<private mailbox elements>= (U->) [D->] typedef vector<char> Mailbox; Mailbox mbox;
Messages within the mailbox character vector mbox
are represented by a
vector of iterators into mbox
. Message i
begins at the character in
mbox
referenced by message_starts[i]
and ends just before (to the left
of) the character in mbox
referenced by message_starts[i + 1]
. For
simplicity and consistency, message_starts
also includes an iterator to a
phantom next message so the last message in the mailbox can be treated the same
as all the other messages in the mailbox; this representation is analogous to
the usual iterator range representation used by the STL.
The vector of mailbox iterators header_ends
keeps track of where each
message's header ends. The header for message i
ends just before
(immediately to the left of) the character in mbox
referenced by
header_ends[i]
. The intention is to have everything from
header_ends[i]
up to the next message start be the message content, but
there's a small problem with this intention: this range includes the blank line
between header and content as part of the content, which isn't correct. The
proper thing to do is have separate vectors indicating end of header and
beginning of content; maybe later (there's a similar problem at the other end
with the blank line before the start of the next message).
<private mailbox elements>+= (U->) [<-D->] typedef Mailbox::iterator Mailbox_itr; typedef vector<Mailbox_itr> Mailbox_itrs; Mailbox_itrs message_starts; Mailbox_itrs header_ends;
There are also some private functions to read the external mailbox; find the message starts, and find the header endings.
<private mailbox elements>+= (U->) [<-D] bool read(const char * mbox_name); void find_message_starts(void); bool find_header_ends(void);
And that's it for the mailbox class interface.
<mailbox-char.h
>=
#ifndef _mailbox_char_h_defined_
#define _mailbox_char_h_defined_
#include <vector>
#include <string>
class mailbox {
public:
<public mailbox functions>
private:
<private mailbox elements>
};
#endif
The Mailbox Class Implementation
Now all that's left is to implement the mailbox class functions.
<mailbox-char.cc
>= [D->]
#include <algorithm>
#include <fstream>
#include "mailbox-char.h"
The public member function initialize()
reads in the external mailbox file,
and then find the message starts and header ends. If anything goes wrong,
return a string describing the problem, otherwise return the empty string. The
assert()
statement provides a sanity check: if everything's o.k., the
number of message starts should equal the number of header ends.
<mailbox-char.cc
>+= [<-D->]
string mailbox::initialize(const char * mbox_name) {
if (!read(mbox_name))
return string("Can't open ") + mbox_name;
find_message_starts();
if (!find_header_ends())
return string("Mailbox appears to be corrupted: message header not ") +
string("followed by a blank line");
assert(msgcnt() == header_ends.size());
return string("");
}
The msgcnt()
public member function is simple, as long as you don't forget
the phantom message start at the end.
<mailbox-char.cc
>+= [<-D->]
unsigned mailbox::msgcnt(void) const {
return message_starts.size() - 1;
}
The output()
public member function is also simple, but with a few twists.
First, the message number mno
is between 1 and n
, inclusive, and so
needs to be knocked one to the left to be a sensible vector index. Also, don't
forget to skip the blank line separating the header from the content.
This code can also be written using STL's output stream iterators, but we haven't studied them yet.
<mailbox-char.cc
>+= [<-D->]
void mailbox::output(ostream & out, unsigned mno) const {
assert((1 <= mno) && (mno <= msgcnt()));
for (Mailbox_itr i = header_ends[mno - 1] + 1; i < message_starts[mno]; i++)
out << *i;
}
Reading the external mailbox is simple, but for a little trick: starting off
mbox
with a newline character. This trick simplifies some code in
find_message_starts()
.
This code can be streamlined by using the STL's input-stream iterators, but we haven't studied them yet.
<mailbox-char.cc
>+= [<-D->]
bool mailbox::read(const char * mbox_name) {
ifstream mbox_file(mbox_name);
if (!mbox_file)
return false;
mbox.push_back('\n');
char c;
while ((c = mbox_file.get()) != EOF)
mbox.push_back(c);
return true;
}
Finding the message starting points is simple, thanks to the little trick in
read()
, which eliminates the need to treat the first occurence of "From
"
in the mailbox as a special case. Note that end
is pushed into
message_starts
as the phantom message.
<mailbox-char.cc
>+= [<-D->]
void mailbox::find_message_starts(void) {
const char * marker = "\nFrom ";
const unsigned mlen = strlen(marker);
Mailbox_itr start = mbox.begin();
const Mailbox_itr end = mbox.end();
assert(mbox.size() > 0);
while (true) {
start = search(start, end, marker, marker + mlen);
if (start == end) break;
message_starts.push_back(start);
start += mlen;
}
message_starts.push_back(end);
}
Finding the end of a header works pretty much like finding the beginning of a message, except with different search text and a different end indicator. One difference is that RFC 822 requires that every header end in an blank line, even if the message has no content; failure to find the blank line indicates a trashed mailbox.
<mailbox-char.cc
>+= [<-D]
bool mailbox::find_header_ends(void) {
const char * marker = "\n\n";
const int mlen = strlen(marker);
for (unsigned i = 0; i < message_starts.size() - 1; i++) {
Mailbox_itr j = search(message_starts[i], message_starts[i + 1],
marker, marker + mlen);
if (j == message_starts[i + 1])
return false;
header_ends.push_back(j + 1);
}
return true;
}
mailbox-char.cc
>: D1, D2, D3, D4, D5, D6, D7
mailbox-char.h
>: D1
main()
>: D1, U2
readmbox.cc
>: D1
This page last modified on 3 July 2000.