See the assignment turn-in page (last modified on 3 November 2003) for instructions on turning in your assignment.
< a href = "
URL" >
where URL is the reference to the external resource.
Outside of strings, letter case is ignored in tags; also out side of strings, space characters in excess of those needed to separate parts of a tag are also ignored.
The href =
part of the tag is called an attribute; anchor tags have many
attributes, but the only one we're interested in is the href
attribute,
which, as shown above, is a string describing a URL to a resource. The href
attribute is not required for anchor tags; an anchor tag need not have an href
attribute.
Other HTML tags also have a href attribute (such as, for example, the link tag), but the only href attribute we're interested for this assignment in are those associated with anchor tags.
The command line to your program should contain a single URL; the href URLs in the associated resource should be written to std-out one per line. The resource associated with the command-line URL need not be a HTML document; if it isn't, chances are good that it won't contain any anchor tags, in which case your program should print nothing.
The only thing your program has to do is find the href URLs and print them to std-out. It doesn't have to check the URLs for validity, or figure out what kind of URLs they are, or retrieve the resources associated with the URLs; all that will come later.
You can use get()
to get the resource associated with a URL. get()
is
defined in get.h
and get.cc
, which can be
found in the assignment directory
/export/home/class/cs-305/pa1b
Be careful if you copy these files to a more convenient location; you are
responsible for making sure you are using the most recent version of these
files. If you do copy the files, you shouldn't change them; get.h
and
get.cc
are deleted if you turn them in.
You can find os-print-urls
, my solution to this assignment, in the
assignment directory; os is one of linux
or solaris
. Remember,
the objective of this assignment is to correctly filter links in HTML
documents; it is not to faithfully reproduce the behavior of my solution. If
my solution's wrong and you copy the error, you're going to lose points.
This page last modified on 11 September 2003.