Programming Assignment 1 - XML Tag Nesting

Advanced Programming II, Spring 2004


Due Date

This assignment is due by 2:00 p.m. on Tuesday, 3 February.

See the assignment turn-in page (last modified on 9 February 2004) for instructions on turning in your assignment.

Background

For purposes of this assignment an XML document is a sequence of zero or more of tags interspersed with arbitrary other text (almost arbitray other text: see below).

There are two types of tags: open tags and close tags. The format of an open tag is

< name >

where name is a sequence of one or more alpha-numeric characters; an alpha-numeric character is any character that makes isalnum() true.

The format of a close tag is

< / name >

where name is defined as for open tags.

Case is not significant in name; names are case insensitive. For example, the name blob is identical to the name BlOb. Zero or more space characters may appear between any two components of a tag; a space character is any character that makes isspace() true. For example, the close tags </blob> and < / blob > are identical.

An open tag and a close tag match if all of the following are true:

  1. The name of the open tag matches the name of the close tag. For example, the names in the open and close tags <blob> and < / bLoB > match, but the names in the open and close tags <blob> and </blab> don't match.

  2. The open tag precedes the close tag in the document. For example, the open tag precedes the close tag in

    ... <blob> ... </blob> ...
    

    but does not precede the close tag in

    ... </blob> ... <blob> ...
    

  3. All the tags between the open and close tags also match (this is the nesting condition). For example, the blob tags match in

    ... <blob> <blab> </blab> </blob> ...
    

    (as do the blab tags) but do not match in

    ... <blob> </blab> <blab> </blob> ...
    

    or

    ... <blob> <blab> </blob> </blab> ...
    

The Problem

Write a program that reads an XML document from std-in and checks it for proper tag nesting. If the XML tags are properly nested in the document, your program should output nothing; otherwise, it should output an informative error message to std-out and stop.

Input

Input comes entirely from std-in. Arbitrary text may appear before and after a any tag. Your program may assume that every occurrence of the character < is interpreted as the start of an XML tag.

Output

If the input is a valid and properly nested XML document, your program should output nothing. If your program finds an error, it should print an informative error message and exit; there is no need to further process a document once an error is found.

Error messages should be a single line of informative text starting with ! (that's bang followed by space). Error messages should be sent to std-out, not std-err.

Testing

The program gen-xmldoc writes a (possibly invalid or improperly nested) XML document to std-out. Because gen-xmldoc randomly generates a XML document each time it's run, you may find it more convenient to send the output of gen-xmldoc to a file and then take input from the file:

$ ./gen-xmldoc > doc

$ ./your-program < doc

If you run gen-xmldoc with the -b option, it will generate a bad XML document. gen-xmldoc can be found in the assignment directory

/export/home/class/cs-509/pa1

You can find my answer to the assignment in the same directory. Remember, the objective of this assignment is to check XML nesting; the objective is not to faithfully reproduce the behavior of my solution. If my solution's wrong and you copy the error, you're going to lose points.


This page last modified on 3 February 2004.