Programming Assignment 2 — TML Tagging

CS 305 & 503 - Data Structures & Algorithms,
Fall 2008


Due Date

This assignment is due by 6:00 p.m. on Tuesday, 25 November.

See the assignment turn-in page (last modified on 16 October 2007) for instructions on turning in your assignment.

The absolute deadline for turning-in Assignment 2 is Friday, 28 November at 6:00 p.m.. It is not possible to turn-in Assignment 2 after the absolute deadline.

Background

A TML (Tree Mark-up Language) document is a sequence of zero or more of tags interspersed with arbitrary other text. There are two types of tags: open tags and close tags. The format of an open tag is

[ name ]

where name is a sequence of one or more alpha-numeric characters; an alpha-numeric character is any character that makes isalnum() true.

The format of a close tag is

[ / name ]

where name is defined as for open tags.

Case is not significant in name; names are case insensitive. For example, the name blob is identical to the name BlOb. Zero or more space characters may appear between any two components of a tag; a space character is any character that makes isspace() true. For example, the close tags [/blob] and [ / blob ] are identical.

An open tag and a close tag match if all of the following are true:

  1. The name of the open tag matches the name of the close tag. For example, the names in the open and close tags [blob] and [ / bLoB ] match, but the names in the open and close tags [blob] and [/blab] don't match.

  2. The open tag precedes the close tag in the document. For example, the open tag precedes the close tag in

    ... [blob] ... [/blob] ...
    

    but does not precede the close tag in

    ... [/blob] ... [blob] ...
    

  3. All the tags between the open and close tags also match (this is the nesting condition). For example, the blob tags match in

    ... [blob] [blab] [/blab] [/blob] ...
    

    (as do the blab tags) but do not match in

    ... [blob] [/blab] [blab] [/blob] ...
    

    or

    ... [blob] [blab] [/blob] [/blab] ...
    

The Problem

Write a program that reads an TML document from standard input and checks it for proper tag nesting. If the TML tags are properly nested in the document, your program should output nothing; otherwise, it should output an informative error message to standard error and stop. Your code should use no STL data structures or algorithms; if you want to use it, you have to implement it.

Input

Input comes entirely from std-in. Arbitrary text may appear before and after any tag. Your program may assume that every occurrence of the character [ is interpreted as the start of an TML tag.

Output

If the input is a valid and properly nested TML document, your program should output nothing. If your program finds an error, it should print an informative error message and exit; there is no need to further process a document once an error is found.

Error messages should be a single line of informative text starting with “! ” (that's bang followed by space). Error messages should be sent to standard error, not standard out. Document processing should stop after the first error is found.

Testing

The program /export/home/class/cs-305-503/a2/gen-tmldoc writes a (possibly invalid or improperly nested) TML document to std-out. Because gen-tmldoc randomly generates a TML document each time it's run, you may find it more convenient to send gen-tmldoc output to a file and then take input from the file:

$ ./gen-tmldoc > doc

$ ./your-program < doc

If you run gen-tmldoc with the -b option, it will generate a bad TML document.


This page last modified on 13 November 2008.