Programming Assignment 1 - Parsing Spreadsheets

Advanced Programming II, Fall 2002


Due Date

This assignment is due by 2:00 p.m. on Thursday, 19 September.

See the assignment turn-in page (last modified on 4 March 2002) for instructions on turning in your assignment.

Background

Most spreadsheet programs have the ability to write a spreadsheet as text into a file, producing a matrix of data items. For example, this


sid        name        lg  grade  tst 4  tst 3  tst 1  tst 2   os 4   os 3

132 coomer             A-  90.76   85.0   88.8   82.5   90.0   70.0  100.0
312 giambi             B+  86.60   83.8   85.0   82.5   85.0  100.0   83.3
231 jeter              A-  93.44   93.8   90.0   85.0   96.3   80.0  100.0
222 rivera             B   85.36   80.0   82.5   81.3   83.8  100.0   83.3
123 soriano            B+  87.91   88.8   87.5   82.5   87.5  100.0   85.0
321 vander wal         B   85.96   82.5   85.0   81.3   91.3   80.0   75.0
111 white              B   86.03   85.0   83.8   88.8   86.3  100.0   75.0
333 widger             B   85.60   86.3   82.5   82.5   86.3   80.0   75.0
213 williams           B   86.34   86.3   95.0   88.8   86.3   85.0   70.0

is an example of a grading spreadsheet written as a text file.

Perversely, it is often convenient or necessary to treat a text-file spreadsheet as if it were a real one, picking out data items from the matrix for further analysis. For example, I create grade histograms by running a program over the text representation of a grade spreadsheet; the program finds the data in the column of interest and uses the data to generate plotting commands.

The Problem

Write a program that reads a spreadsheet textfile from std-in, accepts element specs on the command line, and writes to std-out the data from the spreadsheet found at the given element specs. The command format is

parsess [ element-spec ]...

where [ A ] means an optional A and A... means A repeated one or more times.

Input

Your program should accept two forms if input: a textual spreadsheet from std-in and element specs from the command line.

Std-in contains a single textual spreadsheet; each line of input corresponds to a row of the spreadsheet. Within a row, each element is separated from adjacent elements by at least one space character; rows containing only space characters should be ignored. Note that, as with regular spreadsheets, each row may not have an entry for every column. You may assume that short rows are padded out to the right with space characters.

You may assume that adjacent columns are separated by a column of space characters that is at least one character wide. Your program should find as many columns as it can. Only space characters should appear in the separating column between adjacent columns; all other white-space characters should not appear in the separating column between adjacent columns.

The command line contains a sequence of zero or more element specs, each of which has the form

row-spec,col-spec

(note the absence of space around the comma). Row and column specs have the same form, and will be described together:

  1. A non-negative integer n - The nth row or column. In the example above, the element spec 2,1 refers to the element giambi.

  2. A pair of non-negative integers i-j with i <= j - The entries i through j in a particular row or column. For example, the element spec 7,0-2 refers to the elements

    111 white B
    

    the element spec 2-4,2 refers to the elements

    B+
    A-
    B
    

    and the element spec 0-1,8-9 refers to the elements

    os 4  os 3
    70.0 100.0
    

The topmost row is row 0; the leftmost column is column 0.

Output

Output is written to std-out and consists of the sequence of spreadsheet elements given by the command-line specifications; the elements are output in the same order as the specifications are given (in left-to-right order). Each set of elements should be separated from the next by one blank line.

As shown in the input section, spreadsheet elements are output in a way that preserves their shape in the spreadsheet: a row is output as a row, a column is output as a column, and a matrix is output as a matrix. It is not necessary to preserve the spacing between rows or columns; however empty elements must be preserved.

If an incorrect element spec is given on the command line, an error message should be written to std-out in place of the spreadsheet section. For example, if the command

parsess 1,2 34-3,blue,4 4-5,7-8

read the example spreadsheet given above, the output would be:

giambi

Malformed element spec:  34-3,blue,4

83.8  100.0
87.5  100.0

Testing

The assignment directory /export/home/class/cs-509/pa1 contains some example spreadsheets. The assignment directory is available from any cslab or linux machine.


This page last modified on 11 September 2002.