See the assignment turn-in page (last modified on 21 January 2003) for instructions on turning in your assignment.
Symbolic execution is an important technique in program analysis. For example, compilers use symbolic execution to determine when an expression is using an uninitialized variable. Each variable in a program can hold either of the symbolic values "defined" or "undefined". When a variable is created, it holds "undefined"; when assigned a value, a variable holds the value "defined". If the compiler finds an expression containing a variable holding "undefined", then the program is incorrectly using an uninitialized variable.
Here is an example, with the results of symbolic execution appearing in the comments:
int * ip; // ip = "undefined" int i; // i = "undefined" i = 3; // i = "defined" *ip = i; // error - ip undefined
The compiler finds the variable ip
is undefined in the expression
*ip
. Notice that neither the types of the variables nor the actual values
assigned to the variables matter; the only thing that matters is whether the
variables hold "undefined" or "defined".
r0
through r7
; memory is addressed by
identifiers in C++ syntax (see Deitel and Deitel, page 26), which are assumed
to be different from the register names.
The CPU executes executes a sequence of instructions starting at the beginning of the sequence and stopping at the end of the same sequence; instructions are executed in strict sequential order with no jumping or gotos.
An instruction is either a move instruction or an operator instructions. A move instruction has the form
move
dst src
and moves the value stored in src to the location given by dst; the effect of a move instruction is
dst can be either a memory-address identifier or a register name; src can be either a memory-address identifier, a register name, or an integer constant. However, both dst and src cannot be memory-address identifiers.
An operator instruction has the form
where opr is any of the standard C++ binary operators multiplication, division, modulus, addition, subtraction, and the bitwise operators left shift, right shift, AND, exclusive OR, and inclusive OR (See Deitel and Deitel Appendix A for the operators). dst must be a register name; src can be a register name or an integer constant. The effect of an operator instruction is to compute
=
dst op src
Two blocks are said to perform the same computation if and only if, when started from the same memory configuration, they both produce the same memory configuration at the end of their execution. Notice that the final contents of the CPU memory registers do not matter when determining if two computations compute the same values. Notice also that this property must hold for any initial memory configuration.
For example, the two blocks
move r0 a move r1 a + r0 r1 move a r0 |
move r0 a + r0 r0 move a r0 |
a = aval + aval
, where
aval
is the initial value stored in the memory location a
. Notice
that the difference in values stored in register r1
by the two blocks does
not matter when determining if the blocks produce the same computation.
Because your program is computing on symbolic values, it need not worry about matters such as precision and overflow.
std::isspace()
, excluding the line-ending
newline); at least one blank line separates the two blocks. Any non-space
character occurring after the second block should result in an input error,
which prints an informative error message to std-err and immediately terminates
the program.
An instruction block consists of one or more instructions. Each instruction appears in a single line, and a single line contains exactly one instruction.
Any syntactically incorrect input should immediately terminate execution with an informative error message written to std-err.
If the two blocks perform different computations for one or more memory-resident variables, the program should write the following error message to std-out for any one of the incorrect variables:
Block 1 computes value
<value
1>
for variable <variable>.
Block 2 computes value
<value
2>
for variable <variable>.
where <value
i>
is the value computed by block i and <variable>
is one
of the memory-location identifiers with incorrectly computed values.
Any syntactically incorrect input should immediately terminate execution with an
informative error message written to std-err.
gen-code
program in /export/home/class/cs-509/pa2
generates blocks of code you can use to
test your program. The command format is
gen-code
[ -d
]
By default, gen-code
writes to std-out two code blocks that compute
identical values. The -d
command-line option causes gen-code
to
write two code blocks that compute different values.
There are a few points you should keep in mind when using gen-code
:
gen-code
's output to a file so
you can test your program with repeatable input.
gen-code
creates good output by generating a random block of code
and then optimizing it. Because optimization is tricky (and because I wrote
the optimizer), it may happen that the optimizations are incorrect and that
optimized block computes different values then does the unoptimized block.
If your program indicates the good input from gen-code
computes different
values, your program may be correct and you should check the input to verify
your program's result.
-d
command-line option. gen-code
produces bad output by
generating two random blocks of code; your program should almost always
indicate an error in the bad output from gen-code
.
However, if gen-code
manages to generate two identical code blocks (which
could happen, as they say), then your program will indicate no errors. Again,
you should verify the input to your program to see if your program's correct.
gen-code
will create a random expression that's too
big to compile. When that happens, you'll get a "No more free registers."
error message. Just run gen-code
again.
This page last modified on 11 February 2003.