R. Clayton (rclayton@monmouth.edu)
(no date)
One of the running themes in class is that your code is talking about you, and
that you want your code to say good things about you. In case you think I'm
being stupid or weird about this, here's a letter from the more recent RISKS
digest on the same topic.
--Date: Wed, 13 Mar 2002 18:26:28 +1300 From: "Dr Richard A. O'Keefe" <ok@cs.otago.ac.nz> Subject: Bioinformatics start-of-the-art
Bioinformatics is a hot topic at this university, and the computer science department is just starting to get involved. As part of trying to learn about this field, I thought I'd read a couple of the better-known programs. To be honest, I thought I'd run splint (formerly known as lclint) over them and find a minor buig or two. I'm not going to name either of these programs, but one of them was particularly interesting because we were thinking of having a student make a parallel version to try out a parallel architecture one of our people is interested in, because normal runs of this program on recent PCs can take about 3 weeks.
I don't know what art these programs are state-of; possibly macrame. They certainly aren't even 1970's state of the programming art.
* indentation inconsistent, crazy, or both (fix with indent) - lines up to 147 columns wide (fix with indent)
* lots of dead variables (fix with quick edit)
* array subscripts that could go negative (use unsigned char rather than char in a couple of dozen places, phew)
* failure to comprehend that C++ prototypes and C prototypes are different (fix by changing () to (void) in too many places)
* #define lint ... so that lint falls over (rename lint to Lint in a dozen files)
* assumption that long int = 32 bits (one program) or that int = 32 bits (the other) (not yet done, but use inttypes.h with a local backup)
* string->integer code that gets INT_MIN wrong (rip out, plug in code known since 60s)
* using %ld format with int arguments (*printf and *scanf), a real problem because the machines
(I have access to are 64-bitters and it'd be nice if the programs ran in LP64 mode.)
* gcc, liint, splint find variable used before initialised (see next item)
* technically legal syntax with no semantics: double matrix[][] as function argument,. (Scream, bang head on wall, write this message.)
Is it reasonable to expect people with a biochemistry or mathematics background to write clean well-engineered code? No. For the importance of the topic, and the sums of money involved, is it reasonable to expect that they'll have their programs cleaned by someone else before release? I think it is. With the pervasive lack of quality I'm seeing, I don't trust _any_ of the results of these programs. I have to wonder how many published results obtained using these programs (and fed back into databases that are used to derive more results which are ...) are actually valid.
This archive was generated by hypermail 2.0b3 on Fri May 10 2002 - 12:45:04 EDT