Lecture Notes for CS 325

Software Engineering for Safety Critical Systems


  1. safety critical systems - high reliability systems, high assurance systems

    1. software is only one part of a safe system

    2. one view - software itself is not unsafe

    3. another view - safety is a system-wide characteristic

  2. safety - first, do no harm

    1. safety - the inability to cause personal loss or harm

    2. completely safe systems are either very expensive to build (automobile travel) or impossible given current state of the art (nuclear power).

    3. like any other engineering discipline - trade off safety for risk

    4. risk - the effects of a hazard times its occurrence probability

    5. measures of risk - human life, replacement costs, legal costs

    6. reliability and safety

      1. reliability - the system does what it's supposed to

      2. unreliable systems can be safe - by failing to the safe side

      3. can safe systems be unreliable

    7. software quality and safety

      1. is perfect software the answer - what is "perfection"

      2. as part of a system, software errors may be caught elsewhere

      3. shifts the view from operational requirements to safety requirements

    8. software vs. hardware

      1. software is expensive to develop, cheap to produce, and flexible

      2. hardware is cheap to develop, expensive to produce, and inflexible

      3. the pressure is to replace hardware with software

      4. but hardware operates under physical laws which cannot fail

      5. and hardware failures are predictable

      6. a hardware-software split, with hardware providing absolute measures and software providing finer measures

  3. hazards - identification and analysis

    1. identification

      1. hazard identification is difficult - lack of standards and categories

      2. the consensus view towards identifying and classifying hazards

        1. the delphi method - anonymous opinions collected

        2. joint application design - representatives meet and hash things out

        3. operating hazard analysis - identify operating procedures, then identify hazard conditions and results

    2. analysis - where the identified hazards may occur

      1. fault tree analysis - from a hazard, identify causes

        1. and and or nodes, basic, undeveloped, and intermediate events

      2. event tree analysis - a bottom-up approach to analyzing fault trees

      3. failure modes and effects - consider system components and how they fail

  4. software processes for developing safty-critical systems

    1. avoding adding hazards to the system during development

    2. requirements

      1. specification to turn informal techniques into more formal ones

      2. formal notations for requirements specification - statecharts, petri nets, state machines

      3. validation - automated proofs are difficult or impossible

    3. design - should not intruduce new hazards.

      1. avoid adding new functions

      2. formal methods based approach, or a transformational approach

        1. design to formal notation is difficult, correctness proofs are difficult, and changes are difficult to make

    4. implementation

      1. development tools - configuration management for system and tools

      2. formal verification -

      3. runtime checking - self checking code and monitor code


This page last modified on 26 April 2000.