Early planning and implementation dramatically ease the developmental burden of these requirements. Depending on the failure philosophy used (fault tolerance, control-path separation, etc), design and implementation trade-offs will be made. Trying to incorporate these requirements late in the life cycle will impact the project cost, schedule, and quality. It can also impact safety as an integrated design that incorporates software safety features such as those above. This allows the system perspective to be taken into account and the design to have a better chance of being implemented as needed to meet the requirements in an elegant, simple, and more reliable way.
The sub-requirements and notes included in the requirement are a collection of best practices for the implementation of safety-critical software. These sub-requirements are applicable to components that reside in a safety-critical system, and the components that control, mitigate or contribute to a hazard as well as software used to command hazardous operations/activities. The requirements contained in this section complement the processes identified in NASA-STD-8719.13, NASA Software Safety Standard. Software engineering and software assurance disciplines each have specific responsibilities for providing project management with work products that meet the engineering, safety, quality, and reliability requirements on a project. A detailed explanation of the rationale for each of the notes can be found in CxP 70065 .
Additional specific clarifications for a few of the requirement notes include:
Item a: Aspects to consider when establishing a known safe state includes state of the hardware and software, operational phase, device capability, configuration, file allocation tables, and boot code in memory.
Item d: Multiple independent actions by the operator help to reduce potential operator mistakes.
Item f: Memory modifications may occur due to radiation-induced errors, uplink errors, configuration errors, or other causes so the computing system must be able to detect the problem and recover to a safe state. As an example, computing systems may implement error detection and correction, software executable and data load authentication, periodic memory scrub, and space partitioning to provide protection against inadvertent memory modification. Features of the processor and/or operating system can be utilized to protect against incorrect memory use.
Item g: Software needs to accommodate both nominal inputs (within specifications) and off-nominal inputs, from which recovery may be required.
Item h: The requirement is intended to preclude the inappropriate sequencing of commands. Appropriateness is determined by the project and conditions designed into the safety-critical system. Safety-critical software commands are commands that can cause or contribute to a hazardous event or operation. One must consider not only inappropriate sequencing of commands (as described in the original note) but also the execution of a command in the wrong mode or state. Safety-critical software commands must perform when needed (must work) or be prevented from performing when the system is not in a proper mode or state (must-not work).
Item j: The intent is to establish a safe state following detection of an off-nominal indication. The safety mitigation must complete between the time that the off-nominal condition is detected and the time the hazard would occur without the mitigation. The safe state can either be an alternate state from normal operations or can be accomplished by detecting and correcting the fault or failure within the timeframe necessary to prevent a hazard and continuing with normal operations. The intent is to design in the ability of software to detect and respond to a fault or failure before it causes the system or subsystem to fail. If failure cannot be prevented, then design in the ability for the software to place the system into a safe state from which it can later recover. In this safe state, the system may not have full functionality, but will operate with this reduced-functionality.
Item k: Error handling is an implementation mechanism or design technique by which software faults and/or failures are detected, isolated, and recovered to allow for correct run-time program execution. The software error handling features that support safety-critical functions must detect and respond to hardware and operational faults and/or failures as well as faults in software data and commands from within a program or from other software programs.
Item l: The design of the system must provide sufficient sensors and effectors, as well as self-checks within the software, in order to enable the software to detect and respond to system potential hazards.