bannerd


SWE-205 - Determination of Safety-Critical Software

1. Requirements

3.7.1 The project manager, in conjunction with the SMA organization, shall determine if each software component is considered to be safety-critical per the criteria defined in NASA-STD-8739.8. 

1.1 Notes

NPR 7150.2, NASA Software Engineering Requirements, does not include any notes for this requirement.

1.2 History

SWE-205 - Last used in rev NPR 7150.2D

RevSWE Statement
A


Difference between A and B

N/A

B


Difference between B and C

NEW

C

3.7.1 The project manager, in conjunction with the SMA organization, shall determine if each software component is considered to be safety-critical per the criteria defined in NASA-STD-8739.8. 

Difference between C and DNo change. 
D

3.7.1 The project manager, in conjunction with the SMA organization, shall determine if each software component is considered to be safety-critical per the criteria defined in NASA-STD-8739.8. 



1.3 Applicability Across Classes

 

Class

     A      

     B      

     C      

     D      

     E      

     F      

Applicable?

   

   

   

   

   

   

Key:    - Applicable | - Not Applicable


2. Rationale

It is important to determine the safety criticality of each software component to identify the most critical software system components and to ensure that the software safety-critical requirements and processes are followed.

3. Guidance

3.1 Classifying Software

Classifying software essentially provides pre-tailoring of software engineering requirements, software safety requirements, software assurance requirements, and other software requirements for different types and levels of software. Beyond classification, each project evaluates the project software to determine if the software is safety-critical. 

Safety-critical is a term “describing any condition, event, operation, process, equipment, or system that could cause or lead to severe injury, major damage, or mission failure if performed or built improperly, or allowed to remain uncorrected.” 

Software safety is defined as “the aspects of software engineering and software assurance that provide a systematic approach to identifying, analyzing, tracking, mitigating, and controlling hazards and hazardous functions of a system where software may contribute either to the hazard or to its mitigation or control, to ensure safe operation of the system.” 

The project can use NASA-STD-8739.8  278 criteria to perform its determination of the software safety criticality.

NASA-STD-8739.8B - SOFTWARE ASSURANCE AND SOFTWARE SAFETY STANDARD

4.2 Safety-Critical Software Determination

Software is classified as safety-critical if the software is determined by and traceable to a hazard analysis. Software is classified as safety-critical if it meets at least one of the following criteria:

a. Causes or contributes to a system hazardous condition/event,

b. Controls functions identified in a system hazard,

c. Provides mitigation for a system hazardous condition/event,

d. Mitigates damage if a hazardous condition/event occurs,

e. Detects, reports, and takes corrective action if the system reaches a potentially hazardous state.

See Appendix A for guidelines associated with addressing software in hazard definitions. See Table 1, 3.7.1, SWE-205 for more details. Consideration for other independent means of protection (software, hardware, barriers, or administrative) should be a part of the system hazard definition process.

See also SWE-023 - Software Safety-Critical Requirements. Topic 7.02 - Classification and Safety-Criticality

3.2 Determining Software Safety Criticality

Software safety criticality is initially determined in the formulation phase using the NASA Software Assurance Standard, NASA-STD-8739.8.   As the software is developed or changed and the computer software configuration items (CSCI), models, and simulations are identified, the safety-critical software determination can be reassessed and applied at lower levels. The software safety assessment and planning are performed for each software acquisition, development, and maintenance activity, and changes to legacy\heritage systems. When the software in a system or subsystem is found to be safety-critical, additional requirements in NASA-STD-8739.8, the NASA Software Assurance and  Software Safety Standard will augment those associated with the software class requirements found in this document.

Software safety requirements contained in NASA-STD-8739.8B

Derived from NPR 7150.2D para 3.7.3 SWE 134: Table 1, SA Tasks 1 - 6

1. Analyze the software requirements and the software design and work with the project to implement NPR 7150.2 requirement items "a" through "l."

2. Assess that the source code satisfies the conditions in the NPR 7150.2 requirement "a" through "l" for safety-critical and mission-critical software at each code inspection, test review, safety review, and project review milestone.

3. Confirm that the values of the safety-critical loaded data, uplinked data, rules, and scripts that affect hazardous system behavior have been tested.

4. Analyze the software design to ensure the following:
   a. Use of partitioning or isolation methods in the
         design and code,
   b. That the design logically isolates the safety-critical
         design elements and data from those that are
         non-safety-critical.

5. Participate in software reviews affecting safety-critical software products.

6. Ensure the SWE-134 implementation supports and is consistent with the system hazard analysis.

See the software assurance tab for Considerations when identifying software subsystem hazard causes and for Considerations when identifying software causes in general software-centric hazard analysis.

3.3 Additional Guidance

Additional guidance related to this requirement may be found in the following materials in this Handbook:

3.4 Center Process Asset Libraries

SPAN - Software Processes Across NASA
SPAN contains links to Center managed Process Asset Libraries. Consult these Process Asset Libraries (PALs) for Center-specific guidance including processes, forms, checklists, training, and templates related to Software Development. See SPAN in the Software Engineering Community of NEN. Available to NASA only. https://nen.nasa.gov/web/software/wiki  197

See the following link(s) in SPAN for process assets from contributing Centers (NASA Only). 

SPAN Links

4. Small Projects

No additional guidance is available for small projects.

5. Resources

5.1 References

5.2 Tools

Tools to aid in compliance with this SWE, if any, may be found in the Tools Library in the NASA Engineering Network (NEN). 

NASA users find this in the Tools Library in the Software Processes Across NASA (SPAN) site of the Software Engineering Community in NEN. 

The list is informational only and does not represent an “approved tool list”, nor does it represent an endorsement of any particular tool.  The purpose is to provide examples of tools being used across the Agency and to help projects and centers decide what tools to consider.

6. Lessons Learned

6.1 NASA Lessons Learned

No Lessons Learned have currently been identified for this requirement.

6.2 Other Lessons Learned

No other Lessons Learned have currently been identified for this requirement.

7. Software Assurance

SWE-205 - Determination of Safety-Critical Software
3.7.1 The project manager, in conjunction with the SMA organization, shall determine if each software component is considered to be safety-critical per the criteria defined in NASA-STD-8739.8. 

7.1 Tasking for Software Assurance

From NASA-STD-8739.8

1. Confirm that the hazard reports or safety data packages contain all known software contributions or events where software, either by its action, inaction, or incorrect action, leads to a hazard.

2. Assess that the hazard reports identify the software components associated with the system hazards per the criteria defined in NASA-STD-8739.8, Appendix A.

3. Assess that hazard analyses (including hazard reports) identify the software components associated with the system hazards per the criteria defined in NASA-STD-8739.8, Appendix A.

4. Confirm that the traceability between software requirements and hazards with software contributions exists.

5. Develop and maintain a software safety analysis throughout the software development life cycle.

7.2 Software Assurance Products

  • Hazard Analyses
  • Assessment of Hazard Analyses and Reports
  • The list of all software safety-critical components that have been identified by the system hazard analysis.
  • Software Safety Analysis results, including any risks or issues.
  • Analysis of updated hazard reports at review points.
  • Record of software safety determination for the software project.


    Objective Evidence

    • Evidence of confirmation of traceability between software requirements and hazards with software contributions.
    • Evidence that confirmation of all safety contributions or events in hazard reports has occurred.

    Objective evidence is an unbiased, documented fact showing that an activity was confirmed or performed by the software assurance/safety person(s). The evidence for confirmation of the activity can take any number of different forms, depending on the activity in the task. Examples are:

    • Observations, findings, issues, risks found by the SA/safety person and may be expressed in an audit or checklist record, email, memo or entry into a tracking system (e.g. Risk Log).
    • Meeting minutes with attendance lists or SA meeting notes or assessments of the activities and recorded in the project repository.
    • Status report, email or memo containing statements that confirmation has been performed with date (a checklist of confirmations could be used to record when each confirmation has been done!).
    • Signatures on SA reviewed or witnessed products or activities, or
    • Status report, email or memo containing a short summary of information gained by performing the activity. Some examples of using a “short summary” as objective evidence of a confirmation are:
      • To confirm that: “IV&V Program Execution exists”, the summary might be: IV&V Plan is in draft state. It is expected to be complete by (some date).
      • To confirm that: “Traceability between software requirements and hazards with SW contributions exists”, the summary might be x% of the hazards with software contributions are traced to the requirements.
    • The specific products listed in the Introduction of 8.16 are also objective evidence as well as the examples listed above.

7.3 Metrics

  • # of software work product Non-Conformances identified by life cycle phase over time
  • # of safety-related requirement issues (Open, Closed) over time.
  • # of safety-related non-conformances identified by the life cycle phase over time.
  • # of Hazards containing software that have been tested vs. total # of Hazards containing software

See also Topic 8.18 - SA Suggested Metrics.

7.4 Guidance

NASA-STD-8739.8B - SOFTWARE ASSURANCE AND SOFTWARE SAFETY STANDARD

4.2 Safety-Critical Software Determination

Software is classified as safety-critical if the software is determined by and traceable to a hazard analysis. Software is classified as safety-critical if it meets at least one of the following criteria:

a. Causes or contributes to a system hazardous condition/event,

b. Controls functions identified in a system hazard,

c. Provides mitigation for a system hazardous condition/event,

d. Mitigates damage if a hazardous condition/event occurs,

e. Detects, reports, and takes corrective action if the system reaches a potentially hazardous state.

Note: Software is classified as safety-critical if the software is determined by and traceable to hazard analysis. See Appendix A of NASA-STD-8739.8 for guidelines associated with addressing software in hazard definitions. See SWE-205. Consideration for other independent means of protection (e.g., software, hardware, barriers, or administrative) should be a part of the system hazard definition process.


Safety-Critical: A term describing any condition, event, operation, process, equipment, or system that could cause or lead to severe injury, major damage, or mission failure if performed or built improperly, or allowed to remain uncorrected. (Source NPR 8715.3)

Safety-Critical Software: Software is classified as safety-critical if the software is determined by and traceable to a hazard analysis. Software is classified as safety-critical if it meets at least one of the following criteria:

  1. Causes or contributes to a system hazardous condition/event,
  2. Controls functions identified in a system hazard,
  3. Provides mitigation for a system hazardous condition/event,
  4. Mitigates damage if a hazardous condition/event occurs,
  5. Detects, reports, and takes corrective action if the system reaches a potentially hazardous state.

Safety-Critical Software can cause, contribute to, or mitigate human safety hazards or damage facilities. Safety-critical software is identified based on the results of the hazard analysis and the results of the Orbital Debris Assessment Report/End-Of-Mission Plan (where applicable). Examples of safety-critical software can be found in all types of systems, including Flight, Ground Support systems, Mission Operations Support Systems, and Test Facilities. See Appendix A for guidelines associated with addressing software in hazard definitions. Consideration for other independent means of protection (software, hardware, barriers, or administrative) should be a part of the system hazard definition process.

See also Topic 8.58 - Software Safety and Hazard Analysis

Task 1: Confirm that the hazard reports or safety data packages contain all known software contributions or events where software; either by its action, inaction, or incorrect action, lead to a hazard.

It is necessary for software assurance and software safety personnel to begin examining possible software hazards and determining whether the software might be safety-critical as early as possible. Several steps are important in determining this:

A. Learn who the key project personnel are and begin establishing a good working relationship with them. In particular, systems analysts, systems safety personnel, requirements development personnel, end-users, and those establishing operational concepts are some of the key people with initial knowledge of the project.

B. Gather all of the initial documents listed in the requirement as well as any others that the project is developing that may contain critical information on the project being developed. Don’t wait for signature copies, but begin getting acquainted with them as early as possible. Through the working relationships established, stay informed about the types of updates that are being made as the system concepts continue to be refined. Keep a list of the documents collected and their version dates as the system matures. Potential documents that may contain critical information include:

    • Preliminary System-Level Preliminary Hazard Analyses (PHA)
    • Concept of Operations (ConOps)
    • Generic Hazard Lists (e.g. for project type, for software, or just generic hazards)
    • Critical Item List(s) (CIL)
    • Preliminary System Reliability Assessment or Analyses
    • Project/System Risk Assessments
    • Request for Proposals (RFP)
    • Computing System Safety Analyses
    • Software Security Assessment (NPR 7150.2, SWE-156 - Evaluate Systems for Security Risks, SWE-154 - Identify Security Risks)
    • Science Requirements Document

C. Become familiar with the documents in Step 2 and pay particular attention to the risks and potential hazards that might be mentioned in these documents. While reviewing these risks and potential hazards, think about the ways that the software might be involved in these risks. Possible examples of software contributions to potential hazards are found in the section below titled: Software Contributions to Hazards, Software in system hazard analysis

D. As the initial hazard analyses are being done, software assurance and software safety people confirm that these analyses are as complete as possible for the stage of the project.

Task 2: Assess that the hazard reports identify the software components associated with the system hazards per the criteria defined in NASA-STD- 8739.8 Appendix A.

Review each hazard report to see that the software components associated with the system hazards are identified, using the criteria defined in NASA-STD- 8739.8 Appendix A. The Hazard Analysis done at this point should identify the initial set of planned safety-critical components. A list of all the safety-critical components should be included in the hazard reports. Keep this safety-critical components list for Tasks 4 and 5.

Task 3: Analyze the updated hazard reports and design at review points to determine if any newly identified software components are safety-critical.

At each milestone or review point, review any updated hazard analyses or new hazard reports. Review the current design to determine whether any new software has been identified as safety-critical. By this point in the project, some of the software may have been identified as control or mitigation software for one of the previously identified hazards and it may not have been thought about in earlier hazard reports. Verify that this newly identified software is now included in a hazard report, is included on the safety-critical components list, and has a corresponding requirement. As the project continues and requirements mature, any newly identified safety-critical software should be added to the hazard reports, so the reports contain a complete record of all safety-critical components.

Task 4: Confirm that the traceability between the software requirements and the hazards with software contributions exists.

As the project progresses, review the hazard reports with software contributions and confirm that the associated safety-critical component is listed in the hazard reports and can be traced back to a requirement in the requirements document. Confirm these requirements trace to one or more tests, and that they include testing of the software-critical capabilities required.

Task 5: Develop and maintain a software safety analysis throughout the software development life cycle.

Throughout the software development, starting during the requirements phase, develop a software safety analysis. Topic 8.09 - Software Safety Analysis guides on doing a software safety analysis.

7.4.1 Software Contributions to Hazards, Software in system hazard analysis:

Hazard Analysis must consider the software’s ability, by design, to cause or control a given hazard. It is a best practice to include the software within the system hazard analysis. The general hazard analysis must consider software common-mode failures that can occur in instances of redundant flight computers running the same software. A common mode failure is a specific type of common cause failure where several subsystems fail in the same way for the same reason. The failures may occur at different times and the common cause could be a design defect or a repeated event.

Software Safety Analysis supplements the system hazard analysis by assessing the software performing critical functions serving as a hazard cause or control. The review assures the following:

1) Compliance with the levied functional software requirements, including SWE-134 - Safety-Critical Software Design Requirements

2) That the software shouldn’t violate the independence of hazard inhibits, and

3) That the software shouldn’t violate the independence of hardware redundancy.

The Software Safety Analysis should follow the phased hazard analysis process. A typical Software Safety Analysis process begins by identifying the must work and must not work functions in Phase 1 hazard reports. The system hazard analysis and software safety analysis process should assess each function, between Phase 1 and 2 hazard analysis, for compliance with the levied functional software requirements, including SWE-134. For example, Solar Array deployment (must work function) software should place deployment effectors in the powered off state when it boots up and requires initializing and executing (arm and fire) commands in the correct order within 4 CPU cycles before removing a deployment inhibit. The analysis also assesses the channelization of the communication paths between the inputs/sensors and the effectors to assure there is no violation of fault tolerance by routing a redundant communication path through a single component. The system hazard analysis and software safety analysis also assure the redundancy management performed by the software supports fault tolerance requirements. For example, software can’t trigger a critical sequence in a single fault-tolerant manner using single sensor input. Considering how software can trigger a critical sequence is required for the design of triggering events such as payload separation, tripping FDIR responses that turn off critical subsystems, failover to redundant components, and providing closed-loop control of critical functions such as propellant tank pressurization.

The design analysis portion of software safety analysis should be completed by Phase 2 safety reviews. At this point, the software safety analysis supports a requirements gap analysis to identify any gaps (SWE-184 - Software-related Constraints and Assumptions) and ensure the risk and control strategy documented in hazard reports are correct as stated. Between Phase 2 and 3 safety reviews, the system hazard analysis and software safety analysis supports the analysis of test plans to assure adequate off-nominal scenarios (SWE-062 - Unit Test, SWE-065 - Test Plan, Procedures, Reports a). Finally, in Phase 3, the system hazards analysis must verify the final implementation and verification upholds the analysis by ensuring test results permit closure of hazard verifications (SWE-068 - Evaluate Test Results) and that the final hazardous commands support the single command and multi-step command needs and finalized pre-requisite checks are in place. See also Topic 8.01 - Off Nominal Testing.

The following sections include useful considerations and examples of software causes and controls:

7.4.2 Considerations when identifying software subsystem hazard causes: (This information is also included in Appendix A of NASA-STD-8739.8) 278

    1. Does software control any of the safety-critical hardware?
    2. Does software perform critical reconfiguration of the system during the mission?
    3. Does the software perform redundancy management for safety-critical hardware?
    4. Does the software determine when to perform a critical action?
    5. Does the software trigger logic to meet failure tolerance requirements?
    6. Does the software monitor hazard inhibits, safety-critical hardware/software, or issue a caution and warning alarm used to perform an operational control?
    7. Does the software process or display data used to make safety-critical decisions?
    8. Does the flight or ground software manipulate hazardous system effectors during prelaunch checkouts or terminal count?
    9. Does the software perform analysis that impacts automatic or manual hazardous operations?
    10. Does the software serve as an interlock preventing unsafe actions?
    11. Does the software contain stored command sequences that remove multiple inhibits from a hazard?
    12. Does the software initiate any stored command sequences, associated with a safety-critical activity, and if so, are they protected?
    13. Does software violate any hazard inhibits or hardware redundancy independence (channelized communication/power paths, stored command sequences/scripts, FDIR false positive, etc.)?
    14. Can the software controls introduce new hazard causes?
    15. Are the software safety-critical controls truly independent?
    16. Can common cause faults affect the software controls?
    17. Can any of the software controls used in operational scenarios cause a system hazard?
    18. Does the software control switch over to a backup system if a failure occurs in a primary system?
    19. Is the software process sensor data used to make safety-critical decisions fault-tolerant?
    20. Does the software provide an approach for recovery if the system monitoring functions fail?
    21. Does the software allow the operators to disable safety-critical controls unintentionally?
    22. Does the software provide safety-critical cautions and warnings?
    23. Is the software capable of diagnosing and fixing safety-critical faults that might occur in operations?
    24. Does the software provide the health and status of safety-critical functions?
    25. Does the software process safety-critical commands (including autonomous commanding)?
    26. Can the software providing full or partial verification or validation of safety-critical systems generate a hazard if the software has a defect, fault, or error?
    27. Can a defect, fault, or error in the software used to process data or analyze trends that lead to safety decisions cause a system hazard?
    28. Do software capabilities exist to handle the potential use cases and planned operations throughout all phases of use, and through transitions between those phases/states?

7.4.3 Considerations when identifying software causes in a general software-centric hazard analysis:

See also Topic 8.58 - Software Safety and Hazard Analysis


Software Cause Areas to Consider

Potential Software Causes

Data errors


1.     Asynchronous communications

2.     Single or double event upset/bit flip or hardware induced error

3.     Communication to/from an unexpected system on the network

4.     An out-of-range input value, a value above or below the range

5.     Start-up or hardware initiation data errors

6.     Data from an antenna gets corrupted

7.     Failure of software interface to memory

8.     Failure of flight software to suppress outputs from a failed component

9.     Failure of software to monitor bus controller rates to ensure communication with all remote terminals on the bus schedule's avionics buses

10.  Ground or onboard database error

11.  Interface error

12.  Latent data

13.  Communication bus overload

14.  Missing or failed integrity checks on inputs, failure to check the validity of input/output data

15.  Excessive network traffic/babbling node - keeps the network so busy it inhibits communication from other nodes

16.  Sensors or actuators stuck at some value

17.  Wrong software state for the input

Commanding errors


1.     Command buffer error or overflow

2.     Corrupted software load

3.     Error in real-time command build or sequence build

4.     Failure to command during hazardous operations

5.     Failure to perform prerequisite checks before the execution of safety-critical software commands

6.     Ground or onboard database error for the command structure

7.     Error in command data introduced by command server error

8.     Incorrect operator input commands

9.     Wrong command or a miscalculated command sent

10.  Sequencing error, failure to issue commands in the correct sequence

11.  Command sent in wrong software state or software in an incorrect or unanticipated state

12.  An incorrect timestamp on the command

13.  Missing software error handling on incorrect commands

14.  Status messages on command execution not provided

15.  Memory corruption, critical data variables overwritten in memory

16.  Inconsistent syntax

17.  Inconsistent command options  

18.  Similarly named commands

19.  Inconsistent error handling rules

20.  Incorrect automated command sequence built into script containing single commands that can remove multiple inhibits to a hazard

Flight computer errors

1.     Board support package software error

2.     Boot load software error

3.     Boot Programmable Read-Only Memory (PROM) corruption preventing reset

4.     Buffer overrun

5.     CPU overload

6.     Cycle jitter

7.     Cycle over-run

8.     Deadlock

9.     Livelock

10.  Reset during program upload (PROM corruption)

11.  Reset with no restart

12.  Single or double event upset/bit flip or hardware induced error

13.  Time to reset greater than time to failure

14.  Unintended persistent data/configuration on reset

15.  Watchdog active during reboot causing infinite boot loop

16.  Watchdog failure

17.  Failure to detect and transition to redundant or backup computer

18.  Incorrect or stale data in redundant or backup computer

Operating systems errors

1.     Application software incompatibility with upgrades/patches to an operating system

2.     Defects in Real-Time Operating System (RTOS) Board Support software

3.     Missing or incorrect software error handling

4.     Partitioning errors

5.     Shared resource errors

6.     Single or double event upset/bit flip

7.     Unexpected operating system software response to user input

8.     Excessive functionality

9.     Missing function

10.  Wrong function

11.  Inadequate protection against operating system bugs

12.  Unexpected and aberrant software behavior

Programmable logic device errors

1.     High cyclomatic complexity levels (above 15)

2.     Errors in programming and simulation tools used for Programmable Logic Controller (PLC) development

3.     Errors in the programmable logic device interfaces

4.     Errors in the logic design

5.     Missing software error handling in the logic design

6.     PLC logic/sequence error

7.     Single or double event upset/bit flip or hardware induced error

8.     Timing errors

9.     Unexpected operating system software response to user input

10.  Excessive functionality

11.  Missing function

12.  Wrong function

13.  Unexpected and aberrant software behavior

Flight system time management errors


1.     Incorrect data latency/sampling rates

2.     Failure to terminate/complete process in a given time

3.     Incorrect time sync

4.     Latent data (Data delayed or not provided in required time)

5.     Mission elapsed time timing issues and distribution

6.     Incorrect function execution, performing a function at the wrong time, out of sequence, or when the program is in the wrong state

7.     Race conditions

8.     The software cannot respond to an off-nominal condition within the time needed to prevent a hazardous event

9.     Time function runs fast/slow

10.  Time skips (e.g., Global Positioning System time correction)

11.  Loss or incorrect time sync across flight system components

12.  Loss or incorrect time Synchronization between ground and spacecraft Interfaces

13.  Unclear software timing requirements

14.  Asynchronous systems or components

15.  Deadlock conditions

16.  Livelocks conditions

Coding, logic, and algorithm failures, algorithm specification errors


1.     Auto-coding errors as a cause

2.     Bad configuration data/no checks on external input files and data

3.     Division by zero

4.     Wrong sign

5.     Syntax errors

6.     Error coding software algorithm

7.     Error in positioning algorithm

8.     Case/type/conversion error/unit mismatch

9.     Buffer overflows

10.  High cyclomatic complexity levels (above 15)

11.  Dead code or unused code

12.  Endless do loops

13.  Erroneous outputs

14.  Failure of flight computer software to transition to or operate in a correct mode or state

15.  Failure to check safety-critical outputs for reasonableness and hazardous values and correct timing

16.  Failure to generate a process error upon detection of arithmetic error (such as divide-by-zero)

17.  Failure to create a software error log report when an unexpected event occurs

18.  Inadvertent memory modification

19.  Incorrect "if-then" and incorrect "else"

20.  Missing default case in a switch statement

21.  Incorrect implementation of a software change, software defect, or software non-conformance

22.  Incorrect number of functions or mathematical iteration

23.  Incorrect software operation if no commands are received or if a loss of commanding capability exists (inability to issue commands)

24.  Insufficient or poor coding reviews, inadequate software peer reviews

25.  Insufficient use of coding standards

26.  Interface errors

27.  Missing or inadequate static analysis checks on code

28.  Missing or incorrect parameter range and boundary checking

29.  Non-functional loops

30.  Overflow or underflow in the calculation

31.  Precision mismatch

32.  Resource contention (e.g., thrashing: two or more processes accessing a shared resource)

33.  Rounding or truncation fault

34.  Sequencing error (e.g., failure to issue commands in the correct sequence)

35.  Software is initialized to an unknown state; failure to properly initialize all system and local variables are upon startup, including clocks

36.  Too many or too few parameters for the called function

37.  Undefined or non-initialized data

38.  Untested COTS, MOTS, or reused code

39.  Incomplete end-to-end testing

40.  Incomplete or missing software stress test

41.  Errors in the data dictionary or data dictionary processes

42.  Confusing feature names

43.  More than one name for the same feature

44.  Repeated code modules

45.  Failure to initialize a loop-control

46.  Failure to initialize (or reinitialize) pointers

47.  Failure to initialize (or reinitialize) registers

48.  Failure to clear a flag

49.  Scalability errors

50.  Unexpected new behavior or defects introduced in newer or updated COTS modules

51.  Not addressing pointer closure

Fault tolerance and fault management errors

1.     Missing software error handling

2.     Missing or incorrect fault detection logic

3.     Missing or incorrect fault recovery logic

4.     Problems with the execution of emergency safing operations

5.     Failure to halt all hazard functions after an interlock failure

6.     The software cannot respond to an off-nominal condition within the time needed to prevent a hazardous event

7.     Common mode software faults

8.     A hazard causal factor occurrence isn't detected

9.     False positives in fault detection algorithms

10.  Failure to perform prerequisite checks before the execution of safety-critical software commands

11.  Failure to terminate/complete process in a given time

12.  Memory corruption, critical data variables overwritten in memory

13.  Single or double event upset/bit flip or hardware-induced error

14.  Incorrect interfaces, errors in interfaces

15.  Missing self-test capabilities

16.  Failing to consider stress on the hardware

17.  Incomplete end-to-end testing

18.  Incomplete or missing software stress test

19.  Errors in the data dictionary or data dictionary processes

20.  Failure to provide or ensure secure access for input data, commanding, and software modifications

Software process errors

1.      Failure to implement software development processes or implementing inadequate processes

2.      Inadequate software assurance support and reviews

3.      Missing or inadequate software assurance audits

4.      Failure to follow the documented software development processes

5.      Missing, tailored, or incomplete implementation of the safety-critical software requirements in NPR 7150.2

6.      Missing, tailored, or incomplete implementation of the safety-critical software requirements in Space Station Program 50038, Computer-Based Control System Safety Requirements

7.      Incorrect or incomplete testing

8.      Inadequate testing of reused or heritage software

9.      Failure to open a software problem report when an unexpected event occurs

10.   Failure to include hardware personnel in reviews of software changes, software implementation, peer reviews, and software testing

11.   Failure to perform a safety review on all software changes and software defects

12.   Defects in COTS, MOTS, or OSS Software,

13.   Failure to perform assessments of available bug fixes and updates available in COTS software

14.   Insufficient use of coding standards

15.   Missing or inadequate static analysis checks on code

16.   Incorrect version loaded

17.   Incorrect configuration values or data

18.   No checks on external input files and data

19.   Errors in configuration data changes being uploaded to spacecraft

20.   Software/avionics simulator/emulator errors and defects

21.   Unverified software

22.   High cyclomatic complexity levels (over 15)

23.   Incomplete or inadequate software requirements analysis

24.   Compound software requirements

25.   Incomplete or inadequate software hazard analysis

26.   Incomplete or inadequate software safety analysis

27.   Incomplete or inadequate software test data analysis

28.   Unrecorded software defects found during informal and formal software testing

29.   Auto-coding tool faults and defects

30.   Errors in design models

31.   Software errors in hardware simulators due to a lack of understanding of hardware requirements

32.   Incomplete or inadequate software test data analysis

33.   Inadequate built-in-test coverage

34.   Inadequate regression testing and unit test coverage of flight software application-level source code

35.   Failure to test  all nominal and planned contingency scenarios (breakout and re-rendezvous, launch abort) and complete mission duration (launch to docking to splashdown) in the hardware in the loop environment

36.   Incomplete testing of unexpected conditions, boundary conditions, and software/interface inputs

37.   Use of persistence of test data, files, or config files in an operational scenario

38.   Failure to provide multiple paths or triggers from safe states to hazardous states

39.   Interface control documents and interface requirements documents errors

40.   System requirements errors

41.   Misunderstanding of hardware configuration and operation

42.   Hardware requirements and interface errors, Incorrect description of the software/hardware functions and how they are to perform

43.   Missing or incorrect software requirements or specifications

44.   Missing software error handling

45.   Requirements/design errors not fully defined, detected, and corrected)

46.   Failure to identify the safety-critical software items

47.   Failure to perform a function, performing the wrong function, performing the function incompletely

48.   An inadvertent/unauthorized event, an unexpected, unwanted event, an out-of-sequence event, the failure of a planned event to occur

49.   The magnitude or direction of an event is wrong

50.   Out-of-sequence event protection

51.   Multiple events/actions trigger simultaneously (when not expected)

52.   Error or exception handling missing or incomplete

53.   Inadvertent or incorrect mode transition for required vehicle functional operation; undefined or incorrect mode transition criteria; unauthorized mode transition

54.  Failure of flight software to correctly initiate proper transition mode

55.  Software state transition error

56.  Software termination is an unknown state

57.  Errors in the software data dictionary values

Human-machine interface errors



1.     Incorrect data (unit conversion, incorrect variable type)

2.     Stale data

3.     Poor design of human-machine interface

4.     Too much, too little, incorrect data displayed

5.     Ambiguous or incorrect messages

6.     User display locks up/fails

7.     Missing software error handling

8.     Unsolicited command (command issued inadvertently, cybersecurity issue, or without cause)

9.     Wrong command or a miscalculated command sent

10.  Failure to display information or messages to a user

11.  Display refresh rate leads to an incorrect operator response

12.  Lack of ordering scheme for hazardous event queues (such as alerts) in the human-computer interface (i.e., priority versus time of arrival, for example, when an abort must go to the top of the queue)

13.  Incorrect labeling of operator controls in the human interface software

14.  Failure to check for constraints in algorithms/specifications and valid boundaries

15.  Failure of human interface software to check operator inputs

16.  Failure to pass along information or messages

17.  No onscreen instructions

18.  Undocumented features

19.  States that appear impossible to exit

20.  No cursor

21.  Failure to acknowledge an input

22.  Failure to advise when a change takes effect

23.  Wrong, misleading, or confusing information

24.  Poor aesthetics in the screen layout

25.  Menu layout errors

26.  Dialog box layout errors

27.  Obscured instructions

28.  Misuse of color

29.  Failure to allow tabbing navigation to edit fields (mouse-only input)

Security and virus errors

1.     Denial or interruption of service

2.     Spoofed or jammed inputs

3.     Missing capabilities to detect insider threat activities

4.     Inadvertent or intentional memory modification

5.     Inadvertent or unplanned mode transition

6.     Missing software error handling or detect handling

7.     Unsolicited command

8.     Stack-based buffer overflows 

9.     Heap-based attacks 

10.  Cybersecurity vulnerability or computer virus

11.  Inadvertent access to ground system software

12.  Destruct commands incorrectly allowed in a hands-off zone

13.  Communication to/from an unexpected system on the network

Unknown Unknowns errors

1.     Undetected software defects

2.     Unknown limitations for COTS (operational, environmental, stress)

3.     COTS extra capabilities

4.     Incomplete or inadequate software safety analysis for COTS components

5.     Compiler behavior errors or undefined compiler behavior

6.     Software defects and investigations that are unresolved before the flight

Some safety-critical aspects are addressed in hardware, for example, valves failing to close when a fault occurs.

Safety products (including hazard reports, responses to launch site requirements, preliminary hazard analyses, etc.) begin with the PHA, evolving and expanding throughout the project life cycle.  See also Topic 8.01 - Off Nominal Testing, 8.08 - COTS Software Safety Considerations.

7.5 Additional Guidance

Additional guidance related to this requirement may be found in the following materials in this Handbook:



  • No labels

0 Comments