3.7.1 The project manager, in conjunction with the SMA organization, shall determine if each software component is considered to be safety-critical per the criteria defined in NASA-STD-8739.8.
1.1 Notes
NPR 7150.2, NASA Software Engineering Requirements, does not include any notes for this requirement.
1.2 History
Expand
title
Click here to view the history of this requirement:
SWE-205 History
Include Page
SITE:SWE-205 History
SITE:SWE-205 History
1.3 Applicability Across Classes
Applicable c
a
1
b
1
csc
1
c
1
d
1
dsc
1
e
1
f
0
g
0
h
0
Div
id
tabs-2
2. Rationale
It is important to determine the safety criticality of each software component to identify the most critical software system components and to ensure that the software safety-critical requirements and processes are followed.
Div
id
tabs-3
3. Guidance
Classifying software essentially provides pre-tailoring of software engineering requirements, software safety requirements, software assurance requirements, and other software requirements for different types and levels of software. Beyond classification, each project evaluates the project software to determine if the software is safety-critical.
Safety-critical is a term “describing any condition, event, operation, process, equipment, or system that could cause or lead to severe injury, major damage, or mission failure if performed or built improperly, or allowed to remain uncorrected.”
Software safety is defined as “the aspects of software engineering and software assurance that provide a systematic approach to identifying, analyzing, tracking, mitigating, and controlling hazards and hazardous functions of a system where software may contribute either to the hazard or to its mitigation or control, to ensure safe operation of the system.”
The project can use NASA-STD-8739.8
Swerefn
refnum
278
criteria to perform its determination of the software safety criticality.
Panel
Safety-Critical Software Determination
Software is classified as safety-critical if it meets at least one of the following criteria:
Causes or contributes to a system hazardous condition/event,
Provides control or mitigation for a system hazardous condition/event,
Controls safety-critical functions,
Mitigates damage if a hazardous condition/event occurs,
Detects, reports, and takes corrective action if the system reaches a potentially hazardous state.
Note: Software is classified as safety-critical if the software is determined by and traceable to hazard analysis. See Appendix A of NASA-STD-8739.8 for guidelines associated with addressing software in hazard definitions. See SWE-205. Consideration for other independent means of protection (e.g. software, hardware, barriers, or administrative) should be a part of the system hazard definition process.
Software safety criticality is initially determined in the formulation phase using the NASA Software Assurance Standard, NASA-STD-8739.8. As the software is developed or changed and the computer software configuration items (CSCI), models, and simulations are identified, the safety-critical software determination can be reassessed and applied at lower levels. The software safety assessment and planning are performed for each software acquisition, development, and maintenance activity, and changes to legacy\heritage systems. When the software in a system or subsystem is found to be safety-critical, additional requirements in NASA-STD-8739.8, the NASA Software Assurance and Software Safety Standard will augment those associated with the software class requirements found in this document.
Panel
Software safety requirements contained in NASA-STD-8739.8
1. Analyze the software requirements and the software design and work with the project to implement NPR 7150.2, SWE-134 requirement items "a" through "l."
2. Assess that the source code satisfies the conditions in the NPR 7150.2, SWE-134 requirement "a" through "l" for safety-critical and mission-critical software at each code inspection, test review, safety review, and project review milestone.
3. Confirm 100% code test coverage is addressed for all identified software safety-critical software components or assure that software developers provide a risk assessment explaining why the test coverage is not possible for the safety-critical code component.
4. Confirm that all identified safety-critical software components have a cyclomatic complexity value of 15 or lower. If not, assure that software developers provide a risk assessment explaining why the cyclomatic complexity value needs to be higher than 15 and why the software component cannot be structured to be lower than 15.
5. Confirm that the values of the safety-critical loaded data, uplinked data, rules, and scripts that affect hazardous system behavior have been tested.
6. Analyze the software design to ensure:
a. Use of partitioning or isolation methods in the design and code,
b. That the design logically isolates the safety-critical design elements and data from those that are non-safety-critical.
7. Participate in software reviews affecting safety-critical software products.
See the software assurance tab for Considerations when identifying software subsystem hazard causes and for Considerations when identifying software causes in general software-centric hazard analysis.
Additional guidance related to software safety may be found in the following related requirements in this Handbook:
No additional guidance is available for small projects.
Div
id
tabs-5
5. Resources
5.1 References
refstable
Show If
group
confluence-users
Panel
titleColor
red
title
Visible to editors only
Enter the necessary modifications to be made in the table below:
SWEREFs to be added
SWEREFS to be deleted
added SWEREF-271
added SWEREF-278
SWEREFs called out in the text: 278
SWEREFs NOT called out in text but listed as germane: 271
5.2 Tools
Include Page
Tools Table Statement
Tools Table Statement
Div
id
tabs-6
6. Lessons Learned
6.1 NASA Lessons Learned
No Lessons Learned have currently been identified for this requirement.
6.2 Other Lessons Learned
No other Lessons Learned have currently been identified for this requirement.
Div
id
tabs-7
7. Software Assurance
Excerpt Include
SWE-205 - Determination of Safety-Critical Software
SWE-205 - Determination of Safety-Critical Software
7.1 Tasking for Software Assurance
Confirm that the hazard reports or safety data packages contain all known software contributions or events where software; either by its action, inaction, or incorrect action, lead to a hazard.
Assess that the hazard reports identify the software components associated with the system hazards per the criteria defined in NASA-STD- 8739.8 Appendix A.
Assess that hazard analyses, including hazard reports, identify the software components associated with the system hazards.
Confirm that the traceability between software requirements and hazards with software contributions exists.
Develop and maintain a software safety analysis throughout the software development life-cycle.
7.2 Software Assurance Products
Hazard Analyses
Assessment of Hazard Analyses and Reports
The list of all software safety-critical components that have been identified by the system hazard analysis.
Software Safety Analysis results, including any risks or issues.
Analysis of updated hazard reports at review points.
Record of software safety determination for the software project.
Note
title
Objective Evidence
Evidence of confirmation of traceability between software requirements and hazards with software contributions.
Evidence that confirmation of all safety contributions or events in hazard reports has occurred.
Expand
title
Definition of objective evidence
Include Page
SITE:Definition of Objective Evidence
SITE:Definition of Objective Evidence
7.3 Metrics
# of software work product Non-Conformances identified by life-cycle phase over time
# of safety-related requirement issues (Open, Closed) over time.
# of safety-related non-conformances identified by life-cycle phase over time.
# of Hazards containing software that has been successfully tested vs. total # of Hazards containing software
7.4 Guidance
Panel
Safety-Critical Software Determination
Software is classified as safety-critical if it meets at least one of the following criteria:
Causes or contributes to a system hazardous condition/event,
Provides control or mitigation for a system hazardous condition/event,
Controls safety-critical functions,
Mitigates damage if a hazardous condition/event occurs,
Detects, reports, and takes corrective action if the system reaches a potentially hazardous state.
Note: Software is classified as safety-critical if the software is determined by and traceable to hazard analysis. See Appendix A of NASA-STD-8739.8 for guidelines associated with addressing software in hazard definitions. See SWE-205. Consideration for other independent means of protection (e.g., software, hardware, barriers, or administrative) should be a part of the system hazard definition process.
Safety-Critical: A term describing any condition, event, operation, process, equipment, or system that could cause or lead to severe injury, major damage, or mission failure if performed or built improperly, or allowed to remain uncorrected. (Source NPR 8715.3)
Safety-Critical Software: Software is classified as safety-critical if it meets at least one of the following criteria:
a. Causes or contributes to a system hazardous condition or event,
b. Provides control or mitigation for a system hazardous condition or event,
c. Controls safety-critical functions,
d. Mitigates damage if a hazardous condition/event occurs,
e. Detects, reports, and takes corrective action if the system reaches a potentially hazardous state.
Safety-Critical Software can cause, contribute to, or mitigate human safety hazards or damage facilities. Safety-critical software is identified based on the results of the hazard analysis and the results of the Orbital Debris Assessment Report/End-Of-Mission Plan (where applicable). Examples of safety-critical software can be found in all types of systems, including Flight, Ground Support systems, Mission Operations Support Systems, and Test Facilities. See Appendix A for guidelines associated with addressing software in hazard definitions. Consideration for other independent means of protection (software, hardware, barriers, or administrative) should be a part of the system hazard definition process.
Task 1:Confirm that the hazard reports or safety data packages contain all known software contributions or events where software; either by its action, inaction, or incorrect action, lead to a hazard.
It is necessary for software assurance and software safety personnel to begin examining possible software hazards and determining whether the software might be safety-critical as early as possible. Several steps are important in determining this:
A. Learn who the key project personnel are and begin establishing a good working relationship with them. In particular, systems analysts, systems safety personnel, requirements development personnel, end-users, and those establishing operational concepts are some of the key people with initial knowledge on the project.
B. Gather all of the initial documents listed in the requirement as well as any others that the project is developing that may contain critical information on the project being developed. Don’t wait for signature copies, but begin getting acquainted with them as early as possible. Through the working relationships established, stay informed about the types of updates that are being made as the system concepts continue to be refined. Keep a list of the documents collected and their version dates as the system matures. Potential documents that may contain critical information include:
C. Become familiar with the documents in Step 2 and pay particular attention to the risks and potential hazards that might be mentioned in these documents. While reviewing these risks and potential hazards, think about the ways that the software might be involved in these risks. Possible example software contributions to potential hazards are found in the section below titled: Software Contributions to Hazards, Software in system hazard analysis
D. As the initial hazard analyses are being done, software assurance and software safety people confirm that these analyses are as complete as possible for the stage of the project.
Task 2: Assess that the hazard reports identify the software components associated with the system hazards per the criteria defined in NASA-STD- 8739.8 Appendix A.
Review each hazard report to see that the software components associated with the system hazards are identified, using the criteria defined in NASA-STD- 8739.8 Appendix A. The Hazard Analysis done at this point should identify the initial set of planned safety-critical components. A list of all the safety-critical components should be included in the hazard reports. Keep this safety-critical components list for Task 4 and 5.
Task 3: Analyze the updated hazard reports and design at review points to determine if any newly identified software components are safety-critical.
At each milestone or review point, review any updated hazard analyses or new hazard reports. Review the current design to determine whether any new software has been identified as safety-critical. By this point in the project, some of the software may have been identified as control or mitigation software for one of the previously identified hazards and it may not have been thought about in earlier hazard reports. Verify that this newly identified software is now included in a hazard report, is included on the safety-critical components list, and has a corresponding requirement. As the project continues and requirements mature, any newly identified safety-critical software should be added to the hazard reports, so the reports contain a complete record of all safety-critical components.
Task 4: Confirm that the traceability between the software requirements and the hazards with software contributions exists.
As the project progresses, review the hazard reports with software contributions and confirm that the associated safety-critical component is listed in the hazard reports and can be traced back to a requirement in the requirements document. Confirm these requirements trace to one or more tests, and that they include testing of the software-critical capabilities required.
Task 5: Develop and maintain a software safety analysis throughout the software development lifecycle.
Throughout the software development, starting during the requirements phase, develop a software safety analysis. Topic 8.9 Software Safety Analysis guides on doing a software safety analysis.
Software Contributions to Hazards, Software in system hazard analysis:
Hazard Analysis must consider the software’s ability, by design, to cause or control a given hazard. It is a best practice to include the software within the system hazard analysis. The general hazard analysis must consider software common-mode failures that can occur in instances of redundant flight computers running the same software. A common mode failureis a specific type of commoncause failurewhere several subsystems fail in the same way for the same reason. The failuresmay occur at different times and the commoncause could be a design defect or a repeated event.
Software Safety Analysis supplements the system hazard analysis by assessing the software performing critical functions serving as a hazard cause or control. The review assures the following:
1) Compliance with the levied functional software requirements, including SWE-134,
2) That the software shouldn’t violate the independence of hazard inhibits, and
3) That the software shouldn’t violate the independence of hardware redundancy.
The Software Safety Analysis should follow the phased hazard analysis process. A typical Software Safety Analysis process begins by identifying the must work and must not work functions in Phase 1 hazard reports. The system hazard analysis and software safety analysis process should assess each function, between Phase 1 and 2 hazard analysis, for compliance with the levied functional software requirements, including SWE-134. For example, Solar Array deployment (must work function) software should place deployment effectors in the powered off state when it boots up and requires to initialize and execute (arm and fire) commands in the correct order within 4 CPU cycles before removing a deployment inhibit. The analysis also assesses the channelization of the communication paths between the inputs/sensors and the effectors to assure there is no violation of fault tolerance by routing a redundant communication path through a single component. The system hazard analysis and software safety analysis also assure the redundancy management performed by the software supports fault tolerance requirements. For example, software can’t trigger a critical sequence in a single fault-tolerant manner using single sensor input. Considering how software can trigger a critical sequence is required for the design of triggering events such as payload separation, tripping FDIR responses that turn off critical subsystems, failover to redundant components, and providing closed-loop control of critical functions such as propellant tank pressurization.
The design analysis portion of software safety analysis should be completed by Phase 2 safety reviews. At this point, the software safety analysis supports a requirements gap analysis to identify any gaps (SWE-184) and ensure the risk and control strategy documented in hazard reports are correct as stated. Between Phase 2 and 3 safety reviews, the system hazard analysis and software safety analysis supports the analysis of test plans to assure adequate off-nominal scenarios (SWE-062, SWE-065 a). Finally, in Phase 3, the system hazards analysis must verify the final implementation and verification upholds the analysis by ensuring test results permit closure of hazard verifications (SWE-068) and that the final hazardous commands support the single command and multi-step command needs and finalized pre-requisite checks are in place.
The following sections include useful considerations and examples of software causes and controls:
Considerations when identifying software subsystem hazard causes: (This information is also included in Appendix A.1.4 of NASA-STD-8739.8)
Swerefn
refnum
278
Does software control any of the safety-critical hardware?
Does software perform critical reconfiguration of the system during the mission?
Does the software perform redundancy management for safety-critical hardware?
Does the software determine when to perform a critical action?
Does the software trigger logic to meet failure tolerance requirements?
Does the software monitor hazard inhibits, safety-critical hardware/software, or issue a caution and warning alarm used to perform an operational control?
Does the software process or display data used to make safety-critical decisions?
Does the flight or ground software manipulate hazardous system effectors during prelaunch checkouts or terminal count?
Does the software perform analysis that impacts automatic or manual hazardous operations?
Does the software serve as an interlock preventing unsafe actions?
Does the software contain stored command sequences that remove multiple inhibits from a hazard?
Does the software initiate any stored command sequences, associated with a safety-critical activity, and if so, are they protected?
Does software violate any hazard inhibits or hardware redundancy independence (channelized communication/power paths, stored command sequences/scripts, FDIR false positive, etc.)?
Can the software controls introduce new hazard causes?
Are the software safety-critical controls truly independent?
Can common cause faults affect the software controls?
Can any of the software controls used in operational scenarios cause a system hazard?
Does the software control switch-over to a backup system if a failure occurs in a primary system?
Is the software process sensor data used to make safety-critical decisions fault-tolerant?
Does the software provide an approach for recovery if the system monitoring functions fail?
Does the software allow the operators to disable safety-critical controls unintentionally?
Does the software provide safety-critical cautions and warnings?
Is the software capable of diagnosing and fixing safety-critical faults that might occur in operations?
Does the software provide the health and status of safety-critical functions?
Does the software process safety-critical commands (including autonomous commanding)?
Can the software providing full or partial verification or validation of safety-critical systems generate a hazard if the software has a defect, fault, or error?
Can a defect, fault, or error in the software used to process data or analyze trends that lead to safety decisions cause a system hazard?
Do software capabilities exist to handle the potential use cases and planned operations throughout all phases of use, and through transitions between those phases/states?
Considerations when identifying software causes in a general software-centric hazard analysis:
Computer Reset
Causes
Reset with no restart
Reset during program upload (PROM corruption)
Boot PROM corruption preventing reset
Watchdog active during reboot causing infinite boot loop
Controls
Disable FDIR and watchdogs during boot
Redundant computers
Memory
Causes
Memory corruption
Out of Memory
Buffer overrun
Deadlock (trying to write to the same memory at the same time or trying to update while reading it)
Incorrect Data (unit conversion, incorrect variable type, etc.)
Stale Data
Controls
Visual indication of stale data
Watchdog timer
Events and Actions
Causes
Out-of-sequence event protection
Multiple events/actions trigger simultaneously (when not expected)
Error/Exception handling missing or incomplete
Inadvertent mode transition
Controls
Fault Management
Pre-requisite logic
Interlocks
Timekeeping
Causes
Time runs fast/slow
Time skips (e.g., Global Positioning System time correction)
Time sync across components
Oscillator Drift
Controls
Diverse/redundant time sources with fault down logic
Robust time sync design that can deal with the loss of external time sources
Pre-launch checkout of Oscillators
Timing Problems
Causes
Priority inversion
Failure to terminate/complete process in a given time
Data latency/sampling rate too slow
Race Conditions
Non-determinism
Static and Dynamic Analysis Tools
Coding standards
Controls
Coding, Logic, and Algorithm failures
Causes
Division by zero
Bad data in = bad data out (no parameter range & boundary checking)
Dead code
Unused code
Non-functional loops
Endless do loops
Incorrect passes (too many or too few or not at the correct time)
Incorrect “if-then” and incorrect “else.”
Too many or too few parameters for the called function
Case/type mismatch
Precision mismatch
Rounding or truncation fault
Resource contention (e.g., thrashing: two or more processes accessing a shared resource)
Bad configuration data/no checks on external input files and data
Inappropriate equation
Undefined or non-initialized data
Limit ranges
Relationship logic for interdependent limits
Overflow or underflow in the calculation
Controls
Use of industry-accepted coding standard
Use of safe math libraries
The robust software development, quality, and safety processes
Input Failures
Causes
Noise
Sensors or actuators stuck at some value (all zeros, all ones, some other value)
A value above or below range
Value in range but incorrect
Physical units incorrect
Inadequate data sampling rate
Controls
Sensor Health Checks
Input Validation, Sanity Checks
User Interface Errors
Causes
Wrong commands are given by the operator
No commands are given by the operator
Status and messages not provided for operations, systems, and inhibits
Ambiguous or incorrect messages
User display locks up/fails
Controls
Two-step commands
GUI style guide
Software interlocks to prevent human error
Configuration Management
Causes
Incorrect version loaded
Incorrect configuration values
Controls
Version CRC check after software/configuration load
Security
Causes
Denial/Interruption of Service
Spoofed/Jammed inputs
An unauthorized input/access
Message filtering to detect spoofing
Ensure software has data source validation checking features
Controls
PLC Processor Fault causes
Some safety-critical aspects are addressed in hardware, for example, valves failing closed when a fault occurs.
Safety products (including hazard reports, responses to launch site requirements, preliminary hazard analyses, etc.) begin with the PHA, evolving and expanding throughout the project lifecycle.