


- 1. The Requirement
- 2. Rationale
- 3. Guidance
- 4. Small Projects
- 5. Resources
- 6. Lessons Learned
- 7. Software Assurance
1. Requirements
4.4.1 The crewed space system shall provide the capability for the crew to monitor, operate, and control the crewed space system and subsystems, where:
- The capability is necessary to execute the mission; or
- The capability would prevent a catastrophic event; or
- The capability would prevent an abort.
1.1 Notes
NASA-STD-8719.29, NASA Technical Requirements for Human-Rating, does not include any notes for this requirement.
1.2 History
1.3 Applicability Across Classes
Class A B C D E F Applicable?
Key: - Applicable |
- Not Applicable
2. Rationale
This capability flows directly from the definition of human-rating. Within the context of this requirement, monitoring is the ability to determine where the vehicle is, its condition, and what it is doing. Monitoring helps to create situational awareness that improves the performance of the human operator and enhances the mission. Determining the level of operation over individual functions is a decision made separately for specific space systems. Specifically, if a valve or relay can be controlled by a computer, then that same control could be offered to the crew to perform that function. However, a crew member probably could not operate individual valves that meter the flow of propellant to the engines, but the function could be replaced by a throttle that incorporates multiple valve movements to achieve a desired end state (reduce or increase thrust). Meeting any of the three stated conditions invokes the requirement. The first condition recognizes that the crew performs functions to meet mission objectives and, in those cases, the crew is provided the designated capabilities. This does not mean that the crew is provided these capabilities for all elements of a mission. Many considerations are involved in making these determinations, including the capability to perform the function and reaction time. The second and third conditions recognize that, in many scenarios, the crew improves the performance of the system and that the designated capabilities support that performance improvement.
3. Guidance
Per NPR 8705.2 024, a human-rated system accommodates human needs, effectively utilizes human capabilities, controls hazards with sufficient certainty to be considered safe for human operations, and provides the capability to safely recover from emergency situations. The concept of human-rating a space system entails three fundamental tenets:
- Human-rating is the process of evaluating and assuring that the total system can safely conduct the required human missions.
- Human-rating includes the incorporation of design features and capabilities that accommodate human interaction with the system to enhance overall safety and mission success.
- Human-rating includes the incorporation of design features and capabilities to enable safe recovery of the crew from hazardous situations.
This capability flows directly from that human-rating concept. Within the context of this requirement, monitoring is the ability to determine where the vehicle is, its condition, and what it is doing. Monitoring helps to create situational awareness that improves the performance of the human operator and enhances the mission. Determining the level of operation over individual functions is a decision made separately for specific space systems. Specifically, if a valve or relay can be controlled by a computer, then that same control could be offered to the crew to perform that function. However, a crew member probably could not operate individual valves that meter the flow of propellant to the engines, but the function could be replaced by a throttle that incorporates multiple valve movements to achieve a desired end state (reduce or increase thrust). Meeting any of the three stated conditions invokes the requirement. The first condition recognizes that the crew performs functions to meet mission objectives and, in those cases, the crew is provided the designated capabilities. This does not mean that the crew is provided these capabilities for all elements of a mission. Many considerations are involved in making these determinations, including the capability to perform the function and reaction time. The second and third conditions recognize that, in many scenarios, the crew improves the performance of the system and that the designated capabilities support that performance improvement.
See Topic 7.24 - Human Rated Software Requirements for other Software Requirements related to Human Rated Software.
3.1 Software Tasks for Crew Operations Capabilities
To ensure that the crewed space system provides the capability for the crew to monitor, operate, and control the system and subsystems, the following software tasks should be implemented:
- Human-Machine Interface Design: Design and implement an intuitive and robust human-machine interface (HMI) that enables the crew to effectively monitor, operate, and control the space system and its subsystems. The interface should display critical information clearly and provide controls that are easy to use and understand. The HMI design should take into consideration the Display Standards in Appendix F of NASA Spaceflight Human-System Standard, Volume 2: Human Factors, Habitability, And Environmental Health (NASA-STD-3001, Vol 2, Rev D) 498
- Real-time Data Monitoring: Develop and implement systems for real-time monitoring of the health and status of critical systems and subsystems. This includes displaying performance data, system status, and alerts to the crew to facilitate timely decision-making and actions.
- Control System Implementation: Develop and implement control systems that allow the crew to interact with and manage the operation of the space system and its subsystems. This includes manual override capabilities for automated systems to ensure the crew can take control when necessary to prevent catastrophic events or mission aborts.
- Safety-Critical Software Requirements: Ensure that safety-critical software requirements are thoroughly defined and implemented. This includes verifying that the software supports functions necessary to execute the mission, prevent catastrophic events, and prevent aborts.
- Redundancy and Fault Tolerance: Design and implement a system with redundancy and fault tolerance to ensure continuous operation even in the presence of faults or failures. This helps in maintaining control and monitoring capabilities essential for mission success and safety.
- Safety Reviews: Perform safety reviews on all software changes and defects to verify that the system has redundancy and fault tolerance to ensure continuous operation even in the presence of faults or failures. This ensures that each fault has a fault detection and recovery mechanism and the modifications do not introduce new vulnerabilities or increase the risk of failure due to the fault.
- Independent Verification and Validation (IV&V): Ensure independent verification and validation is performed to ensure that the monitoring, operation, and control systems meet specified requirements and are effective in all operational scenarios. IV&V activities should include rigorous testing and analysis of these systems.
- IV&V Analysis Results: Assure that the crew monitoring and control capabilities have been implemented in the software and independently verified and validated to meet safety and mission requirements.
- IV&V Participation: Involve the IV&V provider in reviews, inspections, and technical interchange meetings to provide real-time feedback and ensure a thorough assessment.
- IV&V Management and Technical Measurements: Track and evaluate the performance and results of IV&V activities to ensure continuous improvement and risk management.
- Simulation and Testing: Perform extensive simulations and testing to verify that the crew can effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios. This includes testing for unexpected conditions and boundary conditions.
- Independent Testing: Ensure independent software testing, including IV&V testing, is performed to verify that the capability is available for the crew to monitor, operate, and control the crewed space system and subsystems, where:
- The capability is necessary to execute the mission; or
- The capability would prevent a catastrophic event; or
- The capability would prevent an abort.
- Error Handling and Recovery Mechanisms: Implement robust error handling and recovery mechanisms to address errors and faults detected during operation. This includes ensuring that error handling is adequate and that the system can recover from errors without leading to hazardous or catastrophic events.
- Configuration Management: Maintain strict configuration management to ensure that the correct software versions and configurations are used. This reduces the risk of errors due to incorrect or inconsistent configurations that could affect monitoring and control capabilities.
- Training and Documentation: Provide comprehensive training and documentation for the crew on how to use the monitoring, operation, and control systems. This includes detailed procedures, troubleshooting guides, and emergency protocols to ensure the crew is well-prepared to handle any situation. This is best done by providing a User Manual with instructions and applicable information about each error/fault and how the system can recover from it.
By implementing these tasks, the crewed space system can be designed to provide the necessary capabilities for the crew to monitor, operate, and control the system and subsystems, ensuring mission success, safety, and reliability.
3.2 Additional Guidance
Additional guidance related to this requirement may be found in the following materials in this Handbook:
3.3 Center Process Asset Libraries
SPAN - Software Processes Across NASA
SPAN contains links to Center managed Process Asset Libraries. Consult these Process Asset Libraries (PALs) for Center-specific guidance including processes, forms, checklists, training, and templates related to Software Development. See SPAN in the Software Engineering Community of NEN. Available to NASA only. https://nen.nasa.gov/web/software/wiki 197
See the following link(s) in SPAN for process assets from contributing Centers (NASA Only).
SPAN Links |
---|
To be developed later. |
4. Small Projects
No additional guidance is available for small projects. The community of practice is encouraged to submit guidance candidates for this paragraph.
5. Resources
5.1 References
5.2 Tools
NASA users find this in the Tools Library in the Software Processes Across NASA (SPAN) site of the Software Engineering Community in NEN.
The list is informational only and does not represent an “approved tool list”, nor does it represent an endorsement of any particular tool. The purpose is to provide examples of tools being used across the Agency and to help projects and centers decide what tools to consider.
6. Lessons Learned
6.1 NASA Lessons Learned
No Lessons Learned have currently been identified for this requirement.
6.2 Other Lessons Learned
No other Lessons Learned have currently been identified for this requirement.
7. Software Assurance
- The capability is necessary to execute the mission; or
- The capability would prevent a catastrophic event; or
- The capability would prevent an abort.
7.1 Tasking for Software Assurance
- Ensure the development, implementation, and testing of robust control algorithms capable of managing critical functions with crew intervention. These algorithms must undergo thorough testing to guarantee their reliability and safety in all operational scenarios.
- Ensure redundancy and fault tolerance are included in the design to ensure that critical functions can continue to operate autonomously or the crew can monitor, operate, and control the crewed space system and subsystems, , even in the presence of faults or failures. This includes implementing backup systems and failover mechanisms.
- Ensure that Integrated real-time monitoring and diagnostic tools are used to continuously assess the health and status of critical systems and subsystems. These tools should detect anomalies and trigger autonomous responses to mitigate potential catastrophic events and alert the crew of the situation for potential intervention.
- Employ safety analysis techniques such as 8.07 - Software Fault Tree Analysis and 8.05 - SW Failure Modes and Effects Analysis to identify potential hazards and failure modes. This helps in designing controls and mitigations to allow the crew to effectively monitor, operate, and control the space system during various critical operations.
- Ensure extensive simulations and testing are conducted to verify that the crew can effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios. This includes testing for unexpected situations and boundary conditions.
- Confirm that strict configuration management to ensure that the correct software versions and configurations are used. This reduces the risk of errors due to incorrect or inconsistent configurations that could impact crew operations.
- Ensure robust error handling and recovery mechanisms to address errors stemming from detected faults. This ensures that error handling is adequate and that the system can recover from errors without leading to hazardous or catastrophic events.
- Perform safety reviews on all software changes and software defects.
- Confirm that 100% code test coverage is addressed for all identified safety-critical software components or that software developers provide a technically acceptable rationale or a risk assessment explaining why the test coverage is not possible or why the risk does not justify the cost of increasing coverage for the safety-critical code component.
- Analyze that the software test plans and software test procedures cover the software requirements and provide adequate verification of hazard controls, specifically that the crew can effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios. (See SWE-071 - Update Test Plans and Procedures tasks). Ensure that the project has developed and executed test cases to test the software system’s recovery from faults.
- Analyze the software test procedures for the following:
- Coverage of the software requirements.
- Acceptance or pass/fail criteria,
- The inclusion of operational and off-nominal conditions, including boundary conditions,
- Requirements coverage and hazards per SWE-066 - Perform Testing and SWE-192 - Software Hazardous Requirements, respectively.
- Perform test witnessing for safety-critical software to ensure that the crew can effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios.
- Confirm that test results are sufficient verification artifacts for the hazard reports.
- Confirm independent software testing, including IV&V testing, is performed to verify that the capability is available for the crew to monitor, operate, and control the crewed space system and subsystems, where:
- The capability is necessary to execute the mission; or
- The capability would prevent a catastrophic event; or
- The capability would prevent an abort.
- Ensure comprehensive training and documentation for operators are available.
7.2 Software Assurance Products
- 8.52 - Software Assurance Status Reports
- 8.54 - Software Requirements Analysis
- 8.55 - Software Design Analysis
- 8.56 - Source Code Quality Analysis
- 8.57 - Testing Analysis
- 8.58 - Software Safety and Hazard Analysis
- 8.59 - Audit Reports
- Test Witnessing Signatures (See SWE-066 - Perform Testing)
Objective Evidence
- System design showing that the crew can effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios.
- Software design that shows how the system design allows the crew to effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios.
- Completed Hazard Analyses and Hazard Reports identifying all of the potential hazard faults with their associated crew operations.
- Completed software safety and hazard analysis results
- Software Fault Tree Analysis (FTA) and Software Failure Modes and Effects Analysis (FMEA)
- Audit reports, specifically the Functional Configuration Audit (FCA) and Physical Configuration Audit (PCA)
- SWE work product assessments for Software Test Plan, Software Test Procedures, Software Test Reports, and User Manuals
- Results from the use of automated tools for code coverage and other verification and validation activities.
- Comprehensive user manual that includes instructions for recovering from off-nominal conditions.
7.3 Metrics
For the requirement that the crewed space system shall provide the capability for the crew to monitor, operate, and control the crewed space system and subsystems, where the capability is necessary to execute the mission, prevent a catastrophic event, or prevent an abort, the following software assurance metrics are necessary:
- Verification and Validation Metrics:
- Test Coverage: Ensure comprehensive test coverage for all scenarios involving crew monitoring, operation, and control of the system and subsystems. This includes normal operations, failure modes, and recovery scenarios.
- Defect Density: Track the number of defects identified during testing per thousand lines of code to ensure software reliability and robustness.
- Requirements Traceability: Ensure each requirement, including those for crew monitoring, operation, and control capabilities, is traced to its implementation and corresponding test cases.
- Safety Metrics:
- Hazard Analysis: Identify and evaluate potential hazards related to crew monitoring and control functions, ensuring adequate mitigation strategies are in place.
- Safety-critical Requirements Compliance: Verify that all safety-critical requirements are met and adequately tested to prevent failures during crew operations.
- Quality Metrics:
- Code Quality: Use metrics such as cyclomatic complexity and static analysis results to ensure the code is maintainable and less prone to errors.
- Code Churn: Measure changes in the codebase to monitor stability and identify areas of frequent modification that may need more rigorous testing.
- Performance Metrics:
- Response Time: Measure the time taken for the system to respond to crew inputs for monitoring, operation, and control to ensure the timely execution of commands.
- System Uptime: Ensure the system is available and operational when needed, especially during critical mission phases.
- Configuration Management Metrics:
- Version Control: Ensure proper version control for all software components involved in crew monitoring and control capabilities to track changes and maintain consistency.
- Change Requests: Monitor the number of change requests and their impact on the system's reliability and safety.
- Training Metrics:
- Personnel Training Completion: Ensure all personnel involved in the development, testing, and operation of the crew monitoring and control system have completed the necessary training.
- Independent Verification and Validation (IV&V) Metrics:
- IV&V Analysis Results: Provide assurance that the crew monitoring and control capabilities have been independently verified and validated to meet safety and mission requirements.
- IV&V Participation: Involve the IV&V provider in reviews, inspections, and technical interchange meetings to provide real-time feedback and ensure thorough assessment.
- IV&V Management and Technical Measurements: Track and evaluate the performance and results of IV&V activities to ensure continuous improvement and risk management.
Examples of potential SA metrics are:
- # of potential hazards that could lead to catastrophic events
- # of Non-Conformances identified during each testing phase (Open, Closed, Severity)
- Code coverage data: % of code that has been executed during testing
- % of traceability completed for all hazards to software requirements and test procedures
- # of hazards with completed test procedures/cases vs. total # of hazards over time
- # of Non-Conformances identified while confirming hazard controls are verified through test plans/procedures/cases
- # of Hazards containing software that has been tested vs. total # of Hazards containing software
- # of safety-related Non-Conformances
- # of Safety Critical tests executed vs. # of Safety Critical tests witnessed by SA
- Software code/test coverage percentages for all identified safety-critical components (e.g., # of paths tested vs. total # of possible paths)
- # of safety-critical requirement verifications vs. total # of safety-critical requirement verifications completed
- Test coverage data for all identified safety-critical software components
- # of Software Requirements that do not trace to a parent requirement
- % of traceability completed in each area: System Level requirements to Software requirements; Software Requirements to Design; Design to Code; Software Requirements to Test Procedures
- % of traceability completed for all hazards to software requirements and test procedures
- Defect trends for trace quality (# of circular traces, orphans, widows, etc.)
- # of Configuration Management Audits conducted by the project – Planned vs. Actual
These metrics ensure that the software supporting crew monitoring, operation, and control capabilities is reliable, safe, and meets the specified requirements. For detailed guidance, referring to the Software Assurance and Software Safety Standard (NASA-STD-8739.8) and the NASA Procedural Requirements (NPR 7150.2) would provide a comprehensive framework.
See also Topic 8.18 - SA Suggested Metrics
7.4 Guidance
To ensure the crewed space system is fully capable of enabling the crew to monitor, operate, and control the system and its subsystems, the following essential software assurance and safety tasks should be implemented:
- Human-Machine Interface Design: Ensure an intuitive and robust human-machine interface (HMI) that empowers the crew to effectively monitor and control the space system and its subsystems are designed and implemented. The interface must clearly display critical information and provide user-friendly controls. The HMI design should take into consideration the Display Standards in Appendix F of NASA Spaceflight Human-System Standard, Volume 2: Human Factors, Habitability, And Environmental Health (NASA-STD-3001, Vol 2, Rev D) 498.
- Real-Time Data Monitoring: Ensure real-time monitoring systems for the health and status of critical systems and subsystems are developed and implemented. This will include displaying performance data, system status, and alerts to ensure the crew can make timely, informed decisions.
- Control System Implementation: Ensure control systems that allow the crew to interact with and manage the operation of the space system and its subsystems, including manual override capabilities for automated systems, are developed and implemented. This guarantees that the crew can take command when necessary to prevent catastrophic events or mission aborts.
- Safety-Critical Software Requirements: Ensure that safety-critical software requirements are thoroughly defined and implemented to ensure that the software supports mission execution, prevents catastrophic events, and avoids aborts.
- Redundancy and Fault Tolerance: Ensure a system is designed to include built-in redundancy and fault tolerance to ensure continuous operation, even in the face of faults. This design is crucial for maintaining the necessary control and monitoring capabilities for mission success and safety.
- Software Safety and Hazard Analysis: Develop and maintain a Software Safety Analysis throughout the software development life cycle. Assess that the Hazard Analyses (including hazard reports) identify the software components associated with the system hazards per the criteria defined in NASA-STD- 8739.8, Appendix A. (See SWE-205 - Determination of Safety-Critical Software) Perform these on all new requirements, requirement changes, and software defects to determine their impact on the software system's reliability and safety. Confirm that all safety-critical requirements related to crew monitoring and control capabilities have been implemented and adequately tested to prevent failures during mission-critical operations. It may be necessary to discuss these findings during the Safety Review so the reviewers can weigh the impact of implementing the changes. (See Topic 8.58 – Software Safety and Hazard Analysis.
- Hazard Analysis/Hazard Reports: Confirm that a comprehensive hazard analysis was conducted to identify potential hazards that could result from critical software behavior. This analysis should include evaluating existing and potential hazards and recommending mitigation strategies for identified hazards. The Hazard Reports should contain the results of the analyses and proposed mitigations (See Topic 5.24 - Hazard Report Minimum Content)
- Software Safety Analysis: To develop this analysis, utilize safety analysis techniques such as 8.07 - Software Fault Tree Analysis and 8.05 - SW Failure Modes and Effects Analysis to identify potential hazards and failure modes. This helps in designing controls and mitigations for the operation of critical functions. When generating this SA product, see Topic 8.09 - Software Safety Analysis for additional guidance.
- Safety Reviews: Perform safety reviews on all software changes and defects to verify that the crew can effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios, that could result in a catastrophic event. This ensures that each fault has a fault detection mechanism and the modifications do not introduce new vulnerabilities or increase the risk of failure due to the fault.
- Peer Reviews: Participate in peer reviews on all software changes and software defects affecting safety-critical software and hazardous functionality to verify that the crew can effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios. (See SWE-134 - Safety-Critical Software Design Requirements tasks.)
- Change Requests: Monitor the number of software change requests and software defects and their impact on the system's reliability and safety. Increases in the number of changes may be indicative of requirements issues or code quality issues resulting in potential schedule slips. (See SWE-053 - Manage Requirements Changes, SWE-080 - Track and Evaluate Changes.)
- Test Witnessing: Perform test witnessing for safety-critical software to verify that the crew can effectively monitor, operate, and control the space system under various conditions, including nominal and off-nominal scenarios. (See SWE-066 - Perform Testing.) This includes witnessing tests to:
- Confirm that the crew can effectively monitor, operate, and control the space system under various conditions without resulting in catastrophic consequences. This could include:
- Measuring the time taken for the system to detect and report faults to the crew so they can implement mitigation procedures in timely and accurate manner. A prolonged period could cause catastrophic consequences.
- Ensuring the system is available and operational when needed, especially during critical mission phases.
- Uncover unrecorded software defects and confirm they get documented and recorded.
- Confirm robust error handling and recovery mechanisms to address any errors and faults encountered during operation are implemented. This includes ensuring that error handling is adequate, and that the system can recover from errors without leading to hazardous or catastrophic events.
- Confirm that the crew can effectively monitor, operate, and control the space system under various conditions without resulting in catastrophic consequences. This could include:
- Simulation and Testing: Ensure extensive simulations and testing are performed to confirm that the crew can effectively monitor, operate, and control the space system under all conditions, including both nominal and off-nominal scenarios. This will include rigorous testing for unexpected situations and boundary conditions.
- Test Results Assessment: Confirm that test results are assessed and recorded and that the test results are sufficient verification artifacts for the hazard reports. (See SWE-068 - Evaluate Test Results.)
- Configuration Management: Ensure strict configuration management is maintained to guarantee that only the correct software versions and configurations are used. (See SWE-187 - Control of Software Items for more information.) This proactive measure will significantly reduce the risk of errors due to incorrect or inconsistent configurations that could compromise monitoring and control capabilities. This also includes performing the SWE-187 tasking.
- Assess that the software safety-critical items, including the hazard reports and safety analysis, are configuration-managed (See SWE-081 - Identify Software CM Items tasking).
- Code Coverage: Confirm that 100% code test coverage is addressed for all identified safety-critical software components or ensure that software developers provide a risk assessment explaining why the test coverage is impossible or why the risk does not justify the cost of increasing coverage for the safety-critical code component. This includes normal operations, failure modes, fault detection, isolation, and recovery procedures. (See SWE-189 - Code Coverage Measurements, SWE-219 - Code Coverage for Safety Critical Software.)
- Training and Documentation: Ensure comprehensive training and documentation for the crew on using the monitoring, operation, and control systems are available. This will encompass detailed procedures, troubleshooting guides, and emergency protocols to ensure the crew is thoroughly prepared to handle any situation.
By implementing these decisive tasks, the crewed space system will be designed to provide the necessary capabilities for the crew to monitor, operate, and control the system and its subsystems effectively, ensuring mission success, safety, and reliability.
7.5 Additional Guidance
Additional guidance related to this requirement may be found in the following materials in this Handbook: