Invalid license: Your evaluation license of Refined expired.
bannerd


Renew your license to continue

Your evaluation license of Visibility for Confluence expired. Please use the Buy button to purchase a new license.

HR-33 - Inadvertent Operator Action
This page contains macros or features from a plugin which requires a valid license.

You will need to contact your administrator.

1. Requirements

4.3.3 The space system shall be designed to tolerate inadvertent operator action (minimum of one inadvertent action), as verified by a human error analysis, without causing a catastrophic event.

1.1 Notes

An operator is defined as any human that commands or interfaces with the space system during the mission, including humans in the control centers. The appropriate level of protection (i.e., one, two, or more inadvertent actions) is determined by the integrated human error and hazard analysis per NPR 8705.2. 024

1.2 History

HR-33 - First published in NASA-STD-8719.29. First used in Software Engineering Handbook Version D.

SWEHB RevHR RevRequirement Statement
DBaseline

4.3.3 The space system shall be designed to tolerate inadvertent operator action (minimum of one inadvertent action), as verified by a human error analysis, without causing a catastrophic event.

1.3 Applicability Across Classes

Class

     A      

     B      

     C      

     D      

     E      

     F      

Applicable?

   

   

   

   

   

   

Key:    - Applicable | - Not Applicable


Renew your license to continue

Your evaluation license of Visibility for Confluence expired. Please use the Buy button to purchase a new license.

2. Rationale

One of the main causes of erroneous software behavior is through inadvertent or erroneous operator input.  Tolerating inadvertent operator input requires general one-fault systems tolerance (i.e., error checking).    This requirement ensures that multiple, unique, and independent commands are used when disabling a Must Work Function and forces operator error analysis before execution. This requirement protects against disabling redundant control strings by preventing the scripting or combining two or more of the “unique and independent” commands, such that multiple commands can be issued by a single operator action without error checking. 

3. Guidance

 Some strategies to mitigate the input of faulty data are to employ two-stage commanding for critical commands, provide operator information on the implication of the command with requested confirmation feedback, and perform error checks on the operator inputs.

Where the loss of a function could result in a hazard, the requirement provides the necessary command fault tolerance to disable the function by one of the two methods below:

  1. For systems with redundant control string(s) active before disabling the primary string, one independent and unique command shall disable each control string.
  2. For systems with redundant control string(s) activated after disabling the primary string, two independent and unique commands shall be required to deactivate/disable the active control string. 

Case A is dependent on a system that uses multiple computers simultaneously to monitor, collect, and perform operational tasks.  This case also represents systems that use one primary computer system at a time, with redundant computer(s) collecting the same sensor and input data as the primary, and can take immediate control of the system if the primary system goes offline.  This case also includes backup systems that are powered off but are powered on and initialized to a known safe state before disabling the primary control string.

Case B is a system that uses a single primary active computer at a time.  The redundant backup computer(s) would be powered off or offline and would require time to initialize before it can take primary operational control of the system.  In addition, the primary string is required to be disabled before Failure Detection, Isolation, and Recovery (FDIR) or manual operator actions may be used to bring the backup system online. 

A loss of capability could result from the loss of sensor or command input, control processing, or disabling of effector capability.  The typical circumstances where a command would be issued to disable a control string to a Must Work Function will be due to a failure or maintenance activities.  Otherwise, it is assumed that all redundant control strings are available.  In addition, FDIR or manual recovery must meet the time-to-event requirements to preclude the hazard.

See Topic 7.24 - Human Rated Software Requirements for other Software Requirements related to Human Rated Software. 

3.1 Software Tasks For Tolerating Inadvertent Operator Actions

To ensure that the space system can tolerate inadvertent operator actions without causing catastrophic events, the following software tasks should be implemented:

  1. Human Error Analysis: Conduct a detailed human error analysis to identify potential operator errors that could lead to catastrophic events. This analysis should include various scenarios and the likelihood of each error occurring.  This involves evaluating human-machine interface design, operator training, and operational procedures and implementing checks, mitigations, and error handling to prevent inadvertent actions that could cause catastrophic events.
  2. Simulation and Testing: Develop, implement, and execute software simulations to model and test the impact of inadvertent operator actions. This includes conducting tests to verify that the software can handle at least one inadvertent action without resulting in catastrophic consequences. The flight operations team should conduct simulations to thoroughly test the various scenarios.
  3. Safety Reviews: Perform safety reviews on all software changes and defects. This ensures that any modifications do not introduce new vulnerabilities or increase the risk of inadvertent actions leading to catastrophic events.
  4. Formal and Informal Testing: Carry out both formal and informal testing to uncover unrecorded software defects. This includes testing unexpected conditions, boundary conditions, and software/interface inputs.
  5. Error Handling and Recovery: Implement robust error handling and recovery mechanisms to address errors resulting from inadvertent operator actions. This includes ensuring adequate error handling and that the system can recover from errors without leading to catastrophic events.
  6. Automated Verification and Validation: Use automated tools for static analysis, dynamic analysis, code coverage, cyclomatic complexity, and other verification and validation activities. This helps identify potential software defects that could result in catastrophic events due to inadvertent operator actions.
  7. Configuration Management: Maintain strict configuration management to ensure that the correct software versions and configurations are used. This reduces the risk of errors due to incorrect or inconsistent configurations.
  8. Training and Documentation: Provide comprehensive training and documentation for operators to minimize the chances of inadvertent actions. This includes clear instructions, warnings, and recovery procedures. This is best done by providing a User Manual with instructions and applicable information about each error and how to gracefully recover from it.
  9. Independent Verification and Validation (IV&V): Ensure there is independent verification and validation to ensure that the software meets its specified requirements and that any modifications do not introduce new vulnerabilities or increase the risk of failure due to operator errors. 
    1. IV&V Analysis Results: Assure that the capability to tolerate inadvertent operator actions has been independently verified and validated to meet safety and mission requirements. 
    2. IV&V Participation: Involve the IV&V provider in reviews, inspections, and technical interchange meetings to provide real-time feedback and ensure thorough assessment. 
    3. IV&V Management and Technical Measurements: Track and evaluate the performance and results of IV&V activities to ensure continuous improvement and risk management.

By implementing these tasks, the space system can be designed to tolerate and recover from inadvertent operator actions and ensure safety and reliability.

3.2 Additional Guidance

Additional guidance related to this requirement may be found in the following materials in this Handbook:

3.3 Center Process Asset Libraries

SPAN - Software Processes Across NASA
SPAN contains links to Center managed Process Asset Libraries. Consult these Process Asset Libraries (PALs) for Center-specific guidance including processes, forms, checklists, training, and templates related to Software Development. See SPAN in the Software Engineering Community of NEN. Available to NASA only. https://nen.nasa.gov/web/software/wiki  197

See the following link(s) in SPAN for process assets from contributing Centers (NASA Only). 

SPAN Links

To be developed later. 

4. Small Projects

No additional guidance is available for small projects. The community of practice is encouraged to submit guidance candidates for this paragraph.

5. Resources

5.1 References

Renew your license to continue

Your evaluation license has expired. Contact your administrator to renew your Reporting for Confluence license.

Renew your license to continue

Your evaluation license of Visibility for Confluence expired. Please use the Buy button to purchase a new license.


5.2 Tools

Tools to aid in compliance with this SWE, if any, may be found in the Tools Library in the NASA Engineering Network (NEN). 

NASA users find this in the Tools Library in the Software Processes Across NASA (SPAN) site of the Software Engineering Community in NEN. 

The list is informational only and does not represent an “approved tool list”, nor does it represent an endorsement of any particular tool.  The purpose is to provide examples of tools being used across the Agency and to help projects and centers decide what tools to consider.

6. Lessons Learned

6.1 NASA Lessons Learned

No Lessons Learned have currently been identified for this requirement.

6.2 Other Lessons Learned

No other Lessons Learned have currently been identified for this requirement.

7. Software Assurance

HR-33 - Inadvertent Operator Action
4.3.3 The space system shall be designed to tolerate inadvertent operator action (minimum of one inadvertent action), as verified by a human error analysis, without causing a catastrophic event.

7.1 Tasking for Software Assurance

  1. Confirm that a detailed software command error analysis is complete to identify potential operator errors that could lead to catastrophic events. This analysis should include various commanding scenarios and the likelihood of each command error occurring.
  2. Analyze that the software test plans and software test procedures cover the software requirements and provide adequate verification of hazard controls, specifically the off-nominal commanding scenarios to mitigate the impact of inadvertent operator actions. (See SWE-071 - Update Test Plans and Procedures tasks). Ensure that the project has developed and executed test cases to test the impact of inadvertent operator actions. This includes conducting tests to verify that the system can handle at least one inadvertent action without resulting in catastrophic consequences.
  3. Perform safety reviews on all software changes and software defects. This ensures that any modifications do not introduce new vulnerabilities or increase the risk of inadvertent actions leading to catastrophic events.
  4. Perform test witnessing for safety-critical software to ensure the impact of inadvertent operator actions is mitigated. (See SWE-066 - Perform Testing tasks)  
  5. Confirm that both formal and informal testing to uncover unrecorded software defects has been completed. This includes testing unexpected conditions, boundary conditions, hazardous conditions, and software/interface inputs.
  6. Confirm robust error handling and recovery mechanisms to address errors resulting from inadvertent operator actions. This includes ensuring adequate error handling and that the system can recover from errors without leading to catastrophic events.
  7. Confirm the use of automated tools for static analysis, dynamic analysis, and other verification and validation activities. This helps identify potential software defects that could result in catastrophic events due to inadvertent operator actions.
  8. Confirm that strict configuration management is maintained to ensure that the correct software versions and configurations are used. This reduces the risk of errors due to incorrect or inconsistent configurations. (See tasks in SWE-187 - Control of Software Items)
  9. Ensure comprehensive training and documentation for operators to minimize the chances of inadvertent actions is available. This includes clear instructions, warnings, and recovery procedures.

7.2 Software Assurance Products

  1. 8.52 - Software Assurance Status Reports 
  2. 8.56 - Source Code Quality Analysis 
  3. 8.57 - Testing Analysis 
  4. 8.58 - Software Safety and Hazard Analysis 
  5. 8.59 - Audit Reports
  6. Test Witnessing Signatures (see SWE-066 - Perform Testing)


Objective Evidence

  1. Detailed human error analysis, this analysis should include various scenarios and the likelihood of each error occurring.
  2. Verification test reports include conducting tests to verify that the system can handle at least one inadvertent action without resulting in catastrophic consequences.
  3. Audit reports,  specifically the Functional Configuration Audit (FCA) and Physical Configuration Audit (PCA)
  4. Completed safety reviews on all software changes and software defects.
  5. Results from the use of automated tools for static analysis, dynamic analysis, code coverage, cyclomatic complexity, and other verification and validation activities.
  6. SWE work product assessments for Software Test Plan, Software Test Procedures, Software Test Reports, and User Manuals. 
  7. SA assessment that source code meets SWE-134 - Safety-Critical Software Design Requirements requirements at inspections and reviews, including any risks and issues. 

Objective evidence is an unbiased, documented fact showing that an activity was confirmed or performed by the software assurance/safety person(s). The evidence for confirmation of the activity can take any number of different forms, depending on the activity in the task. Examples are:

  • Observations, findings, issues, risks found by the SA/safety person and may be expressed in an audit or checklist record, email, memo or entry into a tracking system (e.g. Risk Log).
  • Meeting minutes with attendance lists or SA meeting notes or assessments of the activities and recorded in the project repository.
  • Status report, email or memo containing statements that confirmation has been performed with date (a checklist of confirmations could be used to record when each confirmation has been done!).
  • Signatures on SA reviewed or witnessed products or activities, or
  • Status report, email or memo containing a short summary of information gained by performing the activity. Some examples of using a “short summary” as objective evidence of a confirmation are:
    • To confirm that: “IV&V Program Execution exists”, the summary might be: IV&V Plan is in draft state. It is expected to be complete by (some date).
    • To confirm that: “Traceability between software requirements and hazards with SW contributions exists”, the summary might be x% of the hazards with software contributions are traced to the requirements.
  • The specific products listed in the Introduction of 8.16 are also objective evidence as well as the examples listed above.


7.3 Metrics

For the requirement that the space system shall be designed to tolerate inadvertent operator action (minimum of one inadvertent action), as verified by a human error analysis, without causing a catastrophic event, the following considerations and software assurance metrics are necessary:

  1. Human Error Analysis:
    • Conduct or confirm a comprehensive software error analysis to identify potential operator errors and their impact on the software. This involves evaluating command interface design, command error handling, operator training, and operational procedures to ensure that inadvertent actions are anticipated and appropriately mitigated.
  2. Verification and Validation Metrics:
    • Test Coverage: Ensure comprehensive test coverage for scenarios involving inadvertent operator actions, including normal operations, failure modes, and recovery procedures.
    • Defect Density: Track the number of defects identified during testing per thousand lines of code to ensure software reliability and robustness.
    • Requirements Traceability: Ensure each requirement, including those for tolerating inadvertent operator actions, is traced to its implementation and corresponding test cases to maintain comprehensive coverage and validation.
  3. Safety Metrics:
    • Hazard Analysis: Identify and evaluate potential hazards related to inadvertent operator actions, ensuring adequate mitigation strategies are in place.
    • Safety-critical Requirements Compliance: Verify that all safety-critical requirements related to tolerating inadvertent operator actions are met and adequately tested to prevent failures during mission-critical operations.
  4. Quality Metrics:
    • Code Quality: Use metrics such as cyclomatic complexity and static analysis results to ensure the code is maintainable and less prone to errors. Specifically, ensure that safety-critical software components have a cyclomatic complexity value of 15 or lower, or provide a technically acceptable rationale if this value is exceeded.
    • Code Churn: Measure changes in the codebase to monitor stability and identify areas of frequent modification that may need more rigorous testing.
  5. Performance Metrics:
    • Response Time: Measure the time taken for the system to detect and respond to inadvertent operator actions to ensure timely and accurate execution of mitigation procedures.
    • System Uptime: Ensure the system is available and operational when needed, especially during critical mission phases, to support tolerance of inadvertent operator actions.
  6. Configuration Management Metrics:
    • Version Control: Ensure proper version control for all software components involved in tolerating inadvertent operator actions to track changes and maintain consistency.
    • Change Requests: Monitor the number of change requests and their impact on the system's reliability and safety.
  7. Training Metrics:
    • Personnel Training Completion: Ensure that all personnel involved in the development, testing, and operation of the system have completed the necessary training to handle inadvertent operator actions and single system failures.
  8. Independent Verification and Validation (IV&V) Metrics:
    • IV&V Analysis Results: Assure that the capability to tolerate inadvertent operator actions has been independently verified and validated to meet safety and mission requirements.
    • IV&V Participation: Involve the IV&V provider in reviews, inspections, and technical interchange meetings to provide real-time feedback and ensure thorough assessment.
    • IV&V Management and Technical Measurements: Track and evaluate the performance and results of IV&V activities to ensure continuous improvement and risk management.

Examples of potential SA metrics are:

  • # of potential hazards that could lead to catastrophic events 
  • # of Non-Conformances identified during each testing phase (Open, Closed, Severity) 
  • Code coverage data: % of code that has been executed during testing 
  • Static Analysis metrics: 
  • # of total errors and warnings identified by the Static Analysis tool 
  • # of errors and warnings evaluated vs. # of total errors and warnings identified by the Static Analysis tool 
  • # of static code errors and warnings identified as “positives” vs. # of total errors and warnings identified by the tool 
  • Total # of static code analysis "positives" vs.  # of "positives" resolved. Trend over time. 
  • # of static code errors and warnings resolved by Severity vs. # of static code errors and warnings identified by Severity by the tool 
  • % of traceability completed for all hazards to software requirements and test procedures 
  • # of hazards with completed test procedures/cases vs. total # of hazards over time 
  • # of Non-Conformances identified while confirming hazard controls are verified through test plans/procedures/cases 
  • # of Hazards containing software that has been tested vs. total # of Hazards containing software 
  • # of safety-related Non-Conformances 
  • # of Safety Critical tests executed vs. # of Safety Critical tests witnessed by SA 
  • Software code/test coverage percentages for all identified safety-critical components (e.g., # of paths tested vs. total # of possible paths)  
  • # of safety-critical requirement verifications vs. total # of safety-critical requirement verifications completed 
  • Test coverage data for all identified safety-critical software components 

These metrics ensure that the software supporting tolerance of inadvertent operator actions is reliable, safe, and meets the specified requirements. For detailed guidance, referring to the Software Assurance and Software Safety Standard (NASA-STD-8739.8  278 ) and the NASA Procedural Requirements (NPR 7150.2 083) would provide a comprehensive framework.

See also Topic 8.18 - SA Suggested Metrics

7.4 Guidance

To ensure that the space system can tolerate inadvertent operator actions without causing catastrophic events, the following software assurance and software safety tasks should be implemented:

  1. Human Error Analysis: Confirm that a detailed software error analysis is complete to identify potential operator errors that could lead to catastrophic events. This analysis should include various scenarios and the likelihood of each error occurring. This analysis should also include the various scenarios and the likelihood of each error occurring along with ensuring: 
    1. Comprehensive test coverage for scenarios involving inadvertent operator actions, including normal operations, failure modes, and recovery procedures. 
    2. Tests for unexpected conditions, boundary conditions, and software/interface inputs will be performed including robust error handling and recovery mechanisms to address errors resulting from inadvertent operator actions. 
    3. The project has developed and plans to execute simulations that model and test the impact of inadvertent operator actions. This includes conducting tests to verify that the software system can handle inadvertent actions without resulting in catastrophic consequences. 
    4. Each requirement, including those for tolerating inadvertent operator actions, is traced to its implementation and corresponding test cases to maintain comprehensive coverage and validation. 
  2. Simulations and Testing: Ensure that the project has developed and executed simulations to model and test the impact of inadvertent operator actions. This includes conducting tests to verify that the system can handle at least one inadvertent action without resulting in catastrophic consequences.
  3. Test Witnessing: Perform test witnessing for safety-critical software to ensure the impact of inadvertent operator actions is mitigated. (See SWE-066 - Perform Testing) This includes witnessing tests to:  
    1. Confirm that the system can handle inadvertent actions without resulting in catastrophic consequences. This could include: 
      1. Measuring the time taken for the system to detect and respond to inadvertent operator actions to ensure timely and accurate execution of mitigation procedures. A prolonged period could cause catastrophic consequences. 
      2. Ensuring the system is available and operational when needed, especially during critical mission phases, to support tolerance of inadvertent operator actions. 
    2. Uncover unrecorded software defects and confirm they get documented and recorded. 
    3. Confirm there is robust error handling and recovery mechanisms to address errors resulting from inadvertent operator actions. This includes ensuring adequate error handling and that the system can recover from errors without leading to catastrophic events 
  4. Software Safety and Hazard Analysis: Develop and maintain a Software Safety Analysis throughout the software development life cycle. Assess that the Hazard Analyses (including hazard reports) identify the software components associated with the system hazards per the criteria defined in NASA-STD- 8739.8, Appendix A. (See SWE-205 - Determination of Safety-Critical Software) Perform these on all new requirements, requirement changes, and software defects to determine their impact on the software system's reliability and safety. Confirm that all safety-critical requirements related to tolerating inadvertent operator actions have been implemented and adequately tested to prevent catastrophic events during mission-critical operations. It may be necessary to discuss these findings during the Safety Review so the reviewers can weigh the impact of implementing the changes. (See Topic 8.58 – Software Safety and Hazard Analysis.)
    1. Hazard Analysis/Hazard Reports: Confirm that a comprehensive hazard analysis was conducted to identify potential hazards related to inadvertent operator actions that could result from critical software behavior. This analysis should include evaluating existing and potential hazards and recommending mitigation strategies for identified hazards. The Hazard Reports should contain the results of the analyses and proposed mitigations (See Topic 5.24 - Hazard Report Minimum Content)
    2. Software Safety Analysis: To develop this analysis, utilize safety analysis techniques such as 8.07 - Software Fault Tree Analysis and 8.05 - SW Failure Modes and Effects Analysis to identify potential hazards and failure modes. This helps in designing controls and mitigations for the operation of critical functions. When generating this SA product, see Topic 8.09 - Software Safety Analysis for additional guidance.
  5. Safety ReviewsPerform safety reviews on all software changes and software defects. This ensures that any modifications do not introduce new vulnerabilities or increase the risk of inadvertent actions leading to catastrophic events. 
  6. Peer Reviews: Participate in peer reviews on all software changes and software defects affecting safety-critical software and hazardous functionality. (See SWE-134 - Safety-Critical Software Design Requirements tasks.) This ensures that any modifications control the input of bad data and do not introduce new vulnerabilities or increase the risk of inadvertent actions leading to catastrophic events. 
    1. Change Requests: Monitor the number of software change requests and software defects and their impact on the system's reliability and safety. Increases in the number of changes may be indicative of requirements issues or code quality issues resulting in potential schedule slips. (See SWE-053 - Manage Requirements Changes , SWE-080 - Track and Evaluate Changes) 
  7. Test Results Assessment: Confirm that test results are assessed and recorded and that the test results are sufficient verification artifacts for the hazard reports. (See SWE-068 - Evaluate Test Results tasks.)
  8. Formal and Informal Testing: Ensure that both formal and informal testing to uncover unrecorded software defects has been completed. This includes testing unexpected conditions, boundary conditions, and software/interface inputs.
  9. Automated Verification and Validation: Confirm the use of automated tools for static analysis, dynamic analysis, code coverage, cyclomatic complexity, and other verification and validation activities. This helps identify potential software defects that could result in catastrophic events due to inadvertent operator actions. (See SWE-135 - Static Analysis ) 
    1. Code Quality: Use metrics such as cyclomatic complexity and static analysis results to ensure the code is maintainable and less prone to errors. Specifically, confirm that safety-critical software components have a cyclomatic complexity value of 15 or lower, or software developers must provide a technically acceptable rationale if this value is exceeded. (See SWE-220 - Cyclomatic Complexity for Safety-Critical Software, SWE-135 - Static Analysis) 
    2. Code Coverage: Confirm that 100% code test coverage is addressed for all identified software safety-critical software components or ensure that software developers provide a risk assessment explaining why the test coverage is impossible for the safety-critical code component. (See SWE-189 - Code Coverage Measurements, SWE-219 - Code Coverage for Safety Critical Software) 
    3. Software Volatility: Measure changes in the codebase to monitor stability and identify areas of frequent modification that may need more rigorous testing. (See SWE-200 - Software Requirements Volatility Metrics) 
    4. Verification Testing: The verification analysis activity ensures that the safety requirements for the software were properly flowed down from the system safety requirements, traced to test/test procedures, and that they have been adequality tested. (See SWE-066 - Perform Testing, SWE-071 - Update Test Plans and Procedures, SWE-192 - Software Hazardous Requirements, SWE-194 - Delivery Requirements Verification, and Topic  8.57 - Testing Analysis)  
    5. Validation Testing: Software validation is a software engineering activity that shows confirmation that the software product, as provided (or as it will be provided), fulfills its intended use in its intended environmentIn other words, validation testing ensures that “you built the right thing.” (See SWE-055 - Requirements Validation, SWE-070 - Models, Simulations, Tools, SWE-073 - Platform or Hi-Fidelity Simulations, and Topic 8.57 - Testing Analysis)
  10. Configuration Management: Ensure that strict configuration management is maintained to ensure that the correct software versions and configurations are used. See SWE-187 - Control of Software Items for more information. This reduces the risk of errors due to incorrect or inconsistent configurations, tracks changes, and maintains consistency. This also includes performing the SWE-187 - Control of Software Items tasking.
  11. Training and Documentation: Ensure comprehensive training and documentation for operators to minimize the chances of inadvertent actions is available. This includes clear instructions, warnings, and recovery procedures.

By performing these tasks, the space system can be designed to tolerate inadvertent operator actions and ensure safety and reliability.

7.5 Additional Guidance

Additional guidance related to this requirement may be found in the following materials in this Handbook:

  • No labels