- 1. The Requirement
- 2. Rationale
- 3. Guidance
- 4. Small Projects
- 5. Resources
- 6. Lessons Learned
- 7. Software Assurance
5.5.4 The project manager shall implement process assessments for all high-severity software non-conformances (closed loop process).
NPR 7150.2, NASA Software Engineering Requirements, does not include any notes for this requirement.
Click here to view the history of this requirement: SWE-204 History
1.3 Applicability Across Classes
Key: - Applicable | - Not Applicable
A & B = Always Safety Critical; C & D = Sometimes Safety Critical; E - F = Never Safety Critical.
Understand why the high severity software non-conformance or defect occurred and make process changes to avoid additional high severity software non-conformances or defects. To reduce software defects.
To reduce defects from occurring, we have to understand why the defect or software non-conformance occurred. Using a method like, Root Cause Analysis is a technique that will help you address the requirement. Root Cause Analysis is a structured evaluation method that identifies the root causes of an undesired outcome and the actions adequate to prevent a recurrence. In science and engineering, root cause analysis is a method of problem-solving used for identifying the root causes of faults or problems. Root cause analysis can be decomposed into four steps:
- Identify and describe clearly the problem.
- Establish a timeline from the normal situation up to the time the problem occurred.
- Distinguish between the root cause and other causal factors (e.g., using event correlation).
- Establish a causal graph between the root cause and the problem.
It is up to the project, engineering, and assurance to decide on the definition of high severity for their project. The intent of this requirement is to assess any critical or high severity software defect or non-conformance to find out why the defect happened and what could be done to avoid generating these types of defects in the future on the project.
Using the project definition of high severity defects, the project performs a root cause analysis to determine why the defect occurred and what could have been done to prevent it. This information generally serves as input to a remediation process whereby corrective actions are determined and taken to prevent the problem from reoccurring in the future.
Proactive management, conversely, consists in preventing problems from occurring. Many techniques can be used for this purpose, ranging from good practices in design to analyzing in detail problems that have already occurred, and taking actions to make sure they never reoccur. Speed is not as important here as the accuracy and precision of the diagnosis. The focus is on addressing the real cause of the problem rather than its effects.
A factor is considered the root cause of a problem if removing it prevents the problem from recurring. A causal factor, conversely, is one that affects an event's outcome but is not the root cause. Although removing a causal factor can benefit an outcome, it does not prevent its recurrence with certainty.
The goal of the requirement is to identify the root cause of the software problem, defect, or non-conformance. The next step is to trigger long-term corrective actions to address the root cause identified during root cause analysis and make sure that the problem does not resurface.
Definitions Pertaining to Root Cause Analysis Cause (Causal Factor)
An event or condition that results in an effect. Anything that shapes or influences the outcome.
The event(s) that occurred, including any condition(s) that existed immediately before the undesired outcome, directly resulted in its occurrence and, if eliminated or modified, would have prevented the undesired outcome. Also known as the direct cause(s).
One of the multiple factors (events, conditions, or organizational factors) that contributed to or created the proximate cause and subsequent undesired outcome and, if eliminated or modified, would have prevented the undesired outcome. Typically multiple root causes contribute to an undesired outcome.
Root Cause Analysis (RCA)
A structured evaluation method that identifies the root causes of an undesired outcome and the actions adequate to prevent a recurrence. Root cause analysis should continue until organizational factors have been identified, or until data are exhausted.
A real-time occurrence describing one discrete action, typically an error, failure, or malfunction. Examples: pipe broke, power lost, lightning struck, the person opened a valve, etc…
Any as-found state, whether or not resulting from an event, that may have safety, health, quality, security, operational, or environmental implications.
Any operational or management structural entity that exerts control over the system at any stage in its life cycle, including but not limited to the system’s concept development, design, fabrication, test, maintenance, operation, and disposal.
Examples: resource management (budget, staff, training); policy (content, implementation, verification); and management decisions.
An event or condition that may have contributed to the occurrence of an undesired outcome but, if eliminated or modified, would not by itself have prevented the occurrence.
A physical device or an administrative control used to reduce the risk of the undesired outcome to an acceptable level. Barriers can provide physical intervention (e.g., a guardrail) or procedural separation in time and space (e.g., lock-out-tag-out procedure).
Severity is defined as the degree of impact a Defect has on the development or operation of a component application being tested.
A higher effect on the system functionality will lead to the assignment of higher severity to the bug. Software Assurance engineer usually works with the engineering group to determine the severity level of defect
Higher the priority, the sooner the defect should be resolved.
Defects that leave the software system unusable should be given a higher priority over defects that cause a small functionality of the software to fail.
A business goal of all NASA software development organizations is to reduce software defects and non-conformances. The only way to succeed is to look at why the software defects and non-conformances occurred and fix the process or step that caused the software defects and non-conformances.
Additional guidance related to process assessments may be found in the following related requirements in this Handbook:
4. Small Projects
No additional guidance is available for small projects.
5.3 Training resources for NASA
6. Lessons Learned
6.1 NASA Lessons Learned
No Lessons Learned have currently been identified for this requirement.
6.2 Other Lessons Learned
No other Lessons Learned have currently been identified for this requirement.
7. Software Assurance
7.1 Tasking for Software Assurance
- Perform or confirm that a root cause analysis has been completed on all identified high severity software nonconformances, the results are recorded, and that the results have been assessed for adequacy.
- Confirm that the project analyzed the processes identified in the root cause analysis associated with the high severity software nonconformances.
- Assess opportunities for process improvement on the processes identified in the root cause analysis associated with the high severity software nonconformances.
- Perform or confirm tracking of corrective actions to closure on high severity software non-conformances.
7.2 Software Assurance Products
- Root Cause Analysis (Includes results, and any problem reports or findings recorded from Analysis)
- Record of Corrective Action Closures (Confirmation or trending showing closure status of findings/problem reports.)
- SA assessment of process improvement opportunities.
Definition of objective evidence
- Evidence that confirmation of Task 2 has occurred.
Objective evidence is an unbiased, documented fact showing that an activity was confirmed or performed by the software assurance/safety person(s). The evidence for confirmation of the activity can take any number of different forms, depending on the activity in the task. Examples are:
- Observations, findings, issues, risks found by the SA/safety person and may be expressed in an audit or checklist record, email, memo or entry into a tracking system (e.g. Risk Log).
- Meeting minutes with attendance lists or SA meeting notes or assessments of the activities and recorded in the project repository.
- Status report, email or memo containing statements that confirmation has been performed with date (a checklist of confirmations could be used to record when each confirmation has been done!).
- Signatures on SA reviewed or witnessed products or activities, or
- Status report, email or memo containing Short summary of information gained by performing the activity. Some examples of using a “short summary” as objective evidence of a confirmation are:
- To confirm that: “IV&V Program Execution exists”, the summary might be: IV&V Plan is in draft state. It is expected to be complete by (some date).
- To confirm that: “Traceability between software requirements and hazards with SW contributions exists”, the summary might be x% of the hazards with software contributions are traced to the requirements.
- # of Root Cause Analyses performed
- # of Non-Conformances identified by each root cause analysis
- # of Corrective Actions (CAs) raised by SA vs. total #
- Attributes (Type, Severity, # of days Open, Life-cycle Phase Found)
- State (Open, In work, Closed)
- Trends of CA closures over time
- Trend the # of inconsistencies or corrective actions identified, and # closed
- Total # of Non-Conformances over time (Open, Closed, # of days Open, and Severity of Open)
- # of Non-Conformances in current reporting period (Open, Closed, Severity)
- # of software process Non-Conformances by life-cycle phase over time
- # of software work product Non-Conformances identified by life-cycle phase over time
Task 1: Perform a root cause analysis on all identified high severity software non-conformances and record the results.
Software assurance personnel review the list of non-conformances and chooses all those non-conformances that are marked as “high priority.” Typically, those will be the non-conformances that cause a complete crash of the software, those that cause a major problem such as preventing a primary software function from performing, allowing a hazard to occur or producing erroneous critical results. The high priority non-conformances all fall into a category where the software could not be released for use until these non-conformances are fixed or a work-around is identified.
Examine each of these non-conformances to determine the underlying cause of the failure. If the non-conformances seem to be related, it might be possible to do the root cause analysis as a group, rather than individually, but it is important to make sure that the analysis delves into the problem deeply enough to either confirm that the root cause is the same issue or identifies the individual root causes of each non-conformance.
Follow a typical root cause analysis process which usually consists of these basic steps:
- Define the problem
- Collect the data relating to the problem
- Identify what is causing the problem
- Prioritize the causes
- Identify solutions to the underlying problem
- Implement the change
- Monitor and sustain
To “define the problem” identify the high priority non-conformances that need a root cause analysis and determine whether there are ones that might be done as a group or whether an individual analysis should be done for each.
For “Collect the data relating to the problem,” get a good understanding of what actually happened when the non-conformance occurred. Think about: When does the problem occur? What is the software doing when the problem occurs? Is there a particular activity that seems to cause the problem to occur? How was the problem discovered? Collect any available details about the failure.
For “Identify what is causing the problem,” think about all the possible causes of the problem. There are several techniques that can be used to help with this process. One of the most popular is the Fishbone process where many possible causes are identified and then sorted into useful categories. The “bones” of the fish are each labeled with a broad category of possible causes and then populated using brainstorming to identify potential causes for this case.
“Prioritize the causes”- From a list of the potential causes or from something like a fishbone chart, the most likely causes are selected.
“Identify solutions to the underlying problem” – The most likely causes are further explored (or tested until they are narrowed down to the actual cause or causes. Then the solution can be found to correct the problem (non-conformance)
“Implement the change” – This step is done by the developers, but software assurance assures that the change is correctly implemented and actually fixes the non-conformance. It is also important to review other areas in the code where a similar non-conformance might exist and assure that those areas are also fixed.
“Monitor and sustain” -In addition to assuring that similar non-conformances in other areas of the code are also fixed, software assurance should check for other problems with similar causes when reviewing other parts of the system or when checking on changes to the system.
Task 2: Confirm that the project analyzed the processes identified in the root cause analysis associated with the high severity software non-conformances. If any of the processes used were determined to be root causes of the non-conformance, think about possible changes and improvements to the process could prevent future similar non-conformances.
Task 3: Assess opportunities for process improvement on the processes identified as a causing factor in the root cause analysis associated with the high severity software non-conformances. Decide whether any changes should be made to the processes in use, what to change, and how to determine whether the change was effective in preventing a similar high severity non-conformance.
Task 4: Perform tracking of corrective actions to closure on high severity software non-conformances.
As part of completing the root cause analysis, software assurance will track all the corrective actions identified to closure.
Record the number of root cause analyses that software assurance has done on the project and make a list of the root causes found on each analysis. This list of root causes can be used on future reviews of the system to improve the quality of the software by paying close attention to these root causes and preventing similar issues.