UNDER CONSTRUCTION

1. NASA-STD-8739.8B Title Material

1. SCOPE

1.1 Document Purpose

1.1.1 The purpose of the Software Assurance and Software Safety Standard is to define the requirements to implement a systematic approach to software assurance, software safety, and Independent Verification and Validation (IV&V) for software created, acquired, provided, used, or maintained by or for NASA. Various personnel in the program, project, engineering, facility, or Safety and Mission Assurance (SMA) organizations can perform the activities required to satisfy these requirements. The Software Assurance and Software Safety Standard provides a basis for personnel to perform software assurance, software safety, and IV&V activities consistently throughout the life of the software.

1.1.2 The Software Assurance and Software Safety Standard, in accordance with NPR 7150.2, NASA Software Engineering Requirements, supports the implementation of the software assurance, software safety, and IV&V sub-disciplines. The application and approach to meeting the Software Assurance and Software Safety Standard vary based on the system and software products and processes to which they are applied. The Software Assurance and Software Safety Standard stresses coordination between the software assurance sub-disciplines and system safety, system reliability, hardware quality, system security, and software engineering to maintain the system perspective and minimize duplication of effort.

1.1.3 The objectives of the Software Assurance and Software Safety Standard include the following:

a. Ensuring that the processes, procedures, and products used to produce and sustain the software conform to all specified requirements and standards that govern those processes, procedures, and products.

(1) A set of activities that assess adherence to, and the adequacy of the software processes used to develop and modify software products.
(2) A set of activities that define and assess the adequacy of software processes to provide evidence that establishes confidence that the software processes are appropriate for and produce software products of suitable quality for their intended purposes.

b. Determining the degree of software quality obtained by the software products.
c. Ensuring that the software systems are safe and that the software safety-critical requirements are followed.
d. Ensuring that the software systems are secure.
e. Employing rigorous analysis and testing methodologies to identify objective evidence and conclusions to provide an independent assessment of critical products and processes throughout the life cycle.

1.1.4 The Software Assurance and Software Safety Standard is compatible with all software life cycle models. The Software Assurance and Software Safety Standard does not impose a particular life cycle model on a software project.

1.1.5 In this standard, all mandatory actions (i.e., requirements) are denoted by statements containing the term “shall.” The terms “may” denote a discretionary privilege or permission; “can” denotes statements of possibility or capability; “should” denotes a good practice and is recommended; but not required, “will” denotes expected outcome; and “are/is” denotes descriptive material.

1.2 Applicability

1.2.1 This standard is approved for use by NASA Headquarters and NASA Centers, including Component Facilities and Technical and Service Support Centers. This NASA Technical Standard applies to the assurance of software created by or for NASA projects, programs, facilities, and activities and defines the requirements for those activities. This directive is applicable to the Jet Propulsion Laboratory, a Federally Funded Research and Development Center, only to the extent specified in the NASA/Caltech Prime Contract. This standard may also apply to other contractors, grant recipients, or parties to agreements to the extent specified or referenced in their contracts, grants, or agreements.

1.3 Documentation and Deliverables

1.3.1 The Software Assurance and Software Safety Standard is not intended to designate the format of program/project/facility documentation and deliverables. The software assurance and software safety data, information, and plans may be considered to be quality records with a retention period as specified in NRRS 1441.1. The format of the documentation is a program/project/facility decision. The software assurance and software safety organizations should keep records, reports, metrics, analyses, and trending results and should keep copies of their project plans for future reference and improvements. The software assurance and software safety plans (e.g., the Software Assurance Plan) can be standalone documents or incorporated within other documents (e.g., part of a Software Management Plan, a Software Development Plan or part of a Program or Project Safety and Mission Assurance (SMA) plan).

1.4 Request for Relief

1.4.1 Tailoring of this standard for application to a specific program or project is documented as part of program or project requirements and approved by the responsible Center Technical Authority (TA) in accordance with NPR 8715.3, NASA General Safety Program Requirements. Section 4.5 of this standard contains the principles related to tailoring this standard’s requirements.

2. APPLICABLE AND REFERENCE DOCUMENTS

2.1 Applicable Documents

The applicable documents are accessible via the NASA Technical Standards System at https://standards.nasa.gov, or the NASA Online Directives Information System https://nodis3.gsfc.nasa.gov/main_lib.cfm or may be obtained directly from the Standards Developing Organizations.

NPR 1400.1 NASA Directives and Charters Procedural Requirements
NPR 7120.5 NASA Space Flight Program and Project Management Requirements
NPR 7120.10 Technical Standards for NASA Programs and Projects
PR 7150.2 NASA Software Engineering Requirements
PR 8000.4 Agency Risk Management Procedural Requirements
NPR 8715.3 NASA General Safety Program Requirements
NASA-HDBK-2203 NASA Software Engineering Handbook
NASA-HDBK-4008 Programmable Logic Devices Handbook
NRRS 1441.1 NASA Records Retention Schedules

2.2 Reference Documents

The reference documents listed in this section are not incorporated by reference within this standard but may provide further clarification and guidance.

2.2.1 Government Documents

NPD 2810.1 NASA Information Security Policy
NPD 8720.1 NASA Reliability and Maintainability Program Policy
NPR 1441.1 NASA Records Management Program Requirements
NPR 2210.1 Release of NASA Software
NPR 2810.1 Security of Information and Information Systems
NPR 2830.1 NASA Enterprise Architecture Procedures
NPR 2841.1 Identity, Credential, and Access Management
NPR 7120.7 NASA Information Technology Program and Project Management Requirements
NPR 7120.8 NASA Research and Technology Program and Project Management Requirements
NPR 7120.10 Technical Standards for NASA Programs and Projects
NPR 7120.11 Health and Medical Technical Authority Implementation
NPR 7123.1 NASA Systems Engineering Processes and Requirements.
NPR 8000.4 Agency Risk Management Procedural Requirements
NPR 7123.1 NASA Systems Engineering Processes and Requirements
NASA-STD-1006 Space System Protection Standard
NASA-STD-2601 Minimum Cybersecurity Requirements for Computing Systems
NASA-STD-7009 Standard for Models and Simulations
NASA-STD-8729.1 NASA Reliability And Maintainability Standard For Spaceflight And Support Systems
NASA-HDBK-7009 NASA Handbook for Models and Simulations: An Implementation Guide for NASA-STD-7009
NASA-HDBK-8709.22 Safety and Mission Assurance Acronyms, Abbreviations, and Definitions
NASA-HDBK-8739.23 NASA Complex Electronics Handbook for Assurance Professionals
NIST SP 800-37 Risk Management Framework
NIST SP 800-40 Guide to Enterprise Patch Management Planning: Preventive Maintenance for Technology
NIST SP 800-53 Security and Privacy Controls for Information Systems and Organizations
NIST SP 800-70 National Checklist Program for Information Technology products: Guidelines for Checklist Users and Developers
NIST SP 800-115 Technical Guide to Information Security Testing and Assessment
NFS 1813.301-79 Supporting Federal Policies, Regulations, and NASA Procedural Requirements
NFS 1852.237-72 Access to Sensitive Information
NFS 1852.237-73 Release of Sensitive Information

2.2.2 Non-Government Documents

CMMI-DEV, V2.0 CMMI® for Development, Version 2.0
IEEE 730 Institute of Electrical and Electronics Engineers (IEEE) Standard for Software Quality Assurance Processes
IEEE 828 IEEE Standard for Configuration Management in Systems and Software Engineering.
IEEE 982.1 IEEE Standard Measures of the Software Aspects of Dependability
IEEE 1012 IEEE Standard for System, Software, and Hardware Verification and Validation
IEEE 1028 IEEE Standard for Software Reviews and Audits
IEEE 1633 IEEE Recommended Practice on Software Reliability
IEEE 15026-1 Systems and software engineering--Systems and software assurance--Part 1: Concepts and vocabulary
IEEE 29119-4 Software and systems engineering -- Software testing -- Part 4: Test techniques
ISO 26514 Systems and software engineering–requirements for designers and developers of user documentation
ISO 24765 System and Software Engineering – Vocabulary

2.3 Order of Precedence

2.3.1 This standard establishes requirements to implement a systematic approach to Software Assurance, Software Safety, and IV&V for software created, acquired, provided, or maintained by or for NASA but does not supersede nor waive established Agency requirements found in other documentation.

2.3.2 Conflicts between the Software Assurance and Software Safety Standard and other requirements documents are resolved by the responsible SMA and engineering TA(s), per NPR1400.1, NASA Directives and Charters Procedural Requirements, and NPR 7120.10, Technical Standards for NASA Programs and Projects.

3. ACRONYMS AND DEFINITIONS

3.1 Acronyms and Abbreviations

CMMI®? Capability Maturity Model Integration
COTS Commercial-Off-The-Shelf
GOTS Government-Off-The-Shelf
HDBK Handbook
IEEE Institute of Electrical and Electronics Engineers
IPEP IV&V Project Execution Plan
IV&V Independent Verification and Validation
MC/DC Modified Condition/Decision Coverage
MOTS Modified-Off-The-Shelf
NASA National Aeronautics and Space Administration
NIST National Institute of Standards and Technology
NPD NASA Policy Directive
NPR NASA Procedural Requirements
NRRS NASA Records Retention Schedule
OSMA NASA Headquarters Office, Safety and Mission Assurance
OSS Open Source Software
PLC Programmable Logic Controller
PROM Programmable Read-Only Memory
RTOS Real-Time Operating System
SMA Safety and Mission Assurance
SP Special Publication
SWE Software Engineering
TA Technical Authority

3.2 Definitions

Accredit. The official acceptance of a software development tool, model, or simulation, including associated data, to use for a specific purpose.
Acquirer. The entity or individual who specifies the requirements and accepts the resulting software products. The Acquirer is usually NASA or an organization within the Agency but can also refer to the prime contractor-subcontractor relationship.

Analyze. Review results in-depth, look at relationships of activities, examine methodologies in detail, and follow methodologies such as Failure Mode and Effects Analysis, Fault Tree Analysis, trending, and metrics analysis. Examine processes, plans, products, and task lists for completeness, consistency, accuracy, reasonableness, and compliance with requirements. The analysis may include identifying missing, incomplete, or inaccurate products, relationships, deliverables, activities, required actions, etc.

Approve. When the responsible originating official, or designated decision authority, of a document, report, condition, etc., has agreed, via their signature, to the content and indicates the document is ready for release, baselining, distribution, etc. Usually, one “approver” and several stakeholders need to “concur” for official acceptance of a document, report, etc. For example, the project manager would approve the Software Development Plan, but SMA would concur on it.

Assess. Judge results against plans or work product requirements. Assess includes judging for practicality, timeliness, correctness, completeness, compliance, evaluation of rationale, etc., reviewing activities performed, and independently tracking corrective actions to closure.
Assure. When software assurance personnel make certain that others have performed the specified software assurance, management, and engineering activities.

Audit. Formal review to assess compliance with hardware or software requirements, specifications, baselines, safety standards, procedures, instructions, codes, and contractual and licensing requirements. (Source NPR 8715.3)

Bi-directional Traceability. Association among two or more logical entities that are discernible in either direction (to and from an entity). (Source IEEE Definition)

Concur. A documented agreement that a proposed course of action is acceptable.

Condition. (1) measurable qualitative or quantitative attribute that is stipulated for a requirement and that indicates a circumstance or event under which a requirement applies (2) description of a contingency to be considered in the representation of a problem, or a reference to other procedures to be considered as part of the condition (3) true or false logical predicate (4) logical predicate involving one or more behavior model elements (5) Boolean expression containing no Boolean operators.

Configuration Item. (1)item or aggregation of hardware, software, or both that is designated for configuration management and treated as a single entity in the configuration management process (2)component of an infrastructure or an item which is or will be under control of configuration management (3) aggregation of work products that is designated for configuration management and treated as a single entity in the configuration management process (4) any system element or aggregation of system elements that satisfies an end use function and is designated by the acquirer for separate configuration control (5) item or aggregation of software that is designed to be managed as a single entity and its underlying components, such as documentation, data structures, scripts. (Source IEEE Definition)

Configuration items can vary widely in complexity, size, and type, ranging from an entire system including all hardware, software, and documentation, to a single module or a minor hardware component. CIs have four common characteristics: defined functionality; replaceable as an entity; unique specification; formal control of form, fit, and function. See Also: hardware configuration item, computer software configuration item, configuration identification, and critical item.

Confirm. Check to see that activities specified in the software engineering requirements are adequately done and evidence of the activities exists as proof. Confirm includes ensuring activities are done completely and correctly and have expected content according to approved tailoring.

Critical. A condition that may cause severe injury or occupational illness, or major property damage to facilities, systems, or flight hardware.
Deliverable. Product or item that has to be completed and delivered under the terms of an agreement or contract. Products may also be deliverables, e.g., software requirements specifications, and detailed design documents. Develop. To produce or create a product or document and mature or advance the product or document content.

Ensure. When software assurance or software safety personnel perform the specified software assurance and software safety activities themselves.
Event. (1) occurrence of a particular set of circumstances (2) external or internal stimulus used for synchronization purposes (3) change detectable by the subject software (4) fact that an action has taken place (5) singular moment in time at which some perceptible phenomenological change (energy, matter, or information) occurs at the port of a unit.

Failure. Inability of a system, subsystem, component, or part to perform its required function within specified limits. (Source NPR 8715.3)

Hazard. A state or a set of conditions, internal or external to a system that has the potential to cause harm. (Source NPR 8715.3)
Hazard Analysis. Identifying and evaluating existing and potential hazards and the recommended mitigation for the hazard sources found.
Hazard Control. Means of reducing the risk of exposure to a hazard. (Source NPR 8715.3)

Hazardous Operation/Work Activity. Any operation or other work activity that, without the implementation of proper mitigations, has a high potential to result in loss of life, serious injury to personnel or public, or damage to property due to the material or equipment involved or the nature of the operation/activity itself.

Independent Verification and Validation. Verification and validation performed by an organization that is technically, managerially, and financially independent of the development organization. (Source IEEE Definition)

Inhibit. Design feature that prevents the operation of a function.

Insight. An element of Government surveillance that monitors contractor compliance using Government-identified metrics and contracted milestones. Insight is a continuum that can range from low intensity such as reviewing quarterly reports to high intensity such as performing surveys and reviews. (Source NPR 7123.1)

Maintain. To continue to have; to keep in existence, to stay up-to-date and correct.

Mission Critical. [1] Item or function that must retain its operational capability to assure no mission failure (i.e., for mission success). [2] An item or function, the failure of which may result in the inability to retain operational capability for mission continuation if corrective action is not successfully performed. (Source NASA-STD-8729.1)

Mission Success. Meeting all mission objectives and requirements for performance and safety. (Source NPR 8715.3)

Monitor. (1) software tool or hardware device that operates concurrently with a system or component and supervises, records, analyzes, or verifies the operation of the system or component; (2) collect project performance data with respect to a plan, process, produce performance measures, and report and disseminate performance information.

Participate. To be a part of the activity, audit, review, meeting, or assessment.

Perform. Software assurance does the action specified. Perform may include making comparisons of independent results with similar activities performed by engineering; performing audits; and reporting results to engineering.

Product. A result of a physical, analytical, or another process. The item delivered to the customer (e.g., hardware, software, test reports, data) and the processes (e.g., system engineering, design, test, logistics) that make the product possible. (Source NASA-HDBK-8709.22)

Program. A strategic investment by a Mission Directorate or Mission Support Office that has a defined architecture and technical approach, requirements, funding level, and management structure that initiates and directs one or more projects. A program implements a strategic direction that the Agency has identified as needed to accomplish Agency goals and objectives. (Source NPR 7120.5)

Program Manager. A generic term for the person who is formally assigned to be in charge of the program. A program manager could be designated as a program lead, program director, or some other term, as defined in the program's governing document. A program manager is responsible for the formulation and implementation of the program, per the governing document with the sponsoring MDAA.

Project. A specific investment having defined goals, objectives, requirements, life cycle cost, a beginning, and an end. A project yields new or revised products or services that directly address NASA’s strategic needs. They may be performed wholly in-house; by Government, industry, academia partnerships; or through contracts with private industry. (Source NPR 7150.2)

Project Manager. The entity or individual who accepts the resulting software products. Project managers are responsible and accountable for the safe conduct and successful outcome of their program or project in conformance with governing programmatic requirements. The project manager is usually NASA but can also refer to the prime contractor-subcontractor relationship as well.
Provider. A Provider is a NASA or contractor organization that is tasked by an accountable organization (i.e., the Acquirer) to produce a product or service. (Source NASA-HDBK-8709.22)

Regression testing. (1) selective retesting of a system or component to verify that modifications have not caused unintended effects and that the system or component still complies with its specified requirements (2) testing following modifications to a test item or its operational environment, to identify whether regression failures occur. (Source IEEE Definition)

Risk. The combination of (1) the probability (qualitative or quantitative) of experiencing an undesired event, (2) the consequences, impact, or severity that would occur if the undesired event were to occur, and (3) the uncertainties associated with the probability and consequences. (Source NPR 8715.3)

A risk is an uncertain future event, or combination of events, that could threaten the achievement of performance objectives or requirements. A "problem," on the other hand, describes an issue that is certain or near certain to exist now, or an event that has been determined with certainty or near certainty to have occurred and is threatening the achievement of an objective or requirement. It is generally at the discretion of the decision authority to define at what level of certainty (i.e., likelihood) an event may be classified and addressed as a “problem” rather than as a “risk.” A risk may be conditional upon a problem, i.e., an existing issue may or may not develop into performance-objective consequences or the extent to which it may be at present uncertain.

Risk Posture. A characterization of risk based on conditions (e.g., criticality, complexity, environments, performance, cost, schedule) and a set of identified risks, taken as a whole which allows an understanding of the overall risk or provides a target risk range or level, which can then be used to support decisions being made.

Safe State. A system state in which hazards are inhibited, and all hazardous actuators are in a non-hazardous state. The system can have more than one Safe State.

Safety. Freedom from those conditions that can cause death, injury, occupational illness, damage to or loss of equipment or property, or damage to the environment. In a risk-informed context, safety is an overall mission and program condition that provides sufficient assurance that accidents will not result from the mission execution or program implementation, or, if they occur, their consequences will be mitigated. This assurance is established by means of the satisfaction of a combination of deterministic criteria and risk criteria. (Source NPR 8715.3)

Safety Analysis. Generic term for a family of analyses, which includes but is not limited to, preliminary hazard analysis, system (subsystem) hazard analysis, operating hazard analysis, software hazard analysis, sneak circuit, and others. Software safety analysis consists of a number of tools and techniques to identify safety risks and formulate effective controls. These techniques are used to help identify the hazards during the Hazard Analysis process, which in turn identifies the safety-critical software. The Safety Analysis techniques often used to support the Hazard Analysis are the Software Fault Tree Analysis and the Software Failure Modes and Effects Analysis. The Software Fault Tree Analysis and the Software Failure Modes and Effects Analysis are used to help identify hazards, hazard causes, and potential failure modes.

Safety-Critical. A term describing any condition, event, operation, process, equipment, or system that could cause or lead to severe injury, major damage, or mission failure if performed or built improperly or allowed to remain uncorrected. (Source NPR 8715.3)

Safety-Critical Software. Software is classified as safety-critical if the software is determined by and traceable to a hazard analysis. Software is classified as safety-critical if it meets at least one of the following criteria:

a. Causes or contributes to a system hazardous condition/event,
b. Controls functions identified in a system hazard,
c. Provides mitigation for a system hazardous condition/event,
d. Mitigates damage if a hazardous condition/event occurs,
e. Detects, reports, and takes corrective action if the system reaches a potentially hazardous state.

Software. defined as (1) computer programs, procedures, and associated documentation and data pertaining to the operation of a computer system (2) all or a part of the programs, procedures, rules, and associated documentation of an information processing (3) program or set of programs used to run a computer (4) all or part of the programs which process or support the processing of digital information (5) part of a product that is the computer program or the set of computer programs. This definition applies to software developed by NASA, software developed for NASA, software maintained by or for NASA, Commercial-Off-The-Shelf (COTS), Government-Off-The-Shelf (GOTS), Modified-Off-The-Shelf (MOTS), Open Source Software (OSS), reused software components, auto-generated code, embedded software, the software executed on processors embedded in programmable logic devices (see NASA-HDBK-4008), legacy, heritage, applications, freeware, shareware, trial or demonstration software, and OSS components. (Source NPR 7150.2)

Software Assurance. (1) a set of activities that assess adherence to, and the adequacy of the software processes used to develop and modify software products. Software assurance also determines the degree to which the desired results from software quality control are being obtained. (2) set of activities that define and assess the adequacy of software processes to provide evidence that establishes confidence that the software processes are appropriate for and produce software products of suitable quality for their intended purposes. (Source IEEE Definition)

A key attribute of software assurance is the objectivity of the software assurance function with respect to the project.

Software Developer. A person, organization, or system that develops software based on program/project requirements.

Software Life Cycle. The period that begins when a software product is conceived and ends when the software is no longer available for use. The software life cycle typically includes a concept phase, requirements phase, design phase, implementation phase, test phase, installation and checkout phase, operation and maintenance phase, and sometimes, retirement phase.

Software Peer Review. An examination of a software product to detect and identify software anomalies, including errors and deviations from standards and specifications. (Source IEEE Definition)

Software Safety. The aspects of software engineering, system safety, and software assurance, that provide a systematic approach to identifying, analyzing, tracking, mitigating, and controlling hazards and hazardous functions of a system where software may contribute either to the hazard(s) or to its detection, mitigation or control, to ensure safe operation of the system.

Software Validation. (1)confirmation, through the provision of objective evidence, that the requirements for a specific intended use or application have been fulfilled (2) process of providing evidence that the system, software, or hardware and its associated products satisfy requirements allocated to it at the end of each life cycle activity, solve the right problem (e.g., correctly model physical laws, implement business rules, and use the proper system assumptions), and satisfy intended use and user needs (3) the assurance that a product, service, or system meets the needs of the customer and other identified stakeholders (4) process of evaluating a system or component during or at the end of the development process to determine whether it satisfies specified requirements (5) confirmation in a timely manner, through automated techniques where possible, through the provision of objective evidence, that the requirements for a specific intended use or application have been fulfilled. (Source IEEE Definition)

Note: Validation in a system life cycle context is the set of activities ensuring and gaining confidence that a system is able to accomplish its intended use, goals, and objectives (meet stakeholder requirements) in the intended operational environment. The right system has been built or is operating to meet business objectives. Validation demonstrates that the system can be used by the users for their specific tasks. "Validated" is used to designate the corresponding status. Multiple Validation can be carried out if there are different intended uses.

Software Verification. Confirmation that products properly reflect the requirements specified for them. In other words, verification ensures that “you built it right.” (Source IEEE Definition)

Supplier. Any organization which provides a product or service to a customer. By this definition, suppliers may include vendors, subcontractors, contractors, flight programs/projects, and the NASA organization supplying science data to a principal investigator. The classical definition of a supplier is a subcontractor, at any tier, performing contract services or producing the contract articles for a contractor. (Source NASA-HDBK-8709.22)
System Safety. Application of engineering and management principles, criteria, and techniques to optimize safety and reduce risks within the constraints of operational effectiveness, time, and cost.

Tailoring. The process used to adjust a prescribed requirement to accommodate the needs of a specific task or activity (e.g., program or project). Tailoring may result in changes, subtractions, or additions to a typical implementation of the requirement. (Source NPR 7150.2)

Track. To follow and note the course or progress of the product.

4. SOFTWARE ASSURANCE AND SOFTWARE SAFETY REQUIREMENTS

4.2 Safety-Critical Software Determination

Software is classified as safety-critical if the software is determined by and traceable to a hazard analysis. Software is classified as safety-critical if it meets at least one of the following criteria:

a. Causes or contributes to a system hazardous condition/event,

b. Controls functions identified in a system hazard,

c. Provides mitigation for a system hazardous condition/event,

d. Mitigates damage if a hazardous condition/event occurs,

e. Detects, reports, and takes corrective action if the system reaches a potentially hazardous state.

See Appendix A for guidelines associated with addressing software in hazard definitions. See Table 1, 3.7.1, SWE-205 for more details. Consideration for other independent means of protection (software, hardware, barriers, or administrative) should be a part of the system hazard definition process.

4.3 Software Assurance and Software Safety Requirements

4.3.1 The responsible project manager shall ensure the performance of the software assurance, software safety, and IV&V activities, the applicable requirements are defined in Table 1. In this document, the phrase “Software Assurance and Software Safety Tasks” means that the roles and responsibilities for completing these requirements may be delegated within the project consistent with the scope and scale of the project. The Center SMA Director designates SMA TA(s) for programs, facilities, and projects, providing direction, functional oversight, and assessment for all Agency software assurance, software safety, and IV&V activities.

Table 1. Software Assurance and Software Safety Requirements Mapping Matrix

4.4 Independent Verification & Validation Requirements

4.4.1 IV&V Overview

4.4.1.1 IV&V is a technical discipline of software assurance that employs rigorous analysis and testing methodologies to identify objective evidence and conclusions to provide an independent assessment of critical products and processes throughout the software development life The evaluation of products and processes throughout the life cycle demonstrates whether the software is fit for nominal operations (required functionality, safety, dependability, etc.) and off-nominal conditions (response to faults, responses to hazardous conditions, etc.). The goal of the IV&V effort is to contribute assurance conclusions provided to the project and stakeholders based on evidence found in software development artifacts and risks associated with the intended behaviors of the software.

4.4.1.2 Three parameters define the independence of IV&V: technical independence, managerial independence, and financial independence.

a. Technical independence requires that the personnel performing the IV&V analysis are not involved in the development of the system or its elements. The IV&V team establishes an understanding of the problem and how the system addresses the problem. Through technical independence, the IV&V team’s different perspective allows it to detect subtle errors overlooked by personnel focused on developing the system.

b. Managerial independence requires that the personnel performing the IV&V analysis are not in the same organization as the development and program management team. Managerial independence also means that the IV&V team makes its own decisions about which segments of the system and its software to analyze and test, chooses the IV&V analysis methods to apply, and defines the IV&V schedule of activities. While independent from the development and program management organization, the IV&V team provides its findings in a timely manner to both of those organizations. The submission of findings to the program management organization should not include any restrictions (e.g., requiring the approval of the development organization) or any other adverse pressures from the development group.

c. Financial independence requires that the control of the IV&V budget be vested in a group independent of the software development organization. Financial independence does not necessarily mean that the IV&V team controls the budget but that the finances should be structured so that funding is available for the IV&V team to complete its analysis or test work. No adverse financial pressure or influence is applied.

4.4.1.3 The IV&V process starts early in the software development life cycle, providing feedback to the IV&V provider organization, allowing the IV&V team to modify products at optimal timeframes and in a timely fashion, thereby reducing overall project risk. The feedback also answers project stakeholders’ questions about system properties (correctness, robustness, safety, security, etc.) to make informed decisions with respect to the development and acceptance of the system and its software.

4.4.1.4 The IV&V provider performs two primary activities, often concurrently: verification and validation. Each of the activities provides a different perspective on the system/software.

a. Verification is the process of evaluating a system and its software to provide objective evidence as to whether or not a product conforms to the build-to requirements and design specifications. Verification holds from the requirements through the design and code and into testing. Verification demonstrates that the products of a given development phase satisfy the conditions imposed at the start of or during that phase.

b. Validation develops objective evidence that shows that the content of the engineering artifact is the right content for the developed system/software.

The content is accurate and correct if the objective evidence demonstrates that it satisfies the system requirements (e.g., user needs, stakeholder needs, etc.), fully describes the required capability/functionality needed, and solves the right problem.

4.4.1.5 The main goal of the IV&V effort is to identify and generate objective evidence that supports the correct operation of the system or refutes the correct operation of the system. The IV&V provider typically works with the development team to understand this objective evidence, which provides artifacts such as concept studies, operations concepts, and requirements that define the overall project. The IV&V provider uses these materials to develop an independent understanding of the project’s commitment to NASA, which forms the basis for validating lower-level technical artifacts.

4.4.1.6 Two principles help guide the development and use of objective evidence.

a. Performing IV&V throughout the entire development lifetime is the first principle; potential problems should be detected as early as possible in the development life Performing IV&V throughout the entire development lifetime provides the IV&V team with sufficient information to establish a basis for the analysis results and provides early objective evidence to the development and program management groups to help keep the development effort on track early in the life cycle.

b. The second principle is “appropriate assurance.” Given that it is not possible to provide IV&V on all aspects of a project’s software, the IV&V provider and project should balance risks against available resources to define an IV&V program for each project that provides IV&V so that the software will operate correctly, safely, reliably, and securely throughout its operational lifetime. The IPEP documents this tailored approach and summarizes the cost/benefit trade-offs made in the scoping process.

4.4.1.7 The IV&V requirements are analyzed and partitioned according to the type of artifact. The requirements do not imply or require the use of any specific life cycle model. It is also important to understand that IV&V applies to any life cycle development process. The IV&V requirements document the potential scope of analysis performed by the IV&V provider and the key responsibility of the software project to provide the information needed to perform that analysis. Additionally, the risk assessment is used to scope the IV&V analysis to help determine the prioritization of activities and the level of rigor associated with performing those activities. The scoping exercise results are captured in the IV&V Project Execution Plan, as documented below.

Appendix A. GUIDELINES FOR THE HAZARD DEVELOPMENT INVOLVING SOFTWARE

A.1 Software Contributions to Hazards

A.1.1 Hazard Analysis should consider software’s ability, by design, to cause or control a given hazard. It is a best practice to include the software within the system hazard analysis. The general hazard analysis should consider software common-mode failures that can occur in instances of redundant flight computers running the same software.

A.1.2 Software safety analysis supplements the system hazard analysis by assessing the software performing critical functions serving as a hazard cause or control. A typical software safety analysis process identifies the must work and must not work functions in the hazard reports. The system hazard analysis and software safety analysis process should assess each function for compliance with the levied functional software requirements. The system hazard analysis and software safety analysis also assure the redundancy management performed by the software supports fault tolerance requirements.

A.1.3 The second part of the safety review should complete the design analysis portion of software safety analysis. The software safety analysis supports a requirements gap analysis to identify gaps (SWE-184) and ensure the risk and control strategy documented in hazard reports is correct, as stated. The system hazards analysis and software safety analysis support test plans' analysis to assure adequate off-nominal scenarios (SWE-62, SWE-65). Finally, the system hazards analysis should verify the final implementation and uphold the analysis by ensuring test results permit the closure of hazard verifications (SWE-68).

A.1.4 Considerations when identifying software causes in a general software-centric hazard analysis are found in Table 2 below.

Software Cause Areas to Consider	Potential Software Causes
Data errors	Asynchronous communications Single or double event upset/bit flip or hardware induced error Communication to/from an unexpected system on the network An out-of-range input value, a value above or below the range Start-up or hardware initiation data errors Data from an antenna gets corrupted Failure of software interface to memory Failure of flight software to suppress outputs from a failed component Failure of software to monitor bus controller rates to ensure communication with all remote terminals on the bus schedule's avionics buses Ground or onboard database error Interface error Latent data Communication bus overload Missing or failed integrity checks on inputs, failure to check the validity of input/output data Excessive network traffic/babbling node - keeps the network so busy it inhibits communication from other nodes Sensors or actuators stuck at some value Wrong software state for the input
Commanding errors	1. Command buffer error or overflow 2. Corrupted software load 3. Error in real-time command build or sequence build 4. Failure to command during hazardous operations 5. Failure to perform prerequisite checks before the execution of safety-critical software commands 6. Ground or onboard database error for the command structure 7. Error in command data introduced by command server error 8. Incorrect operator input commands 9. Wrong command or a miscalculated command sent 10. Sequencing error, failure to issue commands in the correct sequence 11. Command sent in wrong software state or software in an incorrect or unanticipated state 12. An incorrect timestamp on the command 13. Missing software error handling on incorrect commands 14. Status messages on command execution not provided 15. Memory corruption, critical data variables overwritten in memory 16. Inconsistent syntax 17. Inconsistent command options 18. Similarly named commands 19. Inconsistent error handling rules 20. Incorrect automated command sequence built into script containing single commands that can remove multiple inhibits to a hazard
Flight computer errors	1. Board support package software error 2. Boot load software error 3. Boot Programmable Read-Only Memory (PROM) corruption preventing reset 4. Buffer overrun 5. CPU overload 6. Cycle jitter 7. Cycle over-run 8. Deadlock 9. Livelock 10. Reset during program upload (PROM corruption) 11. Reset with no restart 12. Single or double event upset/bit flip or hardware induced error 13. Time to reset greater than time to failure 14. Unintended persistent data/configuration on reset 15. Watchdog active during reboot causing infinite boot loop 16. Watchdog failure 17. Failure to detect and transition to redundant or backup computer 18. Incorrect or stale data in redundant or backup computer
Operating systems errors	1. Application software incompatibility with upgrades/patches to an operating system 2. Defects in Real-Time Operating System (RTOS) Board Support software 3. Missing or incorrect software error handling 4. Partitioning errors 5. Shared resource errors 6. Single or double event upset/bit flip 7. Unexpected operating system software response to user input 8. Excessive functionality 9. Missing function 10. Wrong function 11. Inadequate protection against operating system bugs 12. Unexpected and aberrant software behavior
Programmable logic device errors	1. High cyclomatic complexity levels (above 15) 2. Errors in programming and simulation tools used for Programmable Logic Controller (PLC) development 3. Errors in the programmable logic device interfaces 4. Errors in the logic design 5. Missing software error handling in the logic design 6. PLC logic/sequence error 7. Single or double event upset/bit flip or hardware induced error 8. Timing errors 9. Unexpected operating system software response to user input 10. Excessive functionality 11. Missing function 12. Wrong function 13. Unexpected and aberrant software behavior
Flight system time management errors	1. Incorrect data latency/sampling rates 2. Failure to terminate/complete process in a given time 3. Incorrect time sync 4. Latent data (Data delayed or not provided in required time) 5. Mission elapsed time timing issues and distribution 6. Incorrect function execution, performing a function at the wrong time, out of sequence, or when the program is in the wrong state 7. Race conditions 8. The software cannot respond to an off-nominal condition within the time needed to prevent a hazardous event 9. Time function runs fast/slow 10. Time skips (e.g., Global Positioning System time correction) 11. Loss or incorrect time sync across flight system components 12. Loss or incorrect time Synchronization between ground and spacecraft Interfaces 13. Unclear software timing requirements 14. Asynchronous systems or components 15. Deadlock conditions 16. Livelocks conditions
Coding, logic, and algorithm failures, algorithm specification errors	1. Auto-coding errors as a cause 2. Bad configuration data/no checks on external input files and data 3. Division by zero 4. Wrong sign 5. Syntax errors 6. Error coding software algorithm 7. Error in positioning algorithm 8. Case/type/conversion error/unit mismatch 9. Buffer overflows 10. High cyclomatic complexity levels (above 15) 11. Dead code or unused code 12. Endless do loops 13. Erroneous outputs 14. Failure of flight computer software to transition to or operate in a correct mode or state 15. Failure to check safety-critical outputs for reasonableness and hazardous values and correct timing 16. Failure to generate a process error upon detection of arithmetic error (such as divide-by-zero) 17. Failure to create a software error log report when an unexpected event occurs 18. Inadvertent memory modification 19. Incorrect "if-then" and incorrect "else" 20. Missing default case in a switch statement 21. Incorrect implementation of a software change, software defect, or software non-conformance 22. Incorrect number of functions or mathematical iteration 23. Incorrect software operation if no commands are received or if a loss of commanding capability exists (inability to issue commands) 24. Insufficient or poor coding reviews, inadequate software peer reviews 25. Insufficient use of coding standards 26. Interface errors 27. Missing or inadequate static analysis checks on code 28. Missing or incorrect parameter range and boundary checking 29. Non-functional loops 30. Overflow or underflow in the calculation 31. Precision mismatch 32. Resource contention (e.g., thrashing: two or more processes accessing a shared resource) 33. Rounding or truncation fault 34. Sequencing error (e.g., failure to issue commands in the correct sequence) 35. Software is initialized to an unknown state; failure to properly initialize all system and local variables are upon startup, including clocks 36. Too many or too few parameters for the called function 37. Undefined or non-initialized data 38. Untested COTS, MOTS, or reused code 39. Incomplete end-to-end testing 40. Incomplete or missing software stress test 41. Errors in the data dictionary or data dictionary processes 42. Confusing feature names 43. More than one name for the same feature 44. Repeated code modules 45. Failure to initialize a loop-control 46. Failure to initialize (or reinitialize) pointers 47. Failure to initialize (or reinitialize) registers 48. Failure to clear a flag 49. Scalability errors 50. Unexpected new behavior or defects introduced in newer or updated COTS modules 51. Not addressing pointer closure
Fault tolerance and fault management errors	1. Missing software error handling 2. Missing or incorrect fault detection logic 3. Missing or incorrect fault recovery logic 4. Problems with the execution of emergency safing operations 5. Failure to halt all hazard functions after an interlock failure 6. The software cannot respond to an off-nominal condition within the time needed to prevent a hazardous event 7. Common mode software faults 8. A hazard causal factor occurrence isn't detected 9. False positives in fault detection algorithms 10. Failure to perform prerequisite checks before the execution of safety-critical software commands 11. Failure to terminate/complete process in a given time 12. Memory corruption, critical data variables overwritten in memory 13. Single or double event upset/bit flip or hardware induced error 14. Incorrect interfaces, errors in interfaces 15. Missing self-test capabilities 16. Failing to consider stress on the hardware 17. Incomplete end-to-end testing 18. Incomplete or missing software stress test 19. Errors in the data dictionary or data dictionary processes 20. Failure to provide or ensure secure access for input data, commanding, and software modifications
Software process errors	1. Failure to implement software development processes or implementing inadequate processes 2. Inadequate software assurance support and reviews 3. Missing or inadequate software assurance audits 4. Failure to follow the documented software development processes 5. Missing, tailored, or incomplete implementation of the safety-critical software requirements in NPR 7150.2 6. Missing, tailored, or incomplete implementation of the safety-critical software requirements in Space Station Program 50038, Computer-Based Control System Safety Requirements 7. Incorrect or incomplete testing 8. Inadequate testing of reused or heritage software 9. Failure to open a software problem report when an unexpected event occurs 10. Failure to include hardware personnel in reviews of software changes, software implementation, peer reviews, and software testing 11. Failure to perform a safety review on all software changes and software defects 12. Defects in COTS, MOTS, or OSS Software, 13. Failure to perform assessments of available bug fixes and updates available in COTS software 14. Insufficient use of coding standards 15. Missing or inadequate static analysis checks on code 16. Incorrect version loaded 17. Incorrect configuration values or data 18. No checks on external input files and data 19. Errors in configuration data changes being uploaded to spacecraft 20. Software/avionics simulator/emulator errors and defects 21. Unverified software 22. High cyclomatic complexity levels (over 15) 23. Incomplete or inadequate software requirements analysis 24. Compound software requirements 25. Incomplete or inadequate software hazard analysis 26. Incomplete or inadequate software safety analysis 27. Incomplete or inadequate software test data analysis 28. Unrecorded software defects found during informal and formal software testing 29. Auto-coding tool faults and defects 30. Errors in design models 31. Software errors in hardware simulators due to a lack of understanding of hardware requirements 32. Incomplete or inadequate software test data analysis 33. Inadequate built-in-test coverage 34. Inadequate regression testing and unit test coverage of flight software application-level source code 35. Failure to test all nominal and planned contingency scenarios (breakout and re-rendezvous, launch abort) and complete mission duration (launch to docking to splashdown) in the hardware in the loop environment 36. Incomplete testing of unexpected conditions, boundary conditions, and software/interface inputs 37. Use of persistence of test data, files, or config files in an operational scenario 38. Failure to provide multiple paths or triggers from safe states to hazardous states 39. Interface control documents and interface requirements documents errors 40. System requirements errors 41. Misunderstanding of hardware configuration and operation 42. Hardware requirements and interface errors, Incorrect description of the software/hardware functions and how they are to perform 43. Missing or incorrect software requirements or specifications 44. Missing software error handling 45. Requirements/design errors not fully defined, detected, and corrected) 46. Failure to identify the safety-critical software items 47. Failure to perform a function, performing the wrong function, performing the function incompletely 48. An inadvertent/unauthorized event, an unexpected, unwanted event, an out-of-sequence event, the failure of a planned event to occur 49. The magnitude or direction of an event is wrong 50. Out-of-sequence event protection 51. Multiple events/actions trigger simultaneously (when not expected) 52. Error or exception handling missing or incomplete 53. Inadvertent or incorrect mode transition for required vehicle functional operation; undefined or incorrect mode transition criteria; unauthorized mode transition 54. Failure of flight software to correctly initiate proper transition mode 55. Software state transition error 56. Software termination is an unknown state 57. Errors in the software data dictionary values
Human-machine interface errors	1. Incorrect data (unit conversion, incorrect variable type) 2. Stale data 3. Poor design of human machine interface 4. Too much, too little, incorrect data displayed 5. Ambiguous or incorrect messages 6. User display locks up/fails 7. Missing software error handling 8. Unsolicited command (command issued inadvertently, cybersecurity issue, or without cause) 9. Wrong command or a miscalculated command sent 10. Failure to display information or messages to a user 11. Display refresh rate leads to an incorrect operator response 12. Lack of ordering scheme for hazardous event queues (such as alerts) in the human-computer interface (i.e., priority versus time of arrival, for example, when an abort must go to the top of the queue) 13. Incorrect labeling of operator controls in the human interface software 14. Failure to check for constraints in algorithms/specifications and valid boundaries 15. Failure of human interface software to check operator inputs 16. Failure to pass along information or messages 17. No onscreen instructions 18. Undocumented features 19. States that appear impossible to exit 20. No cursor 21. Failure to acknowledge an input 22. Failure to advise when a change takes effect 23. Wrong, misleading, or confusing information 24. Poor aesthetics in the screen layout 25. Menu layout errors 26. Dialog box layout errors 27. Obscured instructions 28. Misuse of color 29. Failure to allow tabbing navigation to edit fields (mouse only input)
Security and virus errors	1. Denial or interruption of service 2. Spoofed or jammed inputs 3. Missing capabilities to detect insider threat activities 4. Inadvertent or intentional memory modification 5. Inadvertent or unplanned mode transition 6. Missing software error handling or detect handling 7. Unsolicited command NASA-STD-8739.8B 70 of 70 Software Cause Areas to Consider Potential Software Causes 8. Stack-based buffer overflows 9. Heap-based attacks 10. Cybersecurity vulnerability or computer virus 11. Inadvertent access to ground system software 12. Destruct commands incorrectly allowed in a hands-off zone 13. Communication to/from an unexpected system on the network
Unknown Unknowns errors	1. Undetected software defects 2. Unknown limitations for COTS (operational, environmental, stress) 3. COTS extra capabilities 4. Incomplete or inadequate software safety analysis for COTS components 5. Compiler behavior errors or undefined compiler behavior 6. Software defects and investigations that are unresolved before the flight