As NASA organizations use Artificial Intelligence (AI) to make decisions, the need for robust Software Assurance (SA) becomes important.
(AI) has emerged as a project option, changing the way we interact with and use data. Recommend, at this time, that we limit to use of AI to non-safety critical applications.
Software Assurance plays a crucial role in the AI development lifecycle. It involves a systematic process of monitoring, assessing, and improving the software development and AI implementation processes. In the context of AI, SA aims to:
evaluate the use of AI in software development activities,
validate the accuracy of algorithms,
ensure model robustness, and
ensure that the software meets the highest standards of performance.
AI introduces unique challenges to traditional SA methodologies. Unlike conventional software, AI systems continuously learn and adapt based on new data. This dynamic nature poses challenges in establishing fixed criteria for testing and validation. SA in AI must evolve alongside the models it scrutinizes, necessitating a more iterative and adaptive approach.
1.1 Data Quality
The quality of the data used to train and test AI models directly influences their performance and reliability. Garbage in, garbage out (GIGO) holds in the world of AI, emphasizing the critical importance of high-quality data. SA should also ensure the security and configuration control aspects of the data used to train and test AI. Data quality encompasses several dimensions, including accuracy, completeness, consistency, timeliness, and relevance. In the context of AI, accuracy is particularly vital, as even small inaccuracies in the training data can lead to significant errors in predictions. Ensuring that data is representative and unbiased is also crucial to avoid reinforcing existing biases within AI models.
There are a number of key considerations for software assurance on AI models:
1.1.1 Validating The Training Data
SA needs to review (evaluate and approve) how the project and engineering are testing the data used to train the AI model to ensure it is accurate, representative, and unbiased. This helps identify and mitigate potential issues with the model's performance and predictions. Effective data preprocessing and cleaning are foundational steps in ensuring data quality for AI. This involves identifying and addressing missing values, handling outliers, and normalizing data to create a standardized and reliable dataset. SA processes must include rigorous checks at these stages to guarantee that the data fed into AI models is of the highest quality. SA should analyze and think through all possible scenarios to make sure that the AI actions are correct.
Unlike traditional software, AI models that are generative or continually learning may require ongoing testing even after deployment, if the model continues to learn and evolve with new real-world data. SA needs to ensure that the software engineering process includes repeatedly testing the model's behavior and performance over time and evaluating the results. Testing AI models is a multifaceted process. It involves validating the accuracy of predictions, assessing model generalization to new data, and evaluating performance under diverse conditions. The dynamic nature of some AI techniques may require continuous testing throughout the development lifecycle. Rigorous testing not only ensures the reliability of AI applications but also contributes to building trust among end-users. The dynamic nature of AI necessitates continuous monitoring and feedback loops. SA processes should include mechanisms for monitoring of AI applications in development, production, and in testing. This ongoing evaluation helps identify and address issues promptly, ensuring that the AI system remains accurate and effective as it encounters new data.
Comprehensive documentation and traceability are essential aspects of AI applications. Documenting the entire development and testing process, along with the data used, facilitates transparency allowing for effective debugging and auditing. In the event of issues or unexpected outcomes, traceability enables developers and software assurance teams to identify and rectify problems efficiently. SA needs to ensure that sufficient documentation and traceability exist for the AI application and associated data. SA should ensure if the model needs to be recreated, then all of the inputs need to be documented so that they can be fed into the AI tool and produce the same/correct results.
Automated tools and machine learning models can be used to analyze large codebases, identify patterns, and predict potential issues more efficiently than manual review. SA needs to analyze the engineering data or perform independent static code analysis to check for code defects, software quality objectives, code coverage objectives, software complexity values, and software security objectives. SA needs to confirm the static analysis tool(s) are used with checkers to identify security and coding errors and defects. SA needs to confirm that the project addresses the errors and defects and assesses the results from the static analysis tools used by software assurance, software safety, engineering, or the project. SA should confirm that Software Quality Objectives or software quality threshold levels are defined and set for static code analysis defects, or software security objectives.
It's important to establish clear quality criteria and variables to assess the AI model's performance, quality, risk, security, maintainability, and compliance with requirements. The SA process should confirm that these criteria are defined and aim to continuously improve the code quality towards all of the criteria.
While the goal is to maximize code quality, projects must also consider factors like features, costs, and schedule. The SA process should help identify the critical vulnerabilities to address based on objectives, risks, and schedules.
Protecting sensitive data used to train and operate the AI model is important. Appropriate access controls and security measures must be in place. SA should confirm that the project has the proper security controls in place and configuration management for the project.
One of the most significant challenges in AI software assurance is addressing bias. Biased training data can lead to incorrect outcomes. Software assurance should look at how the project identifies and mitigates bias in the data and the algorithms. SA should also ensure ongoing monitoring and adjustment are used to ensure objectivity in AI applications.
1.1.9 Requirements Verification And Validation
AI applications introduce a unique challenge of indeterministic behavior. The ability to verify and validate the behavior of AI systems may not be predictably repeatable. SA should work to develop verification and validation criteria that address the probabilistic nature of AI systems to establish qualitative measures for validation and verification test acceptance, pass, and fail criteria.
In summary, software assurance for AI models requires a comprehensive approach that focuses on validating training data, continuous testing, leveraging AI/ML tools, defining quality benchmarks, balancing project goals and risks, ensuring data security and privacy, and addressing ethical considerations. This helps ensure the AI system is accurate, robust, secure, and compliant.
In the era of artificial intelligence, software quality assurance emerges as a linchpin for ensuring the reliability, accuracy, and ethicality of AI applications. By focusing on data quality as a foundational element, SA plays a pivotal role in mitigating challenges such as bias, ensuring transparency, and building trust in AI systems. As technology continues to advance, a proactive and adaptive approach to SA will be crucial in unlocking the full potential of AI while safeguarding against unintended consequences. Through continuous improvement, collaboration, and a commitment to ethical standards, the marriage of SA and AI promises a future where intelligent systems enhance our lives responsibly and reliably.
1.3 Additional Guidance
Links to Additional Guidance materials for this subject have been compiled in the Relevant Links table. Click here to see the
Tablink2
tab
3
linktext
Additional Guidance
in the Resources tab.
Div
id
tabs-2
2. Generative AI Metrics
Defining metrics for the complexity of generative AI models requires tailoring your evaluation to the unique characteristics of these systems—such as their architecture, size, computational requirements, capabilities, and usability. Unlike traditional software complexity metrics (e.g., cyclomatic complexity), generative AI complexity is often evaluated through model-specific engineering, mathematical, and operational characteristics. Below is a comprehensive guide to key metrics that you can use:
2.1 Architectural Complexity
Metrics:
Number of Layers: The depth or number of layers in the neural network architecture (e.g., 12 layers for GPT-2, 96 layers for GPT-4).
Parameters: Total number of trainable parameters (e.g., billions for most large language models).
Attention Heads: In transformer models, attention heads drive complexity. Evaluate the number of heads per layer and their interactions.
Non-Linearity: Measure the types and number of activation functions (e.g., ReLU, GELU).
Why Important:
These metrics indicate the model's capacity to learn complex patterns. Larger and deeper architectures typically have higher expressivity but come with increased computational cost.
2.2 Computational Complexity
Metrics:
Floating-Point Operations per Second (FLOPs): The number of computations required to perform training and inference.
Memory Requirements: GPU or RAM usage during training and inference—especially significant for deployment on constrained systems.
Inference Time: Latency in generating outputs. Faster inference models are considered less complex and more efficient.
Power Consumption: Energy required for training and inference, relevant for sustainable AI practices.
Why Important:
These metrics determine the model's scalability and operational costs for deployment and training. For example, models with high FLOPs and memory requirements are often harder to scale.
2.3 Model Representational Complexity
Metrics:
Expressive Power: The ability of the model to learn and represent complex functions or dynamics.
Entropy of Outputs: Capturing the diversity and unpredictability of model outputs during inference.
Embedding Space Size: The dimensionality of the embeddings used internally (e.g., 768 for GPT-2, 4096 for GPT-4).
Why Important:
These metrics highlight how effectively the model can generalize across diverse tasks and inputs while maintaining rich representations.
2.4 Training Complexity
Metrics:
Dataset Size: The volume of training data required (e.g., tokens or examples in billions for large models).
Training Iterations: Number of epochs or updates needed to achieve convergence.
Learning Rate Dynamics: The adaptation of learning rates during training, which impacts convergence speed.
Optimization Complexity: Evaluate the type of optimizer used (e.g., Adam vs. AdaFactor) and its configuration.
Why Important:
High training complexity can imply longer development times and greater hardware requirements to train a model properly.
2.5 Fine-Tuning and Adaptation Complexity
Metrics:
Number of Parameters Adapted: How much of the model can or must be fine-tuned for specific tasks (e.g., fine-tuning full models vs. adapter layers in PEFT [Parameter-Efficient Fine-Tuning]).
Data Requirements for Fine-Tuning: The amount of task-specific data required to adapt the model.
Domain Generalization: The model’s ability to generalize across new domains without full retraining.
Why Important:
Assessing fine-tuning complexity helps determine the model’s usability for downstream applications.
2.6 Output Complexity
Metrics:
Sequence Length: The maximum number of tokens or characters the model can process or generate in a single inference step.
Coherence Score: How logically connected the outputs are over long sequences (subjective or algorithmic measures).
Temperature and Diversity: Configurations used during inference and their influence on creativity or randomness of generative outputs.
Why Important:
Output complexity impacts the quality and usability of generative and conversational results, especially for tasks requiring coherence, relevance, or creativity.
2.7 Interpretability Complexity
Metrics:
Explainability: How easy it is to understand the internal workings of the model (e.g., decision-making pathways or attention distributions).
Saliency Maps: Highlights in the input that influence the outputs, which are useful for interpretability tools.
Layer Contribution Analysis: Understanding which layers contribute most to model performance.
Bias and Fairness Audits: The complexity of detecting and mitigating bias in the model outputs.
Why Important:
Interpretability metrics are crucial for ethical AI deployment and trust-building in sensitive applications.
2.8 Real-World Deployment Complexity
Metrics:
Scalability: How easy it is to scale up or down the model architecture for different hardware configurations.
Latency: The time taken for the model to respond or process input in real-world usage scenarios.
API Complexity: The ease or difficulty of integrating the model into applications (e.g., REST APIs vs. custom libraries).
Security and Robustness: Complexity of ensuring the model is robust to adversarial attacks or misuse.
Why Important:
Deployment complexity plays a significant role in practical utility, customer satisfaction, and security of generative AI solutions.
2.9 Best Practices for Defining Metrics
Task-Specific Design: Tailor metrics to your specific use case, whether it's text generation, image generation, or conversational AI.
Benchmarking: Use standard benchmarks such as GLUE, SuperGLUE, BLEU, ROUGE, or human evaluation to assess performance alongside complexity.
Holistic View: Combine several complexity metrics for a more complete picture (architectural, computational, and deployment complexity).
Comparative Analysis: Compare your model against others (e.g., GPT, BERT, DALL-E) to contextualize complexity scores.
2.10 Tools and Frameworks for Complexity Evaluation
Example: You can use tools like these for computation-heavy components:
Weights & Biases (W&B): For tracking FLOPs, memory use, and other training metrics.
Hugging Face Benchmarking Tools: For evaluating inference performance.
Explainability Libraries: Captum, SHAP, or LIME for interpretability complexity.
Energy Usage Estimators: Like CodeCarbon, to assess power consumption.
By defining and measuring these complexity metrics, you can assess generative AI models more effectively, ensure performance optimization, and improve deployment decisions.
This checklist provides comprehensive data and evidence required to certify software for human-rated missions.
It ensures compliance with applicable safety standards, regulatory requirements (NASA NPR 7150.2D
Swerefn
refnum
083
, SSP 50038
Swerefn
refnum
014
, FAA 450.141 Computing systems, NASA-STD-8739.8B
Swerefn
refnum
278
), mission-critical functionality, and stakeholder acceptance of residual risks, demonstrating that the software is safe, reliable, and mission-ready for crewed spaceflight operations.
Excerpt Include
SITE:PAT-082 - Software Certification in Human-Rated Missions Checklist
SITE:PAT-082 - Software Certification in Human-Rated Missions Checklist
nopanel
true
Div
id
tabs-2
2. Key Software Compliance Data Needs
2.1 Summary Table of Key Software Compliance Data Needs
Requirements
Access to the software requirements or software user stories, software acceptance criteria, software use cases, software functional specifications, software behaviors, software feature descriptions, or software tickets.
Access to the System/Software requirements traceability data
Access to the hazard control software requirements traceability data
Explanation on software safety constraints and assumptions
Access to the software requirements analysis results
Design
Access to the software design
Explanation on software fault containment and management approach
Access to the software design analysis results
Access to the Interface Control Documents (ICDs) for software interfaces
Access to the data dictionary data
Data demonstrating prerequisite checks for all hazardous commands
Explanation on how the software design and requirements meet the safety critical requirements
Explanation on how the safety-critical code is modifiable
Demonstrate that the applicable human factors design principles are in place in the software
Development
Access to the software operational plans or user’s manual
Explanation of the Fault Tolerance approach used in the design
Access to the source code
Approval data for all AI uses in flight software before deployment.
Demonstrate that the displayed confidence levels are correct for all critical AI decisions and predictions.
Demonstrate that the AI systems used in critical space flight software are managed with transparency to maintain safety, reliability, and accountability.
Access to the software defect reports (open/closed with severity), anomaly logs, residual risk acceptance with stakeholder sign-off
Access to all tailoring/waiver/deviation and TBR/TBD/FWD closure records.
Evidence that the approved software processes were followed during all software development activities
Evidence of closer of peer reviews/inspection findings for requirements, architecture/design, code and test artifacts review
Evidence of qualification and pedigree of reused/third‑party software
Provide an understanding of the certifications process for the software models & simulations used to qualify flight software.
Data showing the Worst‑Case Execution Time (WCET), timing analysis, and scheduling evidence, including partitioning and multicore interference analysis where relevant.
Verification & Validation
Access to the software test results
Access to the MC/DC coverage data for safety-critical components
Access to the static and dynamic analysis testing approach and results
Demonstrate the fault injection resilience under worst-case conditions
Demonstrate how the software handles spurious signals and invalid inputs
Provide an understanding of the certifications process for the software test environments used to qualify flight software.
Provide results of a cyclomatic complexity analyses
Provide a list of identified software issues, anomalies, and defects, with clear and traceable justifications for acceptance of any unresolved items.
Provide data showing the operational test results and data confirming system resource margins under nominal and worst-case scenarios, including CPU/memory/throughput margins data
Access to the IV&V reports and findings and an understanding of the IV&V scope
Hazard Analysis
Access to the software related hazard analysis reports, fault tree analysis, and FMEA addressing all software hazards and mitigation strategies.
Demonstrate that all AI systems generate warnings and recommendations for anomalies that could lead to hazards
Demonstrate the operator Action Validation
Explanation of the software safing Procedures
Configuration Management
Access to the software documentation,
Access to the software change Logs
Access to the software version control, baseline definitions, and audit records for all changes.
Operational Procedures
Explanation of the software control Sequences,
Explanation of any software manual safing data and processes
Cybersecurity
Demonstrate that the software has encryption, authentication, meets secure coding practices, and has access control
Access to the results showing penetration testing and vulnerability analysis
2.2 Key Software Compliance Data Needs Rationale
Section: Requirements
Item
Rationale
Access to the software requirements or software user stories, softwareacceptance criteria, softwareuse cases, softwarefunctional specifications, software behaviors, softwarefeature descriptions, or software tickets
Demonstrated the software capability definition and software testing criteria. SWEHB emphasizes complete, correct, validated SRS content as the foundation for safe, verifiable software; early clarity prevents costly downstream defects and safety gaps.
Access to the System/Software requirements traceability data
Bi‑directional traceability is required to prove every design, code, and test artifact maps to a validated system need—preventing orphan functionality and ensuring hazards are controlled.
Access to the hazard control software requirements traceability data
Hazard controls must trace explicitly from hazard analysis to software requirements and back to verification evidence to avoid missing or partial mitigations.
Explanation on software safety constraints and assumptions
Documented constraints and assumptions prevent hidden coupling and incorrect operating expectations—frequent root causes in lessons learned for hazardous behavior.
Access to the software requirements analysis results
Structured requirements analysis (ambiguity checks, safety impacts, HSI considerations) provides objective evidence that requirements are testable, feasible, and complete before design.
Section: Design
Item
Rationale
Access to the software design
Architecture/design descriptions enable validation of partitioning, redundancy, timing, and safety mechanisms; undocumented design is a recurring integration‑failure source in SWEHB.
Explanation on software fault containment and management approach
A clear FDIR strategy limits fault propagation, preserves redundancy, and supports safe recovery for safety‑critical functions.
Access to the software design analysis results
Design reviews/analyses expose interface risks, safety gaps, and performance limits early, reducing costly rework during integration/test.
Access to the Interface Control Documents (ICDs) for software interfaces
ICDs prevent interface mismatch failures—one of the most common categories of integration defects called out in SWEHB checklists.
Access to the data dictionary data
Data dictionaries provide authoritative definitions and units, preventing misinterpretation and unit conversion defects across teams/tools. (SWEHB practice captured in certification checklist.)
Data demonstrating prerequisite checks for all hazardous commands
Prerequisite checks are explicitly required for safety‑critical design to block unsafe command execution unless the system state is validated.
Explanation on how the design and requirements meet safety‑critical requirements
Explicit mapping from safety‑critical requirements to design elements verifies that every control is implemented and testable—closing common gaps seen in hazard mitigations.
Explanation on how the safety‑critical code is modifiable
Modular, isolated safety‑critical code reduces regression risk during updates and enables focused assurance/maintenance. (Emphasized in SWEHB certification guidance.)
Demonstrate applicable human‑factors design principles in the software
HSI evidence (clear displays, alerts, error tolerance) mitigates operator error and automation surprise—frequent contributors to unsafe states in lessons learned and SWEHB checklists.
Section: Development
Item
Rationale
Access to the software operational plans or user’s manual
Operational manuals drive correct crew/operator actions; unclear procedures have historically led to unsafe system states—documented guidance is essential.
Explanation of the Fault Tolerance approach used in the design
Development evidence must confirm the implemented redundancy, monitoring, and recovery match the approved design/FDIR strategy.
Access to the source code
Assurance/IV&V require code access to assess correctness, safety patterns, and compliance; “code is the design” for verification in SWEHB practice.
Approval data for all AI uses in flight software before deployment
The human‑rated certification checklist in SWEHB requires governance of novel tech; AI decisions must be bounded, transparent, and verified before mission use.
Demonstrate correct AI confidence levels for critical decisions/predictions
Accurate confidence cues preserve appropriate human trust; misleading confidence can induce hazardous operator actions (SWEHB human‑rated checklist).
Demonstrate AI transparency/accountability in critical flight software
Transparency and auditable behavior enable assurance teams to verify AI outputs, edge cases, and failure handling consistent with safety requirements (SWEHB PAT guidance).
Access to defect reports, anomaly logs, and residual risk acceptance
Defect trends and residual‑risk approvals provide readiness evidence and ensure risks are visible and accepted prior to flight (SWEHB PAT‑082).
Access to tailoring/waiver/deviation and TBR/TBD/FWD closures
Tailoring/waiver records show what SWEHB/NPR expectations were modified and how verification gaps were closed—preventing undocumented process escapes.
Evidence that approved processes were followed during development
Deterministic timing & schedulability evidence prevent deadline misses and resource starvation—common latent hazards noted in SWEHB certification checklists.
Section: Verification & Validation
Item
Rationale
Access to the software test results
Test evidence must trace to requirements and hazards to prove correctness and safety under nominal/off‑nominal conditions (SWEHB).
Access to MC/DC coverage data for safety‑critical components
For safety‑critical logic, MC/DC helps reveal untested decision paths—improving confidence that critical branches cannot hide defects (SWEHB practice).
Static and dynamic analysis approach/results
Static/dynamic analyses expose memory, concurrency, and API defects early, complementing functional tests with deeper correctness checks (SWEHB PAT‑082).
Fault‑injection resilience under worst‑case conditions
Fault‑injection validates robustness of error handling and safe‑state transitions under realistic fault scenarios (SWEHB).
Handling spurious signals and invalid inputs
Defensive input handling and integrity checks are SWEHB safety‑critical design requirements to prevent unsafe behavior from noise or corruption.
Certification of software test environments
Accredited test environments assure fidelity so test results represent flight behavior—preventing false positives/negatives (SWEHB PAT‑082).
Cyclomatic complexity analyses results
Complexity correlates with defect likelihood and test effort; tracking supports risk‑based testing and maintainability (SWEHB checklist).
Issues/anomalies/defects list with unresolved‑item justifications
Residual issues require documented operational mitigations and rationale to ensure informed risk acceptance before flight (SWEHB).
Operational test results & resource margins (CPU/memory/throughput)
Resource margin evidence demonstrates the system sustains mission loads without entering unsafe degraded states (SWEHB).
Access to IV&V reports/findings & IV&V scope understanding
Independent assessment adds rigor and identifies latent risks; scope clarity ensures findings are dispositioned (SWEHB/SMA IV&V discussion).
Section: Hazard Analysis
Item
Rationale
Hazard analysis, FTA, FMEA
Hazard analyses formally identify software causes/contributions and define controls; they are the backbone for safety‑critical requirements and tests (SWEHB).
AI warnings & recommendations for anomalies
AI components must surface anomaly cues and recommended mitigations in time for operator/software safing to prevent hazard escalation (SWEHB PAT‑082 scope).
Operator Action Validation
Operator action validation ensures human‑software interactions are unambiguous and error‑tolerant—reducing unsafe actions under stress (SWEHB checklist).
Software safing procedures
Safing procedures must reliably place the system into known safe states on detection of hazardous conditions—captured in SWEHB safety‑critical design items.
Section: Configuration Management
Item
Rationale
Access to the software documentation
CM control of documentation prevents divergence between what is built, tested, and flown—frequent contributors to integration defects (SWEHB PAT‑082).
Access to the software change logs
Change history enables traceability and impact assessment for safety‑critical components/interfaces (SWEHB).
Version control, baseline definitions, audit records
Version/baseline and audits ensure the exact tested configuration proceeds to flight and prevent unauthorized changes (SWEHB checklist).
Section: Operational Procedures
Item
Rationale
Explanation of the software control sequences
Clear control sequences reduce operator confusion and ensure predictable behavior during time‑critical operations (SWEHB).
Explanation of any software manual safing data/processes
Manual safing provides a human‑initiated fallback when automation fails; procedures must be explicit, tested, and accessible (SWEHB).
Section: Cybersecurity
Item
Rationale
Encryption, authentication, secure coding, access control
Security controls protect command authority, data integrity, and system availability—security weaknesses can become safety hazards (SWEHB).
Penetration testing and vulnerability analysis results
Pen/vuln testing finds exploitable weaknesses before flight, supporting remediation and risk reduction (SWEHB checklist practice).
Div
id
tabs-3
3. Example Potential Safety Case for Human-Rated Software Certification
This safety case demonstrates that the software used in this human-rated mission adheres to rigorous safety, quality, and regulatory standards. Based on the evidence provided, the software is flight-ready and capable of supporting critical mission operations while ensuring the safety of the crew and spacecraft under both nominal and adverse conditions.
1. Requirements and Traceability
Argument: The software requirements are clearly defined, traceable, and aligned with safety-critical mission needs.
Evidence:
Comprehensive Software Requirements Specification (SRS) covering high-level mission-critical systems (e.g., navigation, propulsion, anomaly detection, life support, and abort operations).
Verified safety requirements (fault tolerance, redundancy, and safe initialization/termination).
Acceptable quality of detailed low-level safety-critical requirements, including specifics like algorithm designs and timing constraints.
A completed and validated Requirements Traceability Matrix (RTM) showing bi-directional traceability from requirements through design, code, and test results.
Reviewed system-level safety analyses to document "Must Work" (MWF) and "Must Not Work" (MNWF) requirements, prerequisite checks for hazardous commands, and mitigation strategies.
2. Software Design and Architecture
Argument: The software architecture is resilient, modular, and designed for fault tolerance and safety-critical operations.
Evidence:
Architecture documentation detailing modular fault isolation, redundancy, and resiliency mechanisms.
Block diagrams illustrating fault containment, fail-safe control paths, and separation of critical functions.
Documentation and analysis of safety-critical subsystems (e.g., propulsion, crew displays, navigation) with clearly defined responsibilities.
Verified Interface Control Documents (ICDs), ensuring compatibility between internal software, hardware systems, and external interactions.
Safety validation evidence for safeguards like fault containment, error detection, operator validation, integrity checks, and anomaly recovery processes.
Independent redundant system designs ensuring physical and logical separation to mitigate single points of failure.
Validation of fault-tolerant mechanisms, including cosmic radiation protection in CPU designs.
3. Hazard Analysis and Safety Evidence
Argument: All hazards associated with software functionality are identified, analyzed, and mitigated to acceptable levels of risk.
Evidence:
A complete Hazard Analysis Report (HAR) identifying software-driving hazards and the mitigation strategies in place.
Fault Tree Analysis (FTA) and Failure Mode and Effects Analysis (FMEA) showing robust fault prevention and recovery mechanisms or a completed System Theoretic Process Analysis (STPA) showing robust fault prevention and recovery mechanisms.
Time-to-effect (TTE) analyses ensuring hazardous conditions can be addressed by safing systems within operational thresholds.
Residual risk documentation showing resolution or acceptance of remaining risks by stakeholders.
4. Verification and Validation (V&V) Evidence
Argument: Rigorous testing, validation, and coverage analyses demonstrate software compliance with safety-critical requirements.
Evidence:
100% Statement Coverage.
100% Decision Coverage.
100% Modified Condition/Decision Coverage (MC/DC) for safety-critical components.
Unit testing, system integration testing, end-to-end validation, and operational flight simulations confirming that expected functional performance aligns with safety goals.
Validation of reused components (COTS, GOTS, OSS, MOTS) to ensure compatibility and reliable integration into human-rated environments.
Coverage analysis demonstrating:
Static analysis reports showing compliance with coding standards and identification/remediation of software defects.
Fault injection testing results validating responses to corrupted data, anomalies during power disruptions, and memory errors.
Worst-case response timing analysis confirming safing systems meet TTE requirements under degraded conditions.
5. Configuration Management and Change Tracking
Argument: Configuration management processes ensure version control and traceability for all software changes.
Evidence:
Documentation showing version-controlled baselines for flight-ready software and data loads, including configuration hashes and release notes.
Audit records verifying modifications, regression testing, impact analyses, and stakeholder approvals
6. Cybersecurity and Security Validation
Argument: The software architecture incorporates robust cybersecurity measures to mitigate threats in operation environments.
Penetration testing results validating resilience against cyberattacks and unauthorized system access during pre-launch and flight.
Vulnerability analysis reports confirming detection, resolution, and closure of security-related risks.
7. Defect Management and Residual Risks
Argument: All software defects have been resolved or mitigated to acceptable levels of residual risk.
Evidence:
Defect reports showing all open and closed defects categorized by severity and justifications for acceptance of residual risks.
Logs documenting defect resolutions and testing data validating the outcomes of mitigation measures.
Residual risk acceptance documentation signed off by stakeholders, with sufficient evidence showing safe system behavior despite unresolved minor risks.
8. Resource Utilization and Performance Metrics
Argument: The software demonstrates sufficient resource margins and acceptable performance under normal and worst-case conditions.