1.0 Introduction

As NASA organizations use Artificial Intelligence (AI) to make decisions, the need for robust Software Assurance (SA) becomes important.Artificial Intelligence (AI) will be increasingly embedded in NASA software systems and decision workflows. AI’s probabilistic behavior, dependence on data, and susceptibility to drift and supply‑chain changes introduce assurance challenges that differ from conventional software. This Topic defines software assurance (SA) expectations across the AI system lifecycle, expanding prior guidance on data quality, continuous testing, documentation, security, bias, and V&V with first‑class evaluation, uncertainty management, safety engineering, architectural resilience, human oversight, and continuous change management.

Applicability. Until Agency policy evolves, projects are recommended to limit AI use to non‑safety‑critical applications unless a documented AI safety case and risk controls are approved by the appropriate authority.

2.0 Scope

This Topic applies to machine learning (ML), foundation‑model applications (including LLMs), fine‑tuned or open‑weight models, retrieval‑augmented systems, compound AI systems, and agentic systems that plan and take actions using tools. It covers development, integration, deployment, operation, monitoring, and retirement.

3.0 Key Definitions

AI system: Software that uses data‑driven models to produce predictions, classifications, generations, recommendations, or actions.
Foundation model: Large pre‑trained multi‑task model consumed via API or fine‑tuning.
Compound AI system: Architecture orchestrating multiple AI and non‑AI components (e.g., retrieval, guardrails, models, tools).
Agentic system: AI that plans, uses tools, and takes actions autonomously within defined authority.
Data drift / Concept drift: Changes in input distributions or input–output relationships that degrade performance.
Uncertainty calibration: Correspondence between expressed confidence and probability of correctness.

4.0 Policy Statement

Projects that incorporate AI should implement Software Assurance practices that:

treat evaluation as a first‑class engineering activity;
ensure traceability across all AI artifacts;
design human oversight appropriate to risk;
address security across the full AI threat surface;
manage uncertainty and communicate it to users;
architect for modularity and resilience;
apply safety engineering tailored to AI; and
plan for continuous change in models, data, and dependencies.

5.0 Roles and Teaming

Projects should form cross‑functional teams that include domain experts, SA/IV&V, AI evaluation specialists, data engineers, prompt/retrieval engineers, platform/MLOps, and system architects; and provide AI literacy to all stakeholders who interact with AI outputs.

6.0 Assurance Requirements and Activities

6.1 Evaluation as a First‑Class Activity

Rationale. Evaluation determines whether the AI system works as intended before and after deployment. It must detect drift, safety violations, invalid outputs, and behavior changes.