bannerc
7.18 - Documentation Guidance

1. Purpose

Provide a set of minimum content guidance for software project plans, reports, and procedures.  This guidance originally appeared in the version of this Handbook associated with NPR 7150.2A, which contained requirements for software documentation.  Those requirements have been removed from the NPR and incorporated into the guidance provided in this Handbook topic.

1.1 Introduction

Guidance for each document is provided via the linked pages below. Software documentation resources and lessons learned are also reiterated on those respective pages.

The tab labels are abbreviated as follows:


To supplement and implement this guidance, NASA-specific documentation templates, examples, checklists, and more are available in Software Processes Across NASA (SPAN), accessible to NASA users from the SPAN tab in this Handbook.  Center-specific guidance and resources, such as templates, are available in Center Process Asset Libraries (PALs).

2. Resources

2.1 References


2.2 Tools


Unable to render {include} The included page could not be found.

3. Lessons Learned

3.1 NASA Lessons Learned

The NASA Lesson Learned database contains the following lessons learned related to software project documentation:

  • Mars Observer Inertial Reference Loss. Lesson Number 0310 501: Design for Maintenance, Lesson Learned No. 4: "Design flexibility of the flight computer and software is critical to the ability to uplink software patches for the correction of unexpected in-flight spacecraft anomalies." Document Structure Reference: SwDD - Software Design Description
  • Thrusters Fired on Launch Pad (1975) (Plan for safe exercise of command sequences). Lesson Number 0403 507: "When command sequences are stored on the spacecraft and intended to be exercised only in the event of abnormal spacecraft activity, the consequences should be considered of their being issued during the system test or the pre-launch phases." Document Structure Reference: STP - Software Test Plan
  • Interface Control and Verification. Lesson Number 0569 508: Problems occurred during the Mars Pathfinder spacecraft integration and test due to out-of-date or incomplete interface documentation. (While this lesson involves a hardware-related problem, it illustrates the need for accuracy in interface documentation.) Investigation showed that the main wiring harness was built in accordance with documentation that had not been updated after design changes were made. In part, this was due to independently prepared Mechanical Interface Control Drawings by the Government and the contractor. The MICD s should have had periodic verification for accuracy and compatibility. Document Structure Reference: IDD - Interface Design Description
  • Consider Language Differences When Conveying Requirements to Foreign Partners (1997) (Diagrams may be useful in requirements specifications). Lesson Number 0608 511: "It is especially important when working with foreign partners to document requirements in terms that describe the intent very clearly; include graphics where possible." Document Structure Reference: SRS - Software Requirements Specification
  • Mars Climate Orbiter Mishap Investigation Board - Phase I Report. Lesson Number 0641 513: "The MCO Mishap Investigation Board (MIB)...determined that the root cause for the loss of the MCO spacecraft was the failure to use metric units in the coding of a ground software file." The data in the ... file was required to be in metric units per existing software interface documentation, and the trajectory modelers assumed the data was provided in metric units per the requirements. In fact, the angular momentum (impulse) data was in English units rather than metric units. The failure to properly communicate what was in the interface documentation is a warning that the effective implementation of the IDD requirement includes adequate communication of its contents, not just the writing and recording. Document Structure Reference: IDD - Interface Design Description
  • Preliminary Design Review. Lesson Learned 0655 514:  By not holding a PDR "one or a number of potential problems which could result in an adverse impact on the system, subsystem, and/or project might not be identified in a timely manner. This oversight might later result in a condition having a significant effect on quality, reliability, capability, schedule, and/or cost...Conduct a formal Preliminary Design Review (PDR) at the system and subsystem levels prior to the start of subsystem detail design, to assure that the proposed design and associated implementation approach will satisfy the system and subsystem functional requirements." Document Structure Reference: SwDD - Software Design Description
  • Critical Design Review for Unmanned Missions. Lesson Learned 0657 515:  "In the absence of a CDR, potential problems with adverse impacts on the subsystem, system, or project may not be identified in a timely manner. This oversight may later result in a condition having a significant effect on quality, reliability, capability, schedule, or cost." Document Structure Reference: SwDD - Software Design Description
  • Fault Tolerant Design. Lesson Number 0707 517: "Systems which do not incorporate fault tolerant design (FTD) as a part of their development process will experience a higher risk of a severely degraded or prematurely terminated mission, or it may result in excessively large weight volume, or high cost to achieve an acceptable level of performance by using non-optimized redundancy or overdesign...Incorporate hardware and software features in the design of spacecraft equipment which tolerate the effects of minor failures and minimize switching from the primary to the secondary string. This increases the potential availability and reliability of the primary string." Document Structure Reference: SwDD - Software Design Description
  • Pre-Flight Problem/Failure Reporting Procedures. Lesson Number 0733 519: Impact of Non-Practice states: "Without ... formal reporting procedures, problems/failures, particularly minor glitches, may be overlooked or not considered serious enough to investigate or report to Project Management. This could result in recurrence of the problem/failure during the mission and result in a significant degradation in performance." Document Structure Reference:CR-PR - Software Change Request - Problem Report
  • Problem Reporting and Corrective Action System. Lesson Number 0738 520: Impact of Non-Practice states: "Hardware/software problems that require further investigation may not be identified and tracked. The development of corrective action and the need for improvement will not be highlighted in engineering. Opportunities for the early elimination of the causes of failures and valuable trending data can be overlooked." Practice states: "A closed-loop Problem (or Failure) Reporting and Corrective Action System ( PRACAS or FRACAS ) is implemented to obtain feedback about the operation of ground support equipment used for the manned spaceflight program." Document Structure Reference:CR-PR - Software Change Request - Problem Report
  • Software Design for Maintainability. Lesson Number 0838 526: Impact of Non-Practice: "Because of increases in the size and complexity of software products, software maintenance tasks have become increasingly more difficult. Software maintenance should not be a design afterthought; it should be possible for software maintainers to enhance the product without tearing down and rebuilding the majority of code." Document Structure Reference: SwDD - Software Design Description
  • Probable Scenario for Mars Polar Lander Mission Loss (1998) (Affects of an incomplete software requirements specification). Lesson Number 0938 529: "All known hardware operational characteristics, including transients and spurious signals, must be reflected in the software requirements documents and verified by test." Document Structure Reference: SRS - Software Requirements Specification
  • Probable Scenario for Mars Polar Lander Mission Loss (1998) (Importance of including known hardware characteristics). Lesson Number 0938 529: "1. Project test policy and procedures should specify actions to be taken when a failure occurs during test. When tests are aborted, or known to have had flawed procedures, they must be rerun after the test deficiencies are corrected. When test article hardware or software is changed, the test should be rerun unless there is a clear rationale for omitting the rerun. 2. All known hardware operational characteristics, including transients and spurious signals, must be reflected in the software requirements documents and verified by test." Document Structure Reference: Test - Software Test Procedures
  • MPL Uplink Loss Timer Software/Test Errors (1998) (Plan to test against full range of parameters). Lesson Number 0939 530: "Unit and integration testing should, at a minimum, test against the full operational range of parameters. When changes are made to database parameters that affect logic decisions, the logic should be re-tested." Document Structure Reference: STP - Software Test Plan
  • Computer Software/Software Safety Policy Requirements/Potential Inadequacies (Cover essential requirements for the project). Lesson Learned 1021 532: "NASA is committed to assuring that required program management plans and any subordinate plans such as software or safety management plans cover the essential requirements for programs where warranted by cost, size, complexity, lifespan, risk, and consequence of failure." Document Structure Reference: SDP-SMP - Software Development - Management Plan
  • International Space Station (ISS) Program/Computer Hardware-Software/Software (Plan realistic but flexible schedules). Lesson Number 1062 536: "NASA should realistically reevaluate the achievable ... software development and test schedule and be willing to delay ... deployment if necessary rather than potentially sacrificing safety." Document Structure Reference: STP - Software Test Plan
  • Computer Hardware-Software/Software Development Tools/Maintenance. Lesson Number 1128 540: "NASA concurs with the finding that no program-wide plan exists addressing the maintenance of COTS software development tools. A programmatic action has been assigned to develop the usage requirements for COTS/modified off-the-shelf software including the associated development tools. These guidelines will document maintenance and selection guidelines to be used by all of the applicable program elements." Document Structure Reference: Maint - Software Maintenance Plan 
  • Computer Hardware-Software/International Space Station/Software Development(Plan for user involvement). Lesson Learned 1132 542: "The lack of user involvement results in increased schedule and safety risk to the program... follow a concurrent engineering approach to building software that involves users and other key discipline specialists early in the software development process to provide a full range of perspectives and improve the understanding of requirements before code is developed." Document Structure Reference: SDP-SMP - Software Development - Management Plan
  • International Space Station (ISS) Program/Computer Hardware-Software/International Partner Source Code (Maintenance Agreements.) Lesson Number 1153 543: The Recommendation states to "Solidify long-term source code maintenance and incident investigation agreements for all software being developed by the International Partners as quickly as possible, and develop contingency plans for all operations that cannot be adequately placed under NASA's control." Document Structure Reference: Maint - Software Maintenance Plan
  • Lack of Education and Training in the Use and Processes of Independent Verification & Validation (IV&V) for Software Within NASA (2001). Lesson Number 1173 544: "While NASA has made major changes to emphasize the need to utilize IV&V on safety critical projects, the technology is not well understood by program managers and other relevant NASA personnel." Document Structure Reference: Train - Software Training Plan
  • Deep Space 2 Telecom Hardware-Software Interaction (1999) (Plan to test as you fly). Lesson Number 1197 545: "To fully validate performance, test integrated software and hardware over the flight operational temperature range." Document Structure Reference: STP - Software Test Plan
  • ADEOS-II NASA Ground Network (NGN) Development and Early Operations – Central/Standard Autonomous File Server (CSAFS/SAFS) Lessons Learned. Lesson Number 1346 550: Use of commercial off the shelf (COTS) products: "Match COTS tools to project requirements. Deciding to use a COTS product as the basis of system software design is potentially risky, but the potential benefits include quicker delivery, less cost, and more reliability in the final product. The following lessons were learned in the definition phase of the [software] development. Document Structure Reference: SwDD - Software Design Description
    • "Use COTS products and re-use previously developed internal products.
    • "Create a prioritized list of desired COTS features.
    • "Talk with local experts having experience in similar areas.
    • "Conduct frequent peer and design reviews.
    • "Obtain demonstration versions of COTS products.
    • "Obtain customer references from vendors.
    • "Select a product appropriately sized for your application.
    • "Choose a product closely aligned with your project's requirements.
    • "Select a vendor whose size will permit a working relationship.
    • "Use vendor tutorials, documentation, and vendor contacts during COTS evaluation period."
  • Lessons Learned From Flights of "Off the Shelf" Aviation Navigation Units on the Space Shuttle, GPS (Importance of Accurate Interface Control Document.) Lesson Learned 1370 551:  "If the integrator and user do not have access to firmware and firmware requirements, the ICD may be the only written source of information on unit parameters. Developers of software that will interface with the unit must examine the ICD closely. .... An inaccurate ICD will lead to software and procedural issues that will have to be addressed before a system can be certified as operational. An accurate ICD is also needed for instrumentation port data that is critical during the test and verification phase of a project...Short development schedules may result in changes to the ICD while host vehicle software requirements are being defined and software is in development and test. A disciplined process of checks must be in place to ensure that the ICD and software requirements for units that interface with the [hardware and instruments] are consistent. Individuals who have knowledge of both [hardware] requirements and requirements for other interfacing units must be able to communicate and be involved in any changes made to the ICD." Document Structure Reference: SwDD - Software Design Description
  • Kennedy Space Center (KSC) Projects and Resources Online (KPRO) Software Development and Implementation (Project team planning).Lesson Learned 1384 552: "When planning and selecting team resources for a project, consider how the resources can work together and support each other, along with the skills required. This can be a factor in meeting or delaying software project milestones if an alternative resource has not been endorsed by the team members." Document Structure Reference: SDP-SMP - Software Development - Management Plan
  • Take CM Measures to Control the Renaming and Reuse of Old Command Files. Lesson Number 1481 556: The Mars Odyssey mission ran into a version control issue when they discovered an improperly named file call script. It was determined that the team had taken an old Mars Global surveyor file to reuse. The file was renamed, but its code creation time captured in the header was not changed. This caused the system to label the file as an old file. As a result, the operations team had to manually specify the correct file to use, until subsequent code fixes were implemented. Document Structure Reference: VDD - Version Description Document
  • MER Spirit Flash Memory Anomaly (2004). Lesson Learned 1483 557:  "Shortly after the commencement of science activities on Mars, an MER rover lost the ability to execute any task that requested memory from the flight computer. The cause was incorrect configuration parameters in two operating system software modules that control the storage of files in system memory and flash memory. Seven recommendations cover enforcing design guidelines for COTS software, verifying assumptions about software behavior, maintaining a list of lower priority action items, testing flight software internal functions, creating a comprehensive suite of tests and automated analysis tools, providing downlinked data on system resources, and avoiding the problematic file system and complex directory structure." Document Structure Reference: SwDD - Software Design Description
  • Develop and Test the Launch Procedure Early (1997).Lesson Number0609 565: The Abstract states: "During the terminal countdown for the first attempted launch of Cassini, spacecraft telemetry channels indicated a false alarm condition that delayed verification of spacecraft readiness for launch, and contributed to a delay on the first launch day. The anomaly was traced to erroneous telemetry documentation. Develop and release the launch procedure early enough for comprehensive testing before launch. Rigorously test and verify all telemetry channels and their alarms and ensure documentation such as telemetry definitions is kept up to-date." Document Structure Reference: SDD - Software Data Dictionary
  • NASA Study of Flight Software Complexity. Lesson Learned 2050 571:  "Flight software development problems led NASA to study the factors that have led to the accelerating growth in flight software size and complexity. The March 2009 report on the NASA Study on Flight Software Complexity contains recommendations in the areas of systems engineering, software architecture, testing, and project management." Document Structure Reference: SwDD - Software Design Description
  • Place Flight Scripts Under Configuration Management Prior to ORT (Project attention to configuration control). Lesson Number 2476 574: "Project attention to the configuration control of flight scripts is likely to prevent the generation of unnecessary software iterations, improve the rigor of mission system engineering processes, and ensure consistency in the test and operations environments." Document Structure Reference: SCMP - Software Configuration Management Plan

3.2 Other Lessons Learned

  • “Both experience and research have shown that the parameters under control of the moderator have significant effect on the results of an inspection. Where teams do not have their own baselines of data, Agency-wide heuristics have been developed to assist in planning. Document Structure Reference: Inspect - Software Inspection, Peer Reviews, Inspections

For example, analyzing a database of over 2,400 inspections across the Agency, researchers have found that inspections with team sizes of 4 to 6 inspectors find 12 defects on average, while teams outside this range find on average only 7. 320

Heuristics for how many pages can be reviewed by a team during a single inspection vary greatly according to the type of the document, which is to be expected since the density of information varies greatly between a page of requirements, a page of a design diagram, and a page of code. Similar to the heuristics for team size, inspection teams that follow the heuristics for document size also find on average significantly more defects than those which do not.  320 The recommended heuristics for document size are listed below. All of these values assume that inspection meetings will be limited to 2 hours.

Inspection Type

Target

Range

Functional Design

20 Pages

10 to 30 Pages

Software Req.

20 Pages

10 to 30 Pages

Arch. Design

30 Pages

20 to 40 Pages

Detailed Design

35 Pages

25 to 45 Pages

Source Code

500 LOC

400 to 600 LOC

Test Plans

30 Pages

20 to 40 Pages

Test Procedures

35 Pages

25 to 45 Pages

Teams have consistently found that inspection meetings should last at most 2 hours at a time. Beyond that, it is hard, if not impossible, for teams to retain the required level of intensity and freshness to grapple effectively with technical issues found. This heuristic is a common rule of thumb found across the inspection literature.

Over the course of hundreds of inspections and analysis of their results, NASA's Jet Propulsion Laboratory (JPL) has identified key lessons learned, which lead to more effective inspections, including:

      • Inspection meetings are limited to 2 hours.
      • Material is covered during the inspection meeting within an optimal page rate range that has been found to give maximum error finding ability.
      • Statistics on the number of defects, the types of defects, and the time expended by engineers on the inspections are kept.”






  • No labels