bannerc

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: topic number

...

Tabsetup
01. Principle and Rational
12. Examples
23. Inputs
34. Resources
45. Lessons Learned
Div
idtabs-1

1. Principle

Floatbox

Include Page
Principles List
Principles List

Excerpt

Include a robust and well thought out response to resource oversubscription situations in the software design.

1.1 Rationale

Resource oversubscription is a severe fault condition that can lead to unpredictable behavior of the software system and render it inoperable. Timely detection and planned response to oversubscriptions can preserve critical system capabilities.









Div
idtabs-2

2. Examples and Discussion

Many resources can become oversubscribed during system operation. Examples include: buffers overflowing, exceeding a rate group time boundary, and excessive inputs or interrupts. The usual response consists of reducing the demand presented on the system by non-essential items, especially if they are the cause of the oversubscription. The system can also generate error messages, being careful not to overload the system further, and attempt to throttle the demand presented to it by external entities. In severe cases interrupts may be locked out. Additionally, the system may be reconfigured to reduce/eliminate non-essential functionality, or to allow use of less-demanding (even if less-accurate) algorithms. In severe cases processes and even the computer may have to be shut down.

An example of a graceful response to an overload is the Apollo Lunar Module onboard software, which correctly handled an unplanned scenario in which the LM's ascent stage rendezvous radar was incorrectly switched on, overloading the Apollo Guidance Computer with input data. In this case the use of task prioritization (the Guidance, Navigation and Control (GNC) had higher priority than the radar), prevented the critical GNC functions from being starved of processor time, saving the mission and crew.

It is recommended that the system design include the monitoring of resource usage with appropriate thresholds set to trigger carefully designed escalation responses, and that the method for detection and response protocol be explicitly documented and verified in the requirements.

This design principle is closely related to the 9.12 Resource Margins principle. Implementing run time measurements enables monitoring of margins during development as well as protecting against oversubscription once the software has been deployed operationally.

Div
idtabs-3

3. Inputs

Show If
groupconfluence-users
Panel
titleColorred
titleVisible to editors only

Excerpts from two documents are included below but no information on the documents that the excerpts were taken from is available. These documents should be properly referenced.

3.1 ARC

  • 3.7.2.4.5 Response to Resource Over-Subscription - The software design should accommodate unintended situations where resource usage is oversubscribed. The action to be taken in such situations should be specified as part of the requirements on the design.


Note: Examples of these situations include buffers overflowing, exceeding a rate group time boundary, and excessive inputs or interrupts. There are several common methods for tolerating these situations, most of which relate to reducing demand from non-essential items, especially if they are the source of over subscription:

     a. Generate warning messages when appropriate.
     b. Instruct external systems to reduce their demands.
     c. Lock out interrupts.
     d. Change operational behavior to handle the load. For example, the software may use faster but less accurate algorithms to keep up with the load.
     e. Reduce the functionality of the software, or even halt or suspend a process or shutdown a computer.

3.2 GSFC

None

3.3 JPL

  • 4.11.4.5 Response to resource over-subscription - The software design shall contain a robust response to situations where computer resources are oversubscribed. The action to be taken in such situations shall be specified as part of the requirements on the design.


Note: Examples of these situations include buffers overflowing, exceeding a rate group time boundary, and excessive inputs or interrupts. There are several common methods for tolerating these situations, most of which relate to reducing demand from non-essential items, especially if they are the source of over subscription:

     a. Generate warning messages when appropriate.
     b. Instruct external systems to reduce their demands.
     c. Lock out interrupts.
     d. Change operational behavior to handle the load. For example, the software may use faster but less accurate algorithms to keep up with the load.
     e. Reduce the functionality of the software, or even halt or suspend a process or shutdown a computer.

3.4 MSFC

None

REF RPT p13
Div
idtabs-4

4. Resources

4.1 References

Include Page
REF RPT p13
refstable-topic



Show If
groupconfluence-users
Panel
titleColorred
titleVisible to editors only

Enter the necessary modifications to be made in the table below:

SWEREFs to be addedSWEREFS to be deleted


SWEREFs called out in the text: 439, 675

SWEREFs NOT called out in text but listed as germane: NONE


Include Page
REF RPT p13
REF RPT p13
Div
idtabs-5

5. Lessons Learned

5.1 NASA Lessons Learned

The NASA Lesson Learned 

Swerefn
refnum439
 database contains the following lessons learned related to resource oversubscription:

  • Science Data Downlink Process Must Address Constraints Stemming from Fixed Deep Space Network (DSN) Assets.  Lesson Learned 1843:
    Swerefn
    refnum675
     "Given their minimal ability to mitigate DSN resource limitations, flight projects must consider mission design and mission operations improvements that may help to achieve Level 1 requirements, such as the 9 measures effectively employed by the Spitzer project."