|
Event Logging
A Flight/Ground/Test Event Logging Facility
|
[
Home |
Problem Collection |
Results for this problem
]
Prepared by
Daniel Dvorak
(Jet Propulsion Laboratory,
California Institute of Technology)
Domain Description
Desired Program
Terminology
Detailed Requirements
Non Requirements
Use Cases
Domain Description
The onboard control software
for spacecraft such as Mars Pathfinder and Cassini is composed of many
subsystems including executive control, navigation, attitude control, imaging,
and telecommunications. The software in all of these subsystems needs to
be instrumented for several purposes: to report required telemetry
data to Earth, to report warnings and errors, to verify internal behavior
during system testing, and to provide ground operators with detailed data
when investigating in-flight anomalies. These reportable events
can range in importance from purely informational events to major errors.
It is desirable to provide a uniform mechanism for reporting such events
and controlling their subsequent processing.
In the domain of deep-space
missions there are practical limits to how much event data can be saved
and transmitted. First, radiation-hardened flight processors are several
years behind the speed and memory of their commercial cousins, with most
of the memory intended for science data. Second, downlink rates from deep
space to Earth can be very low (e.g., ~300 bps from Pluto using X-band
transmission from a 2-meter spacecraft antenna), so it's impractical to
send everything. Third, the new breed of semi-autonomous spacecraft may
contact Earth only once a week, so the least important data may have to
be deleted to make room for more important data, particularly data from
science instruments.
The relative importance of
any particular event depends on several factors. Program context, such
as the distinction between a warning and an error, will rank some events
as inherently more important than others. Some faults can cause an event
to recur at a high rate, but this must not be allowed to flood the memory
pool. Some events may be of low importance when first reported but suddenly
become more important when a subsequent error event gets reported. Some
events may be so routine that they need not be saved and reported unless
specifically requested by ground operators.
When an event is downlinked
to Earth, it will be processed automatically for display, archiving, possible
alerting, and possible historical summary or other analysis. As such, an
event should be represented in a way that facilitates automated processing
rather than manual inspection.
The Desired Program
Your task is to design an object-oriented
event logging facility (ELF) that spacecraft programmers
will use to instrument flight code, that ground operators will control
during mission operations to select different levels of logging, and that
system test tools will connect to in order to monitor and audit test results.
As such, you are designing a facility that spans the flight, ground, and
test domains. There are five main elements to be designed:
-
Design a base class for events
that includes attributes for time stamp, event ID, and event severity,
plus any other attributes needed for the entry policy and retention policy.
This class must be extensible for events that need to be logged with additional
event-specific data.
-
Design an event signaling mechanism
(whether a method, template function, or macro) that programmers will use
to instrument their code. This mechanism should incur minimal overhead
when the entry policy is set to discard the given event.
-
Design a parameterized entry
policy that filters events based on their type, ID, severity, and frequency
of occurrence.
-
Design a parameterized retention
policy that discards logged events based on type, severity, age, and population
limit.
-
Provide a mechanism for ground
operators to dynamically change the parameters of any policy.
-
Optional: Design a ground-based
GUI that maintains an up-to-date display of event occurrences, with selectable
views by time, by severity, by type, and by ID.
You may assume that a
Data Transport subsystem exists for uplinking commands from ground to spacecraft
and for downlinking data from spacecraft to ground. You may also assume
that a Data Management subsystem exists for saving data products such as
events and making them available to other subsystems (such as Data Transport)
as needed.
Terminology
An event is any noteworthy
state, as determined by a system engineer or designer or developer. For
example, a bus voltage below 22 volts or a memory pool over 98% full might
be considered noteworthy states. An event is said to have occurred
(in a software sense) when it is detected in a conditional statement and
can therefore be acted upon. An event occurrence is said to have been
signaled
when an Elf signaling mechanism has been called. A signaled event is said
to have been
logged if an event record is created, submitted to
Data Management, and accepted, subject to an "entry policy". A data
product is a transportable object that can be stored by Data Management
and downlinked by Data Transport. An event object is a data product
containing information describing the occurrence of a particular event.
An event type or event class is a data type that specifies
the kinds of data that describe an event occurrence. An event identifier
is a label for a kind of event. (An event identifier is useful in distinguishing
among different kinds of events that use the same event type.) An event
severity is a measure of the level of importance of an event occurrence.
An entry policy controls what signaled events are logged. As an
example, a policy may control entry based on event type, event severity,
event identifier, frequency of event occurrences, and equality to the previous
event of this type. A retention policy controls how long a logged
event is retained. As an example, a policy might depend on factors such
as age and number of currently retained events.
Detailed Requirements
-
All event
objects must contain a time stamp, event identifier, and event severity.
-
An event
identifier must be encoded as a number, not a string, because strings consume
too much of the limited downlink capacity.
-
Elf must
support three levels of event severity: a "green" level for purely informational
events, a "yellow" level for warnings, and a "red" level for errors.
-
Application
programmers must be able to define new event types that contain application-specific
data.
-
The contents
of an event object must be strongly typed so that downstream processing
can access the contents in a type safe manner.
-
Elf must
define a signaling interface whereby an event occurrence is signaled with
all the information needed to construct an event object.
-
For reasons
of runtime efficiency in high-performance applications, at least one of
the signaling interfaces (if more than one) must be designed for speed
in ignoring events for which logging is currently disabled.
-
Elf must
define interfaces for controlling entry policy and retention policy.
-
It must
be possible to change the tunable parameters of a policy at run-time, i.e.,
no source code changes and no recompilations.
-
It is desirable
to include with a logged event the source location where it was signaled.
This helps distinguish between events that otherwise have the same signature.
-
Since brevity
is a virtue to most programmers, Elf should provide at least one signaling
interface for basic events that can be written in a compact form. The intent
is to make it easy to instrument an application's source code, particularly
during early design and debugging.
Non Requirements
-
There
is no limit on the number of event types that may be defined.
-
There
is no requirement for a signaling interface that can be conditionally compiled
down to zero run-time overhead, i.e., compiled out of existence.
-
The
preceding requirements deliberately do not prescribe or constrain how exceptions
might be used to signal errors. The use of exceptions for error handling
is considered an orthogonal issue.
-
There
is no requirement for Elf to maintain statistics such as the number of
times that an event condition has been checked.
-
There is no requirement for Elf to provide a way to force the occurrence
of an event, such as for testing purposes.
Use Cases
-
A programmer defines an application-specific
event class.
-
A programmer instruments source
code to signal an event if it occurs.
-
A running application program
signals occurrence of an event.
-
An Elf signaling mechanism checks
the entry policy.
-
Data management accepts an event
object for logging.
-
A ground program tabulates events
as they arrive, sorting them by severity.
-
A ground operator adjusts the
tunable parameters of a specific policy.
Last updated by Torsten Layda,
SWX Swiss Exchange,
DesignFest® Webmaster.