Date: January 24, 2006 4:28:28 PM EST
To: secchi_manage NRL 
Subject: SECCHI HK Event Messages in beacon

Folks,

This email describes what severity of event messages means and which event messages are currently sent down SPWX. The full list of event messages may be found at http://ares.nrl.navy.mil/public_html/tcvol2/mnemonics/HKEVENTCODE.html. 

This is in preparation for a meeting sometime before Thermal Cycling to discuss what action to take for each event. Don will be making the meeting arrangements. Thanks,

-n

Begin forwarded message:

From: Ladd Wheeler
Date: January 13, 2006 5:24:53 PM EST
Subject: Re: sc_event question

Nathan --

Following up on our phone conversation, until we get the following info documented in FSW user guide HTML world, the following info might be useful:

There are six levels of severity specified in the event messages.  The levels and their meaning are:
    1: Information -- simply reporting an occurrence (e.g., a command was processed).
    2: Operator Error -- the event was caused by erroneous operator input (e.g., attempting to command a disabled dec heater).
    3: Normal Software Error -- reporting a recoverable software error occurrence (e.g., software bus overflow).
    4: Normal Hardware Error -- reporting a recoverable hardware error occurrence (e.g., failed attempt to write a value to the HKP Master Control Register).
    5: Critical Software Error -- reporting an unrecoverable software error occurrence (e.g., failure in attempting to connect, enable, or disable RAD750 memory scrub interrupt service routine).
    6: Critical Hardware Error -- reporting an unrecoverable hardware error occurrence (e.g., checksum detected an unexpected checksum of a kernel image or SUROM area).

Per your request, I researched which event messages are sent down the SpaceWeather beacon.  In the process, I found a couple of event messages that indeed do have differing severities associated with them based on the detected situation.  Below is a list of the SpWx event messages with event number, brief description, and severity:
 101: Reset Due to Processor Watchdog Timeout:  Crit S/w Err
 105: Spurious Processor Reset:                 Crit S/w Err
                                                Crit H/w Err
 111: Board Reset Commanded:                    Info
 112: Transition to Maint Mode Commanded:       Info
 113: Transition to Ops Mode Commanded:         Info
 116: Cold Reset Due to Too Many Warm Resets:   Crit S/w Err
 117: Reset Due to Power Cycle:                 Crit H/w Err
 118: Event Response Reset:                     Crit S/w Err
 850: IP Event Detected:                        Info
 851: IP Bad Code:                              Info
 965: Impending Power Off:                      Info
 972: Unexpected Checksum Value:                Info (unprotek disk)
                                                Crit H/w Err (other)
 999: Critical Telemetry Autonomy Rule Tripped: Info

Hope this helps.

-- Ladd

At 02:52 PM 1/13/2006 -0500, Ladd Wheeler wrote:
Nathan --

Theoretically it is possible that a given event code may have different severity levels depending on the situation.  However, I can't offhand think of a case where that is the case, but my offhand thoughts are not to be taken as infallible.

Please give me a call to chat about some thoughts I have about SpWx event notification.

-- Ladd

At 02:24 PM 1/13/2006 -0500, Nathan Rich (Contractor) wrote:
Ladd,

I'm thinking about how to implement the cfgmon for Event messages. One question I have is what determines the Event Severity (HKEVENTSEV) for an event message? Does each value of HKEVENTCODE have only one possible severity, or can a given event code have different severities? Thanks,

-n
---------------------------------------------------------------
Ladd Wheeler