Difference between revisions of "UM:Event Processing"

From NetXMS Wiki
Jump to navigation Jump to search
Line 19: Line 19:
Actions taken by event processor for any specific event determined by set of rules called ''Event Processing Policy''. Every rule has two parts - matching part, which determines if rule is appropriate for current event, and action part, which determines actions to be taken for matched events. Matching part consists of four fields:
Actions taken by event processor for any specific event determined by set of rules called ''Event Processing Policy''. Every rule has two parts - matching part, which determines if rule is appropriate for current event, and action part, which determines actions to be taken for matched events. Matching part consists of four fields:


{| class="wikitable"
{| class="wikitable" style="width: 70%"
! Attribute || Description
|-
| ''Source''
| ''Source''
| Event's source node. This field can be set to ''any'', which matches any node, or contain a list of nodes, subnets, or containers. If you specify subnet or container, any node within it will be matched.
| Event's source node. This field can be set to ''any'', which matches any node, or contain a list of nodes, subnets, or containers. If you specify subnet or container, any node within it will be matched.
Line 33: Line 35:
|-
|-
| ''Script''
| ''Script''
| Optional matching script written in NXSL. If this field is empty, no additional checks performed. Otherwise, event will be considered as matched only if script will return non-zero (TRUE) return code. For more information about NetXMS scripting language, please consult [#_NetXMS_Scripting_Language NetXMS Scripting Language] chapter in this manual.
| Optional matching script written in NXSL. If this field is empty, no additional checks performed. Otherwise, event will be considered as matched only if script will return non-zero (TRUE) return code. For more information about NetXMS scripting language, please consult the chapter [[UM:NetXMS_Scripting_Language_(NXSL)|NetXMS Scripting Language (NXSL)]] in this manual.


|}
|}

Revision as of 00:00, 17 May 2012

Event Processing

Event processing is one of the core components of NetXMS. It determines how the monitoring system will react to various events.


Event Processing Overview

The following flowchart outlines event flow inside the monitoring system:

Figure 9: Event flow inside the monitoring system


Error creating thumbnail: Unable to save thumbnail to destination


As you can see on the flowchart, events can come from various sources: polling processes (status, configuration, discovery, and data collection), SNMP traps, and directly from external applications via client library. All incoming events go to single event queue for processing. A special process, called Event Processor, takes events from the queue one by one and matches them against Event Processing Policy. As a result, alarms may be generated and actions may be executed. If event has write lo log attribute set, it is written to NetXMS event log at the end of processing.

Although it may seem that processing all events one by one may become a bottleneck in the system, this should not be the case. Event processor is highly optimized, and all potentially long operations (like action execution) are performed by separate processes.

Event Processing Policy

Actions taken by event processor for any specific event determined by set of rules called Event Processing Policy. Every rule has two parts - matching part, which determines if rule is appropriate for current event, and action part, which determines actions to be taken for matched events. Matching part consists of four fields:

Attribute Description
Source Event's source node. This field can be set to any, which matches any node, or contain a list of nodes, subnets, or containers. If you specify subnet or container, any node within it will be matched.
Event Event code. This field can be set to any, which matches any event, or list of event codes.
Severity Event's severity. This field contains selection of event severities to be matched.
Script Optional matching script written in NXSL. If this field is empty, no additional checks performed. Otherwise, event will be considered as matched only if script will return non-zero (TRUE) return code. For more information about NetXMS scripting language, please consult the chapter NetXMS Scripting Language (NXSL) in this manual.

In action part you can set alarm generation, situation update, and list of actions to be executed. Every rule can also have a free-form textual comment.

Each event passes through all rules in the policy, so if it matches to more than one rule, actions specified in all matched rules will be executed. You can change this behavior by setting Stop Processing flag for the rule. If this flag is set and rule matched, processing of current event will be stopped.

You can create and modify Event Processing Policy using Event Processing Policy Editor. To access the Event Processing Policy Editor window, press F9 or on the View menu click Control Panel to access the Control Panel window and then click the Event Processing Policy icon.


Examples:


Error creating thumbnail: Unable to save thumbnail to destination


This rule defines that for every major or critical event originated from any node within "IPSO" container two e-mail actions should be executed.


Error creating thumbnail: Unable to save thumbnail to destination


This rule defines that for events NOKIA_CFG_CHANGED, NOKIA_CFG_SAVED, NOKIA_LOW_DISK_SPACE, and NOKIA_NO_DISK_SPACE, originated from any node, system should generate alarm with text "%m" (which means "use event's message text) and severity equal to event's severity.

Alarms

Alarms Overview

As a result of event processing some events can be shown up as alarms. Usually alarm represents something that needs attention of network administrators or network control center operators, for example low free disk space on a server. Every alarm has the following attributes:

Attribute Description
Creation time Time when alarm was created.
Last change time Time when alarm was last changed (for example, acknowledged).
State Alarm can be in one of three states:
Outstanding New alarm;
Acknowledged When network administrator sees an alarm, he may acknowledge it to indicate that somebody already aware of that problem and working on it;
Terminated Inactive alarm. When problem is solved, network administrator can terminate alarm. This will remove alarm from active alarms list and it will not be seen in console, but alarm record will remain in database.
Message Message text (usually derived from originating event's message text).
Severity Alarm's severity - Normal, Warning, Minor, Major, or Critical.
Source Source node (derived from originating event).
Key Text string used to identify duplicate alarms and for automatic alarm termination.

Generating Alarms

To generate alarms from events, you should edit "Alarm" field in appropriate rule of Event Processing Policy. Alarm configuration dialog will look like this:

Figure 10: Alarm configuration dialog

Error creating thumbnail: Unable to save thumbnail to destination


You should select Generate new alarm radio button to enable alarm generation from current rule. In the Message field enter alarm's text, and in the alarm key enter value which will be used for repeated alarms detection and automatic alarm termination. In both fields you can use macros described in the [#_Macros_for_Event_Processing Macros for Event Processing] chapter.

You can also configure sending of additional event if alarm will stay in Outstanding state for given period of time. To enable this, enter desired number of seconds in Seconds field, and select event to be sent. Entering value of 0 for seconds will disable additional event sending.

Automatic Alarm Termination

You can terminate all active alarms with given key as a reaction for the event. To do this, select Terminate alarm radio button in alarm configuration dialog and enter value for alarm key. For that field you can use macros described in the [#_Macros_for_Event_Processing Macros for Event Processing] chapter.


Situations

Situations Overview

Situations is a special type of event processing objects which allow you to track current state of your infrastructure and process events accordingly. Each situation has one or more instances, and each instance has one or more attributes. Situation objects allow you to store information about current situation in attributes and then use this information in event processing. For example, if you have one service (service A) depending on another (service B), and in case of service B failure you wish to get alarm about service B failure, and not about consequent service A failure. To accomplish this, you can do the following:

  1. Create situation object named "ServiceStatus";
  2. In event processing policy, for processing of event indicating service B failure, add situation attribute update: update situation "ServiceStatus", instance "Service_B", set attribute "status" to "failed";
  3. In event processing policy, for rule generating alarm in case of service A failure, add additional filtering using script - to match this rule only if service B is not failed. Your script may look like the following:
sub main()
{
    s = FindSituation("ServiceStatus", "Service_B");
    if (s != NULL)
    {
        if (s->status == "failed")
            return 0; // Don't match rule
    }
    return 1; // Match rule
}


Defining Situations

Situations can be configured via management console. To open situations editor, select View in main menu, then Situations. You will see situations tree. At the top of the tree is an abstract root element. Below are all defined situations – initially there are no situations, so you will see only root element. You can create situation either by right-clicking root element and selecting Create from pop-up menu, or by selecting Create under Situation in main menu.

Next level in the tree below situations is situation instances. Initially it is empty, but when situations start updating, you will see existing instances for each situation.

Updating Situations

Situations can be updated via Event Processing Policy. To update situation, you can edit Situation field in appropriate rule. Situation update dialog will looks like following:


Figure 11: Situation update dialog

Error creating thumbnail: Unable to save thumbnail to destination

You can select situation to update, and enter instance name and attributes to be set. In instance name and attributes' values you can use same macros as in alarm generation.