Difference between revisions of "UM:Event Processing"

From NetXMS Wiki
Jump to navigation Jump to search
(Created page with "= Event Processing = Event processing is one of core components of NetXMS. It determines how the monitoring system will react to various events. == Event Processing Overview...")
 
m (Text replacement - "^" to "{{deprecated}}")
 
(30 intermediate revisions by 5 users not shown)
Line 1: Line 1:
= Event Processing =
{{deprecated}}Information moved to documentation:
Event processing is one of core components of NetXMS. It determines how the monitoring system will react to various events.


 
https://www.netxms.org/documentation/adminguide/event-processing.html
== Event Processing Overview ==
The following flowchart outlines event flow inside the monitoring system:
 
'''Figure 9: Event flow inside the monitoring system'''
 
 
<center>[[Image:]]</center>
 
 
As you can see on the flowchart, events can come from various sources: polling processes (status, configuration, discovery, and data collection), SNMP traps, and directly from external applications via client library. All incoming events go to single event queue for processing. A special process, called ''Event Processor'', takes events from the queue one by one and matches them against Event Processing Policy. As a result, alarms may be generated and actions may be executed. If event has ''write lo log'' attribute set, it is written to NetXMS event log at the end of processing.
 
Although it may seem that processing all events one by one may become a bottleneck in the system, this should not be the case. Event processor is highly optimized, and all potentially long operations (like action execution) are performed by separate processes.
 
 
== Event Processing Policy ==
Actions taken by event processor for any specific event determined by set of rules called ''Event Processing Policy''. Every rule has two parts – matching part, which determines if rule is appropriate for current event, and action part, which determines actions to be taken for matched events. Matching part consists of four fields:
 
 
{| class="prettytable"
| ''Source''
| Event's source node. This field can be set to ''any'', which matches any node, or contain a list of nodes, subnets, or containers. If you specify subnet or container, any node within it will be matched.
 
|-
| ''Event''
| Event code. This field can be set to ''any'', which matches any event, or list of event codes.
 
|-
| ''Severity''
| Event's severity. This field contains selection of event severities to be matched.
 
|-
| ''Script''
| Optional matching script written in NXSL. If this field is empty, no additional checks performed. Otherwise, event will be considered as matched only if script will return non-zero (TRUE) return code. For more information about NetXMS scripting language, please consult [#_NetXMS_Scripting_Language NetXMS Scripting Language] chapter in this manual.
 
|}
In action part you can set alarm generation, situation update, and list of actions to be executed. Every rule can also have a free-form textual comment.
 
Each event passes through all rules in the policy, so if it matches to more than one rule, actions specified in all matched rules will be executed. You can change this behavior by setting ''Stop Processing'' flag for the rule. If this flag is set and rule matched, processing of current event will be stopped.
 
You can create and modify Event Processing Policy using '''Event Processing Policy Editor'''. To access the '''Event Processing Policy Editor''' window, press F9 or on the '''View '''menu click '''Control Panel '''to access the '''Control Panel''' window and then click the '''Event Processing Policy''' icon.
 
 
Examples:
 
 
[[Image:]]
 
 
This rule defines that for every major or critical event originated from any node within "IPSO" container two e-mail actions should be executed.
 
 
[[Image:]]
 
 
This rule defines that for events NOKIA_CFG_CHANGED, NOKIA_CFG_SAVED, NOKIA_LOW_DISK_SPACE, and NOKIA_NO_DISK_SPACE, originated from any node, system should generate alarm with text "%m" (which means "use event's message text) and severity equal to event's severity.
 
 
== Alarms ==
=== Alarms Overview ===
As a result of event processing some events can be shown up as ''alarms''. Usually alarm represents something that needs attention of network administrators or network control center operators, for example low free disk space on a server. Every alarm has the following attributes:
 
 
 
{| class="prettytable"
| Creation time
| Time when alarm was created.
 
|-
| Last change time
| Time when alarm was last changed (for example, acknowledged).
 
|-
| State
| Alarm can be in one of three states:
 
 
{| class="prettytable"
| Outstanding
| New alarm;
 
|-
| Acknowledged
| When network administrator sees an alarm, he may ''acknowledge'' it to indicate that somebody already aware of that problem and working on it;
 
|-
| Terminated
| Inactive alarm. When problem is solved, network administrator can terminate alarm – this will remove alarm from active alarms list and it will not be seen in console, but alarm record will remain in database.
 
|}
 
 
 
 
 
|-
| Message
| Message text (usually derived from originating event's message text).
 
|-
| Severity
| Alarm's severity – Normal, Warning, Minor, Major, or Critical.
 
|-
| Source
| Source node (derived from originating event).
 
|-
| Key
| Text string used to identify duplicate alarms and for automatic alarm termination.
 
|}
=== Generating Alarms ===
To generate alarms from events, you should edit "Alarm" field in appropriate rule of Event Processing Policy. Alarm configuration dialog will look like this:
 
'''Figure 10: Alarm configuration dialog'''
 
[[Image:]]
 
 
You should select '''Generate new alarm''' radio button to enable alarm generation from current rule. In the '''Message''' field enter alarm's text, and in the alarm key enter value which will be used for repeated alarms detection and automatic alarm termination. In both fields you can use macros described in the [#_Macros_for_Event_Processing Macros for Event Processing] chapter.
 
You can also configure sending of additional event if alarm will stay in '''Outstanding''' state for given period of time. To enable this, enter desired number of seconds in '''Seconds''' field, and select event to be sent. Entering value of 0 for seconds will disable additional event sending.
 
 
=== Automatic Alarm Termination ===
You can terminate all active alarms with given key as a reaction for the event. To do this, select '''Terminate alarm''' radio button in alarm configuration dialog and enter value for alarm key. For that field you can use macros described in the [#_Macros_for_Event_Processing Macros for Event Processing] chapter.
 
 
== Situations ==
=== Situations Overview ===
Situations is a special type of event processing objects allowing you to track current state of your infrastructure and process events according to it. Each situation has one or more instances, and each instance has one or more attributes. Situation objects allows you to store information about current situation in attributes and then use this information in event processing. For example, if you have one service (service A) depending on another (service B), and in case of service B failure you wish to get alarm about service B failure, and not about consequent service A failure. To accomplish this, you can do the following:
 
# Create situation object named "ServiceStatus";
# In event processing policy, for processing of event indicating service B failure, add situation attribute update: update situation "ServiceStatus", instance "Service_B", set attribute "status" to "failed";
# In event processing policy, for rule generating alarm in case of service A failure, add additional filtering using script – to match this rule only if service B is not failed. You script may looks like following:
 
sub main()
 
{
 
s = FindSituation("ServiceStatus", "Service_B");
 
if (s != NULL)
 
{
 
if (s->status == "failed")
 
return 0;// Don't match rule
 
}
 
return 1;// Match rule
 
}
 
 
=== Defining Situations ===
Situations can be configured via management console. To open situations editor, select '''View''' in main menu, then '''Situations'''. You will see situations tree. At the top of the tree is an abstract root element. Below are all defined situations – initially there are no situations, so you will see only root element. You can create situation either by right-clicking root element and selecting '''Create''' from pop-up menu, or by selecting '''Create''' under '''Situation '''in main menu.
 
Next level in the tree below situations is situation instances. Initially it is empty, but when situations start updating, you will see existing instances for each situation.
 
 
=== Updating Situations ===
Situations can be updated via Event Processing Policy. To update situation, you should edit '''Situation''' field in appropriate rule. Situation update dialog will looks like following:
 
 
'''Figure 11: Situation update dialog'''
 
[[Image:]]
 
You should select situation to update, and enter instance name and attributes to be set. In instance name and attributes' values you can use same macros as in alarm generation.

Latest revision as of 18:13, 13 September 2022

This Wiki is deprecated and we are are currrently migrating remaining pages into product documentation (Admin Guide, NXSL Guide)

Information moved to documentation:

https://www.netxms.org/documentation/adminguide/event-processing.html