Difference between revisions of "UM:Data Collection"

Jump to navigation Jump to search
Line 131: Line 131:
=== Instance ===
=== Instance ===
Each DCI has an ''Instance'' attribute, which is a free-form text string, passed as a 6<sup>th</sup> parameter to events associated with thresholds. You can use this parameter to distinguish between similar events related to different instances of the same entity. For example, if you have an event generated when file system was low on free space, you can set the ''Instance'' attribute to file system mount point.
Each DCI has an ''Instance'' attribute, which is a free-form text string, passed as a 6<sup>th</sup> parameter to events associated with thresholds. You can use this parameter to distinguish between similar events related to different instances of the same entity. For example, if you have an event generated when file system was low on free space, you can set the ''Instance'' attribute to file system mount point.


=== Threshold Processing ===
=== Threshold Processing ===
Line 140: Line 141:




As you can see from this flowchart, threshold order is very important. Let's consider the following example: you have DCI representing CPU utilization on the node, and you wish two different events to be generated - one when CPU utilization exceeds 50%, and another one when it exceeds 90%. What happens when you place threshold "> 50" first, and "> 90" second? Table 4 shows values received from host and actions taken by monitoring system (assuming that all thresholds initially unarmed):
As you can see from this flowchart, threshold order is very important. Let's consider the following example: you have DCI representing CPU utilization on the node, and you wish two different events to be generated - one when CPU utilization exceeds 50%, and another one when it exceeds 90%. What happens when you place threshold "> 50" first, and "> 90" second? The following table shows values received from host and actions taken by monitoring system (assuming that all thresholds initially unarmed):
 
'''Table 4: Actions taken by monitoring system'''
 
 
{| class="prettytable"
| '''Value'''
| '''Action'''


{| class="wikitable"
|-
! Value !! Action
|-
|-
| 10
| 10
| Nothing will happen.
| Nothing will happen.
|-
|-
| 55
| 55
| When checking first threshold ("> 50"), the system will find that it's not active, but condition evaluates to true. So, the system will set threshold state to "active" and generate event associated with it.
| When checking first threshold ("> 50"), the system will find that it's not active, but condition evaluates to true. So, the system will set threshold state to "active" and generate event associated with it.
|-
|-
| 70
| 70
| When checking first threshold ("> 50"), the system will find that it's already active, and condition evaluates to true. So, the system will stop threshold checking and will not take any actions.
| When checking first threshold ("> 50"), the system will find that it's already active, and condition evaluates to true. So, the system will stop threshold checking and will not take any actions.
|-
|-
| 95
| 95
| When checking first threshold ("> 50"), the system will find that it's already active, and condition evaluates to true. So, the system will stop threshold checking and will not take any actions.
| When checking first threshold ("> 50"), the system will find that it's already active, and condition evaluates to true. So, the system will stop threshold checking and will not take any actions.
|}


|}
Please note that second threshold actually is not working, because it's "masked" by the first threshold. To achieve desired results, you should place threshold "> 90" first, and threshold "> 50" second.
Please note that second threshold actually is not working, because it's "masked" by the first threshold. To achieve desired results, you should place threshold "> 90" first, and threshold "> 50" second.


You can disable threshold ordering by checking '''Always process all thresholds''' checkbox. If it is marked, system will always process all thresholds.
You can disable threshold ordering by checking '''Always process all thresholds''' checkbox. If it is marked, system will always process all thresholds.


=== Threshold Configuration ===
=== Threshold Configuration ===
Line 177: Line 172:
First, you have to select what value will be checked:
First, you have to select what value will be checked:


* ''last polled value''
{| class="wikitable"
 
|-
Last value will be used. If number of polls set to more then 1, then condition will evaluate to true only if it's true for each individual value of last ''n'' polls.
| ''last polled value'' || Last value will be used. If number of polls set to more then 1, then condition will evaluate to true only if it's true for each individual value of last ''n'' polls.
 
|-
* ''average value''
| ''average value'' || An average value for last ''n'' polls will be used (you have to configure a desired number of polls).
 
|-
An average value for last ''n'' polls will be used (you have to configure a desired number of polls).
| ''mean deviation'' || A mean absolute deviation for last ''n'' polls will be used (you have to configure a desired number of polls). Additional information on how mean absolute deviation calculated can be found here: [http://en.wikipedia.org/wiki/Mean_deviation http://en.wikipedia.org/wiki/Mean_deviation].
 
|-
* ''mean deviation''
| ''diff with previous value'' || A delta between last and previous values will be used. If DCI data type is string, system will use 0, if last and previous values match; and 1, if they don't.
 
|-
A mean absolute deviation for last ''n'' polls will be used (you have to configure a desired number of polls). Additional information on how mean absolute deviation calculated can be found here: [http://en.wikipedia.org/wiki/Mean_deviation http://en.wikipedia.org/wiki/Mean_deviation].
| ''data collection error'' || An indicator of data collection error. Instead of DCI's value, system will use 0 if data collection was successful, and 1 if there was a data collection error. You can use this type of thresholds to catch situations when DCI's value cannot be retrieved from agent.
 
|}
* ''diff with previous value''
 
A delta between last and previous values will be used. If DCI data type is string, system will use 0, if last and previous values match; and 1, if they don't.
 
* ''data collection error''
 
An indicator of data collection error. Instead of DCI's value, system will use 0 if data collection was successful, and 1 if there was a data collection error. You can use this type of thresholds to catch situations when DCI's value cannot be retrieved from agent.
 


Second, you have to select comparison function. Please note that not all functions can be used for all data types. Below is a compatibility table:
Second, you have to select comparison function. Please note that not all functions can be used for all data types. Below is a compatibility table: