Difference between revisions of "UM:Basic Concepts"

1,775 bytes removed ,  18:13, 13 September 2022
m
Text replacement - "^" to "{{deprecated}}"
m
 
m (Text replacement - "^" to "{{deprecated}}")
 
(14 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{deprecated}}{{DISPLAYTITLE:Basic Concepts}}
= Overview of System Architecture =
= Overview of System Architecture =


The system has three-tier architecture: the information is collected by monitoring agents (either NetXMS native agents or SNMP agents) and delivered to monitoring server for processing and storage. Network administrator can configure system and access collected data using portable Java-based Management Console, Web Interface, or Mobile Management Console. You can see an overview of NetXMS architecture on the following diagram.
The system is built on the three-tier architecture. The monitoring agents (either NetXMS native agents or SNMP agents) collect the information and deliver it to the monitoring server for processing and storage. Network administrators can configure the system and access collected data using cross-platform Java-based Management Console, Web Interface or Mobile Management Console. You can see the overview of the NetXMS system architecture on the following diagram.
::[[File:um_architecture.png]]
::[[File:um_architecture.png|thumb|650px|none|alt=Architecture|NetXMS Architecture]]


All collected data and system configuration is stored in the SQL database. You can choose Oracle, Microsoft SQL Server, PostgreSQL, MySQL, or SQLite as your database engine. Database server can be installed on the same physical machine, or be a separate server.
All collected data and system configuration is stored in the SQL database. You can choose Oracle, Microsoft SQL Server, PostgreSQL, MySQL, or SQLite as your database engine. Database server can be installed on the same physical machine, or be a separate server.
NetXMS web-based used interface is built as a single war file and can be deployed to any compatible application server (like Tomcat or Jetty). It is a separate component and can be installed on the same physical machine as NetXMS server, or on a remote server.
NetXMS web-based user interface is built as a single war file and can be deployed to any compatible application server (like Tomcat or Jetty). It is a separate component and can be installed on the same physical machine as NetXMS server, or on a remote server.
The system was designed to be easily extendable; so all three tiers — server, agent, and client — have modular structure. The followoing figure shows main software layer of NetXMS system.
The system has been designed to be easily extendable; so all three tiers — server, agent, and client — have modular structure. The following figure shows main software layer of NetXMS system.


::[[File:um_software_layers.png]]
::[[File:Main_Software_Layer.png|thumb|650px|none|alt=Software layers|NetXMS Software Layers]]


All system components use two libraries: NetXMS Foundation Library and Communication Library. These libraries provide communication between all system components. The server also has an underlying DB Driver API layer, which provides uniform database engine interface by using database drivers. This approach allows developers to add support for new database engines in a matter of days without changing or even recompiling server code.
All system components use two libraries: NetXMS Foundation Library and Communication Library. These libraries provide communication between all system components. The server also has an underlying DB Driver API layer, which provides uniform database engine interface by using database drivers. This approach allows developers to add support for new database engines in a matter of days without changing or even recompiling server code.
Line 16: Line 17:
= Objects =
= Objects =


All network infrastructure monitored by NetXMS inside monitoring system represented as a set of objects. Each object represents one physical or logical entity (like host or network interface), or group of them. Objects organized into hierarchical structure. There are 12 different object classes:
{{Objects}}


Entire Network
= DCI (Data Collection Items) =
Abstract object representing root of IP topology tree. All subnet objects located under it. System can have only one object of this class.
Subnet
Object representing IP subnet. Typically objects of this class created automatically by the system to reflect system's knowledge of IP topology.
Node
Object representing physical host or network device. These objects can be created either manually by administrator or automatically during network discovery process.
Cluster
Object representing cluster consisted of two or more hosts.
Interface
Object representing network interface of node. These objects created automatically by the system during configuration polls.
Network Service
Object representing network service running on a node (like http or ssh).
VPN Connector
Object representing VPN tunnel endpoint. Such objects can be created to add VPN tunnels to network topology known y NetXMS server.
Service Root
Abstract object representing root of your service tree. System can have only one object of this class.
Container
Grouping object which can contain nodes, subnets, clusters, conditions, or other containers. With help of container objects you can build object's tree which represents logical hierarchy of IT services in your organization.
Condition
Object representing complicated condition – like "cpu on node1 is overloaded and node2 is down for more than 10 minutes".
Template Root
Abstract object representing root of your template tree.
Template Group
Grouping object which can contain templates or other template groups.
Template
Data collection template. See Data Collection section for more information about templates.


Every object has set of attributes; some of them are common (like id and name), while other depends on object class – for example, attribute "SNMP community string" have only node objects.


= DCI (Data Collection Items) =
Every node can have many parameters, like CPU utilization, amount of free memory or disk space usage. The management server can collect these parameters, check them for threshold violations and store them in the database. In NetXMS, parameters configured for collection are called Data Collection Items or DCI for short. One DCI represents one node's parameter, and unlimited number of DCIs can be configured for any node.


Every node can have many parameters, like CPU utilization or amount of free memory, which can be collected by management server, checked for threshold violations, and stored in the database. In NetXMS, parameters configured for collection are called Data Collection Items or DCI for short. One DCI represents one node's parameter, and unlimited number of DCIs can be configured for any node.
Each data collection item has various attributes controlling its handling:
Each data collection item has various attributes controlling its handling:
Description — a free-form text string describing DCI. It is not used by the server and is intended for better information understanding by operators.
{| class="wikitable" style="width: 70%"
Origin — origin of data (or method of obtaining data). Possible origins are NetXMS agent, SNMP agent, CheckPoint SNMP agent, or Internal (data generated inside NetXMS server process).
! Attribute || Description
Name — name of the parameter of interest, used for making a request to target node. For NetXMS agent it will be parameter name, and for SNMP agent it will be an SNMP OID.
|-
Data Type — data type for a parameter. Can be one of the following: Integer, Unsigned Integer, 64-bit Integer, 64-bit Unsigned Integer, Float (floating point number), or String. Selected data type affects processing of collected data — for example, you cannot use operations like ”less than” or ”greater than” on strings.
| Description || A free-form text string describing DCI. It is not used by the server and is intended for better information understanding by operators.
Retention Time — this attribute specifies how long collected data should be kept in database, in days. Minimum retention time is 1 day, and maximum is not limited. However, keeping too many collected values for too long will lead to significant increase of your database size and possible performance degradation.
|-
Schedule Type — type of the collection schedule used; can be either simple or advanced. In a simple mode, values are taken from target at fixed intervals. In an advanced mode, cron-like scheduling table can be used to specify the exact time for polling. This can be useful if, for example, you wish to check the file size every Monday and Friday at 7:00.
| Origin || Origin of data (or method of obtaining data). Possible origins are NetXMS agent, SNMP agent, CheckPoint SNMP agent, or Internal (data generated inside NetXMS server process).
Polling Interval — interval in seconds between two polls. Applicable only if simple schedule type is selected.
|-
Scheduling Table — cron-like scheduling table for data collection polls. Applicable only if advanced schedule type is selected.
| Name || Name of the parameter of interest, used for making a request to target node. For NetXMS agent it will be parameter name, and for SNMP agent it will be an SNMP OID.
Threshold List — list of defined thresholds.
|-
Instance — a free-form text string, passed as 6th parameter to events associated with thresholds. You can use this parameter to distinguish similar events related to different instances of the same entity – for example, if you have an event generated when file system is low on free space, you can set instance attribute to file system mount point.
| Data Type || Data type for a parameter. Can be one of the following: Integer, Unsigned Integer, 64-bit Integer, 64-bit Unsigned Integer, Float (floating point number), or String. Selected data type affects processing of collected data — for example, you cannot use operations like ”less than” or ”greater than” on strings.
|-
| Retention Time || This attribute specifies how long collected data must be kept in the database, in days. Minimum retention time is 1 day, and maximum is unlimited. However, keeping too many collected values for too long will lead to significant increase of your database size and possible performance degradation.
|-
| Schedule Type || Type of the collection schedule used; can be either simple or advanced. In a simple mode, values are taken from target at fixed intervals. In an advanced mode, cron-like scheduling table can be used to specify the exact time for polling. This can be useful if, for example, you wish to check the file size every Monday and Friday at 7:00.
|-
| Polling Interval || Interval in seconds between two polls. Applicable only if simple schedule type is selected.
|-
| Scheduling Table || Cron-like scheduling table for data collection polls. Applicable only if advanced schedule type is selected.
|-
| Threshold List || List of defined thresholds.
|-
| Instance || A free-form text string, passed as 6th parameter to events associated with thresholds. You can use this parameter to distinguish similar events related to different instances of the same entity – for example, if you have an event generated when file system is low on free space, you can set instance attribute to file system mount point.
|}


= Thresholds =
= Thresholds =


Each threshold is a combination of condition and events pair — if condition becomes true, associated "activation" event generated, and when it's becomes false again, "deactivation" event generated. Thresholds let you take a proactive approach to network management. You can define thresholds for any data collection items that you are monitoring. When setting thresholds, first determine what would constitute reasonable thresholds. To decide on a threshold value, you need to know what is normal value and what is out of range. Only you can decide what is normal behavior for a device on your network. Generally, we recommend that you collect information about a device throughout one complete business cycle, before determining the normal high/low range. Consider collecting values such as error rates, retry limits, collisions, throughput, relation rates, and many more. You also have the possibility to define more than one threshold for a single DCI, which allows you to distinguish between different severity conditions.
Each threshold is a combination of a condition and an events pair. If a condition becomes true, associated "activation" event generated, and when it's becomes false again, "deactivation" event generated. Thresholds let you take a proactive approach to network management. You can define thresholds for any data collection items that you are monitoring. When setting thresholds, first determine what would constitute reasonable thresholds. To decide on a threshold value, you need to know what is normal value and what is out of range. Only you can decide what is normal behavior for a device on your network. Generally, it's recommended that you collect information about a device throughout one complete business cycle, before determining the normal high/low range. Consider collecting values such as error rates, retry limits, collisions, throughput, relation rates, and many more. You also have the possibility to define more than one threshold for a single DCI, which allows you to distinguish between different severity conditions.


= Events and Alarms =
= Events and Alarms =


Many services within NetXMS gather information and generate events that are forwarded to NetXMS Event Queue. Events can also be emitted from agents on managed nodes, or from management applications residing on the management station or on specific network nodes. All events are processed by NetXMS Event Processor one-by-one, according to the processing rules defined in Event Processing Policy. As a result of event processing, some actions can be taken, and event can be shown up as alarm. NetXMS provides one centralized location, the Alarm Browser, where the alarms are visible to your team. You can control which events should be considered important enough to show up as alarms. You and your team can easily monitor the posted alarms and take appropriate actions to preserve the health of your network.  
Many services within NetXMS gather information and generate events that are forwarded to NetXMS Event Queue. Events can also be emitted from agents on managed nodes, or from management applications residing on the management station or on specific network nodes. All events are processed by NetXMS Event Processor one-by-one, according to the processing rules defined in Event Processing Policy. As a result of event processing, some actions can be taken, and event can be shown up as alarm. NetXMS provides one centralized location, the Alarm Browser, where the alarms are visible to your team. You can control which events should be considered important enough to show up as alarms. You and your team can easily monitor the posted alarms and take appropriate actions to preserve the health of your network.
 
Examples of alarms include:
Examples of alarms include:
A critical router exceeded its threshold of traffic volume that you configured in Data Collection.
* A critical router exceeded its threshold of traffic volume that you configured in Data Collection.
The shell script that you wrote gathered the specific information you needed and posted it to the NetXMS as an event.
* The shell script that you wrote gathered the specific information you needed and posted it to the NetXMS as an event.
One of your mission-critical servers is using its UPS battery power.
* One of your mission-critical servers is using its UPS battery power.
An SNMP agent on a managed critical server forwarded a trap to NetXMS because it was overheating and about to fail.
* An SNMP agent on a managed critical server forwarded a trap to NetXMS because it was overheating and about to fail.