Basic Concepts

From NetXMS Wiki
Jump to navigation Jump to search
This Wiki is deprecated and we are are currrently migrating remaining pages into product documentation (Admin Guide, NXSL Guide)

Overview of System Architecture

The system is built on the three-tier architecture. The monitoring agents (either NetXMS native agents or SNMP agents) collect the information and deliver it to the monitoring server for processing and storage. Network administrators can configure the system and access collected data using cross-platform Java-based Management Console, Web Interface or Mobile Management Console. You can see the overview of the NetXMS system architecture on the following diagram.

Error creating thumbnail: Unable to save thumbnail to destination
NetXMS Architecture

All collected data and system configuration is stored in the SQL database. You can choose Oracle, Microsoft SQL Server, PostgreSQL, MySQL, or SQLite as your database engine. Database server can be installed on the same physical machine, or be a separate server. NetXMS web-based user interface is built as a single war file and can be deployed to any compatible application server (like Tomcat or Jetty). It is a separate component and can be installed on the same physical machine as NetXMS server, or on a remote server. The system has been designed to be easily extendable; so all three tiers — server, agent, and client — have modular structure. The following figure shows main software layer of NetXMS system.

Error creating thumbnail: Unable to save thumbnail to destination
NetXMS Software Layers

All system components use two libraries: NetXMS Foundation Library and Communication Library. These libraries provide communication between all system components. The server also has an underlying DB Driver API layer, which provides uniform database engine interface by using database drivers. This approach allows developers to add support for new database engines in a matter of days without changing or even recompiling server code. On top of server core, another interface, Server Module API, provides a uniform way to create server extensions. The same approach is used with NetXMS agent, which has a Subagent API on top of agent core. Access to NetXMS server provided via API available in C and Java versions. All client components of NetXMS – management console, alarm viewer, web interface, and console for Android – use this API. If you wish to write your own client application for NetXMS or need to integrate existing system with NetXMS, you can use either C or Java client library. Please consult NetXMS Client Library Programmer's Manual for detailed information.

Objects

All network infrastructure monitored by NetXMS inside monitoring system represented as a set of objects. Each object represents one physical or logical entity (like host or network interface), or group of them. Objects organized into hierarchical structure. There are 30 different object classes:

Object Class Description Can contain
Entire Network Abstract object representing root of IP topology tree. All zone and subnet objects located under it. System can have only one object of this class.

If zoning enabled:

  • Zone

If zoning disabled:

  • Subnet
Zone Object representing group of (usually interconnected) IP networks without overlapping addresses. Contains appropriate subnet objects.
  • Subnet
Subnet Object representing IP subnet. Typically objects of this class created automatically by the system to reflect system's knowledge of IP topology.
  • Node
Node Object representing physical host or network device. These objects can be created either manually by administrator or automatically during network discovery process.
  • Interface
  • Network Service
  • VPN Connector
Cluster Object representing cluster consisted of two or more hosts.
  • Node
Interface Object representing network interface of node. These objects created automatically by the system during configuration polls. Nothing
Network Service Object representing network service running on a node (like http or ssh). Nothing
VPN Connector Object representing VPN tunnel endpoint. Such objects can be created to add VPN tunnels to network topology known y NetXMS server. Nothing
Service Root Abstract object representing root of your infrastructure service tree. System can have only one object of this class.
  • Cluster
  • Condition
  • Container
  • Mobile Device
  • Node
  • Subnet
Container Grouping object which can contain nodes, subnets, clusters, conditions, or other containers. With help of container objects you can build object's tree which represents logical hierarchy of IT services in your organization.
  • Cluster
  • Condition
  • Container
  • Mobile Device
  • Node
  • Subnet
Condition Object representing complicated condition – like "cpu on node1 is overloaded and node2 is down for more than 10 minutes". Nothing
Template Root Abstract object representing root of your template tree.
  • Template
  • Template Group
Template Group Grouping object which can contain templates or other template groups.
  • Template
  • Template Group
Template Data collection template. See Data Collection section for more information about templates.
  • Mobile Device
  • Node
Network Map Root Abstract object representing root of your network map tree.
  • Network Map
  • Network Map Group
Network Map Group Grouping object which can contain network maps or other network map groups.
  • Network Map
  • Network Map Group
Network Map Network map. Nothing
Dashboard Root Abstract object representing root of your dashboard tree.
  • Dashboard
Dashboard Dashboard. Can contain other dashboards.
  • Dashboard
Report Root Abstract object representing root of your report tree.
  • Report
  • Report Group
Report Group Grouping object which can contain reports or other report groups.
  • Report
  • Report Group
Report Report object. Nothing
Business Service Root Abstract object representing root of your business service tree. System can have only one object of this class.
  • Business Service
Business Service Object representing single business service. Can contain other business services, node links, or service checks.
  • Business Service
  • Node Link
  • Service Check
Node Link Link between node object and business service. Used to simplify creation of node-related service checks.
  • Service Check
Service Check Object used to check business service state. One business service can contain multiple checks. Nothing

Every object has set of attributes; some of them are common (like id and name), while other depends on object class – for example, attribute "SNMP community string" have only node objects.

DCI (Data Collection Items)

Every node can have many parameters, like CPU utilization, amount of free memory or disk space usage. The management server can collect these parameters, check them for threshold violations and store them in the database. In NetXMS, parameters configured for collection are called Data Collection Items or DCI for short. One DCI represents one node's parameter, and unlimited number of DCIs can be configured for any node.

Each data collection item has various attributes controlling its handling:

Attribute Description
Description A free-form text string describing DCI. It is not used by the server and is intended for better information understanding by operators.
Origin Origin of data (or method of obtaining data). Possible origins are NetXMS agent, SNMP agent, CheckPoint SNMP agent, or Internal (data generated inside NetXMS server process).
Name Name of the parameter of interest, used for making a request to target node. For NetXMS agent it will be parameter name, and for SNMP agent it will be an SNMP OID.
Data Type Data type for a parameter. Can be one of the following: Integer, Unsigned Integer, 64-bit Integer, 64-bit Unsigned Integer, Float (floating point number), or String. Selected data type affects processing of collected data — for example, you cannot use operations like ”less than” or ”greater than” on strings.
Retention Time This attribute specifies how long collected data must be kept in the database, in days. Minimum retention time is 1 day, and maximum is unlimited. However, keeping too many collected values for too long will lead to significant increase of your database size and possible performance degradation.
Schedule Type Type of the collection schedule used; can be either simple or advanced. In a simple mode, values are taken from target at fixed intervals. In an advanced mode, cron-like scheduling table can be used to specify the exact time for polling. This can be useful if, for example, you wish to check the file size every Monday and Friday at 7:00.
Polling Interval Interval in seconds between two polls. Applicable only if simple schedule type is selected.
Scheduling Table Cron-like scheduling table for data collection polls. Applicable only if advanced schedule type is selected.
Threshold List List of defined thresholds.
Instance A free-form text string, passed as 6th parameter to events associated with thresholds. You can use this parameter to distinguish similar events related to different instances of the same entity – for example, if you have an event generated when file system is low on free space, you can set instance attribute to file system mount point.

Thresholds

Each threshold is a combination of a condition and an events pair. If a condition becomes true, associated "activation" event generated, and when it's becomes false again, "deactivation" event generated. Thresholds let you take a proactive approach to network management. You can define thresholds for any data collection items that you are monitoring. When setting thresholds, first determine what would constitute reasonable thresholds. To decide on a threshold value, you need to know what is normal value and what is out of range. Only you can decide what is normal behavior for a device on your network. Generally, it's recommended that you collect information about a device throughout one complete business cycle, before determining the normal high/low range. Consider collecting values such as error rates, retry limits, collisions, throughput, relation rates, and many more. You also have the possibility to define more than one threshold for a single DCI, which allows you to distinguish between different severity conditions.

Events and Alarms

Many services within NetXMS gather information and generate events that are forwarded to NetXMS Event Queue. Events can also be emitted from agents on managed nodes, or from management applications residing on the management station or on specific network nodes. All events are processed by NetXMS Event Processor one-by-one, according to the processing rules defined in Event Processing Policy. As a result of event processing, some actions can be taken, and event can be shown up as alarm. NetXMS provides one centralized location, the Alarm Browser, where the alarms are visible to your team. You can control which events should be considered important enough to show up as alarms. You and your team can easily monitor the posted alarms and take appropriate actions to preserve the health of your network.

Examples of alarms include:

  • A critical router exceeded its threshold of traffic volume that you configured in Data Collection.
  • The shell script that you wrote gathered the specific information you needed and posted it to the NetXMS as an event.
  • One of your mission-critical servers is using its UPS battery power.
  • An SNMP agent on a managed critical server forwarded a trap to NetXMS because it was overheating and about to fail.