Step by step service monitoring

From NetXMS Wiki
Jump to: navigation, search

Before this tutorial, please make sure there are no security issues with trusted nodes.
https://wiki.netxms.org/wiki/SG:Security_Issues

Fast solution:
Set "CheckTrustedNodes" to 0 in Server Configuration, and restart the server



First we check if the website answers to ping.
This is because when we add the website as a node into NetXMS, NetXMS uses ping to check basic node availability during status polls.

Website-monitoring-tutorial-1.PNG

In our case, netxms.org does not respond to ping.
We need to remember that for later.



Next we create a node for our website.
We make sure to disable usage of not-supported sources for polling (including ping, since netxms.org doesnt respond to ping).

Website-monitoring-tutorial-2.PNG



We should see our node properly created with "Unknown" status.

Special note: in NetXMS 2.0.4 and older, there was a bug that would create a node down alarm.
Please terminate this alarm on this node before proceeding further in the tutorial if running a version older than 2.0.5 (or 2.1).

Website-monitoring-tutorial-3.PNG

This is because we disabled all polling sources, so NetXMS doesnt have any way to check if a node is up or down.
We will fix this later on in the tutorial.

Please note that if the node responded to ping (and therefor we would NOT disable usage of ICMP for status polling), the node's status would be Normal.



I have created a template to put all my website monitoring into.
This will be useful if I want to monitor many websites, since I dont have to manually create monitoring DCIs for every one.

Website-monitoring-tutorial-4.png

We then go into the Data collection configuration of this template.



We create a new DCI. Make sure Origin is set to NetXMS Agent, press Select, and find ServiceResponseTime.Custom(*).

Website-monitoring-tutorial-5.PNG



Next we will set the template DCI to resolve the node's primary IP for each node that is assigned to this template.
Documentation for template macros here: https://wiki.netxms.org/wiki/UM:Data_Collection#Macros_in_template_items

Website-monitoring-tutorial-6.PNG

We also set the Source node to our NetXMS server.
This will mean that even tho this DCI is configured on the website's node, the NetXMS server should be used to collect the data for this DCI.
Final effect will be that the DCI itself will show up on our website's node, but the data will be polled by the NetXMS server, which is what we want in this case.



We can now apply our template to our website's node.

Website-monitoring-tutorial-7.png



We should now see our website's node applied to the template, and in the node's Last values, we should see the response time of the website.
I have also added a ping DCI, to see that the website does indeed not respond to ping.

Website-monitoring-tutorial-8.PNG

We now just need to fix status calculation for the node.

If the node does respond to ping (as mentioned previously), we are finished.



Please note that as mentioned previously, if the nodes does respond to ping (and therefore node status calculation from its Status poll based on ICMP works fine), the following steps are NOT required.

We will create another DCI, this one will be used to check just if the website is up or down.

Website-monitoring-tutorial-9.PNG

Now we need to configure it to return NetXMS status codes for this node. We can do this using a transformation script.
Status codes documented here: https://wiki.netxms.org/wiki/NXSL:NetObj (see Possible values for "status" attribute section)

Website-monitoring-tutorial-10.PNG

Now we just tell NetXMS to use this DCI for status calculation.

Website-monitoring-tutorial-11.PNG



After the next status poll (br default every 1 minute), node should have Normal status.
Node should also be correctly taken to Critical status when TCP 80 doesnt respond.

Website-monitoring-tutorial-12.PNG