
One of my clients has a number of individual Dell servers running VMWare ESXi 6.5 and they asked me if there was a way to add them into their existing Xymon monitoring system.
After looking around on the Internet I realized there was no ready-made solution available so I ended up digging into the internals of ESXi to figure out how we could monitor their ESXi install base using Xymon.
Read on to learn how I created a native Xymon monitor for VMWare ESXi that runs as a service directly on the server itself.
Contents
Step One: Sending Data to Xymon
The first step was a simple script to monitor CPU utilization on a single ESXi chassis.
To do this I used Python since it appears to be the only language (other than the Bash shell) that already exists on ESXi 6.5. Fortunately, sending updates to Xymon is as simple as opening a socket to the right port and sending a blob of data in the right format.
So all I had to do was come up with a Python script to send the CPU data to the Xymon server in the usual combo message format that any other Xymon client would use.
Once that was completed and working, I ran it manually and waited for a .cpu column to show up in Xymon … but it never did.
Turns out the default firewall settings on an ESXi install block pretty much all outbound traffic as well!
As a result, I had to dig into the firewall settings and figure out an elegant way to allow port 1984 outbound.
The short version is that I needed a .xml file that allowed the ESXi server to send traffic to the Xymon server. Once that was in place I re-ran my Python script and the new .cpu test appeared in Xymon.
At this point I had my own Xymon monitor for VMWare ESXi script that ran locally.
Step Two: Additional Combo Tests
Now that I had a script that could send data to the Xymon server, I started working on how to populate the other parts of the combo message.
Things like uptime and the process list were pretty easy – the methods the Linux Xymon client uses to populate those columns were pretty close to what ESXi needed. I usually just needed to tweak a couple flags but the commands were the same.
However, other columns such as .memory were completely different (for example there is no ‘free’ command like there is on Linux).
So in order to pull the rest of the data I ended up with various ‘esxcli’ commands that the Python script runs and then massages the results in order to send the data to the Xymon server via combo message in the proper format.
In the end though I was able to get these tests working (including graphs):
- .cpu
- .disk
- .memory
- .msgs
- .procs
Step Three: Dell Hardware Stats via OpenManage & CIM
The client’s next request was to have some way of knowing how the hardware of their Dell server was doing. In particular they wanted to know if any of the disks were failing.
To do that we needed to install the Dell OpenManage .vib for ESXi.
Once that was complete, I found that I could query the CIM instances directly on the ESXi CLI (similar to how OpenManage pulls the data remotely).
It took me a while to find the right CIM classes and instances, but then I was able to incorporate the results into the existing Python script and create another special column specifically for the Dell hardware.
Now they can see the status of the internal batteries, fans, memory, processors, disks, and the Dell hardware log itself.
Step Four: Running Automatically (and Surviving a Reboot)
The final step was to put the Python script into cron and let it monitor the system automatically.
However, the first issue I ran into is that ESXi doesn’t preserve most files (or even cron entries) after a system reboot!
I found this out after we rebooted the ESXi server for a different reason and everything went purple in Xymon: when I logged in I found that not only was the cron entry missing but the entire Xymon directory structure was completely gone.
So now I had to find a way to make the Xymon client a permanent resident on ESXi.
It turns out the only real way to do that is to install it as a .vib package. Fortunately, Virtually Ghetto has a great summary on the basics of making a .vib.
I ended up using the VIB Author tool to create a .vib that allowed me to install my new ESXi Xymon Client as a package that would persist after a reboot.
As a result, when the .vib is installed you can see it as a package in the GUI and actually start/stop it as a service. You can also configure the client name you want the server to appear as in Xymon and specify up to (2) Xymon servers with custom ports, all in the GUI.
Summary: A Xymon ESXi VIB/Package
In the end it was really interesting putting this package together, and I’m really happy with how it turned out.
The client can see the status of the ESXi hypervisor in Xymon, along with the status of their Dell hardware, regardless of what hosts they have configured and running. So now the hosts and the underlying hypervisor are all in Xymon.
If you have any questions or comments about how I put it all together, please either contact me or add a comment below.
Leave a Reply