we are using a Monitoring solution from BMC Software to monitor our
Server and Services. We have several SLES-Servers which we monitor, too.
We experienced a problem with the monitoring of a single process on a
SLES 11 SP1-Server. This Process (Sybase DB for our ZCM system) has
often a high processor utilization. The ouput of top is (the process id
top - 10:24:26 up 91 days, 8 min, 2 users, load average: 0.00, 0.00, 0.00
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.5%us, 0.3%sy, 0.0%ni, 99.2%id, 0.0%wa, 0.0%hi, 0.0%si,
Mem: 1910912k total, 1825672k used, 85240k free, 81132k buffers
Swap: 2104472k total, 63872k used, 2040600k free, 357408k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3633 root 20 0 1678m 1.0g 1752 S 2 56.2 21193:51 dbsrv12
This shows -in my opion- the correct CPU Utilization. But the BMC
Software shows other values in CPU-Utilization. I asked the BMC-Support
and they told me that they use the following command to monitor the
UNIX95= ps -eo pid,comm,user,pcpu,vsz,etime,args | cat
This gives me the following output:
3633 dbsrv12 root 16.1 1719092 91-00:05:31
The CPU-Util in this command differs from the top-output. It's a real
problem if the CPU-Util is nearly 100%, the ps-command stays at 16.1%
and we don't get informed from the Monitoring-System.
The BMC-support told me that this command is the Standard-command for
monitoring processes in their products and it works on all Unix-systems.
They advised me to ask the Novell-Support for this behaviour.
Does anybody know why the second ouput does not show the correct