PDA

View Full Version : Monitor NTP on the server.



ddgaikwad
29-Oct-2013, 11:04
Hi Guys,

We are using SuSE Linux 11.1 in our environment and all the servers are vitalized on Vmware ESX servers.
Recently there was a huge time drift on all the servers ranging from 1 minute to about 20 minutes.

But for some reason we were unable to find the cause of the drift.
So, was wondering is there any way we can monitor the time drift on all these servers, may be using a script or monitoring a certain log file?

Regards,
Dinesh

jmozdzen
29-Oct-2013, 12:09
Hi Dinesh,

Hi Guys,

We are using SuSE Linux 11.1 in our environment and all the servers are vitalized on Vmware ESX servers.
Recently there was a huge time drift on all the servers ranging from 1 minute to about 20 minutes.

But for some reason we were unable to find the cause of the drift.
So, was wondering is there any way we can monitor the time drift on all these servers, may be using a script or monitoring a certain log file?

Regards,
Dinesh

sounds to me like the typical "VM/guest time drift" problem. AFAIR, this happens when the guest runs its own wallclock and depends on accurate interrupts from the host to increment the clock.

The subject line of this thread contains "NTP" - but you don't mention if you're actually running ntp inside the VMs and how your VMs' wall clock is set up (dependant/host-based).

If you're not (yet) running ntpd inside the VMs: To properly monitor the time drift, you need a "reference point", a system who's clock you trust. Then you need to take a time stamp snapshot on both the reference system and the monitored system and compare both. Depending on the accuracy you want to achieve, you might do this by running
ssh <vmhost> date +%s from the reference system to the vm and compare the result with the output of "date +%s" on the reference system.

If you're not after monitoring, but rather want to have a more accurate time source within your VMs, then I suggest running ntpd, with independant wallclocks inside the VMs and a physical server or two (or even true NTP device) as time source. Still not as accurate as having a physically triggered RTC interrupt, but *much* better than what you have now ;)

Regards,
Jens

jmozdzen
29-Oct-2013, 12:17
Hi Dinesh,


Hi Guys,

We are using SuSE Linux 11.1 in our environment

is this really SuSE Linux 11.1? Then first of all I strongly recommend upgrading to openSUSE 12.3, and secondly allow myself to mention that these are the forums for SLES - SuSE Linux Enterprise Server. You'll get responses more targeted towards your installation at forums.opensuse.org (they use the same credentials back-end, so yo needn't create a new account there).

If this is indeed SLES, I recommend to use "SLES11SP1" to avoid confusion. Especially the "11.1" notation is used for the community version, while a "SP1" suffix (service pack 1) relates to the commercial products.

SuSE Linux 11.1, suse 11.1 -> openSUSE 11.1
SLED11SP1 -> SuSE Linux Enterprise Desktop 11 SP1
SLES11SP1 -> SuSE Linux Enterprise Server 11 SP1

Regards,
Jens

smflood
29-Oct-2013, 22:49
ddgaikwad wrote:

> We are using SuSE Linux 11.1 in our environment and all the servers are
> vitalized on Vmware ESX servers.

As Jens indicated in his second reply "SuSE Linux 11.1" could mean one of
three possible Operating Systems so please post the output from "cat
/etc/*release" and/or clarify.

I'll also note that VMware have now replaced ESX with ESXi.

> Recently there was a huge time drift on all the servers ranging from 1
> minute to about 20 minutes.
>
> But for some reason we were unable to find the cause of the drift.
> So, was wondering is there any way we can monitor the time drift on all
> these servers, may be using a script or monitoring a certain log file?

VMware have a Knowledge Base article which you may find useful for
synchronising time on virtual Linux-based guests - see
http://kb.vmware.com/kb/1006427

HTH.
--
Simon
SUSE Knowledge Partner

ddgaikwad
30-Oct-2013, 10:25
Hi Guys,

First of all sorry for the confusion caused with the version numbers, here is the output from cat:
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 1

Now for the issue, I have gone through the documents from Vmware as well as Novell and was able to get the clock reset to the correct time of a physical server.
The thing is that, our business is looking out for a way or a method that can be used to monitor the time drift if that occurs.
So, was wondering if there are any methods or log files or automate, to keep a check on the server if indeed time on them is correct?

Regards,
Dinesh

mikewillis
30-Oct-2013, 11:15
First of all sorry for the confusion caused with the version numbers, here is the output from cat:
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 1

Do you know that general support for SLES 11 SP1 ended in August 2012?


As Jens mentioned earlier, your subject header mentions NTP but your post didn't and your last post doesn't mention NTP either. Have you actually configured the server to use NTP? You can look in YaST > Network Services > NTP Configuration, and/or run


$ pidof ntpd
$ grep -v ^# /etc/ntp.conf | grep ^server | grep -v 127
If the output of 'pidof' is blank ntpd isn't running, if the output of the second command is blank ntpd isn't configured to look at an NTP server.


I have run SLES 11 through SLES 11 SP3 in VMware ESX(i) and have never seen a problem with time drift. I install VMware tools and configure ntpd to look our local NTP server, I've never specified kernel parameters. If your organisation doesn't run it's own NTP server take a look at http://www.pool.ntp.org/en/

ddgaikwad
30-Oct-2013, 11:39
Hi Mikewillis,

Yes, we are aware of end of support and currently in planing for an upgrade soon.

Yes we have NTP configured on our environment.
All the servers are pointing to an external time source at the moment and time for the servers is in sync as well.
NTP service is also running on the server just fine.

Before this the servers were pointed to an internal time source and that time server was decommissioned, after which most of the servers started to drift in time.

So, as a precaution, is there process that will tell is the server has drifted too much? Like referring to some other server?

Regards,
Dinesh

smflood
30-Oct-2013, 12:40
On 30/10/2013 09:34, ddgaikwad wrote:

> The thing is that, our business is looking out for a way or a method
> that can be used to monitor the time drift if that occurs.
> So, was wondering if there are any methods or log files or automate, to
> keep a check on the server if indeed time on them is correct?

Do you use Nagios to monitor servers and/or other services? If so, there
are some plugins available to check NTP and time @
http://exchange.nagios.org/directory/Plugins/Network-Protocols/NTP-and-Time

If not, they might be helpful for you to script your own checks.

HTH.
--
Simon
SUSE Knowledge Partner

------------------------------------------------------------------------
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below. Thanks.
------------------------------------------------------------------------

jmozdzen
30-Oct-2013, 14:52
Hi Dinesh,

So, as a precaution, is there process that will tell is the server has drifted too much? Like referring to some other server?

Regards,
Dinesh

that's what I outlined in my original answer to your question - run scripts that compare the actual time against a trusted time source. Some management suites already have agents to do this, or run your own script comparing current time stamps...

Regards,
Jens