[Triumf-linux-managers] time2 alias changed

Kelvin Raywood kray at triumf.ca
Tue Oct 30 15:34:26 PDT 2007


The alias time2.triumf.ca has been changed.  See below for reasons.

Old: time2.triumf.ca -> tgate.triumf.ca
New: time2.triumf.ca -> vmhost01.triumf.ca

You will need to restart the network-time protocol (NTP) daemon on your 
machines for them to see the change.   Note that ntp will still function 
correctly with time1 and time3 available so the restart is not urgent.

To check the status of your ntpd do:

     /usr/bin/ntpdc -s

This produces an output similar to the following:

      remote           local      st poll reach  delay   offset    disp
=======================================================================
  tgate.triumf.ca 142.90.100.212  16 1024  270 0.00015 -0.781483 0.35295
  LOCAL(0)        127.0.0.1       10   64  377 0.00000  0.000000 0.03053
.lin17.triumf.ca 142.90.100.212   2 1024  377 0.00134 -0.000969 0.12184
*trprint.triumf. 142.90.100.212   2 1024  377 0.00124 -0.000920 0.12175

Note that the first remote server is tgate but its stratum (column 3) is 
16.  This means that it is not responding.  trprint (time1) and lin17 
(time3) are stratum 2.  Both of these sync to a stratum 1 GPS 
time-server.  LOCAL refers to the local-clock which is artificially 
assigned stratum 10 so that it has low priority.

Restart ntpd with:

     service ntpd restart

Then do "ntpdc -s" again.  You may have to recheck a few times until all 
time-servers have responded, but after no more than five seconds the 
output should be similar to:
      remote           local      st poll reach  delay   offset    disp
=======================================================================
  LOCAL(0)        127.0.0.1       10   64    1 0.00000  0.000000 2.81735
  lin17.triumf.ca 142.90.100.212   2   64    1 0.00018 -0.000137 2.81735
  trprint.triumf. 142.90.100.212   2   64    1 0.00015  0.000060 2.81735
  vmhost01.triumf 142.90.100.212   2   64    1 0.00011  0.000002 2.81735

Note that vmhost01 is now a stratum-2 time-server.  If you still see 
tgate in the list, it means you have not used the aliases time1, time2 
and time3 in your ntp config.  Contact me if you need help reconfiguring.

Reasons for the change
======================

As you may recall, we had a catastrophic failure of tgate a couple of 
weeks ago.  This machine primarily provided DNS, DHCP and radius 
services; radius is used for wireless and dial-in authentication. 
Tgate was also a stratum-2 time-server (time2.triumf.ca) and a VRVS 
(video-conferencing) reflector. So obviously we got a temporary 
replacement online ASAP.

We (computing-services) have started to implement a plan for ensuring 
high-availability of the critical services.  This involves running some 
services from virtual-machines which are replicated on two physical 
machines.  A heartbeat between two VMs offering the same service ensures 
that only one is active at any given time.  Thus, each VM needs its own 
IP-address and the active one also listens on the IP-address associated 
with the service. Thus DNS and Radius are now served from a secondary 
IP-address (tgate) on a VM (tgate-vm1).  DHCP still comes from the 
primary address but since all initial requests are broadcasts, this 
works fine.

But this scheme doesn't work for NTP.  It must listen and respond to 
broadcasts as well as direct queries so cannot be bound to a secondary 
IP-address on the same network as the primary IP-address.  Perhaps since 
NTP has built-in high-availibility this feature has not been 
implimented. All responses originate from its primary-address and the 
clients that made the request to the secondary address ignore the replies.

Thus NTP cannot be provided from tgate.triumf.ca anymore and so the 
alias time2.triumf.ca was changed.


It was also the
was the primary name-server, the authoritative DHCP server and


More information about the Triumf-linux-managers mailing list