[Triumf-linux-managers] time2 alias changed
Kelvin Raywood
kray at triumf.ca
Tue Oct 30 15:34:26 PDT 2007
The alias time2.triumf.ca has been changed. See below for reasons.
Old: time2.triumf.ca -> tgate.triumf.ca
New: time2.triumf.ca -> vmhost01.triumf.ca
You will need to restart the network-time protocol (NTP) daemon on your
machines for them to see the change. Note that ntp will still function
correctly with time1 and time3 available so the restart is not urgent.
To check the status of your ntpd do:
/usr/bin/ntpdc -s
This produces an output similar to the following:
remote local st poll reach delay offset disp
=======================================================================
tgate.triumf.ca 142.90.100.212 16 1024 270 0.00015 -0.781483 0.35295
LOCAL(0) 127.0.0.1 10 64 377 0.00000 0.000000 0.03053
.lin17.triumf.ca 142.90.100.212 2 1024 377 0.00134 -0.000969 0.12184
*trprint.triumf. 142.90.100.212 2 1024 377 0.00124 -0.000920 0.12175
Note that the first remote server is tgate but its stratum (column 3) is
16. This means that it is not responding. trprint (time1) and lin17
(time3) are stratum 2. Both of these sync to a stratum 1 GPS
time-server. LOCAL refers to the local-clock which is artificially
assigned stratum 10 so that it has low priority.
Restart ntpd with:
service ntpd restart
Then do "ntpdc -s" again. You may have to recheck a few times until all
time-servers have responded, but after no more than five seconds the
output should be similar to:
remote local st poll reach delay offset disp
=======================================================================
LOCAL(0) 127.0.0.1 10 64 1 0.00000 0.000000 2.81735
lin17.triumf.ca 142.90.100.212 2 64 1 0.00018 -0.000137 2.81735
trprint.triumf. 142.90.100.212 2 64 1 0.00015 0.000060 2.81735
vmhost01.triumf 142.90.100.212 2 64 1 0.00011 0.000002 2.81735
Note that vmhost01 is now a stratum-2 time-server. If you still see
tgate in the list, it means you have not used the aliases time1, time2
and time3 in your ntp config. Contact me if you need help reconfiguring.
Reasons for the change
======================
As you may recall, we had a catastrophic failure of tgate a couple of
weeks ago. This machine primarily provided DNS, DHCP and radius
services; radius is used for wireless and dial-in authentication.
Tgate was also a stratum-2 time-server (time2.triumf.ca) and a VRVS
(video-conferencing) reflector. So obviously we got a temporary
replacement online ASAP.
We (computing-services) have started to implement a plan for ensuring
high-availability of the critical services. This involves running some
services from virtual-machines which are replicated on two physical
machines. A heartbeat between two VMs offering the same service ensures
that only one is active at any given time. Thus, each VM needs its own
IP-address and the active one also listens on the IP-address associated
with the service. Thus DNS and Radius are now served from a secondary
IP-address (tgate) on a VM (tgate-vm1). DHCP still comes from the
primary address but since all initial requests are broadcasts, this
works fine.
But this scheme doesn't work for NTP. It must listen and respond to
broadcasts as well as direct queries so cannot be bound to a secondary
IP-address on the same network as the primary IP-address. Perhaps since
NTP has built-in high-availibility this feature has not been
implimented. All responses originate from its primary-address and the
clients that made the request to the secondary address ignore the replies.
Thus NTP cannot be provided from tgate.triumf.ca anymore and so the
alias time2.triumf.ca was changed.
It was also the
was the primary name-server, the authoritative DHCP server and
More information about the Triumf-linux-managers
mailing list