What does the output from ntpq -p mean?
This is an explanation of the output of the ntpq command to help troubleshoot NTP problems.
If you run the ntpq -p command and get:
localhost: timed out, nothing received
***Request timed out
It means that the ntpd service is not running.
The most likely reason is that the time is more than 1000 seconds different to the NTP source and NTP will not be able to resynchronise the time.
The ntpd service exits with a message to the system log when the offset exceeds the panic threshold, which is 1000 seconds by default.
e.g.
Aug 6 08:29:57 jcf-562-alpha ntpd[89548]: ntpd 4.2.8p16-a (1): Starting
Aug 6 08:29:57 jcf-562-alpha ntpd[89548]: Command line: /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf
Aug 6 08:29:57 jcf-562-alpha ntpd[89548]: ----------------------------------------------------
Aug 6 08:29:57 jcf-562-alpha ntpd[89548]: ntp-4 is maintained by Network Time Foundation,
Aug 6 08:29:57 jcf-562-alpha ntpd[89548]: Inc. (NTF), a non-profit 501(c)(3) public-benefit
Aug 6 08:29:57 jcf-562-alpha ntpd[89548]: corporation. Support and training for ntp-4 are
Aug 6 08:29:57 jcf-562-alpha ntpd[89548]: available at https://www.nwtime.org/support
Aug 6 08:29:57 jcf-562-alpha ntpd[89548]: ----------------------------------------------------
Aug 6 08:29:57 jcf-562-alpha ntpd[89549]: restrict default: KOD does nothing without LIMITED.
Aug 6 08:29:57 jcf-562-alpha ntpd[89549]: restrict ::: KOD does nothing without LIMITED.
Aug 6 08:29:57 jcf-562-alpha ntpd[89549]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): good hash signature
Aug 6 08:29:57 jcf-562-alpha ntpd[89549]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): loaded, expire=2024-06-28T00:00:00Z last=2017-01-01T00:00:00Z ofs=37
Aug 6 08:29:57 jcf-562-alpha ntpd[89549]: leapsecond file ('/var/db/ntpd.leap-seconds.list'): expired 39 days ago
Aug 6 08:33:12 jcf-562-alpha ntpd[89549]: Clock offset exceeds panic threshold.
Aug 6 08:33:12 jcf-562-alpha ntpd[89549]: Set system clock by hand.
The way to fix this time difference is to reboot the host, /etc/rc.conf should have lines like below which will resync the time when the host starts:
ntpdate_enable="YES"
ntpdate_flags="-b -u -p 4 -t 30 -v"
ntpdate_hosts="10.1.2.4"
Options
-p Print a list of the peers known to the server as well as a summary of their state. This is equivalent to the peers interactive command.
-n Output all host addresses in dotted-quad numeric format rather than converting to the canonical host names.
e.g.
remote refid st t when poll reach delay offset jitter
==============================================================================
*10.1.2.4 103.165.180.123 3 u 616 1024 377 0.713 -2.091 1.719
o127.127.22.0 .PPS. 0 l 11 16 377 0.000 0.003 0.001
-10.20.21.13 85.199.214.102 2 u 64 64 373 6.422 1.668 0.143
+10.10.5.30 139.143.45.145 2 u 1 64 377 1.164 -1.658 0.216
+10.10.16.1 81.187.26.174 2 u 21 64 317 1.178 -0.816 0.068
The columns headings
REMOTE = The servers and peers specified in the configuration file, from which your host will take time synchronisation
The character that prefixes the remote hostname/IP address means:
*
|
Indicates the current synchronisation source.
|
---|---|
# | Indicates that the host is selected for synchronisation, but distance from the host to the server exceeds the maximum value. |
o | Indicates that the host is selected for synchronisation and the PPS signal is in use. |
+ | Indicates the host is included in the final synchronisation selection set. |
x | Indicates that the host is the designated false ticker by the intersection algorithm. |
. | Indicates that the host is selected from the end of the candidate list. |
– | Indicates a host discarded by the clustering algorithm. |
Blank indicates that the source is not sync'd yet or a host is discarded due to high stratum and/or failed sanity checks. |
REFID = the current source of synchronisation for the remote host
REFID codes are used in kiss-o'-death (KoD) packets, the reference identifier field in ntpq
and ntpmon
billboard displays and log messages.
They consist of a string of four zero-padded ASCII characters.
In practice they are informal and tend to change with time and implementation.
|
access denied by server. |
|
association initialized. |
|
rate exceeded. |
|
association timeout. |
|
step time change, the offset is less than the panic threshold (1000ms) but greater than the step threshold (125ms) - means NTP needed to perform an instantaneous change to your system clock. |
|
Waiting for DNS lookup. |
|
DNS lookup succeeded, no NTP response yet. |
PPS |
Pulse Per Second time source which is usually very accurate but not common. |
|
Waiting for NTS key exchange. |
|
NTS-KE succeeded, no NTP response yet - the |
ST = the stratum level of the remote host
T = types available:
ST = the stratum level of the remote host
T = types available
l
|
local (such as a GPS clock)
|
---|---|
u | unicast (this is the common type) |
m | multicast |
b | broadcast |
– | netaddr (usually 0) |
WHEN = number of seconds passed since the remote host response
POLL = polling interval to the remote host, defined with the “minpoll” value in ntp.conf file
REACH = indicates how successful attempts to reach the server are. This is an 8-bit shift register with the most recent probe in the 2^0 position. The value 001 indicates the most recent probe was answered, while 357 indicates one probe was not answered. The value 377 indicates that all the recent probes have been answered.
DELAY = (round trip time) indicates the time (in milliseconds) taken by the reply packet to return in response, to a query sent by the server.
OFFSET = indicates the time difference (in milliseconds) between the server’s clock and the client’s clock. When this number exceeds 128, and the message synchronisation lost appears in the log file
JITTER = indicates the difference in the offset measurement between two samples. This is an error-bound estimate. Jitter is a primary measure of the network service quality.
REACH Explanation
The column REACH shows if a reference time source could be reached during the last 8 polling intervals, i.e. data could be read from the reference time source, and the reference time source was synchronized.
The value must be interpreted as an 8 bit shift register whose contents are for historical reasons displayed as octal values.
If the NTP daemon has just been started, the value is 0.
Each time a query is successful a '1' is shifted in from the right, so after the daemon has been started the sequence of reach numbers is:
0, 1, 3, 7, 17, 37, 77, 177, 377.
The maximum value 377 means that all the last 8 queries were completed successfully.
After about 10 minutes running all the REACH figures should be "377" (binary 11111111) showing 8 successful connections."
Below is an example of how the REACH octal value can change just because of one single lost response:
REACH
|
Binary
|
Interpretation
|
---|---|---|
377 | 1111 1111 | Normal, the last 8 "polls" were all good. |
376 | 1111 1110 | The last "poll" was bad. |
375 | 1111 1101 | The previous bad "poll" is now sliding left as the last "poll" was good. |
373 |
1111 1011 | The bad "poll" continues sliding left as each new "poll" is good. |
367 | 1111 0111 | The bad "poll" continues sliding left as each new "poll" is good. |
357 | 1110 1111 | The bad "poll" continues sliding left as each new "poll" is good. |
337 | 1101 1111 | The bad "poll" continues sliding left as each new "poll" is good. |
277 | 1011 1111 | The bad "poll" continues sliding left as each new "poll" is good. |
177 | 0111 1111 | The last 7 "polls" were good. |
377 | 1111 1111 | The one single bad "poll" is gone from the history and the last 8 "polls" were good. |
Notice how a REACH = 177 value looks really bad, but it is still just one single lost connection out of the last 8 attempts.
A REACH = 177 is actually much better than a REACH = 376, since it happened longer ago.