esm module and ntp problem

Brian Smith BSmith at lyrix.com
Mon Jan 13 15:20:00 CST 2003


Hi JP,

Is it possible for you to be there during the backup?  Can you build a
second test system that you can play with during the day for testing
purposes?
When you cat /proc/interrupts, you see this:

root at test3#cat /proc/interrupts
           CPU0
  0:     158011          XT-PIC  timer
  1:          3          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:          0          XT-PIC  usb-ohci
  7:       2524          XT-PIC  eth0
  8:          1          XT-PIC  rtc
 10:       8217          XT-PIC  aacraid, megaraid
 14:          1          XT-PIC  ide0
NMI:          0
ERR:          9

Interrupt 0 is the timer interrupt.  Normally the Linux kernel programs the
timer interrupt to tick at a rate of 100Hz.  I don't know if you can change
the rate. It's probably possible.  If you see that number (158011) increase
at a faster or slower rate than +100 per second, then it means the timer
interrupt isn't being serviced like it should.

What happens if you run "setclock --show"?  (note there are two dashes '-')

root at test3#hwclock --show
Mon 13 Jan 2003 04:09:58 PM EST  0.425677 seconds


The man page on my system doesn't say it, but I think the "0.425677
seconds" is showing the difference between the RTC (hardware) and system
(Linux) clock.

It would be interesting to see if the clock drift is sudden or constant
throughout the backup process.

Also, since this system is IDE, do you have DMA enabled on the tape drive
or cdrom?  What happens if you remove the cdrom drive from the cable and
switch the tape drive to master?

I hope this spurs some ideas.

- Brian



                                                                                                                   
                    JP Vossen                                                                                      
                    <vossenjp at net        To:     Brian Smith <BSmith at lyrix.com>                                    
                    axs.com>             cc:     Linux-Poweredge at dell.com                                          
                                         Subject:     Re: esm module and ntp problem                               
                    01/13/03                                                                                       
                    03:34 PM                                                                                       
                                                                                                                   
                                                                                                                   




On Mon, 13 Jan 2003, Brian Smith wrote:

> Very interesting. Did your backup take 2057 seconds to complete?

No, 5139 seconds for the backup and 4395 for the verify.


> Is the system really off by that amount of time or is the ntpd wrong?

It's really off by that much, so NTP dies.  Search the achive of this list
for ntpd -d output from 2 weeks ago.  It's:
           Time OK
           Time OK
           Time OK...
           Time WAY off, NTP dies.


> Is 2057 seconds an even multiple of the number of seconds it took for the
> backup to complete?

No, but 2057 is only this week's offset:

zgrep sanity ../messages*
../messages:Jan 13 07:00:49 drake ntpd[11379]: time correction of 2057
seconds exceeds sanity limit (1000); set clock manually to the correct UTC
time.
../messages.1:Jan  6 07:00:47 drake ntpd[28213]: time correction of 2049
seconds exceeds sanity limit (1000); set clock manually to the correct UTC
time.
../messages.2:Dec 30 06:57:24 drake ntpd[30782]: time correction of 2044
seconds exceeds sanity limit (1000); set clock manually to the correct UTC
time.
../messages.3:Dec 23 06:59:10 drake ntpd[566]: time correction of 2041
seconds exceeds sanity limit (1000); set clock manually to the correct UTC
time.
../messages.4:Dec 16 06:36:40 drake ntpd[19639]: time correction of 1036
seconds exceeds sanity limit (1000); set clock manually to the correct UTC
time.


> Something seems to be pausing the clock on the system.

Certainly seems like it.  And it seems like it's afio or the IDS stuff,
but...


> Try creating a script to echo the date/time to a log file during the
backup
> every 10 seconds or something.

How would that differ substantivly from NTPd -d?


> Try to cat /proc/interrupts and see if the clock interrupt slows down or
> stops (I can't imagine this occurring).

Not quite sure what you mean here.


By the way, the backup is afio (with lots of ugly shell script wrapped
around)
to an IDE On-Stream DI-30 on /dev/hdd, with a CD-ROM (not in use) on hdc.
See
the following for somewhat out-of-date details:
http://www.jpsdomain.org/linux/OnStream_DI-30-RedHat_Backup_mini-HOWTO.html
)


Thanks for thinking about this,
JP
------------------------------|:::======|--------------------------------
JP Vossen, CISSP              |:::======|                jp at jpsdomain.org
My Account, My Opinions       |=========|       http://www.jpsdomain.org/
------------------------------|=========|--------------------------------
"The software said it requires Windows 98 or better, so I installed
Linux..."








More information about the Linux-PowerEdge mailing list