esm module and ntp problem

Brian Smith BSmith at lyrix.com
Tue Jan 14 09:15:01 CST 2003


Hi Mike,

We are looking at this very problem right now on our test systems.  When
you mentioned the ntp and time problems I looked at our test systems and
the ntpd is reporting time steps ranging between 0.4 and 1.6 seconds about
every 10-30 minutes.  This is really bad timekeeping.  I've stopped
(service dellomsa stop) and removed the esm module (rmmod esm) on one of
our test systems to see how it changes the ntp step value for the rest of
the day.  Now, on another test system, we are going to only stop the
dcstor32d and leave the other daemons running along with the esm.o module
in memory to see what happens and if it gives the same results as your
report below.

Do you know what the function of "dcstor32d" is?   I assume "dcsnmp32d" is
for the snmp support.  And I assume "dcevt32d" is for the event logging in
the /var/log/messages file when the status of something changes.  Does
anyone know what each daemon does?

I am very disappointed to find this problem now because we worked on a
monitoring and query script based on snmpwalk queries to the dell mib to
check and monitor power supply, temperature, and fan statuses, along with a
few other handy queries.  If this time slowdown problem cannot be resolved,
this project will be a waste of time and will make our systems vulnerable
to power supply and fan failures without our knowing.  I never thought to
look at the system time slowing down because of a daemon process.

Dell, do you have any response to this time slowdown problem we are
experiencing?

- Brian



                                                                                                                          
                    Michael Redinger                                                                                      
                    <Michael.Redinger at ui        To:     Linux-Poweredge at dell.com                                          
                    bk.ac.at>                   cc:                                                                       
                    Sent by:                    Subject:     Re: esm module and ntp problem                               
                    linux-poweredge-admi                                                                                  
                    n at dell.com                                                                                            
                                                                                                                          
                                                                                                                          
                    01/14/03 08:28 AM                                                                                     
                                                                                                                          
                                                                                                                          





Some news:

This is based on information provided by the Dell Support Hotline (thanks)
and it seems as if I can verify this:

The problem is _not_ the esm module (directly, at least) but dcstor32d.

When I kill dcstor32d (and restart ntpd), things seem to work fine (for
some hours now ...).

(However, if I disable dcstor32d, I basically loose all of OMSA's
functionality ...)


Michael

On Sat, 11 Jan 2003, Michael Redinger wrote:

> On Fri, 10 Jan 2003, JP Vossen wrote:
>
> > On Fri, 10 Jan 2003, Michael Redinger wrote:
> >
> > > Summary:
> > > when esm kernel module is loaded on a 2650, the ntpd regularily
looses the
> > > synchronization. 100% reproduceable for me.
> >
> > I have a PE500SC and am NOT running the esm module, byt my ntp loses
> > synchronization every Monday morning just as my backup job is ending.
All RH8
> > errata applied.  Search the list archives for details.  No joy so
far... :-(
>
>
> I've several 1550, 1650 and 2650 here, I've never seen something similar
> to your problem ...
>
> Well, my problem seems to be different. However, your mail reminded me to

> add the ntpd -d output (to the web forum page). Thanks.
>
> I now found that a mail from Geoff French (Geoff.French at noaa.gov)
> describes maybe exactly the same problem:
>
>
http://lists.us.dell.com/pipermail/linux-poweredge/2002-December/010629.html

>
>
> A small part of the debugging output is given below.
> If I interpret it correctly, the interresting part is
> "ntp_set_tod: settimeofday: 0: Interrupted system call", then ntpd clears

> the connections and restarts syncing (again completely reproducable for
> me):
>
>
> Jan 11 16:18:09 ns0 ntpd: clock_update: at 1104 assoc 6
> Jan 11 16:18:09 ns0 ntpd: local_clock: assocID 44887 off 3.842509 jit
> 0.001390 sta 3
> Jan 11 16:18:09 ns0 ntpd: step_systime: step 3.842509 residual 0.000000
> Jan 11 16:18:09 ns0 ntpd: In ntp_set_tod
> Jan 11 16:18:09 ns0 ntpd: ntp_set_tod: settimeofday: 0: Interrupted
system
> call
> Jan 11 16:18:09 ns0 ntpd: ntp_set_tod: Final result: settimeofday: 0:
> Interrupted system call
> Jan 11 16:18:09 ns0 ntpd: local_clock: mu 908 noi 127846.538 stb 0.000
pol
> 4 cnt 0
> Jan 11 16:18:09 ns0 ntpd: peer_clear: at 1104 assoc ID 44884
> Jan 11 16:18:09 ns0 ntpd: peer_clear: at 1104 assoc ID 44888
> Jan 11 16:18:09 ns0 ntpd: peer_clear: at 1104 assoc ID 44889
> Jan 11 16:18:09 ns0 ntpd: peer_clear: at 1104 assoc ID 44887
> Jan 11 16:18:09 ns0 ntpd: peer_clear: at 1104 assoc ID 44885
> Jan 11 16:18:09 ns0 ntpd: peer_clear: at 1104 assoc ID 44886
> Jan 11 16:18:09 ns0 ntpd: clear_all: at 1104
> Jan 11 16:18:09 ns0 ntpd: report_event: system event 'event_clock_reset'
> (0x05) status 'leap_none, sync_unspec, 15 events, event_peer/strat_chg'
> (0xf4)
> Jan 11 16:18:09 ns0 ntpd: report_event: system event
> 'event_peer/strat_chg' (0x04) status 'leap_none, sync_unspec, 15 events,
> event_clock_reset' (0xf5)
>
>
>
>
> Michael
>
>

--
Michael Redinger
Zentraler Informatikdienst (Computer Centre)
Universitaet Innsbruck
Technikerstrasse 13                     Tel.: ++43 512 507 2335
6020 Innsbruck                          Fax.: ++43 512 507 2944
Austria                                                  Mail:
Michael.Redinger at uibk.ac.at

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq or search the list
archives at http://lists.us.dell.com/htdig/







More information about the Linux-PowerEdge mailing list