any advice to find root cause of "Falling back to HPET" ?

Bond Masuda bond.masuda at jlbond.com
Sat May 22 14:52:53 CDT 2010


thanks for the reply. I'm pretty sure it is not #2, we have verified the
software on both s7 and s8 are identical.

it is true, the only hardware component that we kept using are the hard
drives when the server chassis was swapped out. I just don't understand how
a faulty drive can cause "many lost ticks" and those other messages.
Nonetheless, I'm not going to exclude the possibility. 

I wish RHEL4 had smartctl that worked with megasas; i'll have to compile the
latest smartctl to see if SMART data will tell me anything about the drives.
One thing to note is that during the build of these  servers 2 weeks ago,
one of the drives on s8 did fail and had to be replaced.

-Bond

> -----Original Message-----
> From: guy [mailto:guy.choo.keren at gmail.com]
> Sent: Saturday, May 22, 2010 12:42 PM
> To: Bond Masuda
> Cc: linux-poweredge at lists.us.dell.com
> Subject: Re: any advice to find root cause of "Falling back to HPET" ?
> 
> 
> you can try putting s8's drives inside s7 and see what you get.
> 
> if you get errors, this can be one of:
> 
> 1. the hard disk themselves are faulty.
> 
> or
> 
> 2. there is some different driver on s8 that is not found on s7 (or the
> other way around) and it was installed not via the RPM system (so in
> RPM
> you won't see a difference) - which is causing the problems.
> 
> --guy
> 
> Bond Masuda wrote:
> > Hello,
> >
> > I'd appreciate any help/advice anyone can provide regarding our
> issue. I've
> > run out of ideas on this one...
> >
> > We have two identical PowerEdge 2950, one is called s7 and the other
> is s8.
> > Both are web servers running Apache and PHP. We first noticed the
> problem
> > because our benchmarking showed drastically different results between
> the
> > two servers. With s7, we were able to get 180 requests/sec while on
> s8 we
> > only get 35 request/sec (and now only 15 requests/sec - more on that
> below).
> > After this, we became aware that almost all tasks on s8 were slower
> than s7,
> > whether it is CPU bound or I/O bound, everything we tried was slower
> on s8
> > than on s7 (untar'ing archives, running md5 hashes, etc).
> >
> > I started digging around. Both servers are identical in terms of
> software
> > and configuration (other than things like hostname and IP addresses).
> Both
> > servers are RHEL4U8, kernel-2.6.9-89.0.25.ELsmp, x86_64, exact same
> packages
> > and exact same versions. I even ran 'rpm --verify' on all packages
> and
> > didn't find anything unusual on both s7 and s8.
> >
> > The ONLY error message I'm seeing that is unique to s8 are the
> following
> > messages in dmesg:
> >
> > Losing some ticks... checking if CPU frequency changed.
> > warning: many lost ticks.
> > Your time source seems to be instable or some driver is hogging
> interupts
> > rip __do_softirq+0x4d/0xd0
> > Falling back to HPET
> >
> > Some google searching found:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=429010
> >
> > which refers to:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=248488
> >
> > But that seems to refer to problems with virtualization. This is on
> real
> > hardware.
> >
> > What we don't understand is that s7 does *not* exhibit any slowness
> nor the
> > messags above, only s8. Again, both are identical.
> >
> > So, thinking this might be a hardware issue, we asked our hosting
> company to
> > pull the drives out of s8 and replace the entire chassis. After
> replacing
> > the entire chassis of s8, we are still getting the above messages in
> dmesg.
> > Not only that, things have gotten worse... our benchmarking (using
> 'ab') now
> > shows the server can only do 15 requests/sec (all these test were run
> > locally on loopback to avoid any network related issue).
> >
> > Since the chassis was swapped, we feel that it probably isn't a
> hardware
> > issue. But we have s7 which is configured identically to s8 that
> doesn't
> > have this issue, so it is hard to say that it is a software issue.
> >
> > Any advice? What can I do to find the root cause?
> >
> > TIA,
> > -Bond
> >
> >
> >
> >
> >
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at dell.com
> > https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq
> >




More information about the Linux-PowerEdge mailing list