any advice to find root cause of "Falling back to HPET" ?

Bond Masuda bond.masuda at jlbond.com
Sat May 22 14:37:36 CDT 2010


Oh, just as an illustrative example:

on s7,

1162 root at s7:/boot# time md5sum vmlinuz-2.6.9-89.0.25.ELsmp 
c4fb9036c6d660d8b5939bb597c3b8e3  vmlinuz-2.6.9-89.0.25.ELsmp

real	0m0.008s
user	0m0.006s
sys	0m0.001s

1044 root at s8:/boot# time md5sum vmlinuz-2.6.9-89.0.25.ELsmp 
c4fb9036c6d660d8b5939bb597c3b8e3  vmlinuz-2.6.9-89.0.25.ELsmp

real	0m0.070s
user	0m0.051s
sys	0m0.018s

The results are consistent. And pretty much *anything* exhibits the same
slowness on s8 vs s7.
-Bond

> -----Original Message-----
> From: linux-poweredge-bounces at dell.com [mailto:linux-poweredge-
> bounces at dell.com] On Behalf Of Bond Masuda
> Sent: Saturday, May 22, 2010 12:30 PM
> To: linux-poweredge at lists.us.dell.com
> Subject: any advice to find root cause of "Falling back to HPET" ?
> 
> Hello,
> 
> I'd appreciate any help/advice anyone can provide regarding our issue.
> I've
> run out of ideas on this one...
> 
> We have two identical PowerEdge 2950, one is called s7 and the other is
> s8.
> Both are web servers running Apache and PHP. We first noticed the
> problem
> because our benchmarking showed drastically different results between
> the
> two servers. With s7, we were able to get 180 requests/sec while on s8
> we
> only get 35 request/sec (and now only 15 requests/sec - more on that
> below).
> After this, we became aware that almost all tasks on s8 were slower
> than s7,
> whether it is CPU bound or I/O bound, everything we tried was slower on
> s8
> than on s7 (untar'ing archives, running md5 hashes, etc).
> 
> I started digging around. Both servers are identical in terms of
> software
> and configuration (other than things like hostname and IP addresses).
> Both
> servers are RHEL4U8, kernel-2.6.9-89.0.25.ELsmp, x86_64, exact same
> packages
> and exact same versions. I even ran 'rpm --verify' on all packages and
> didn't find anything unusual on both s7 and s8.
> 
> The ONLY error message I'm seeing that is unique to s8 are the
> following
> messages in dmesg:
> 
> Losing some ticks... checking if CPU frequency changed.
> warning: many lost ticks.
> Your time source seems to be instable or some driver is hogging
> interupts
> rip __do_softirq+0x4d/0xd0
> Falling back to HPET
> 
> Some google searching found:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=429010
> 
> which refers to:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=248488
> 
> But that seems to refer to problems with virtualization. This is on
> real
> hardware.
> 
> What we don't understand is that s7 does *not* exhibit any slowness nor
> the
> messags above, only s8. Again, both are identical.
> 
> So, thinking this might be a hardware issue, we asked our hosting
> company to
> pull the drives out of s8 and replace the entire chassis. After
> replacing
> the entire chassis of s8, we are still getting the above messages in
> dmesg.
> Not only that, things have gotten worse... our benchmarking (using
> 'ab') now
> shows the server can only do 15 requests/sec (all these test were run
> locally on loopback to avoid any network related issue).
> 
> Since the chassis was swapped, we feel that it probably isn't a
> hardware
> issue. But we have s7 which is configured identically to s8 that
> doesn't
> have this issue, so it is hard to say that it is a software issue.
> 
> Any advice? What can I do to find the root cause?
> 
> TIA,
> -Bond




More information about the Linux-PowerEdge mailing list