RHEL 4U6 x86_64 freezed by heavy I/O on PE2950
Peter Grandi
pg_dlxpe at dlxpe.for.sabi.co.UK
Tue Apr 1 14:16:35 CDT 2008
[ ... ]
> Just copying a large number of files (about 2500, 800 MB)
That's very few files and very little storage by contemporary
standards, and 100,000 files and 250GB might be more like "large".
(but then I have *on my laptop* 3 filesystems each with more than
200,000 files and a few dozen GB that I regularly copy around to
''unfragment'' them).
> reproducibly renders the server unusuable for a period of time [
> ... ] But at times, the operation finishes in 2 seconds! Only
> trying a simple "ls -l" afterwards puts the server in the
> mentioned "frozen" state. This is so because at times I/O
> operations are completely buffered
You have just discovered the obvious :-). That is, yet another
case where some part of the Linux IO subsystem seems to have been
designed with astonishingly misguided naivety (to use euphemisms).
> and apparently the issue arises only when actually flushing to
> disk. This server has 16 GB memory.
It could be the buffering in main RAM, but it could also be the
buffering in the host adapter RAM (of rather dubious value beyond
a small size), or both.
> [ ... ] The latest kernel released by RedHat, 2.6.9-67-0.7ELsmp,
> won't even boot after the initrd phase, it does not find the
> root file system and says: [ ... ]
More likely than a bug this is because the system was shut down
before it had fully written cached dirty blocks to disk. This may
be for example because they were cached in the host adapter, or
because of lack of use of barriers where supported.
> I don't know how to experiment with vm parameters in this
> version of RedHat (Nahant).
Thanks to your using a recentish RHEL4 you only really need to
worry about the 'sysctl' parameter 'vm/max_queue_depth' (very
curiously it is not yet in RHEL5 I think, but it may have been
added). It is in megabytes and tells the flusher to start writing
out after those many dirty megabytes have accumulated. A value
like 100-500MB, depending on your appetite for risk and the speed
of your storage, may be appropriate.
Another thing that you might want to experiment with is the
elevator. But then then default elevator, CFQ, is fairly good at
not letting one process hog the whole storage subsystem, but the
one in RHEL4 is a somewhat old version.
> Anyone having the same issue? Anyone having solved it already?
It has been known for years... Even if not to many people I suspect
:-)
Anyhow there is a discussion of some aspects of this here:
http://www.sabi.co.uk/blog/0707jul.html#070701
http://www.sabi.co.uk/blog/0707jul.html#070701b
I have included a patch in the discussion, but RedHat chose a
slightly different way to achieve the same effect (the already
mentioned 'vm/max_queue_depth').
> This is an urgent issue, any hint will be appreciated. In view
> of the fact that all components are supported or certified,
> this case tends to be a quite serious issue, in my opinion.
These expectations are very very funny :-). May the tooth fairy
deliver on them! :-)
More information about the Linux-PowerEdge
mailing list