On Wed, May 26, 2004 at 10:35:01PM -1000, Ronald L Fox wrote:
> Regarding the current PERC3/Di failure workaround hypothesis, I
> asked these questions on 5/22 but didn't see a response so I'll ask
> as a separate thread. If there's a Win32-PowerEdge at list,
> please point me to the sign-up instructions:

There isn't.  I haven't had a request for such a list, apparently the
forums on suffice.

> If this is indeed a PERC/3Di firmware problem, wouldn't you expect
> to see this failure mode with other OS's?


> Has this problem been reported to affect Win32 systems?

Not that we've been able to tell.

> If not, any theory about why?

It's definitely an I/O pattern which induces the cache flush algorithm
in the firmware to run, at the exclusion of other requests, hence
disabling the caches completely prevents that routine from running.
Best as we can tell, ext3's journal commits, particularly on small
updates (say, updatedb running and dirtying a huge number of inodes in
a really short amount of time because the fs is mounted with the
'atime' parameter by default), which are spaced apart every 5 seconds
or so (again the ext3 default commit time), is the pathological worst
case for the flush algorithm.  Essentially, the firmware thinks "hey,
I must be idle, I haven't gotten any commands in quite a while, now's
a good time to do cache maintenance, boy there are a lot of little
changes in here...", then "wham, here's a bunch of new writes coming
from the driver".

I'm not familiar with how/when NTFS commits its journal, such that it
doesn't trigger the firmware's pathological case.

> If so, does the suggested workaround fix the problem under Win32?

If you thought you were having this same problem under Win32, yes,
disabling the caches would solve that too.  If that is really the
case, *please please please* call tech support and report what you're
seeing, that would be new news.


