kernel hanging problem - nfs?...

Wesley T. Perdue wes at greenfieldnetworks.com
Thu Dec 26 19:00:00 CST 2002


Jason,

Thanks for the advice.  My replies are inline.

At 07:49 AM 12/27/2002 +1000, jason andrade wrote:
>On Thu, 26 Dec 2002, Wesley T. Perdue wrote:
>
>> I've got a problem on a PE 2400 running Red Hat 7.1 w/ kernel 2.4.7-10smp; it has two PIII cpus.
>
>an upgrade would be good if that's an option. (redhat 7.3)

It will be in the future, but not right now.

>> The PE 2400 is as home directory and mail server, and therefore sees a lot of nfs traffic to fifteen other servers, eight of which are nodes in a compute cluster.    The PE 2400 has a built-in PERC, and the home dir filesystem is on a RAID-5 array made up of 3x73 GB HDs recently purchased from Dell.  I took the defaults when configuring the array.  I updated the PERC firmware last year to v.2.5.  The root filesystem is on a separate RAID-5 array on the same controller - 3x9 GB, I think.
>
>you could also upgrade the PE bios to A08 and the perc bios to 2.7-1.

I'll keep it under consideration.

>if the home dirs are on 3*73, what is the OS installed on ?

The system has six internal disks: 3x9GB and 3x73 GB.  There are two RAID 5 arrays, one of the 9 GB dissk, and one of the 73 GB disks.  The OS is on the 3x9 array.

>> When nfs traffic gets high, the load (as reported in uptime and top) goes very high (into the tens or twenties), even though there aren't that many active processes.  There are a number of nfsd processes (maybe eight) taking a bit of cpu each.  The machine can feel unresponsive at this time.
>>
>
>you should consider running more than 8 nfsds.  i use 64 myself but at the least you might want to run at least 16 or 32.

I will definitely do so.

>> Sometimes, the machine will hang for a number of seconds (say, up to 30) -- absolutely nothing responds, not even the console.  When this happens, the disk I/O lights are flashing very busily.  When the server snaps out of it, the load is _very_ high, and then gradually comes down.
>
>are you using rsize=1024,wsize=1024 ?

No, 8192.  Here is a client fstab entry for the server:
nfsserver:/home  /home  nfs rsize=8192,wsize=8192,timeo=14,intr,hard,retry=1000000

>nfsserver:/share /localmount nfs ro,rsize=1024,nolock,hard,bg,intr,retrans=8,nfsvers=3,timeo=10 0 0
>
>i use something like the above (add wsize and remove ro, since you'll want write access)

Thanks for the advice.  I'll look into the unique options you specify.

Regards,
Wes




More information about the Linux-PowerEdge mailing list