Kernel Oops on Debian Etch

Hugues Brunel hugues.brunel at free.fr
Mon Aug 27 11:21:38 CDT 2007


I take time to analyse this problem.
It appear on five servers PE860!

All of these have a "Intel(R) Xeon(R) CPU 3040 @ 1.86GHz".
(no problem with other PowerEdge)

They all have a kernel 2.6.19.2.smp with grsecurity patch
(but I don't think grsec is the problem)

All have a SAS Raid0 container.
4 of these use SATA hd.
1 use SCSI hd.

4 of these use Etch distrib
1 use Sarge distrib

Sometime crash appear on CPU n°0, sometime on CPU n°1.

The first line of the message:
   kernel: BUG: unable to handle kernel paging request at virtual address

Here are the address found: fffaf070, fffaf770, fffafd30, fffafff0, 
fffaf570, fffaf670, fffa1b70, fffafab0, fffa1e30, fffa1570, fffafe70, 
fffa1270, fffaf770, fffa1330, fffa1c30, fffa10f0, fffa18b0, fffa1730, 
fffafbb0
=> address fffaf770 appear 2 time on 2 different Server!!?? (is it a way 
to search??)

Problem appear on many different process:
sed, chown, awk, expr, gzip, nohup, mysqld, mysqld, named, tr, cut, 
apache2, create-domain.p (webmin process)

I have all kernel messages for 19 "Oops events"...

Have you any idea?
Thank you very much for any information...

Regards,
Hugues.


Hugues Brunel a écrit :
> Hello,
> 
> I use many (60) Dell Servers (from PE2650 to PE2950) with Debian Sarge, 
> and now Etch. I have a problem with two PE860 DualCore (one on Sarge and 
> one Etch).
> 
> Some time, a process crash with a Oops message. It arrives when load 
> become a little high (but not very high). For exemple, it arrives some 
> time when I compile a new kernel. And recently it arrives with MySQL 
> server process.
> 
> It become very critical to me to correct this issue.
> 
> My kernel is Linux-2.6.19.
> 
> Here is the last message (this morning):
> 
> --------------------------------------------------------
> Aug 27 10:39:31 localhost kernel: BUG: unable to handle kernel paging 
> request at virtual address fffafe70
> Aug 27 10:39:31 localhost kernel:  printing eip:
> Aug 27 10:39:31 localhost kernel: c0191a5f
> Aug 27 10:39:31 localhost kernel: *pgd = 3067
> Aug 27 10:39:31 localhost kernel: *pmd = 3067
> Aug 27 10:39:31 localhost kernel: *pte =    0
> Aug 27 10:39:31 localhost kernel: Oops: 0002 [#2]
> Aug 27 10:39:31 localhost kernel: SMP
> Aug 27 10:39:31 localhost kernel: Modules linked in: ipv6 piix 
> ata_generic pata_sil680 generic siimage sg sr_mod ext3 jbd ide_scsi 
> joydev usbhi
> d i2c_i801 i2c_core ide_floppy pcspkr evdev shpchp pci_hotplug psmouse 
> serio_raw iTCO_wdt sd_mod ide_cd cdrom ide_core uhci_hcd ehci_hcd usbcor
> e ata_piix libata tg3 thermal processor fan mptsas mptscsih mptbase 
> scsi_transport_sas scsi_mod
> Aug 27 10:39:31 localhost kernel: CPU:    0
> Aug 27 10:39:31 localhost kernel: EIP:    0060:[prep_new_page+162/237]   
> Not tainted VLI
> Aug 27 10:39:31 localhost kernel: EFLAGS: 00010246 
> (2.6.19.2.fullsave.smp-grsec #1)
> Aug 27 10:39:31 localhost kernel: eax: 00000000   ebx: c19295c0   ecx: 
> 00000064   edx: 00000034
> Aug 27 10:39:31 localhost kernel: esi: c19295c0   edi: fffafe70   ebp: 
> 00000000   esp: d2fc1e68
> Aug 27 10:39:31 localhost kernel: ds: 0068   es: 0068   ss: 0068
> Aug 27 10:39:31 localhost kernel: Process mysqld (pid: 1892, ti=d2fc0000 
> task=e361e030 task.ti=d2fc0000)
> Aug 27 10:39:31 localhost kernel: Stack: fffaf000 000280d2 00000000 
> 00000001 00000246 c0388f00 c19295c0 c0191fae
> Aug 27 10:39:31 localhost kernel:        d798e6c0 00000002 00000000 
> 00000000 c0388f00 00000002 c03895a0 00000044
> Aug 27 10:39:31 localhost kernel:        c01920e1 000280d2 00000044 
> 00000000 c03895a0 00000000 000280d2 c03895a0
> Aug 27 10:39:31 localhost kernel: Call Trace:
> Aug 27 10:39:31 localhost kernel:  [buffered_rmqueue+268/293] <0> 
> [get_page_from_freelist+133/161] <0> [__alloc_pages+80/646] <0> 
> [do_anonymous
> _page+69/335] <0> [__handle_mm_fault+313/554] <0> 
> [do_page_fault+574/1466] <0> [do_page_fault+0/1466] <0> 
> [error_code+57/64] <0> [__inet6_check
> _established+630/890] <0> =======================
> Aug 27 10:39:31 localhost kernel: Code: 00 00 00 0f b6 4c 24 08 31 ed d3 
> e0 39 c5 7d 3e 89 de ba 0d 00 00 00 89 f0 e8 d3 31 fd ff b9 00 04 00 0
> 0 89 04 24 31 c0 8b 3c 24 <f3> ab 83 c6 20 8b 04 24 ba 0d 00 00 00 45 e8 
> 30 32 fd ff b8 01
> Aug 27 10:39:31 localhost kernel: EIP: [prep_new_page+162/237]  SS:ESP 
> 0068:d2fc1e68
> Aug 27 10:39:31 localhost kernel:  <6>note: mysqld[1892] exited with 
> preempt_count 1
> --------------------------------------------------------
> 
> I suspect a kernel bug but I don't find any information on google!??
> 
> Have you any idea to help me?
> 
> Regards,
> 
> Hugues Brunel.
> -- 
> Directeur Technique
> FullSave SAS
> 05 62 24 34 18
> 06 63 56 06 73
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq



More information about the Linux-PowerEdge mailing list