[Poweredgec-tools] C6145: BIOS 3.3.2 breaks IB card.

John Hanks john.hanks at usu.edu
Mon Jan 13 15:05:23 CST 2014


I've had a chance to follow-up on this with some more testing, coinciding
with racking 8 new C6145 nodes which arrived with BIOS 3.3.2 and for which
the mellanox cards work fine. I re-upgraded one of the earlier test servers
to 3.3.2 and the IB card failed to initialize just as in previous tests.
Rebooted the server, went into the BIOS setup and reset to the Optimal
Defaults, then reset the BMC parameters and selected force PXE boot and PXE
only in the boot options. (Pretty sure those are the only things we have
ever modified.) Rebooted and viola! working Mellanox cards. So whatever the
problem is, resetting the BIOS to defaults seems to address it.

jbh


On Wed, Jan 1, 2014 at 4:12 PM, John Hanks <john.hanks at usu.edu> wrote:

> Hi,
>
> In the processes of troubleshooting a problem I decided to upgrade one of
> my C6145 nodes to the latest BIOS, 3.3.2. After the upgrade my Mellanox IB
> card (Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR /
> 10GigE] ) was no longer recognized. Downgrading the BIOS back to 3.0.0
> fixed the problem. dmesg looks like this when the BIOS version is 3.3.2:
>
> mlx4_core: Mellanox ConnectX core driver v1.1 (Dec, 2011)
> mlx4_core: Initializing 0000:04:00.0
>   alloc irq_desc for 24 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: PCI INT A -> GSI 24 (level, low) -> IRQ 24
> mlx4_core 0000:04:00.0: setting latency timer to 64
>   alloc irq_desc for 114 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 114 for MSI/MSI-X
>   alloc irq_desc for 115 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 115 for MSI/MSI-X
>   alloc irq_desc for 116 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 116 for MSI/MSI-X
>   alloc irq_desc for 117 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 117 for MSI/MSI-X
>   alloc irq_desc for 118 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 118 for MSI/MSI-X
>   alloc irq_desc for 119 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 119 for MSI/MSI-X
>   alloc irq_desc for 120 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 120 for MSI/MSI-X
>   alloc irq_desc for 121 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 121 for MSI/MSI-X
>   alloc irq_desc for 122 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 122 for MSI/MSI-X
>   alloc irq_desc for 123 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 123 for MSI/MSI-X
>   alloc irq_desc for 124 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 124 for MSI/MSI-X
>   alloc irq_desc for 125 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 125 for MSI/MSI-X
>   alloc irq_desc for 126 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 126 for MSI/MSI-X
>   alloc irq_desc for 127 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 127 for MSI/MSI-X
>   alloc irq_desc for 128 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 128 for MSI/MSI-X
>   alloc irq_desc for 129 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 129 for MSI/MSI-X
>   alloc irq_desc for 130 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 130 for MSI/MSI-X
>   alloc irq_desc for 131 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 131 for MSI/MSI-X
>   alloc irq_desc for 132 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 132 for MSI/MSI-X
>   alloc irq_desc for 133 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 133 for MSI/MSI-X
>   alloc irq_desc for 134 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 134 for MSI/MSI-X
>   alloc irq_desc for 135 on node -1
>   alloc kstat_irqs on node -1
> mlx4_core 0000:04:00.0: irq 135 for MSI/MSI-X
> mlx4_core 0000:04:00.0: command 0x23 timed out (go bit not cleared)
> mlx4_core 0000:04:00.0: Failed to initialize queue pair table, aborting.
> mlx4_core 0000:04:00.0: command 0x23 timed out (go bit not cleared)
> mlx4_core 0000:04:00.0: Failed to initialize queue pair table, aborting.
> mlx4_core 0000:04:00.0: PCI INT A disabled
> mlx4_core: probe of 0000:04:00.0 failed with error -16
>
> I've tested this on multiple nodes and it has been consistent. I am
> talking to a dell support person about this along with the original issue
> (flaky LSI SAS card), just posting this here as an FYI for any C6145 owners.
>
> Thanks,
>
> jbh
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/poweredgec-tools/attachments/20140113/1d0c1688/attachment.html 


More information about the Poweredgec-tools mailing list