[Linux-PowerEdge] C6100 and multiple K20 GPU card issue

Sat Oct 18 05:30:00 CDT 2014

We have a C6100 blade attached via iPASS to a C410x PCI-e expansion 
chassis fitted with two K20 GPU cards and this works fine. But if I fit 3 
or more K20 cards to the same bus, the kernel via dmesg reports that the 
PCI-e base address resigsters cannot be assigned:

[    0.573153] pci 0000:04:00.0: BAR 13: can't assign io (size 0xe000)
[    0.573236] pci 0000:05:08.0: BAR 14: can't assign mem (size 0x200000)
[    0.573320] pci 0000:05:08.0: BAR 15: can't assign mem pref (size 0x200000)

Even with five K20 cards fitted, lspci reports all five are present but 
lspci -vv reports 'Memory ignored' errors like this for each GPU card:

Region 0: Memory at c1000000 (32-bit, non-prefetchable) [size=16M]
         Region 1: Memory at <ignored> (64-bit, prefetchable)
         Region 3: Memory at <ignored> (64-bit, prefetchable)

(with one or two GPU cards fitted, all regions report valid memory 
addresses, not just 'ignored'). The nVidia driver reports

[   17.028050] NVRM: This PCI I/O region assigned to your NVIDIA device is 
[   17.028050] NVRM: BAR1 is 0M @ 0x0 (PCI:0000:14:00.0)
[   17.028052] NVRM: The system BIOS may have misconfigured your GPU.
[   17.028056] nvidia: probe of 0000:14:00.0 failed with error -1
[   17.028241] NVRM: The NVIDIA probe routine failed for 1 device(s).
[   17.028243] NVRM: None of the NVIDIA graphics adapters were 
[   17.028244] [drm] Module unloaded
[   17.028314] NVRM: NVIDIA init module failed!

I've updated the BIOS of the C6100 to the latest available version 1.71 
but the problem remains. If I fit three of the older M2090 GPU cards 
instead of the K20 cards, all three work - the BAR's are created and the 
nVidia driver loads.

This looks like a C6100 BIOS issue - system has 24 GB memory fitted and is 
currently running Ubuntu 14.04 but the problem occurs with other versions 
of Ubuntu, CentOS, SuSE, etc. The same problem occurs on the other C6100 
blade fitted to this chassis.

Does anyone have any suggestions? Is this a known limitation of the 
C6100/K20 combination?


