Upgrade issue: missing PATHs to EMC LUNs

Ujjwal_Rajbhandari at Dell.com Ujjwal_Rajbhandari at Dell.com
Mon Mar 17 14:53:17 CDT 2008


Hi,

 

>From the dmesg it looks like there are two drivers loaded for your QLA
card. 

The two drivers are looking for two different cards QLA2100 and QLA2200.
If one of the driver is removed then the total available paths should be
registered with the kernel. Then, if PowerPath is run, all the available
paths per device should be active and available.

 

Ujjwal

 

From: oraclesolutions-bounces at lists.us.dell.com
[mailto:oraclesolutions-bounces at lists.us.dell.com] On Behalf Of
Poovathoor, Thomas K
Sent: Wednesday, March 12, 2008 12:31 PM
To: oraclesolutions at lists.us.dell.com
Cc: Hardy, Kate
Subject: FW: Upgrade issue: missing PATHs to EMC LUNs 

 

Hopefully we can get an answer on the issue we are facing by today so we
can meet the deadline of today for upgrading the RAC nodes one at a
time. 

 

________________________________

From: Poovathoor, Thomas K 
Sent: Wednesday, March 12, 2008 11:16 AM
To: 'Dustin_Yeilding at Dell.com'
Cc: 'Kate_Hardy at Dell.com'
Subject: Upgrade issue: 

 

Issue: We had multiple issues after one of the RAC node was upgraded.
Pls see notes below, missing paths (#2) to the EMC LUN after U5 upgrade
is still not resolved. 

 

1. We have found the resolution to one of the problems that with CRS not
starting up correctly - although we were suspecting the CRS/ASM itself
initially but it turned out that the modprobe.conf was not upgraded
properly. This file was replaced without the critical bond0 entries
along with bunch of other entries that were also found missing. This may
be a Linux upgrade issue but it is beneficial to state this potential
issue in Dell document so people will be aware of this pitfall. We
wasted a day on this!!

 

2. Another problem that we are seeing post upgrade is that the PowerPath
is only showing one path to the LUNS. We see the following in the dmesg
file however another upgraded server with same versions recently did not
encounter any of these problem. (this first node was rebooted after the
modprobe.conf file replaced in the above step)

 

...found this in dmesg:

qla2100: disagrees about version of symbol qla2x00_remove_one

qla2100: Unknown symbol qla2x00_remove_one

qla2100: disagrees about version of symbol qla2x00_probe_one

qla2100: Unknown symbol qla2x00_probe_one

qla2200: disagrees about version of symbol qla2x00_remove_one

qla2200: Unknown symbol qla2x00_remove_one

qla2200: disagrees about version of symbol qla2x00_probe_one

qla2200: Unknown symbol qla2x00_probe_one QLogic Fibre Channel HBA
Driver

 

Output from upgraded and non-upgraded RAC nodes: 

 

1. upgraded machine: 

 

Pseudo name=emcpowera

CLARiiON ID=APM00054602362 [10GRAC_TST]

Logical device ID=6006016058DE1600A2DA5E0D2B8DDA11 [LUN 4]

state=alive; policy=CLAROpt; priority=0; queued-IOs=0

Owner: default=SP B, current=SP A

========================================================================
======

---------------- Host ---------------   - Stor -   -- I/O Path -  --
Stats ---

### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs
Errors

========================================================================
======

   1 qla2xxx                   sdf       SP A1     active  alive      0
0

 

Pseudo name=emcpowerb

CLARiiON ID=APM00054602362 [10GRAC_TST]

Logical device ID=6006016058DE1600B6958B1B2B8DDA11 [LUN 6]

state=alive; policy=CLAROpt; priority=0; queued-IOs=0

Owner: default=SP B, current=SP A

========================================================================
======

---------------- Host ---------------   - Stor -   -- I/O Path -  --
Stats ---

### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs
Errors

========================================================================
======

   1 qla2xxx                   sdd       SP A1     active  alive      0
0

 

 

2. Output from the Machine that's not yet Upgraded: 

 

Pseudo name=emcpowera

CLARiiON ID=APM00054602362 [10GRAC_TST]

Logical device ID=6006016058DE1600A2DA5E0D2B8DDA11 [LUN 4]

state=alive; policy=CLAROpt; priority=0; queued-IOs=0

Owner: default=SP B, current=SP A

========================================================================
======

---------------- Host ---------------   - Stor -   -- I/O Path -  --
Stats ---

### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs
Errors

========================================================================
======

   1 qla2xxx                   sdf       SP A1     active  alive      0
0

   1 qla2xxx                   sdk       SP B0     active  alive      0
0

   2 qla2xxx                   sdp       SP A0     active  alive      0
0

   2 qla2xxx                   sdu       SP B1     active  alive      0
0

 

Pseudo name=emcpowerb

CLARiiON ID=APM00054602362 [10GRAC_TST]

Logical device ID=6006016058DE1600B6958B1B2B8DDA11 [LUN 6]

state=alive; policy=CLAROpt; priority=0; queued-IOs=0

Owner: default=SP B, current=SP A

========================================================================
======

---------------- Host ---------------   - Stor -   -- I/O Path -  --
Stats ---

### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs
Errors

========================================================================
======

   1 qla2xxx                   sdd       SP A1     active  alive      0
0

   1 qla2xxx                   sdi       SP B0     active  alive      0
0

   2 qla2xxx                   sdn       SP A0     active  alive      0
0

   2 qla2xxx                   sds       SP B1     active  alive      0
0

 

 

 

Can someone please get us the help to resolve the second issue. We are
holding off the second node upgrade and production until we fix first
node's issues.  

 

Appreciate everyone's diligent effort and willingness to help out. 

 

 

Our Upgrade Path: From Linux 4 U1 and Oracle 10.2.0.1 to From Linux 4 U5
and Oracle 10.2.0.3

 

 

Ref Document Used:
http://www.dell.com/downloads/global/solutions/Migration_from_Elbrus_2.0

_to_3.0_Final_Web_%20Ready.pdf

 

 

Detailed Steps (already talked to Oracle develpopment and TAM on the
following but listed there for background information ): 

 

       Our goal is to minimize downtime and limit risk/issues with

upgrade. There is a common interest to perform the upgrade on one node

at a time so that the service won't be completely unavailable to the

users all at one time (we are doing this now for production in mind to

do later on). 

 

      This is what we are mapping out to do: 

 

      1.    Shutdown DB, ASM, CRS, powerpath etc on First node -

TSTORA1(While TSTSORA2 is fully operational and running) 

      2.    Unplug the FC cable on the first node only. 

      3.    Upgrade the OS to Update 5 on the first node only. 

      4.    Upgrade powerpath and naviagent (if needed), again first

node alone at this time. Reconnect the FC on the first node. 

            5.    Once the above steps are completed, DBAs will check
and bring up the DB on the first node with the old version of oracle and
verify it joined the clusters fine etc. Then we will be ready to bring
down the second server's DB etc and do the same above 1-4 steps. 

      6.    Ensure the Second server's DBs are down and repeat the above
steps. 

      7.    Once steps 1-5 are completed on second node. 

      8.    Now the oracle upgrade begins, bring down the DB. ASM, CRS
on both servers and Upgrade. 

      9.   Bring back both servers CRS services and DB online through
proper way and test.   

 

      Let me know your thoughts on this. 

 

Tom Poovathoor

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/oraclesolutions/attachments/20080317/885d804f/attachment-0001.htm 


More information about the OracleSolutions mailing list