FW: Upgrade issue: missing PATHs to EMC LUNs
Poovathoor, Thomas K
Poovathoor at uthscsa.edu
Wed Mar 12 12:31:23 CDT 2008
Hopefully we can get an answer on the issue we are facing by today so we
can meet the deadline of today for upgrading the RAC nodes one at a
time.
________________________________
From: Poovathoor, Thomas K
Sent: Wednesday, March 12, 2008 11:16 AM
To: 'Dustin_Yeilding at Dell.com'
Cc: 'Kate_Hardy at Dell.com'
Subject: Upgrade issue:
Issue: We had multiple issues after one of the RAC node was upgraded.
Pls see notes below, missing paths (#2) to the EMC LUN after U5 upgrade
is still not resolved.
1. We have found the resolution to one of the problems that with CRS not
starting up correctly - although we were suspecting the CRS/ASM itself
initially but it turned out that the modprobe.conf was not upgraded
properly. This file was replaced without the critical bond0 entries
along with bunch of other entries that were also found missing. This may
be a Linux upgrade issue but it is beneficial to state this potential
issue in Dell document so people will be aware of this pitfall. We
wasted a day on this!!
2. Another problem that we are seeing post upgrade is that the PowerPath
is only showing one path to the LUNS. We see the following in the dmesg
file however another upgraded server with same versions recently did not
encounter any of these problem. (this first node was rebooted after the
modprobe.conf file replaced in the above step)
...found this in dmesg:
qla2100: disagrees about version of symbol qla2x00_remove_one
qla2100: Unknown symbol qla2x00_remove_one
qla2100: disagrees about version of symbol qla2x00_probe_one
qla2100: Unknown symbol qla2x00_probe_one
qla2200: disagrees about version of symbol qla2x00_remove_one
qla2200: Unknown symbol qla2x00_remove_one
qla2200: disagrees about version of symbol qla2x00_probe_one
qla2200: Unknown symbol qla2x00_probe_one QLogic Fibre Channel HBA
Driver
Output from upgraded and non-upgraded RAC nodes:
1. upgraded machine:
Pseudo name=emcpowera
CLARiiON ID=APM00054602362 [10GRAC_TST]
Logical device ID=6006016058DE1600A2DA5E0D2B8DDA11 [LUN 4]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP A
========================================================================
======
---------------- Host --------------- - Stor - -- I/O Path - --
Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs
Errors
========================================================================
======
1 qla2xxx sdf SP A1 active alive 0
0
Pseudo name=emcpowerb
CLARiiON ID=APM00054602362 [10GRAC_TST]
Logical device ID=6006016058DE1600B6958B1B2B8DDA11 [LUN 6]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP A
========================================================================
======
---------------- Host --------------- - Stor - -- I/O Path - --
Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs
Errors
========================================================================
======
1 qla2xxx sdd SP A1 active alive 0
0
2. Output from the Machine that's not yet Upgraded:
Pseudo name=emcpowera
CLARiiON ID=APM00054602362 [10GRAC_TST]
Logical device ID=6006016058DE1600A2DA5E0D2B8DDA11 [LUN 4]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP A
========================================================================
======
---------------- Host --------------- - Stor - -- I/O Path - --
Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs
Errors
========================================================================
======
1 qla2xxx sdf SP A1 active alive 0
0
1 qla2xxx sdk SP B0 active alive 0
0
2 qla2xxx sdp SP A0 active alive 0
0
2 qla2xxx sdu SP B1 active alive 0
0
Pseudo name=emcpowerb
CLARiiON ID=APM00054602362 [10GRAC_TST]
Logical device ID=6006016058DE1600B6958B1B2B8DDA11 [LUN 6]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP A
========================================================================
======
---------------- Host --------------- - Stor - -- I/O Path - --
Stats ---
### HW Path I/O Paths Interf. Mode State Q-IOs
Errors
========================================================================
======
1 qla2xxx sdd SP A1 active alive 0
0
1 qla2xxx sdi SP B0 active alive 0
0
2 qla2xxx sdn SP A0 active alive 0
0
2 qla2xxx sds SP B1 active alive 0
0
Can someone please get us the help to resolve the second issue. We are
holding off the second node upgrade and production until we fix first
node's issues.
Appreciate everyone's diligent effort and willingness to help out.
Our Upgrade Path: From Linux 4 U1 and Oracle 10.2.0.1 to From Linux 4 U5
and Oracle 10.2.0.3
Ref Document Used:
http://www.dell.com/downloads/global/solutions/Migration_from_Elbrus_2.0
_to_3.0_Final_Web_%20Ready.pdf
Detailed Steps (already talked to Oracle develpopment and TAM on the
following but listed there for background information ):
Our goal is to minimize downtime and limit risk/issues with
upgrade. There is a common interest to perform the upgrade on one node
at a time so that the service won't be completely unavailable to the
users all at one time (we are doing this now for production in mind to
do later on).
This is what we are mapping out to do:
1. Shutdown DB, ASM, CRS, powerpath etc on First node -
TSTORA1(While TSTSORA2 is fully operational and running)
2. Unplug the FC cable on the first node only.
3. Upgrade the OS to Update 5 on the first node only.
4. Upgrade powerpath and naviagent (if needed), again first
node alone at this time. Reconnect the FC on the first node.
5. Once the above steps are completed, DBAs will check
and bring up the DB on the first node with the old version of oracle and
verify it joined the clusters fine etc. Then we will be ready to bring
down the second server's DB etc and do the same above 1-4 steps.
6. Ensure the Second server's DBs are down and repeat the above
steps.
7. Once steps 1-5 are completed on second node.
8. Now the oracle upgrade begins, bring down the DB. ASM, CRS
on both servers and Upgrade.
9. Bring back both servers CRS services and DB online through
proper way and test.
Let me know your thoughts on this.
Tom Poovathoor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/oraclesolutions/attachments/20080312/b15d98cc/attachment-0001.htm
More information about the OracleSolutions
mailing list