Exadata Critical Patch for 11.2.2.3.x through 11.2.2.4.1

By | December 29, 2011

Oracle has released a critical patch for storage server versions 11.2.2.3.x through 11.2.2.4.1.  While 11.2.2.4.1 was released last week, there were a few oneoff patches from 11.2.2.4.0 that didn't seem to make it in to the release.  Oracle has since released 11.2.2.4.2 (patch #13513611, supplemental note #1388400.1).  Similar to 11.2.2.4.1, this release looks to patch several outstanding issues.  Here's the list of bugs fixed from the readme for 11.2.2.4.2:

12764521        INFINIBAND DIAG COMMANDS (LIKE IBDIAGNET AND IBNETDISCOVER) ARE NOT WORKING
13083530        10 GB-E BONDED INTERFACES FAILING- EXADATA
13410353        AFTER UPGRADE TO 11.2.2.4 INFINIBAND CMDS IBDIAGNET, IBNETDISCOVER NOT WORKING
13489032        CHECKHWNFWPROFILE DOES NOT DETECT FAILED FLASH FDOM
13489445        ORA-600 [OSSMISC:OSSMISC_TIMER] WHEN NTPD DETECTED 6 MILLISECOND TIME DIFFERENCE
13512932        FIX INSTALLED WORKAROUND FOR NTP UPDATE BUG 13489445

As you can see, the previously mentioned bugs have been fixed.  There's another bug that was fixed in 11.2.2.4.1 that could be an issue for anybody running 11.2.2.3.x through 11.2.2.4.0.  This bug (13454147) can remove the flashcache from a cell that has an uptime of 6 months or greater.  Fortunately, Oracle has released a patch that includes these critical issues in the event that you can't quickly upgrade to 11.2.2.4.2 - I wouldn't advise running this version for at least a couple weeks...I always advise clients to wait that long for the early adopters to weed out any major issues.

Applying the critical patch only takes a minute, and doesn't take the storage servers or database instances offline.  After it's done, a restart of cellsrv needs to be scheduled, but that can be done in a rolling fashion.  Read on for an example of applying this patch.  As always, do not apply any patch to a production system before appropriately testing against a non-production system!

According to the documentation for patch 13517481, the following bugs are fixed:

Bug       Description
--------  -------------------------------------------------------------------
13454147  Flash cards go offline after 6 months of uptime
12886507  IDT switch in PCI riser resets causing missing flash cards
12626126  Temporary IO stall caused by drive medium errors causes cell reboot
13489445  CELLSRV crash if NTPD interrupted and time drifts back too far
13083530  10GbE network interfaces shutdown

The installation is very quick and easy.  Unpack the patch to a directory on the first database server (I used /u01/stage/patches/11.2.2.4.1_supplemental), and cd to the directory.

[enkdb01:root] /root
> cd /u01/stage/patches/11.2.2.4.1_supplemental/

[enkdb01:root] /u01/stage/patches/11.2.2.4.1_supplemental
> ls
p13517481_112100_Linux-x86-64.zip

[enkdb01:root] /u01/stage/patches/11.2.2.4.1_supplemental
> unzip p13517481_112100_Linux-x86-64.zip
Archive:  p13517481_112100_Linux-x86-64.zip
   creating: 13517481/
  inflating: 13517481/fixpciidt_12886507
  inflating: 13517481/10gig_rxusecs0
  inflating: 13517481/README.txt
  inflating: 13517481/install.sh
  inflating: 13517481/fix_flash_links.sh  

[enkdb01:root] /u01/stage/patches/11.2.2.4.1_supplemental
> cd 13517481/

[enkdb01:root] /u01/stage/patches/11.2.2.4.1_supplemental/13517481
> ls
10gig_rxusecs0  fix_flash_links.sh  fixpciidt_12886507  install.sh  README.txt

After this has been done, copy the all_group file from root's home directory, and verify that SSH equivalence works ok. If everything passes, run the patch check:

[enkdb01:root] /u01/stage/patches/11.2.2.4.1_supplemental/13517481
> cp ~/all_group .

[enkdb01:root] /u01/stage/patches/11.2.2.4.1_supplemental/13517481
> dcli -l root -g all_group hostname
enkdb01: enkdb01.enkitec.com
enkdb02: enkdb02.enkitec.com
enkcel01: enkcel01.enkitec.com
enkcel02: enkcel02.enkitec.com
enkcel03: enkcel03.enkitec.com

[enkdb01:root] /u01/stage/patches/11.2.2.4.1_supplemental/13517481
> ./install.sh -g all_group check

Additional details in patch13517481.log

Perform check using dcli on all systems in all_group: enkdb01 enkdb02 enkcel01 enkcel02 enkcel03 

Completed check on all systems.

Screen output captured from all systems and placed in patch13517481.log

Log files from all systems collected and placed in the current directory /u01/stage/patches/11.2.2.4.1_supplemental/13517481

Check the logs that were created (patch13517481.log for the summary, there is a log for each server), and if everything passes, then apply the patch. Note that the patch will only apply the required fixes based on the server type. On our V2 Exadata shown here, it does not apply the 10GbE patch...The V2 systems do not have 10GbE capability.

[enkdb01:root] /u01/stage/patches/11.2.2.4.1_supplemental/13517481
> ./install.sh -g all_group apply

Additional details in patch13517481.log

Perform apply using dcli on all systems in all_group: enkdb01 enkdb02 enkcel01 enkcel02 enkcel03 

Completed apply on all systems.

Screen output captured from all systems and placed in patch13517481.log

Log files from all systems collected and placed in the current directory /u01/stage/patches/11.2.2.4.1_supplemental/13517481

The log is appended, and you can check to make sure that the patches were applied successfully. Here are the relevant contents of our patch13517481.log file:

2011-12-29 07:13:22 CST main: ====================================================================
2011-12-29 07:20:49 CST main: Running ./install.sh with options ACTION=apply, BUGFIX=ALL, GROUPFILE=all_group
2011-12-29 07:20:49 CST main: Perform apply using dcli on all systems in all_group: enkdb01 enkdb02 enkcel01 enkcel02 enkcel03
2011-12-29 07:20:49 CST main: Verifying SSH setup and free space for all systems
2011-12-29 07:20:49 CST main: SSH validation for enkdb01 passed
2011-12-29 07:20:49 CST main: Free space validation for enkdb01 passed
2011-12-29 07:20:49 CST main: SSH validation for enkdb02 passed
2011-12-29 07:20:49 CST main: Free space validation for enkdb02 passed
2011-12-29 07:20:50 CST main: SSH validation for enkcel01 passed
2011-12-29 07:20:50 CST main: Free space validation for enkcel01 passed
2011-12-29 07:20:50 CST main: SSH validation for enkcel02 passed
2011-12-29 07:20:50 CST main: Free space validation for enkcel02 passed
2011-12-29 07:20:50 CST main: SSH validation for enkcel03 passed
2011-12-29 07:20:50 CST main: Free space validation for enkcel03 passed
2011-12-29 07:20:50 CST main: Create working directory /tmp/patch13517481_122911072049 on all systems
2011-12-29 07:20:50 CST main: Distribute patch files to all systems
2011-12-29 07:20:51 CST main: Execute './install.sh -b ALL apply' on all systems
2011-12-29 07:20:54 CST main: Completed apply on all systems.
2011-12-29 07:20:54 CST main: Screen output captured from all systems and placed in patch13517481.log
2011-12-29 07:20:54 CST main: enkdb01:
2011-12-29 07:20:54 CST main: enkdb01: Additional details in /tmp/patch13517481_122911072049/patch13517481.log
2011-12-29 07:20:54 CST main: enkdb01:
2011-12-29 07:20:54 CST main: enkdb01: Fix for bug 13454147 (NoFlash six months) - apply
2011-12-29 07:20:54 CST main: enkdb01: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb01:
2011-12-29 07:20:54 CST main: enkdb01: Fix for bug 12886507 (IDT switch reset) - apply
2011-12-29 07:20:54 CST main: enkdb01: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb01:
2011-12-29 07:20:54 CST main: enkdb01: Fix for bug 13489445 (NTPD CELLSRV crash) - apply
2011-12-29 07:20:54 CST main: enkdb01: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb01:
2011-12-29 07:20:54 CST main: enkdb01: Fix for bug 13083530 (10GbE shutdown) - apply
2011-12-29 07:20:54 CST main: enkdb01: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb01:
2011-12-29 07:20:54 CST main: enkdb01: Fix for bug 12626126 (IO stall cell reboot) - apply
2011-12-29 07:20:54 CST main: enkdb01: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb01:
2011-12-29 07:20:54 CST main: enkdb02:
2011-12-29 07:20:54 CST main: enkdb02: Additional details in /tmp/patch13517481_122911072049/patch13517481.log
2011-12-29 07:20:54 CST main: enkdb02:
2011-12-29 07:20:54 CST main: enkdb02: Fix for bug 13454147 (NoFlash six months) - apply
2011-12-29 07:20:54 CST main: enkdb02: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb02:
2011-12-29 07:20:54 CST main: enkdb02: Fix for bug 12886507 (IDT switch reset) - apply
2011-12-29 07:20:54 CST main: enkdb02: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb02:
2011-12-29 07:20:54 CST main: enkdb02: Fix for bug 13489445 (NTPD CELLSRV crash) - apply
2011-12-29 07:20:54 CST main: enkdb02: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb02:
2011-12-29 07:20:54 CST main: enkdb02: Fix for bug 13083530 (10GbE shutdown) - apply
2011-12-29 07:20:54 CST main: enkdb02: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb02:
2011-12-29 07:20:54 CST main: enkdb02: Fix for bug 12626126 (IO stall cell reboot) - apply
2011-12-29 07:20:54 CST main: enkdb02: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkdb02:
2011-12-29 07:20:54 CST main: enkcel01:
2011-12-29 07:20:54 CST main: enkcel01: Additional details in /tmp/patch13517481_122911072049/patch13517481.log
2011-12-29 07:20:54 CST main: enkcel01:
2011-12-29 07:20:54 CST main: enkcel01: Fix for bug 13454147 (NoFlash six months) - apply
2011-12-29 07:20:54 CST main: enkcel01: Fix for bug 13454147 - apply SUCCESS
2011-12-29 07:20:54 CST main: enkcel01:
2011-12-29 07:20:54 CST main: enkcel01: Fix for bug 12886507 (IDT switch reset) - apply
2011-12-29 07:20:54 CST main: enkcel01: Fix not needed for Exadata version 11.2.2.4.0.110929 - no action taken
2011-12-29 07:20:54 CST main: enkcel01:
2011-12-29 07:20:54 CST main: enkcel01: Fix for bug 13489445 (NTPD CELLSRV crash) - apply
2011-12-29 07:20:54 CST main: enkcel01: ACTION REQUIRED - Fix for bug 13489445 applied and will become active at next CELLSRV restart
2011-12-29 07:20:54 CST main: enkcel01: Fix for bug 13489445 - apply SUCCESS
2011-12-29 07:20:54 CST main: enkcel01:
2011-12-29 07:20:54 CST main: enkcel01: Fix for bug 13083530 (10GbE shutdown) - apply
2011-12-29 07:20:54 CST main: enkcel01: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkcel01:
2011-12-29 07:20:54 CST main: enkcel01: Fix for bug 12626126 (IO stall cell reboot) - apply
2011-12-29 07:20:54 CST main: enkcel01: Fix not needed for Exadata version 11.2.2.4.0.110929
2011-12-29 07:20:54 CST main: enkcel01:
2011-12-29 07:20:54 CST main: enkcel02:
2011-12-29 07:20:54 CST main: enkcel02: Additional details in /tmp/patch13517481_122911072049/patch13517481.log
2011-12-29 07:20:54 CST main: enkcel02:
2011-12-29 07:20:54 CST main: enkcel02: Fix for bug 13454147 (NoFlash six months) - apply
2011-12-29 07:20:54 CST main: enkcel02: Fix for bug 13454147 - apply SUCCESS
2011-12-29 07:20:54 CST main: enkcel02:
2011-12-29 07:20:54 CST main: enkcel02: Fix for bug 12886507 (IDT switch reset) - apply
2011-12-29 07:20:54 CST main: enkcel02: Fix not needed for Exadata version 11.2.2.4.0.110929 - no action taken
2011-12-29 07:20:54 CST main: enkcel02:
2011-12-29 07:20:54 CST main: enkcel02: Fix for bug 13489445 (NTPD CELLSRV crash) - apply
2011-12-29 07:20:54 CST main: enkcel02: ACTION REQUIRED - Fix for bug 13489445 applied and will become active at next CELLSRV restart
2011-12-29 07:20:54 CST main: enkcel02: Fix for bug 13489445 - apply SUCCESS
2011-12-29 07:20:54 CST main: enkcel02:
2011-12-29 07:20:54 CST main: enkcel02: Fix for bug 13083530 (10GbE shutdown) - apply
2011-12-29 07:20:54 CST main: enkcel02: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkcel02:
2011-12-29 07:20:54 CST main: enkcel02: Fix for bug 12626126 (IO stall cell reboot) - apply
2011-12-29 07:20:54 CST main: enkcel02: Fix not needed for Exadata version 11.2.2.4.0.110929
2011-12-29 07:20:54 CST main: enkcel02:
2011-12-29 07:20:54 CST main: enkcel03:
2011-12-29 07:20:54 CST main: enkcel03: Additional details in /tmp/patch13517481_122911072049/patch13517481.log
2011-12-29 07:20:54 CST main: enkcel03:
2011-12-29 07:20:54 CST main: enkcel03: Fix for bug 13454147 (NoFlash six months) - apply
2011-12-29 07:20:54 CST main: enkcel03: Fix for bug 13454147 - apply SUCCESS
2011-12-29 07:20:54 CST main: enkcel03:
2011-12-29 07:20:54 CST main: enkcel03: Fix for bug 12886507 (IDT switch reset) - apply
2011-12-29 07:20:54 CST main: enkcel03: Fix not needed for Exadata version 11.2.2.4.0.110929 - no action taken
2011-12-29 07:20:54 CST main: enkcel03:
2011-12-29 07:20:54 CST main: enkcel03: Fix for bug 13489445 (NTPD CELLSRV crash) - apply
2011-12-29 07:20:54 CST main: enkcel03: ACTION REQUIRED - Fix for bug 13489445 applied and will become active at next CELLSRV restart
2011-12-29 07:20:54 CST main: enkcel03: Fix for bug 13489445 - apply SUCCESS
2011-12-29 07:20:54 CST main: enkcel03:
2011-12-29 07:20:54 CST main: enkcel03: Fix for bug 13083530 (10GbE shutdown) - apply
2011-12-29 07:20:54 CST main: enkcel03: Fix not applicable to this system
2011-12-29 07:20:54 CST main: enkcel03:
2011-12-29 07:20:54 CST main: enkcel03: Fix for bug 12626126 (IO stall cell reboot) - apply
2011-12-29 07:20:54 CST main: enkcel03: Fix not needed for Exadata version 11.2.2.4.0.110929
2011-12-29 07:20:54 CST main: enkcel03:
2011-12-29 07:20:54 CST main: Log files from all systems collected and placed in the current directory /u01/stage/patches/11.2.2.4.1_supplemental/13517481
2011-12-29 07:20:54 CST main: Log file enkdb01_patch13517481.log from enkdb01
2011-12-29 07:20:54 CST main: Log file enkdb02_patch13517481.log from enkdb02
2011-12-29 07:20:54 CST main: Log file enkcel01_patch13517481.log from enkcel01
2011-12-29 07:20:54 CST main: Log file enkcel02_patch13517481.log from enkcel02
2011-12-29 07:20:54 CST main: Log file enkcel03_patch13517481.log from enkcel03
2011-12-29 07:20:54 CST main: Remove working directory /tmp/patch13517481_122911072049 on all systems
2011-12-29 07:20:54 CST main: Exiting
2011-12-29 07:20:54 CST main: ====================================================================

Note that the NTP fix will not be available until the next cellsrv restart.  This is another critical bug, so do not forget to bounce cellsrv sometime after applying the patch.  Remember that cellsrv bounces can be done in a rolling fashion, but should probably be scheduled for a window where the system activity is low.

Leave a Reply

Your email address will not be published.