When the 11.2.3.2.1 release of the Exadata Storage Server software was released, I was a little excited. There were numerous oneoff patches for the previous release, 11.2.3.2.0, which was the first version to support the Exadata X3, writeback flashcache, run UEK on the X#-2 systems, etc. With that many large changes introduced in one version, it was likely to see some bugs in the .0 release. Fortunately, Oracle was quick to fix many of those issues, but it resulted in several separate patches to update the cellsrv software.
I was working with a colleague last week where we ready to apply this patch to a customer's Exadata system. Everything went off without a hitch - upgrading from 11.2.2.4.2 straight to 11.2.3.2.1. We even applied the patch to the customer's quarter rack in rolling mode, which took under 6 hours to complete. After everything was back up and running, we took an archive log backup using RMAN. For this customer, we back everything up to NFS because it won't fit within the FRA, and they don't want to leave backups inside the production system. We were greeted with a strange error when we tried to kick off the backup job in RMAN:
RMAN> run {
2> ALLOCATE CHANNEL DISK1 DEVICE TYPE DISK;
3> BACKUP DATABASE FORMAT '/mnt/nfs/actest_%U';
4> RELEASE CHANNEL DISK1;
5> }
using target database control file instead of recovery catalog
allocated channel: DISK1
channel DISK1: SID=397 instance=ACTEST1 device type=DISK
Starting backup at 13-02-28 21:38
channel DISK1: starting full datafile backup set
channel DISK1: specifying datafile(s) in backup set
input datafile file number=00007 name=+DATA/actest/datafile/tanel_bigfile.325.808412931
input datafile file number=00006 name=+DATA/actest/datafile/ts_data.380.779860027
input datafile file number=00001 name=+DATA/actest/datafile/system.367.779029515
input datafile file number=00002 name=+DATA/actest/datafile/sysaux.368.779029555
input datafile file number=00003 name=+DATA/actest/datafile/undotbs1.369.779029595
input datafile file number=00004 name=+DATA/actest/datafile/undotbs2.371.779029649
input datafile file number=00005 name=+DATA/actest/datafile/users.372.779029687
channel DISK1: starting piece 1 at 13-02-28 21:38
released channel: DISK1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on DISK1 channel at 02/28/2013 21:38:37
ORA-19504: failed to create file "/mnt/nfs/actest_1jo34pas_1_1"
ORA-27044: unable to write the header block of file
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 3
It didn't matter what we were trying to back up, just that it was going to NFS. This backup job had worked fine prior to the patch (we took a backup immediately preceding the maintenance window), but we had applied both a database bundle patch (this database was 11.2.0.2) and the latest storage server patch (11.2.3.2.1), which updates the Linux OS to OEL 5.8, as well as introduces the Oracle Unbreakable Enterprise Kernel into the mix.
We checked the mount options to make sure that everything was ok, and saw that it was:
[enkdb01:oracle:ACTEST1] /u01/app/oracle/product/11.2.0.3/dbhome_2/rdbms/lib
> mount | grep "/mnt/nfs"
192.168.12.22:/export/nfs on /mnt/nfs type nfs (rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,nfsvers=3,timeo=600,actimeo=0,addr=192.168.12.22)
After poking around a bit, we opened a service request, which was answered pretty quickly by Oracle support. It turns out that there is a known bug with the NFS driver included in the version of the UEK packaged with 11.2.3.2.1. Oracle provided 3 possible fixes, which I'll detail below. The fixes were:
- Enable direct NFS
- Switch to the non-UEK version of the kernel included in 11.2.3.2.1
- Apply a kernel patch using ksplice
Enable direct NFS
First, I'll cover enabling direct NFS. I actually have a blog post in the works, but to give you a quick once-over on it, here goes. Direct NFS is a custom-built NFS driver that does not interfere with the kernel's NFS driver. Any operations that go through the database (RMAN, data pump, etc) will use this special driver that is optimized for database operations. Processes that utilize direct NFS operate in user space (like FUSE), which has less overhead than kernel space. Because direct NFS does not use the bad kernel NFS driver for backup operations, the bug is negated. It goes without saying that if your databases are interacting with NFS, you should use direct NFS. There is no penalty for doing so, and it's really easy to do. Shut down your database instance, and relink for direct NFS:
[enkdb01:oracle:ACTEST1] /home/oracle
> cd $ORACLE_HOME/rdbms/lib
[enkdb01:oracle:ACTEST1] /u01/app/oracle/product/11.2.0.3/dbhome_2/rdbms/lib
> make -f ins_rdbms.mk dnfs_on
rm -f /u01/app/oracle/product/11.2.0.3/dbhome_2/lib/libodm11.so; cp /u01/app/oracle/product/11.2.0.3/dbhome_2/lib/libnfsodm11.so /u01/app/oracle/product/11.2.0.3/dbhome_2/lib/libodm11.so
That's it. If you see the following line in your alert log on instance startup, you're on your way:
Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 3.0
Once you start to access the NFS devices, you can query v$dnfs_servers to ensure that the database is using direct NFS.
SYS:ACTEST1> select * from v$dnfs_servers;
ID SVRNAME DIRNAME MNTPORT NFSPORT WTMAX RTMAX
----------- -------------- -------------- ----------- ----------- ----------- -----------
1 192.168.12.22 /export/nfs 939 2049 0 0
Switch to non-UEK
The next option provided by Oracle support was to switch from the Unbreakable Enterprise Kernel included with 11.2.3.2.1 to the RedHat compatible kernel. To do this, follow the instructions in the readme for the patch (patch #14522699).
First, remove the problematic UEK packages:
[enkdb01:root] /root
> yum remove exadata-sun-computenode kernel-uek-devel kernel-uek-debuginfo kernel-uek-doc kernel-uek-debuginfo-common kernel-uek-headers
Loaded plugins: security
Setting up Remove Process
Resolving Dependencies
--> Running transaction check
---> Package exadata-sun-computenode.x86_64 0:11.2.3.2.1.130109-1 set to be erased
---> Package kernel-uek-debuginfo.x86_64 0:2.6.32-400.11.1.el5uek set to be erased
---> Package kernel-uek-debuginfo-common.x86_64 0:2.6.32-400.11.1.el5uek set to be erased
---> Package kernel-uek-devel.x86_64 0:2.6.32-400.11.1.el5uek set to be erased
---> Package kernel-uek-doc.noarch 0:2.6.32-400.11.1.el5uek set to be erased
---> Package kernel-uek-headers.x86_64 0:2.6.32-400.11.1.el5uek set to be erased
--> Processing Dependency: kernel-headers for package: glibc-headers
--> Processing Dependency: kernel-headers >= 2.2.1 for package: glibc-headers
--> Running transaction check
---> Package glibc-headers.x86_64 0:2.5-81.el5_8.7 set to be erased
--> Processing Dependency: glibc-headers for package: glibc-devel
--> Processing Dependency: glibc-headers = 2.5-81.el5_8.7 for package: glibc-devel
--> Processing Dependency: glibc-headers for package: glibc-devel
--> Processing Dependency: glibc-headers = 2.5-81.el5_8.7 for package: glibc-devel
--> Running transaction check
---> Package glibc-devel.i386 0:2.5-81.el5_8.7 set to be erased
--> Processing Dependency: glibc-devel >= 2.2.90-12 for package: gcc
--> Processing Dependency: glibc-devel >= 2.2.90-12 for package: compat-gcc-34
---> Package glibc-devel.x86_64 0:2.5-81.el5_8.7 set to be erased
--> Running transaction check
---> Package compat-gcc-34.x86_64 0:3.4.6-4 set to be erased
--> Processing Dependency: compat-gcc-34 = 3.4.6-4 for package: compat-gcc-34-c++
---> Package gcc.x86_64 0:4.1.2-52.el5_8.1 set to be erased
--> Processing Dependency: gcc = 4.1.2-52.el5_8.1 for package: gcc-c++
--> Running transaction check
---> Package compat-gcc-34-c++.x86_64 0:3.4.6-4 set to be erased
---> Package gcc-c++.x86_64 0:4.1.2-52.el5_8.1 set to be erased
--> Finished Dependency Resolution
Dependencies Resolved
========================================================================================================================================================
Package Arch Version Repository Size
========================================================================================================================================================
Removing:
exadata-sun-computenode x86_64 11.2.3.2.1.130109-1 installed 95 k
kernel-uek-debuginfo x86_64 2.6.32-400.11.1.el5uek installed 1.0 G
kernel-uek-debuginfo-common x86_64 2.6.32-400.11.1.el5uek installed 163 M
kernel-uek-devel x86_64 2.6.32-400.11.1.el5uek installed 45 M
kernel-uek-doc noarch 2.6.32-400.11.1.el5uek installed 33 M
kernel-uek-headers x86_64 2.6.32-400.11.1.el5uek installed 2.2 M
Removing for dependencies:
compat-gcc-34 x86_64 3.4.6-4 installed 12 M
compat-gcc-34-c++ x86_64 3.4.6-4 installed 84 M
gcc x86_64 4.1.2-52.el5_8.1 installed 9.9 M
gcc-c++ x86_64 4.1.2-52.el5_8.1 installed 7.5 M
glibc-devel i386 2.5-81.el5_8.7 installed 4.9 M
glibc-devel x86_64 2.5-81.el5_8.7 installed 7.0 M
glibc-headers x86_64 2.5-81.el5_8.7 installed 1.9 M
Transaction Summary
========================================================================================================================================================
Remove 13 Package(s)
Reinstall 0 Package(s)
Downgrade 0 Package(s)
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Erasing : kernel-uek-debuginfo 1/13
Erasing : kernel-uek-debuginfo-common 2/13
Erasing : gcc 3/13
Erasing : glibc-headers 4/13
Erasing : compat-gcc-34 5/13
Erasing : glibc-devel 6/13
Erasing : glibc-devel 7/13
Erasing : compat-gcc-34-c++ 8/13
Erasing : kernel-uek-headers 9/13
Erasing : gcc-c++ 10/13
Erasing : kernel-uek-devel 11/13
Erasing : exadata-sun-computenode 12/13
Erasing : kernel-uek-doc 13/13
Removed:
exadata-sun-computenode.x86_64 0:11.2.3.2.1.130109-1 kernel-uek-debuginfo.x86_64 0:2.6.32-400.11.1.el5uek
kernel-uek-debuginfo-common.x86_64 0:2.6.32-400.11.1.el5uek kernel-uek-devel.x86_64 0:2.6.32-400.11.1.el5uek
kernel-uek-doc.noarch 0:2.6.32-400.11.1.el5uek kernel-uek-headers.x86_64 0:2.6.32-400.11.1.el5uek
Dependency Removed:
compat-gcc-34.x86_64 0:3.4.6-4 compat-gcc-34-c++.x86_64 0:3.4.6-4 gcc.x86_64 0:4.1.2-52.el5_8.1 gcc-c++.x86_64 0:4.1.2-52.el5_8.1
glibc-devel.i386 0:2.5-81.el5_8.7 glibc-devel.x86_64 0:2.5-81.el5_8.7 glibc-headers.x86_64 0:2.5-81.el5_8.7
Complete!
Next, modify the exclude line in /etc/yum.conf to include kernel-uek-headers
exclude=up2date kernel-uek-headers
Stop and disable CRS on the node
[enkdb01:root] /root
> crsctl stop crs
[enkdb01:root] /root
> crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.
Install the exadata-sun-computenode-non-uek package from the yum repository.
[enkdb01:root] /root
> yum --enablerepo=exadata_dbserver_11.2.3.2.1_x86_64_base install exadata-sun-computenode-non-uek
Loaded plugins: security
exadata_dbserver_11.2.3.2.1_x86_64_base | 1.9 kB 00:00
Excluding Packages in global exclude list
Finished
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package exadata-sun-computenode-non-uek.x86_64 0:11.2.3.2.1.130109-1 set to be updated
exadata_dbserver_11.2.3.2.1_x86_64_base/filelists_db | 680 kB 00:00
--> Processing Dependency: gcc = 4.1.1-52.el5_8 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: glibc-kernheaders = 3.0-46 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: kernel-headers = 2.6.18-308.24.1.0.1.el5 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: gcc = 4.1.2-52.el5_8.1 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: ofa-2.6.18-308.24.1.0.1.el5 = 1.5.1-4.0.58 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: glibc-devel = 2.5-81.el5_8.7 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: glibc-headers = 2.5-81.el5_8.7 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: gcc-c++ = 4.1.1-52.el5_8 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: gcc-c++ = 4.1.2-52.el5_8.1 for package: exadata-sun-computenode-non-uek
--> Processing Dependency: megaraid_sas-2.6.18-308.24.1.0.1.el5 = v00.00.06.12-ORCL.2 for package: exadata-sun-computenode-non-uek
--> Running transaction check
---> Package gcc.x86_64 0:4.1.2-52.el5_8.1 set to be updated
---> Package gcc-c++.x86_64 0:4.1.2-52.el5_8.1 set to be updated
---> Package glibc-devel.x86_64 0:2.5-81.el5_8.7 set to be updated
---> Package glibc-headers.x86_64 0:2.5-81.el5_8.7 set to be updated
---> Package kernel-headers.x86_64 0:2.6.18-308.24.1.0.1.el5 set to be updated
---> Package megaraid_sas-2.6.18-308.24.1.0.1.el5.x86_64 0:v00.00.06.12-ORCL.2 set to be updated
---> Package ofa-2.6.18-308.24.1.0.1.el5.x86_64 0:1.5.1-4.0.58 set to be updated
--> Finished Dependency Resolution
Dependencies Resolved
========================================================================================================================================================
Package Arch Version Repository Size
========================================================================================================================================================
Installing:
exadata-sun-computenode-non-uek x86_64 11.2.3.2.1.130109-1 exadata_dbserver_11.2.3.2.1_x86_64_base 47 k
Installing for dependencies:
gcc x86_64 4.1.2-52.el5_8.1 exadata_dbserver_11.2.3.2.1_x86_64_base 5.3 M
gcc-c++ x86_64 4.1.2-52.el5_8.1 exadata_dbserver_11.2.3.2.1_x86_64_base 3.8 M
glibc-devel x86_64 2.5-81.el5_8.7 exadata_dbserver_11.2.3.2.1_x86_64_base 2.4 M
glibc-headers x86_64 2.5-81.el5_8.7 exadata_dbserver_11.2.3.2.1_x86_64_base 597 k
kernel-headers x86_64 2.6.18-308.24.1.0.1.el5 exadata_dbserver_11.2.3.2.1_x86_64_base 1.4 M
megaraid_sas-2.6.18-308.24.1.0.1.el5 x86_64 v00.00.06.12-ORCL.2 exadata_dbserver_11.2.3.2.1_x86_64_base 287 k
ofa-2.6.18-308.24.1.0.1.el5 x86_64 1.5.1-4.0.58 exadata_dbserver_11.2.3.2.1_x86_64_base 846 k
Transaction Summary
========================================================================================================================================================
Install 8 Package(s)
Upgrade 0 Package(s)
Total download size: 15 M
Is this ok [y/N]: y
Downloading Packages:
--------------------------------------------------------------------------------------------------------------------------------------------------------
Total 4.5 GB/s | 15 MB 00:00
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : kernel-headers 1/8
Installing : glibc-headers 2/8
Installing : glibc-devel 3/8
Installing : ofa-2.6.18-308.24.1.0.1.el5 4/8
Installing : megaraid_sas-2.6.18-308.24.1.0.1.el5 5/8
Current initrd saved in /boot/initrd-2.6.18-308.24.1.0.1.el5.img.bak.066-2200.11.2.3.2.1.130109
initrd image created: /boot/initrd-2.6.18-308.24.1.0.1.el5.img
Installing : gcc 6/8
Installing : gcc-c++ 7/8
Installing : exadata-sun-computenode-non-uek 8/8
Installed:
exadata-sun-computenode-non-uek.x86_64 0:11.2.3.2.1.130109-1
Dependency Installed:
gcc.x86_64 0:4.1.2-52.el5_8.1 gcc-c++.x86_64 0:4.1.2-52.el5_8.1
glibc-devel.x86_64 0:2.5-81.el5_8.7 glibc-headers.x86_64 0:2.5-81.el5_8.7
kernel-headers.x86_64 0:2.6.18-308.24.1.0.1.el5 megaraid_sas-2.6.18-308.24.1.0.1.el5.x86_64 0:v00.00.06.12-ORCL.2
ofa-2.6.18-308.24.1.0.1.el5.x86_64 0:1.5.1-4.0.58
Complete!
[enkdb01:root] /root
>
Remote broadcast message (Thu Mar 7 22:00:43 2013):
Exadata post install steps started.
It may take up to 2 minutes.
The db node will be rebooted upon successful completion.
This will cause the compute node to reboot. When it has come up, check to make sure that the server booted to the non-UEK version of the kernel:
[enkdb01:root] /root
> uname -a
Linux enkdb01.enkitec.com 2.6.18-308.24.1.0.1.el5 #1 SMP Tue Dec 4 16:00:29 PST 2012 x86_64 x86_64 x86_64 GNU/Linux
Remove the unused UEK RPMs, and install the glibc RPM.
[enkdb01:root] /root
> yum remove kernel-uek kernel-uek-firmware
Loaded plugins: security
Setting up Remove Process
Resolving Dependencies
--> Running transaction check
---> Package kernel-uek.x86_64 0:2.6.32-300.7.2.el5uek set to be erased
--> Processing Dependency: kernel-uek = 2.6.32-300.7.2.el5uek for package: megaraid_sas-2.6.32-300.7.2.el5uek
--> Processing Dependency: kernel-uek = 2.6.32-300.7.2.el5uek for package: ofa-2.6.32-300.7.2.el5uek
---> Package kernel-uek.x86_64 0:2.6.32-300.7.3.el5uek set to be erased
--> Processing Dependency: kernel-uek = 2.6.32-300.7.3.el5uek for package: ofa-2.6.32-300.7.3.el5uek
--> Processing Dependency: kernel-uek = 2.6.32-300.7.3.el5uek for package: megaraid_sas-2.6.32-300.7.3.el5uek
---> Package kernel-uek.x86_64 0:2.6.32-400.1.1.el5uek set to be erased
--> Processing Dependency: kernel-uek = 2.6.32-400.1.1.el5uek for package: ofa-2.6.32-400.1.1.el5uek
---> Package kernel-uek.x86_64 0:2.6.32-400.11.1.el5uek set to be erased
--> Processing Dependency: kernel-uek = 2.6.32-400.11.1.el5uek for package: ofa-2.6.32-400.11.1.el5uek
---> Package kernel-uek-firmware.noarch 0:2.6.32-400.11.1.el5uek set to be erased
--> Running transaction check
---> Package megaraid_sas-2.6.32-300.7.2.el5uek.x86_64 0:v00.00.06.12-ORCL.2 set to be erased
---> Package megaraid_sas-2.6.32-300.7.3.el5uek.x86_64 0:v00.00.06.12-ORCL.2 set to be erased
---> Package ofa-2.6.32-300.7.2.el5uek.x86_64 0:1.5.1-4.0.58 set to be erased
---> Package ofa-2.6.32-300.7.3.el5uek.x86_64 0:1.5.1-4.0.58 set to be erased
---> Package ofa-2.6.32-400.1.1.el5uek.x86_64 0:1.5.1-4.0.58 set to be erased
---> Package ofa-2.6.32-400.11.1.el5uek.x86_64 0:1.5.1-4.0.58 set to be erased
--> Finished Dependency Resolution
Dependencies Resolved
==========================================================================================================================================================================
Package Arch Version Repository Size
==========================================================================================================================================================================
Removing:
kernel-uek x86_64 2.6.32-300.7.2.el5uek installed 81 M
kernel-uek x86_64 2.6.32-300.7.3.el5uek installed 81 M
kernel-uek x86_64 2.6.32-400.1.1.el5uek installed 82 M
kernel-uek x86_64 2.6.32-400.11.1.el5uek installed 82 M
kernel-uek-firmware noarch 2.6.32-400.11.1.el5uek installed 5.3 M
Removing for dependencies:
megaraid_sas-2.6.32-300.7.2.el5uek x86_64 v00.00.06.12-ORCL.2 installed 1.1 M
megaraid_sas-2.6.32-300.7.3.el5uek x86_64 v00.00.06.12-ORCL.2 installed 1.1 M
ofa-2.6.32-300.7.2.el5uek x86_64 1.5.1-4.0.58 installed 3.4 M
ofa-2.6.32-300.7.3.el5uek x86_64 1.5.1-4.0.58 installed 3.4 M
ofa-2.6.32-400.1.1.el5uek x86_64 1.5.1-4.0.58 installed 3.4 M
ofa-2.6.32-400.11.1.el5uek x86_64 1.5.1-4.0.58 installed 3.4 M
Transaction Summary
==========================================================================================================================================================================
Remove 11 Package(s)
Reinstall 0 Package(s)
Downgrade 0 Package(s)
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Erasing : kernel-uek 1/11
Erasing : ofa-2.6.32-400.1.1.el5uek 2/11
Erasing : kernel-uek 3/11
Erasing : ofa-2.6.32-300.7.2.el5uek 4/11
Megaraid path valid. Proceeding.
Erasing : megaraid_sas-2.6.32-300.7.3.el5uek 5/11
Erasing : ofa-2.6.32-300.7.3.el5uek 6/11
Erasing : ofa-2.6.32-400.11.1.el5uek 7/11
Erasing : kernel-uek 8/11
Erasing : kernel-uek-firmware 9/11
Megaraid path valid. Proceeding.
Warning. /lib/modules/2.6.32-300.7.2.el5uek/kernel/drivers/scsi/megaraid/megaraid_sas.ko does not match file from the uninstalling rpm
Warning. Leave megaraid_sas driver as is
+ exit 0
Erasing : megaraid_sas-2.6.32-300.7.2.el5uek 10/11
Erasing : kernel-uek 11/11
Removed:
kernel-uek.x86_64 0:2.6.32-300.7.2.el5uek kernel-uek.x86_64 0:2.6.32-300.7.3.el5uek kernel-uek.x86_64 0:2.6.32-400.1.1.el5uek
kernel-uek.x86_64 0:2.6.32-400.11.1.el5uek kernel-uek-firmware.noarch 0:2.6.32-400.11.1.el5uek
Dependency Removed:
megaraid_sas-2.6.32-300.7.2.el5uek.x86_64 0:v00.00.06.12-ORCL.2 megaraid_sas-2.6.32-300.7.3.el5uek.x86_64 0:v00.00.06.12-ORCL.2
ofa-2.6.32-300.7.2.el5uek.x86_64 0:1.5.1-4.0.58 ofa-2.6.32-300.7.3.el5uek.x86_64 0:1.5.1-4.0.58
ofa-2.6.32-400.1.1.el5uek.x86_64 0:1.5.1-4.0.58 ofa-2.6.32-400.11.1.el5uek.x86_64 0:1.5.1-4.0.58
Complete!
[enkdb01:root] /root
> yum --enablerepo=exadata_dbserver_11.2.3.2.1_x86_64_base install glibc-devel.i386
Loaded plugins: security
Excluding Packages in global exclude list
Finished
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package glibc-devel.i386 0:2.5-81.el5_8.7 set to be updated
--> Finished Dependency Resolution
Dependencies Resolved
==========================================================================================================================================================================
Package Arch Version Repository Size
==========================================================================================================================================================================
Installing:
glibc-devel i386 2.5-81.el5_8.7 exadata_dbserver_11.2.3.2.1_x86_64_base 2.1 M
Transaction Summary
==========================================================================================================================================================================
Install 1 Package(s)
Upgrade 0 Package(s)
Total download size: 2.1 M
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : glibc-devel 1/1
Installed:
glibc-devel.i386 0:2.5-81.el5_8.7
Complete!
Finally, relink the Oracle homes, restart CRS, and try backing up your database.
[enkdb01:root] /root
> crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.
[enkdb01:root] /root
> crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
Ksplice on the UEK
The last option is the most interesting, in my opinion - applying an online kernel patch using ksplice. If you're not familiar with ksplice, it is a magical little piece of software acquired by Oracle which allows administrators to apply kernel patches to the kernel running in memory, eliminating the need for a reboot (generally). Because ksplice isn't fully supported on Exadata, we have a few limitations on what can be done, but this is an interesting glimpse into what may be coming in future releases. This patch (provided by Oracle Support) is a single ksplice patch for this bug. To install, unzip the patch, copy the patch file to the uptrack directory, and apply the patch. All of the following must be done as root.
First, make sure that there aren't any patches installed
[enkdb01:root] /tmp/ksplice_patch/uptrack
> ./ksplice-view
[enkdb01:root] /tmp/ksplice_patch/uptrack
>
Next, apply the patch - this takes about 3 seconds to apply:
[enkdb01:root] /tmp/ksplice_patch/uptrack
> ./ksplice-apply ksplice-9j3o3021.tar.gz
Done!
Check that the patch has applied successfully
[enkdb01:root] /tmp/ksplice_patch/uptrack
> ./ksplice-view
9j3o3021: Ksplice test suite: oracle-uek5-2.6.32-400.11.1.el5uek-amd64
That's all there is to it. This will need to be applied on both nodes, and since it's not a full installation of ksplice, the patch will have to be reapplied upon each reboot of the compute node. If you really want to, you could create a script to apply the patch when the server boots up. On full installations of ksplice, the kernel will be "respliced" upon each reboot.
Hi Andy,
Really helpful blog, I don’t think you are going to be alone in hitting this one!
I take it the uptrack command comes with the patch, rather than native on the Exadata? Did you have to get the patch specially from Oracle support, or is it generally available? i.e. do you have a patch number?
cheers,
jason.
Yes, the uptrack command is not the standard ksplice uptrack command. It is included in the patch provided by Oracle. Unfortunately, there’s no full support of ksplice yet.
I installed the exact same patch last week. The ksplice patching went indeed really easy.
I have also been looking into using our local yum repo server for the ksplice offline client, but had to abandon it due a lack of disk space (and time).
Have you done some investigations into it?
Also I have been investigating backup performance and it seems that without dnfs you really need to enable async io via the filesystemio_options parameter to get some real performance (but had to hand the environment back, before I could do some formal camparison testing).
Dnfs is using async without having to specify it in the filesystemio_options parameter.
I’m just starting to play around with ksplice. I’ve seen some demos of it lately, and it’s pretty cool.
Nice find, i just ran into the same issue this week at a customers site. I always thought that you still needed a seperate license for using ksplice?
Because every Exadata has “Premier Support for Systems,” it includes licenses for OEL, which covers the ksplice license (I think). I hate answering licensing questions, but here’s my source – https://oss.oracle.com/ksplice/docs/ksplice-quickstart.pdf
Ah thanks, ksplice is awesome 🙂
There is a patch , see note ID 1532488.1 on metalink.
Solution
1. Install patch 16432033: EXADATA COMPUTE NODE 11.2.3.2.1 BASE REPOSITORY ISO WITH FIXES
This repository ISO image contains only the Oracle Exadata release 11.2 Latest channel which includes 11.2.3.2.1 Base channel plus fixes for bugs 16263472, 16046497 and 15956690. If RPMs from additional channels are required, then they must be obtained via other methods. This patch supercedes the contents of the ISO contained in patch 15991297. Customers already on 11.2.3.2.1 should consider applying this updated patch. Refer to the patch README for more information.
Yes, Oracle has released a newer ISO file. The version of the software is the same, with the exception of the timestamp at the end. Here’s info from a server with the older 11.2.3.2.1 patch installed:
Kernel version: 2.6.32-400.11.1.el5uek #1 SMP Thu Nov 22 03:29:09 PST 2012 x86_64
Image version: 11.2.3.2.1.130109
Image activated: 2013-02-20 21:48:21 -0800
Image status: success
System partition on device: /dev/mapper/VGExaDb-LVDbSys1
Here’s info from a server with the later patch:
Kernel version: 2.6.32-400.21.1.el5uek #1 SMP Wed Feb 20 01:35:01 PST 2013 x86_64
Image version: 11.2.3.2.1.130302
Image activated: 2013-05-06 13:11:32 -0600
Image status: success
System partition on device: /dev/mapper/VGExaDb-LVDbSys1
Note the difference between the kernel versions (2.6.32-400.11.1 vs 2.6.32.400.21.1) and the timestamp on the image version (11.2.3.2.1.130109 vs 11.2.3.2.1.130302)
Thanks. I already forgot about the ksplice patch since we applied it a year ago after searching thru google I found your post and remembered we did the ksplice patch long ago. Thans again.
Pingback: Kdump to NFS Broken in UEK | NASMART