As part of an ongoing project, I've been performing a fair amount of upgrades to 19c on Exadata systems. Several of those systems are virtualized, running Oracle VM (based on Xen). I've previously mentioned Oracle's oedacli tool that can be used to make upgrades easier, and it's been very useful once you are familiar with it.
Grid Infrastructure upgrades with oedacli are broken in to several tasks, which can be executed separately - this gives you flexibility to perform the tasks that actually bounce the cluster at a specified time. The three steps of an upgrade with oedacli are:
- ADD_HOME - validates the system for use with 19c, unpacks gold image files, reconfigures guest config files, and mounts new home on guests.
- CONFIG_HOME - runs gridSetup.sh to configure the new home
- RUN_ROOTSCRIPT - executes rootupgrade.sh on each node and runs config tools script
We began running oedacli to perform the upgrade on our clusters using the April 2020 OEDA release. Step 1 completed without any issues, and step 2 failed almost immediately with the following error:
oedacli> deploy actions Deploying Action ID : 2 UPGRADE CLUSTER GIVERSION=19.6.0.0.200114 GIHOMELOC=/u01/app/19.0.0.0/grid WHERE CLUSTERNAME=exa1v1 STEPNAME=CONFIG_HOME Deploying UPGRADE CLUSTER Upgrading Cluster Configuring new clusterware home at /u01/app/19.0.0.0/grid Running Cluster Verification Utility for upgrade readiness.. Relinking binaries with RDS /u01/app/19.0.0.0/grid ERROR: Command: ORACLE_HOME=/u01/app/19.0.0.0/grid; export ORACLE_HOME;cd /u01/app/19.0.0.0/grid/rdbms/lib;make -f ins_rdbms.mk rac_on;make -f ins_rdbms.mk ikfod;make -f ins_rdbms.mk ipc_rds ioracle ORACLE_HOME=/u01/app/19.0.0.0/grid; produced null output exa1db01v1.example.com with exit status 2 bash: line 0: cd: /u01/app/19.0.0.0/grid/rdbms/lib: Permission denied make: ins_rdbms.mk: No such file or directory make: *** No rule to make target `ins_rdbms.mk'. Stop. make: ins_rdbms.mk: No such file or directory make: *** No rule to make target `ins_rdbms.mk'. Stop. make: ins_rdbms.mk: No such file or directory make: *** No rule to make target `ins_rdbms.mk'. Stop.
Well, that's not great. The good news is that it is pretty obvious to be a permissions error. I logged in to the VM, and attempted to get in to test running the relink command, and got the same error:
[oracle@exa1db01v1 ~]$ cd /u01/app/19.0.0.0/grid -bash: cd: /u01/app/19.0.0.0/grid: Permission denied
We can definitely see a permissions issue here. If I go back as root and check the directory, I can see that the files are there with proper ownership and access:
[root@enkx4db01 ~]# ls -al /u01/app/19.0.0.0/grid total 312 drwxr-xr-x 65 oracle oinstall 4096 May 31 15:10 . drwxr-xr-x 10 oracle oinstall 4096 May 31 15:09 .. drwxr-xr-x 2 oracle oinstall 4096 Apr 18 2019 addnode drwxr-xr-x 10 oracle oinstall 4096 Apr 17 2019 assistants drwxr-xr-x 2 oracle oinstall 12288 Apr 18 2019 bin drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 cha drwxr-xr-x 4 oracle oinstall 4096 Apr 18 2019 clone drwxr-xr-x 10 oracle oinstall 4096 Apr 18 2019 crs drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 css drwxr-xr-x 7 oracle oinstall 4096 Apr 17 2019 cv drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 dbjava drwxr-xr-x 2 oracle oinstall 4096 Apr 17 2019 dbs drwxr-xr-x 5 oracle oinstall 4096 Apr 18 2019 deinstall drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 demo drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 diagnostics drwxr-xr-x 13 oracle oinstall 4096 Apr 17 2019 dmu -rw-r--r-- 1 oracle oinstall 852 Aug 18 2015 env.ora drwxr-xr-x 6 oracle oinstall 4096 Apr 17 2019 evm drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 gpnp -rwxr-x--- 1 oracle oinstall 3294 Mar 8 2017 gridSetup.sh drwxr-xr-x 4 oracle oinstall 4096 Apr 17 2019 has drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 hs drwxr-xr-x 10 oracle oinstall 4096 Apr 18 2019 install drwxr-xr-x 2 oracle oinstall 4096 Apr 17 2019 instantclient drwxr-x--- 13 oracle oinstall 4096 Apr 18 2019 inventory drwxr-xr-x 8 oracle oinstall 4096 Apr 18 2019 javavm drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 jdbc drwxr-xr-x 6 oracle oinstall 4096 Apr 18 2019 jdk drwxr-xr-x 2 oracle oinstall 4096 Apr 17 2019 jlib drwxr-xr-x 10 oracle oinstall 4096 Apr 17 2019 ldap drwxr-xr-x 4 oracle oinstall 16384 Apr 18 2019 lib drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 md drwxr-xr-x 10 oracle oinstall 4096 Apr 17 2019 network drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 nls drwxr-x--- 14 oracle oinstall 4096 Apr 12 2019 OPatch drwxr-xr-x 3 oracle oinstall 4096 Apr 18 2019 .opatchauto_storage drwxr-xr-x 8 oracle oinstall 4096 Apr 17 2019 opmn drwxr-xr-x 4 oracle oinstall 4096 Apr 17 2019 oracore drwxr-xr-x 6 oracle oinstall 4096 Apr 17 2019 ord drwxr-xr-x 4 oracle oinstall 4096 Apr 17 2019 ords drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 oss drwxr-xr-x 8 oracle oinstall 4096 Apr 18 2019 oui drwxr-xr-x 4 oracle oinstall 4096 Apr 17 2019 owm drwxr-xr-x 7 oracle oinstall 4096 Apr 18 2019 .patch_storage drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 perl drwxr-xr-x 6 oracle oinstall 4096 Apr 17 2019 plsql drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 precomp drwxr-xr-x 2 oracle oinstall 4096 Apr 17 2019 QOpatch drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 qos drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 racg drwxr-xr-x 13 oracle oinstall 4096 Apr 18 2019 rdbms drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 relnotes drwxr-xr-x 7 oracle oinstall 4096 Apr 17 2019 rhp -rwx------ 1 oracle oinstall 405 Apr 18 2019 root.sh -rwx------ 1 oracle oinstall 490 Apr 17 2019 root.sh.old -rw-r----- 1 oracle oinstall 10 Apr 17 2019 root.sh.old.1 -rwx------ 1 oracle oinstall 414 Apr 18 2019 rootupgrade.sh -rwxr-x--- 1 oracle oinstall 628 Sep 3 2015 runcluvfy.sh drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 sdk drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 slax drwxr-xr-x 4 oracle oinstall 4096 Apr 18 2019 sqlpatch drwxr-xr-x 6 oracle oinstall 4096 Apr 18 2019 sqlplus drwxr-xr-x 6 oracle oinstall 4096 Apr 17 2019 srvm drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 suptools drwxr-xr-x 4 oracle oinstall 4096 Apr 17 2019 tomcat drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 ucp drwxr-xr-x 7 oracle oinstall 4096 Apr 17 2019 usm drwxr-xr-x 2 oracle oinstall 4096 Apr 17 2019 utl -rw-r----- 1 oracle oinstall 500 Feb 6 2013 welcome.html drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 wlm drwxr-xr-x 3 oracle oinstall 4096 Apr 17 2019 wwg drwxr-xr-x 5 oracle oinstall 4096 Apr 17 2019 xag drwxr-x--- 6 oracle oinstall 4096 Apr 17 2019 xdk
The file permissions are as I would expect (oracle:oinstall) because rootupgrade.sh hasn't been run to change any permissions yet. I checked the log file, and could see that oedacli had logged in to the VMs as root and run a "/bin/chown -R oracle:oinstall /u01/app/19.0.0.0/grid"
[ RunCommand:218] Node exa1db01v1.example.com appears to be okay, going to run command /bin/chown -R oracle:oinstall /u01/app/19.0.0.0/grid [ RunCommand:531] ##EXEC## |/bin/chown -R oracle:oinstall /u01/app/19.0.0.0/grid|exa1db01v1.example.com|root| [ RunCommand:329] ##RUNC## |/bin/chown -R oracle:oinstall /u01/app/19.0.0.0/grid|exa1db01v1.example.com|root| New or not cached [ RunCommand:183] Ran commands, elapsed time = 2002 mS [ KommandOutput:104] ====== Output from node exa1db01v1.example.com ====== [ KommandOutput:106] Command = exa1db01v1.example.com | root | /bin/chown -R oracle:oinstall /u01/app/19.0.0.0/grid [ KommandOutput:108] Ret code = <0> from node exa1db01v1.example.com [ KommandOutput:113] ## Output Start [ EsCommonUtils:1368] Command: /bin/chown -R oracle:oinstall /u01/app/19.0.0.0/grid produced null output but executed successfully on exa1db01v1.example.com [ KommandOutput:116] ====== End Output from node exa1db01v1.example.com Ret code0 ======
If that's the case, why can't the oracle user run the relink? The issue lies one directory up from the actual GI home. The ownership for /u01/app/19.0.0.0 is set to root:root, and the permissions were 750:
[root@exa1db01v1 ~]# ls -al /u01/app/ total 72 drwxr-xr-x 11 root oinstall 4096 May 22 19:22 . drwxr-xr-x 7 root oinstall 4096 Nov 16 2018 .. drwxr-xr-x 3 root oinstall 4096 Aug 9 2018 12.2.0.1 drwxr-x--- 3 root root 4096 May 9 19:22 19.0.0.0 <---------permissions don't allow Oracle to access directory drwxrwxr-x 17 oracle oinstall 4096 Jan 13 10:55 oracle drwxrwx--- 6 oracle oinstall 4096 May 21 23:53 oraInventory [root@exa1db01v1 ~]# ls -al /u01/app/19.0.0.0/ total 12 drwxr-x--- 3 root root 4096 May 9 19:22 . drwxr-xr-x 11 root oinstall 4096 May 9 19:22 .. drwxr-xr-x 71 root root 4096 May 9 19:21 grid <---------permissions are ok
On this system, the umask for the root user is set to 027, rather than the default of 022. These types of changes are fairly common on systems that require hardening, particularly systems that use the Exadata STIG scripts (MOS note #2181944.1) for hardening.
When the GI home directory was created, the command used was "mkdir -p /u01/app/19.0.0.0/grid," which created the parent directory as well. The permissions followed the umask of the root user, which prevent non-root users from having access.
After identifying the issue, running a simple "chmod 755 /u01/app/19.0.0.0" fixed the issue, and we were able to restart the GI upgrade with the CONFIG_HOME step.