OEDA Virtualized Cluster Discovery With SSH Keys

By | May 7, 2020

As part of the Oracle Exadata Deployment Assistant ("OEDA"), Oracle includes a command line utility to read and modify the XML files used for deployment of an Exadata cluster.  Typical use cases are to install additional Oracle database versions, or to create multiple databases before deployment.  There are several additional features included for virtualized clusters, particularly the ability to simplify upgrading Grid Infrastructure.

In many cases, the original XML used for the deployment is still available, and that's all you need to complete the upgrade.  For some older clusters, the original XML may not utilize the same internal format as the most current OEDA tools, or there may have been other changes performed on the cluster over the years - I have some systems where new nodes have been added, additional clusters built, and there isn't a single consistent file for the entire system.  The good news is that you can use the oedacli utility to discover the virtualized clusters running on an Exadata, and it will generate an updated XML file for you.

To do this, just log in to dom0 on one of the Exadata hosts, and download the latest OEDA package (my version is 19.3.6, which was released in April 2020).  Once the software is unpacked, perform the following steps:

mkdir /root/discovered
./oedacli
discover es hostname='db01,db02,cel01,cel02,cel03' location=/root/discovered

From there, OEDA will discover any running virtual machines, connect to them, and run a full discovery.  This should work without a problem if you still have all of the passwords set to the default values.  In most cases, the passwords have been changed from the defaults at some point.  The OEDA script utilizes the expect command to enter passwords for SSH commands, so you can easily modify the passwords it will use via the genPasswordHash.sh script.  This functionality is a bit limited, though, in that it expects the same password for all clusters on the system.  What if each cluster has a different password?  This seems like a good use for SSH keys.  Fortunately, there is a way to utilize SSH keys with OEDA cluster discovery.

While oedacli offers the ability to generate new SSH keys for authentication with each cluster, I preferred to use existing keys that were already configured from the root account on dom0 of the first compute node.  The process to perform discovery using SSH keys is pretty easy:

  1. Create an SSH key pair if one doesn't already exist
  2. Add the SSH public key to the authorized_keys file for both oracle/grid and root accounts
  3. Perform discovery
  4. Remove SSH key access

If your root account doesn't already have an SSH key pair created, you can create one with:

ssh-keygen -t rsa

Hit enter at the prompts and it will create a key that doesn't require a passphrase.  This is important from an oedacli perspective, as a keys that use a passphrase will not work for the silent installation/discovery process.

From there, make sure that you have three sets of dcli group files - one with the names of all virtualized guests (~/vm_group), one with the names of the physical compute nodes (~/dbs_group), and one with the names of the storage servers (~/cell_group).  The contents of my files are:

[root@enkx4db03 ~]# cat vm_group
enkx4db03c01
enkx4db03c02
enkx4db03c04
enkx4db03c05
enkx4db04c01
enkx4db04c02
enkx4db04c04
enkx4db04c05

[root@enkx4db03 ~]# cat dbs_group
enkx4db03
enkx4db04

[root@enkx4db03 ~]# cat cell_group
enkx4cel05
enkx4cel06
enkx4cel07

You can use dcli to add the SSH key to a user account by adding the -k flag.  If the key is not already in the authorized_keys file, you will be prompted for the password.  Note that the password isn't saved, so it must be entered for each host.  Use dcli to add the keys for each software owner (oracle, grid), and root on the guests, and once to configure root access for the storage servers:

[root@enkx4db03 ~]# dcli -l root -g ~/vm_group -k
root@enkx4db03c01's password:
root@enkx4db03c02's password:
root@enkx4db03c04's password:
root@enkx4db03c05's password:
root@enkx4db04c01's password:
root@enkx4db04c02's password:
root@enkx4db04c04's password:
root@enkx4db04c05's password:
enkx4db03c01: ssh key added
enkx4db03c02: ssh key added
enkx4db03c04: ssh key added
enkx4db03c05: ssh key added
enkx4db04c01: ssh key added
enkx4db04c02: ssh key added
enkx4db04c04: ssh key added
enkx4db04c05: ssh key added

[root@enkx4db03 ~]# dcli -l root -g cell_group -k
root@enkx4cel05's password:
root@enkx4cel06's password:
root@enkx4cel07's password:
enkx4cel05: ssh key added
enkx4cel06: ssh key added
enkx4cel07: ssh key added

[root@enkx4db03 ~]# dcli -l root -g ~/dbs_group -k
root@enkx4db03's password:
root@enkx4db04's password:
enkx4db03: ssh key added
enkx4db04: ssh key added

Now for the tricky part - oedacli will not just use the default key in /root/.ssh.  It expects a separate key pair in the WorkDir directory for each user and host in the discovery.  The expected naming format is id_rsa.<hostname>.<user>[.pub].  An example for enkx4db03c01 would be to have files named id_rsa.enkx4db03c01.oracle and id_rsa.enkx4db03c01.oracle.pub.  Since we have the dcli group files, we can easily create those files without much fuss.  In the example below, my OEDA is unzipped to /EXAVMIMAGES/onecommand/2020_apr/linux-x64.  Modify the OEDA_WORKDIR variable to match where your WorkDir is:

[root@enkx4db03 ~]# for hosts in `cat ~/vm_group`; \
do \
export OEDA_WORKDIR=/EXAVMIMAGES/onecommand/2020_apr/linux-x64/WorkDir; \
cp ~/.ssh/id_rsa $OEDA_WORKDIR/id_rsa.$hosts.oracle; \
cp ~/.ssh/id_rsa $OEDA_WORKDIR/id_rsa.$hosts.grid; \
cp ~/.ssh/id_rsa $OEDA_WORKDIR/id_rsa.$hosts.root; \
cp ~/.ssh/id_rsa.pub $OEDA_WORKDIR/id_rsa.$hosts.oracle.pub; \
cp ~/.ssh/id_rsa.pub $OEDA_WORKDIR/id_rsa.$hosts.grid.pub; \
cp ~/.ssh/id_rsa.pub $OEDA_WORKDIR/id_rsa.$hosts.root.pub; \
done

[root@enkx4db03 ~]# for hosts in `cat ~/cell_group`; \
do \
export OEDA_WORKDIR=/EXAVMIMAGES/onecommand/2020_apr/linux-x64/WorkDir; \
cp ~/.ssh/id_rsa $OEDA_WORKDIR/id_rsa.$hosts.root; \
cp ~/.ssh/id_rsa.pub $OEDA_WORKDIR/id_rsa.$hosts.root.pub; \
done

[root@enkx4db03 ~]# for hosts in `cat ~/dbs_group`; \
do \
export OEDA_WORKDIR=/EXAVMIMAGES/onecommand/2020_apr/linux-x64/WorkDir; \
cp ~/.ssh/id_rsa $OEDA_WORKDIR/id_rsa.$hosts.root; \
cp ~/.ssh/id_rsa.pub $OEDA_WORKDIR/id_rsa.$hosts.root.pub; \
done

You should now have a set of SSH key pairs for each account on your Exadata rack.  We can now run the discovery to create new OEDA XML files.  Launch oedacli, enable SSH key authentication, and run discovery:

[root@enkx4db03 ~]# cd /EXAVMIMAGES/onecommand/2020_apr/linux-x64
[root@enkx4db03 linux-x64]# ./oedacli
oedacli> set sshkeys enable=true
oedacli> discover es hostnames='enkx4db03,enkx4db04,enkx4cel05,enkx4cel06,enkx4cel07' location=/EXAVMIMAGES/onecommand/2020_apr/linux-x64/discovery

OEDA will now connect to each of the hosts and discover the existing software installations, patch versions, and ASM diskgroup configurations.  The location specified will have an XML file for each individual cluster, as well as a full XML file containing each of the clusters.  You will also be able to see an installation template in HTML format, and a new checkip script.

Finally, you can remove the SSH keys using the dcli command with the --unkey option:

[root@enkx4db03 ~]# dcli -l root -g ~/vm_group --unkey
enkx4db03c01: ssh key dropped
enkx4db03c02: ssh key dropped
enkx4db03c04: ssh key dropped
enkx4db03c05: ssh key dropped
enkx4db04c01: ssh key dropped
enkx4db04c02: ssh key dropped
enkx4db04c04: ssh key dropped
enkx4db04c05: ssh key dropped

[root@enkx4db03 ~]# dcli -l root -g ~/dbs_group --unkey
enkx4db03: ssh key dropped
enkx4db04: ssh key dropped

[root@enkx4db03 ~]# dcli -l root -g cell_group --unkey
enkx4cel05: ssh key dropped
enkx4cel06: ssh key dropped
enkx4cel07: ssh key dropped

Here is the full output of my OEDA discovery - you may see that it reports that there are no database homes on certain clusters, but it still completed discovery of all running objects in the rack.

[root@enkx4db03 linux-x64]# ./oedacli
oedacli> set sshkeys enable=true
oedacli> discover es hostnames='enkx4db03,enkx4db04,enkx4cel05,enkx4cel06,enkx4cel07' location=/EXAVMIMAGES/onecommand/2020_apr/linux-x64/discovery
Discovering nodes [enkx4db03, enkx4db04, enkx4cel05, enkx4cel06, enkx4cel07]...
...Running Software Discovery on enkx4db03c01.enkitec.local
Discovering software on: enkx4db03c01.enkitec.local
Discovering cluster details on node: enkx4db03c01.enkitec.local on cluster c0_clusterHome
Discovering Database details on node: enkx4db03c01.enkitec.local for clusterId c0_clusterHome
No Database found for database home /u01/app/oracle/product/19.0.0.0/dbhome_1 on enkx4db03c01.enkitec.local
No Database found for database home /u01/app/oracle/product/12.2.0.1/dbhome_1 on enkx4db03c01.enkitec.local
ERROR: No databaseHomes discovered on enkx4db03c01.enkitec.local
Done Software Discovery on enkx4db03c01.enkitec.local
...Running Software Discovery on enkx4db03c02.enkitec.local
Discovering software on: enkx4db03c02.enkitec.local
Discovering cluster details on node: enkx4db03c02.enkitec.local on cluster c1_clusterHome
Discovering Database details on node: enkx4db03c02.enkitec.local for clusterId c1_clusterHome
No Database found for database home /u01/app/oracle/product/19.0.0.0/dbhome_3 on enkx4db03c02.enkitec.local
No Database found for database home /u01/app/oracle/product/11.2.0.4/dbhome_2 on enkx4db03c02.enkitec.local
No Database found for database home /u01/app/oracle/product/19.0.0.0/dbhome_3 on enkx4db03c02.enkitec.local
Done Software Discovery on enkx4db03c02.enkitec.local
...Running Software Discovery on enkx4db03c05.enkitec.local
Discovering software on: enkx4db03c05.enkitec.local
Discovering cluster details on node: enkx4db03c05.enkitec.local on cluster c2_clusterHome
Discovering Database details on node: enkx4db03c05.enkitec.local for clusterId c2_clusterHome
No Database found for database home /u01/app/oracle/product/11.2.0.4/dbhome_1 on enkx4db03c05.enkitec.local
No Database found for database home /u01/app/oracle/product/11.2.0.4/dbhome_1 on enkx4db03c05.enkitec.local
No Database found for database home /u01/app/oracle/product/12.2.0.1/dbhome_1 on enkx4db03c05.enkitec.local
No Database found for database home /u01/app/oracle/product/12.2.0.1/dbhome_1 on enkx4db03c05.enkitec.local
Done Software Discovery on enkx4db03c05.enkitec.local
...Running Software Discovery on enkx4db03c04.enkitec.local
Discovering software on: enkx4db03c04.enkitec.local
Discovering cluster details on node: enkx4db03c04.enkitec.local on cluster c3_clusterHome
Discovering Database details on node: enkx4db03c04.enkitec.local for clusterId c3_clusterHome
No Database found for database home /u01/app/oracle/product/19.0.0.0/dbhome_1 on enkx4db03c04.enkitec.local
Done Software Discovery on enkx4db03c04.enkitec.local
...Running Software Discovery on enkx4db04c04.enkitec.local
Discovering software on: enkx4db04c04.enkitec.local
Done Software Discovery on enkx4db04c04.enkitec.local
...Running Software Discovery on enkx4db04c01.enkitec.local
Discovering software on: enkx4db04c01.enkitec.local
Done Software Discovery on enkx4db04c01.enkitec.local
...Running Software Discovery on enkx4db04c02.enkitec.local
Discovering software on: enkx4db04c02.enkitec.local
Done Software Discovery on enkx4db04c02.enkitec.local
...Running Software Discovery on enkx4db04c05.enkitec.local
Discovering software on: enkx4db04c05.enkitec.local
Done Software Discovery on enkx4db04c05.enkitec.local
Discovering local disks ....
Discovering switches...
Discovering racks...
Writing Engineered System preconf : /EXAVMIMAGES/onecommand/2020_apr/linux-x64/discovery/Discovered-preconf_rack_0.csv
Creating databasemachine.xml for EM discovery
Done Creating databasemachine.xml for EM discovery
Creating databasemachine.xml for EM discovery
Done Creating databasemachine.xml for EM discovery
Creating databasemachine.xml for EM discovery
Done Creating databasemachine.xml for EM discovery
Creating databasemachine.xml for EM discovery
Done Creating databasemachine.xml for EM discovery
Writing platinum file : /EXAVMIMAGES/onecommand/2020_apr/linux-x64/discovery/Discovered-platinum.csv

Creating Installation template /EXAVMIMAGES/onecommand/2020_apr/linux-x64/discovery/Discovered-InstallationTemplate.html...
Created Installation template /EXAVMIMAGES/onecommand/2020_apr/linux-x64/discovery/Discovered-InstallationTemplate.html
Writing checkip validation script : /EXAVMIMAGES/onecommand/2020_apr/linux-x64/discovery/Discovered-checkip.sh

Validating Engineered System....
Rack Descripton: X4-2 Quarter Rack HC 4TB
..Cluster Name: enkx4c04
..Cluster Node List:[enkx4db03c04.enkitec.local, enkx4db04c04.enkitec.local]
..Storage Server List:[enkx4cel05, enkx4cel06, enkx4cel07]
..Cluster Software Details..
....Cluster Home:/u01/app/19.0.0.0/grid
....Cluster Version:19.5.0.0.191015
....Cluster Scan Name:enkx4c04-scan
..Cluster Owner/Group details..
....Owner:oracle
....Groups:[oinstall, dba]
..Storage Details..
....Disk Group:DATAC4, Size:504G, DiskGroup Type:DATA
Database Home Details..
Warning: No database homes found..
.......
..Cluster Name: enkx4vm1
..Cluster Node List:[enkx4db03c01.enkitec.local, enkx4db04c01.enkitec.local]
..Storage Server List:[enkx4cel05, enkx4cel06, enkx4cel07]
..Cluster Software Details..
....Cluster Home:/u01/app/19.0.0.0/grid
....Cluster Version:19.6.0.0.200114
....Cluster Scan Name:enkx4c01-scan
..Cluster Owner/Group details..
....Owner:oracle
....Groups:[oinstall, dba]
..Storage Details..
....Disk Group:DATAC1, Size:3078G, DiskGroup Type:DATA
....Disk Group:RECOC1, Size:1026G, DiskGroup Type:RECO
Database Home Details..
Warning: No database homes found..
.......
..Cluster Name: enkx4vm2
..Cluster Node List:[enkx4db03c02.enkitec.local, enkx4db04c02.enkitec.local]
..Storage Server List:[enkx4cel05, enkx4cel06, enkx4cel07]
..Cluster Software Details..
....Cluster Home:/u01/app/19.0.0.0/grid
....Cluster Version:19.3.1.0.0
....Cluster Scan Name:enkx4c02-scan
..Cluster Owner/Group details..
....Owner:grid
....Groups:[oinstall, asmdba, asmoper, asmadmin]
..Storage Details..
....Disk Group:DATAC2, Size:1800G, DiskGroup Type:DATA
....Disk Group:RECOC2, Size:1008G, DiskGroup Type:RECO
Database Home Details..
...Database Home Location:/u01/app/oracle/product/19.0.0.0/dbhome_3
...Database Home Version:19.3.1.0.0
...Database Software Owner:oracle
...Groups:[oinstall, asmdba, dba, racoper]
...Databases:[cdb19: Db Node List:[enkx4db03c02, enkx4db04c02]]
...Database Home Location:/u01/app/oracle/product/12.1.0.2/dbhome_1
...Database Home Version:12.1.0.2.190416
...Database Software Owner:oracle
...Groups:[oinstall, asmdba, dba, racoper]
...Databases:[dbm03: Db Node List:[enkx4db03c02, enkx4db04c02]]
.......
..Cluster Name: enkx4vm5
..Cluster Node List:[enkx4db03c05.enkitec.local, enkx4db04c05.enkitec.local]
..Storage Server List:[enkx4cel05, enkx4cel06, enkx4cel07]
..Cluster Software Details..
....Cluster Home:/u01/app/12.2.0.1/grid
....Cluster Version:12.2.0.1.171017
....Cluster Scan Name:enkx4c05-scan
..Cluster Owner/Group details..
....Owner:oracle
....Groups:[oinstall, dba]
..Storage Details..
....Disk Group:DATAC6, Size:1026G, DiskGroup Type:DATA
....Disk Group:RECOC6, Size:504G, DiskGroup Type:RECO
Database Home Details..
Warning: No database homes found..
.......
oedacli>

Now that the discovery is complete, we can move on to upgrading the virtualized clusters using oedacli.

Leave a Reply

Your email address will not be published.