Oracle has released a critical patch for storage server versions 126.96.36.199.x through 188.8.131.52.1. While 184.108.40.206.1 was released last week, there were a few oneoff patches from 220.127.116.11.0 that didn’t seem to make it in to the release. Oracle has since released 18.104.22.168.2 (patch #13513611, supplemental note #1388400.1). Similar to 22.214.171.124.1, this release looks to patch several outstanding issues. Here’s the list of bugs fixed from the readme for 126.96.36.199.2:
12764521 INFINIBAND DIAG COMMANDS (LIKE IBDIAGNET AND IBNETDISCOVER) ARE NOT WORKING
13083530 10 GB-E BONDED INTERFACES FAILING- EXADATA
13410353 AFTER UPGRADE TO 188.8.131.52 INFINIBAND CMDS IBDIAGNET, IBNETDISCOVER NOT WORKING
13489032 CHECKHWNFWPROFILE DOES NOT DETECT FAILED FLASH FDOM
13489445 ORA-600 [OSSMISC:OSSMISC_TIMER] WHEN NTPD DETECTED 6 MILLISECOND TIME DIFFERENCE
13512932 FIX INSTALLED WORKAROUND FOR NTP UPDATE BUG 13489445
As you can see, the previously mentioned bugs have been fixed. There’s another bug that was fixed in 184.108.40.206.1 that could be an issue for anybody running 220.127.116.11.x through 18.104.22.168.0. This bug (13454147) can remove the flashcache from a cell that has an uptime of 6 months or greater. Fortunately, Oracle has released a patch that includes these critical issues in the event that you can’t quickly upgrade to 22.214.171.124.2 – I wouldn’t advise running this version for at least a couple weeks…I always advise clients to wait that long for the early adopters to weed out any major issues.
Applying the critical patch only takes a minute, and doesn’t take the storage servers or database instances offline. After it’s done, a restart of cellsrv needs to be scheduled, but that can be done in a rolling fashion. Read on for an example of applying this patch. As always, do not apply any patch to a production system before appropriately testing against a non-production system!
As business has picked up since OpenWorld (didn’t think that was possible, but that’s another story for another day), we have been seeing more customers adopt or seriously look at Exadata as an option for new hardware implementations. While many will complain that there isn’t enough room for customization in the rigid process of configuring an Exadata system, there are still many possibilities to make your Exadata your own, whether it’s during the initial configuration phase or shortly thereafter. Of course, some of these modifications can be difficult to implement after the system is up and running with users logging in. I’m planning on starting a series of posts regarding a couple of the hot-button topics with regard to Exadata configuration – ASM diskgroup layout (the topic for today), role separated vs standard authentication, and so on. As these topics have no right answers, I’m more than open to a dialogue where you may disagree. On to the good stuff!
A Quick Primer – The Exadata Storage Architecture
Ok…so we’re looking at Exadata specifically in this post. In the examples listed below, we’ll discuss a quarter rack, since it’s the easiest to diagram. To expand to half or full racks, just adjust the number of cells (7, 14) and disks (84, 168) accordingly. To see the relationship between the compute nodes (database servers), Infiniband switches, and storage servers refer to figure 1:
Figure 1 – Exadata Infiniband/Storage Connectivity
In part 1 of this series, we took a look inside the ODA to see what the OS was doing. Here, we’ll dig in a little further to the disk and storage architecture with regard to the hardware and ASM.
There have been a lot of questions about the storage layout of the shared disks. We’ll start at the lowest level and make our way to the disks as we move down the ladder. First, there are 2 dual-ported LSI SAS controllers in each of the system controllers (SCs). They are each connected to a SAS expander that is located on the system board. Each of these SAS expanders connect to 12 of the hard disks on the front of the ODA. The disks are dual-ported SAS, so that each disk is connected to an expander on each of the SCs. Below is a diagram of the SAS connectivity on the ODA (Note: all diagrams are collected from public ODA documentation, as well as various ODA-related support notes available on My Oracle Support).
From this, you can see the relationship between the SAS controllers, SAS expanders, and SAS drives on the front end. If you look at the columns of disks, the first 2 columns are serviced by one expander, while the third and fourth columns are services by the other expander. What the diagram refers to as “Controller-0” and “Controller-1” are actually the independent SCs in the X4370M2. What this shows is that you can lose any of the following components in the diagram and your database will continue to run (assuming RAC is in use):