We’ve had a few weeks to play around with the ODA in our office, and I’ve been able to crack it open and get to into the software and hardware that powers it.
For starters, the system runs a new model of Sun Fire – the X4370 M2. The 4U chassis is basically 2 separate 2U blades (Oracle is calling them system controllers – SCs) that have direct attached storage on the front. Here’s a listing of the hardware in each SC:
|Sun X4370M2 System Controller Components
(2 SCs per X4370M2)
|CPU||2x 6-core Intel Xeon X5675 3.06GHz|
|Memory||96GB 1333MHz DDR3|
|Network||2x 10GbE (SFP+) PCIe card
4x 1GbE PCIe card
2x 1GbE onboard
|Internal Storage||2x 500GB SATA for operating system
1x 4GB USB internal
|RAID Controller||2x SAS-2 LSI HBA|
|Shared Storage||20x 600GB 3.5″ SAS 15,000 RPM hard drives
4x 73GB 3.5″ SSDs
|External Storage||2x external MiniSAS ports|
|Operating System||Oracle Enterprise Linux 5.5 x86-64|
Pictures of a real live ODA after the break.
If you’re anything like me, the first thing you wanted to know about the ODA is what’s inside? Follow along as we walk through the hardware involved.
What’s an SC?
The SC is essentially Oracle’s term for the blades that sit inside the Sun Fire X4370 M2.
From this view, the back of the SC is at the bottom. At the top (front), are 2 connections that plug into the chassis of the X4370 M2. The SCs slide out from the back of the chassis. Looking closer at the back of the SC, we have the external connections
From the back, you can see the PCI cards on the left, and the onboard ports on the right. In the middle are the fan modules. On the left, we have 4, gigabit ethernet ports, 2, 10GbE (SFP+) ports, and the 2 external SAS connections. On the right are the dual gigabit ethernet (onboard) ports, the serial and network ports for the ILOM, and your standard VGA and USB ports. Above the onboard ports are the 500GB SATA hard drives used for the operating system.
As mentioned above, there are 2 Seagate 500GB serial ATA hard drives that are used for the operating system:
Along with the serial ATA drives is a 4GB USB flash stick that can be used to create a bootable rescue installation of Oracle Enterprise Linux. Also, this drive is used for some firmware updates.
As for the disk controllers, each SC has 2 LSI controllers. One is on the internal PCIe slot, and another is on a standard PCIe slot. They are the SAS9211-8i controller.
One of the many things that I found interesting on the box was that OEL 5.5 was installed, not one of the newer releases. Also, the server is running the RedHat compatible kernel, and not the Unbreakable Enterprise Kernel (UEK), which is the default on newer releases of OEL 5.
[root@xlnxrda01 ~]# uname -a Linux xlnxrda01 2.6.18-184.108.40.206.1.el5 #1 SMP Tue Jan 4 16:26:54 EST 2011 x86_64 x86_64 x86_64 GNU/Linux [root@xlnxrda01 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.5 (Tikanga)
As for the storage, Oracle has removed one of my biggest peeves with the Exadata storage servers. On Exadata, the operating system resides on 30GB partitions on the first 2 hard disks. Because of this, a 30GB griddisk has to be created on the remaining 10 disks, which becomes the DBFS_DG diskgroup (formerly SYSTEMDG). With Exadata, this diskgroup becomes the location for OCR/voting files, unless DATA or RECO is high redundancy. In that case, DBFS_DG is just wasted space. Anyways, going back to the ODA (that was the point of this post, wasn’t it?), this problem is no longer present thanks to the 2 500GB 2.5″ SATA drives in the back. These drives utilize software RAID (just like the Exadata storage servers), but don’t take advantage of the active/inactive partition scheme:
[root@patty ~]# parted /dev/sda print Model: ATA SEAGATE ST95001N (scsi) Disk /dev/sda: 500GB Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 32.3kB 107MB 107MB primary ext3 boot, raid 2 107MB 500GB 500GB primary raid Information: Don't forget to update /etc/fstab, if necessary. [root@patty ~]# parted /dev/sdb print Model: ATA SEAGATE ST95001N (scsi) Disk /dev/sdb: 500GB Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 32.3kB 107MB 107MB primary ext3 boot, raid 2 107MB 500GB 500GB primary raid Information: Don't forget to update /etc/fstab, if necessary. [root@patty ~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1 sda1 104320 blocks [2/2] [UU] md1 : active raid1 sdb2 sda2 488279488 blocks [2/2] [UU] unused devices: [root@patty ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroupSys-LogVolRoot 30G 7.4G 21G 27% / /dev/md0 99M 17M 77M 18% /boot /dev/mapper/VolGroupSys-LogVolOpt 59G 5.8G 50G 11% /opt /dev/mapper/VolGroupSys-LogVolU01 97G 188M 92G 1% /u01 tmpfs 48G 0 48G 0% /dev/shm
Let’s look at the LVM setup a little closer. We have a physical volume with a 465GB volume group (VolGroupSys). From here, we have LogVolRoot (30GB), LogVolOpt (60GB), and LogVolU01 (100GB). That leaves us with more than 250GB free on each SC for either adding new filesystems, growing existing filesystems, or taking LVM snapshots.
[root@patty ~]# pvdisplay --- Physical volume --- PV Name /dev/md1 VG Name VolGroupSys PV Size 465.66 GB / not usable 3.44 MB Allocatable yes PE Size (KByte) 32768 Total PE 14901 Free PE 8053 Allocated PE 6848 PV UUID Q99xBf-AMdf-so7d-VF8J-cIjB-cHeQ-cbWzSD [root@patty ~]# vgdisplay --- Volume group --- VG Name VolGroupSys System ID Format lvm2 Metadata Areas 1 Metadata Sequence No 5 VG Access read/write VG Status resizable MAX LV 0 Cur LV 4 Open LV 4 Max PV 0 Cur PV 1 Act PV 1 VG Size 465.66 GB PE Size 32.00 MB Total PE 14901 Alloc PE / Size 6848 / 214.00 GB Free PE / Size 8053 / 251.66 GB VG UUID 96cHoA-qhpG-A2hr-tbG1-c1oW-k9lI-MGURFU [root@patty ~]# lvdisplay --- Logical volume --- LV Name /dev/VolGroupSys/LogVolRoot VG Name VolGroupSys LV UUID 5qpcEM-aPPA-hGGf-GBBs-nwRi-HUNN-qrsIQp LV Write Access read/write LV Status available # open 1 LV Size 30.00 GB Current LE 960 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:0 --- Logical volume --- LV Name /dev/VolGroupSys/LogVolOpt VG Name VolGroupSys LV UUID Ge110F-s9oB-Eqak-muo0-3yWn-GEuN-klI1Rs LV Write Access read/write LV Status available # open 1 LV Size 60.00 GB Current LE 1920 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:1 --- Logical volume --- LV Name /dev/VolGroupSys/LogVolU01 VG Name VolGroupSys LV UUID lDBl2R-ZpX7-QZxJ-t8wI-DKde-kANX-zPf3XP LV Write Access read/write LV Status available # open 1 LV Size 100.00 GB Current LE 3200 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:2 --- Logical volume --- LV Name /dev/VolGroupSys/LogVolSwap VG Name VolGroupSys LV UUID Z2shPb-Dbe6-oVIR-hq2z-nDHs-xfNW-43miCW LV Write Access read/write LV Status available # open 1 LV Size 24.00 GB Current LE 768 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:3
The ODA has 8 (6 GbE and 2 10GbE) physical ethernet ports available to you, along with 2 internal fibre ports that are used for a built-in cluster interconnect. Here’s the output of running “ethtool” on the internal NICs:
[root@patty ~]# ethtool eth0 Settings for eth0: Supported ports: [ FIBRE ] Supported link modes: 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: FIBRE PHYAD: 0 Transceiver: external Auto-negotiation: on Supports Wake-on: d Wake-on: d Current message level: 0x00000001 (1) Link detected: yes [root@patty ~]# ethtool eth1 Settings for eth1: Supported ports: [ FIBRE ] Supported link modes: 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: FIBRE PHYAD: 0 Transceiver: external Auto-negotiation: on Supports Wake-on: d Wake-on: d Current message level: 0x00000001 (1) Link detected: yes
Surprisingly, these NICs aren’t bonded, so we have 2 separate cluster interconnects, which means that we also have 2 HAIP devices. Also, eth1 and eth2 (the onboard NICs) were used to create a bond for the public traffic.
[patty:oracle:+ASM1] /home/oracle > oifcfg getif eth0 192.168.16.0 global cluster_interconnect eth1 192.168.17.0 global cluster_interconnect bond0 192.168.8.0 global public
Note that the default configuration of the ODA doesn’t include a management network, like the Exadata does. That doesn’t mean that you can’t set up a management network, just that it’s not part of the initial setup process.
There’s a little overview of what’s inside the ODA. The next piece in the series will go into a little more detail on the disks, as well as the Oracle configuration.