Archive for the ‘Exadata’ Category.

Exadata Book

Here’s some white Exaddata text

Well, I guess it’s official. A couple of weeks ago I committed to write an Exadata book for Apress, along with my intrepid co-author Randy Johnson. For those of you who don’t know Randy, he’s a very experienced Oracle Guy with a wealth of knowledge, particularly around RAC. I think the two of us make a pretty good team - making up for each others weaknesses (oh wait, I should say we have “Complimentary Skill Sets” - yeah that sounds better).

Anyway, it turns out that writing a book is a lot of work! The way Tom Kyte turns them out I thought it must be pretty easy, but I’ve always been a little overconfident. So I’m starting to realize that I may not have time to do as many blog posts as I might like. But I must say that I am really excited about the subject matter! So I think it will be worth the effort. By the way, that’s not the official cover art (or even the official title as far as I know). I just hacked that together with a Adobe Illustrator. ;-)

As I have said many times, I think the Exadata storage software will usher in a whole new era in relational databases. Not just for Oracle, because you know the other guys will be trying to follow in their footsteps. But I think Oracle is miles ahead at this point, and they own their own hardware platform. Oracle claims that it has been their most successful product launch ever and I believe it. They are starting to pop up like weeds. It will be very interesting to see what the future holds for this platform. To be honest, I think we’re just seeing the tip of the iceberg at this point. Anyway, I appreciate the guys at Apress having the faith in us to take on this project. I hope we don’t disappoint them (I don’t think we will).

Notes on Applying Exadata Bundle Patch (BP5)

Randy Johnson has done a brief post after applying BP5 on our Exadata Lab machine. Looks like it went pretty smoothly with the exception of a problem with DBFS and some misleading comments in the README file regarding using the RDS protocol (both of which we had in play). Here’s a link to his post:

Exadata Bundle Patch 5 Gotcha’s

Running Oracle Exadata V2 on Dell Hardware

Well we had to give it a shot.

So we created an Oracle Exadata Storage Server Software CELLBOOT USB flash drive. I’m not kidding, that’s what the Oracle/Sun guys decided to call it. They didn’t even use an acronym in the manual (I guess “ESSSCB USB FD” doesn’t roll off the tongue much better than the whole thing anyway). We used the make_cellboot_usb utility to create the thing off one of our storage servers, which by the way was not that easy to do, since the USB ports are in the back of the 4275’s and they are not easy to get to with all the cabling that’s back there. Anyway, once we had the little bugger created we pulled it out of the back of the rack and booted a Dell Latitude D630 off of it. Here’s a picture:

Notice the thumb drive is all lit up like a Christmas tree.

Here is a close up of the screen (in case your eyes are going bad like mine):

So we tried a couple of different options but eventually got to this screen:

Notice the ERROR line in the middle of the screen. Somebody wisely put a check in the boot procedure to verify the machine type, presumably if it’s not a Sun 4170 it will throw an error. We thought about hacking the system but decided not to at this point as we had real work to do. (maybe later when we’ve got nothing else to do)

Oracle Exadata - Storage Indexes

Wow! - I was stunned a few days ago by Exadata’s Storage Indexes. I was doing a little testing to see what could be offloaded and what couldn’t (more on that later). I have a 384 million row table I was using on our Exadata Quarter Rack test system. A single threaded full scan with no where clause on the table takes about 24 seconds (ho hum - it’s amazing how quickly we become numbed to the outstanding performance ). So imagine my surprise when I decided to check and see how many nulls I had in a column and the result came back in .07 seconds. Wow! I thought it was a bug! Turns out it was the Storage Indexes. Alright already, I’ll show you some output from the system (by the way, as usual I used a couple of scripts: fsx.sql and mystats.sql):

SYS@LABRAT1> select count(*) from kso.skew3;
 
  COUNT(*)
----------
 384000048
 
1 row selected.
 
Elapsed: 00:00:24.06
SYS@LABRAT1> /
 
  COUNT(*)
----------
 384000048
 
1 row selected.
 
Elapsed: 00:00:23.94
 
SYS@LABRAT1> set timing off
SYS@LABRAT1> @mystats
Enter value for name: %storage%
 
NAME                                                                             VALUE
---------------------------------------------------------------------- ---------------
cell physical IO bytes saved by storage index                                        0
 
1 row selected.
 
SYS@LABRAT1> set timing on
SYS@LABRAT1> select count(*) from kso.skew3 where col1 is null;
 
  COUNT(*)
----------
        12
 
1 row selected.
 
Elapsed: 00:00:00.07
SYS@LABRAT1> set timing off
SYS@LABRAT1> @fsx
Enter value for sql_text: select count(*) from kso.skew3 where col1 is null
Enter value for sql_id: 
Enter value for inst_id: 
 
 
 INST SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO    AVG_PIO AVG_PX OFFLOADABLE IO_SAVED_% SQL_TEXT
----- ------------- ------ ---------- ---------- ------------- ------------ ---------- ------ ----------- ---------- ----------------------------------------
    1 0u1q4b7puqz6g      0 2684249835          5           .09    1,956,226  1,956,219      0 Yes             100.00 select count(*) from kso.skew3 where col
 
1 row selected.
 
SYS@LABRAT1> @mystats
Enter value for name: %storage%
 
NAME                                                                             VALUE
---------------------------------------------------------------------- ---------------
cell physical IO bytes saved by storage index                              16012763136
 
1 row selected.

So apparently Storage Indexes are NULL aware. Very cool! This may have repercussions regarding design and implementation decisions. There are systems that don’t use NULLs in order to insure that they can access records via B-Tree indexes (which as you’re aware do not store NULLs). SAP for example uses a single space character instead of NULLs.

Continue reading ‘Oracle Exadata - Storage Indexes’ »

Exadata and Parallel Queuing

Over the years Oracle has added many enhancements in order to allow individual SQL statements to take full advantage of multiprocessor computers. A few months ago Cary Millsap did a talk where he recalled the presentation Larry Ellison did when Oracle first announced the Parallel Query feature. During Larry’s demo he had a multiprocessor computer all to himself. I don’t remember how many processors it had, but I remember he had some kind of graphic showing individual CPU utilization on one screen while he fired up a parallel query on another screen. The monitoring screen lit up like a Christmas tree. Every one of the CPU’s was pegged during his demo. When Cary was telling the story he said that he had wondered at the time what would have happened if there had been other users on the system during the demo. Their experience would probably not have been a good one. I remember having the exact same thought.

Oracle’s parallel capabilities have been a great gift but they have also been a curse because controlling the beast in an environment where there are multiple users trying to share the resources is pretty difficult. There have been many attempts at coming up with a reasonable way of throttling big parallel statements along the way. But to date, I think this technology has only been used effectively in batch processing environments and large data warehouses where consuming the whole machine’s resources is acceptable due to the relatively low degree of concurrency required by those environments.

So why did I mention Exadata in the title of this post? Well I believe that one of the most promising aspects of Exadata is it’s potential with regard to running mixed work loads (OLTP and DW) on the same database, without crippling one or the other. In order to do that, Oracle needs some mechanism to separate the workloads. Resource Manager is an option in this area, but it doesn’t go far enough in controlling throughput on parallel queries. This new queuing mechanism should be a great help in that regard. So let’s review the options:

Parallel Adaptive Multi User (the old way)
This ability to automatically downgrade the degree of parallelism based on what’s happening on the system when a query kicks off is actually a powerful mechanism and is the best approach we’ve had prior to 11g Release 2. The downside of this approach is that parallelized statements can have wildly varying execution times. As you can imagine, a statement that gets 32 slaves one time and then gets downgraded to serial execution the next time will probably not make the user very happy. The argument for this type of approach is that stuff is going to run slower if the system is busy regardless of what you do. And that users expect it to run slower when the system is busy. The first part of that statement may be true but I don’t believe the second part is (at least in most cases). The bigger problem with the downgrade mechanism though is that the the decision about how many slaves to use is based on a single point in time (the point when the parallel statement starts). And once the degree of parallelism (DOP) is set for a statement it can not be changed. That execution of the statement will run with the number of slaves it got to start with, even if additional resources become available. So consider the statement that takes a minute with 32 slaves that gets downgraded to serial due to a momentarily high load. And say that 10 seconds after it starts the system load drops back to more normal levels. Unfortunately, the serialized statement will continue to run for nearly 30 minutes with it’s single process, even though on average the system is not busier than usual.

Parallel Queuing (the new way)
Now let’s compare that with the new mechanism introduced in 11gR2 that allows parallel statements to be queued in a First In - First Out fashion. This mechanism separates (presumably) long running parallel queries from the rest of the workload. The mechanics are pretty simple. Turn the feature on. Set a target number of parallel slaves (parallel_servers_target). Run stuff. If a statement tries to start that requires exceeding the target, it will be queued until the required number of slaves become available.

The Parallel Queuing feature is controlled by a hidden parameter called “_parallel_statement_queuing”. A value of TRUE turns it on and FALSE turns it off. FALSE is the default by the way. This parameter is not documented but is set automatically when the PARALLEL_DEGREE_POLICY parameter is set to AUTO. Unfortunately, PARALLEL_DEGREE_POLICY is one of those parameters that controls more than one thing. When set to AUTO it also turns on Automatic DOP calculation. This feature calculates a DOP for each statement regardless of whether any objects have been decorated with a parallel setting. The result is that all kinds of statements are run in parallel, even if no objects have been specifically defined with a parallel degree setting. This is truly automatic parallel processing because the database decides what to run in parallel and with how many slaves. On top of that, by default, the slaves may be spread across multiple nodes in a RAC database (this can be disabled by setting PARALLEL_FORCE_LOCAL to TRUE). Finally, AUTO is supposed to enable “In Memory Parallel Query”. This poorly named feature refers to 11gR2’s ability to make use of the SGA for parallel query, as opposed to using direct reads exclusively. Note: I haven’t actually seen this kick in yet, which is probably good, since Exadata Offloading depends on direct reads. I haven’t seen it kick in on non-Exadata databases either though.

Unfortunately this combination of features is a little like the wild west with things running in parallel all over the place. But the ability to queue parallel statements does provide some semblance of order. And to be fair, there are a number of parameters that can be set to control how the calculations are performed. Anyway, here’s a brief synopsis of parameter changes caused by the various settings PARALLEL_DEGREE_POLICY.

Continue reading ‘Exadata and Parallel Queuing’ »