Archive for the ‘11gR2’ Category.

EHCC and the GET_COMPRESSION_TYPE function

Well I turned in the HCC chapter on the Exadata book last week and of course as is usually the case, I immediately ran across something kind of cool on the topic I just finished writing about. (we still have several editing passes though, so I can add it later). Anyway, although I don’t have time to post much these days, I thought this one would be a quick little snippet. So here it is.

The Compression Advisor is part of the DBMS_COMPRESSION package. Specifically it is the GET_COMPRESSION_RATIO procedure. This procedure is worthy of a separate post but I won’t discuss it here except to say that as of 11.2.0.2 you can use it to test HCC compression ratios on non-Exadata platforms. That’s pretty cool, but what I wanted to tell you about is a handy little function in the same package called GET_COMPRESSION_TYPE. This function can tell you exactly what level of compression has been applied to a single row. This can come in handy for investigating the inner workings of  HCC (or OLTP or BASIC compression for that matter).

As you probably already know, HCC is only applied to records loaded via direct path writes. Any updates cause rows to be migrated out of that storage format into blocks flagged for OLTP compression. Of course OLTP compression on a block only kicks in when a block is “full”. On top of this, altering a table to change it’s compression does not actually change the storage format of any existing records (unless you use the MOVE keyword). So you could load some data and then change the designation (say from QUERY LOW to QUERY HIGH). Rows that are inserted after the change will be stored in the new format (assuming the records are loaded via direct path writes of course). So why am I telling you all this. Well, because I ran across a statement in some Oracle documentation that said you can check to see what compression method a table is stored with by looking at the COMPRESS_FOR column in the DBA_TABLES view. This column does reveal what the table designation is. However, the setting actually only tells you how rows inserted in the future will be compressed. It tells you absolutely nothing about the way current rows are stored.

As for the mechanics, it appears that each row has a bitmask associated with it showing what compression format is being used. So I wrote a little script to give me what I want to see (check_row_comp.sql) using the DBMS_COMPRESSION.GET_COMPRESSION_TYPE function. Here’s an example of its use.

Continue reading ‘EHCC and the GET_COMPRESSION_TYPE function’ »

EHCC Mechanics – Proof that whole CU’s are not decompressed

I saw an interesting post recently where Greg Rahn talked about HCC mechanics. He claimed that an update to a record stored in HCC format did not require decompressing the whole Compression Unit (CU) which consist of several Oracle blocks. I’m assuming by this he meant that all the records contained in the CU did not get written back to storage in a non-HCC format due to a single record being updated. Greg then showed an example proving row migration occurred for an updated record. He didn’t show that the other records had not been decompressed though. So since I was already working on an HCC chapter for the upcoming Apress Exadata book, I thought I would take time off from the book writing to post this (hopefully the editors will forgive me).

Here’s the recipe: Basically we’ll update a single row, see that its rowid has changed, veify that we can still get to the record via its original rowid, and check to see if the TABLE FETCH CONTINUED ROW statistic gets updated when we we access the row via its original rowid, thus proving basic row migration (this is what Greg has already shown). Then we’ll look at block dumps for the original and new block to see what’s there.

Continue reading ‘EHCC Mechanics – Proof that whole CU’s are not decompressed’ »

Licensing Requirements for SQL Profiles

Jonathan Lewis reminded me on my last post that using SQL Profiles (because they are part of the SQL Tuning Advisor) requires a license for Oracle Tuning Pack (which requires a license for the Diagnostics Pack). He also mentioned that Baselines did not require any additional license (at least creating and using Baselines on SQL statements). It’s been a while since I worked on a database that didn’t have both packs, but frankly I wasn’t sure I had a good handle of what was allowed and what wasn’t. So I thought it might be worthwhile to check. There is an easy way to check by the way. I did a post a while back on Tuning Pack and Diagnostic Pack license requirements for running AWR and how to check what was allowed and what wasn’t using the CONTROL_MANAGEMENT_PACK_ACCESS parameter. Here’s a link to the post:

Oracle Management Packs

Here’s an example using the same technique to show that SQL Profiles are indeed disabled by turning off the Diagnostic and Tuning Packs (at least on 11.2.02).

Continue reading ‘Licensing Requirements for SQL Profiles’ »

Interaction Between Baselines and SQL Profiles

In a previous post (GATHER_PLAN_STATISTICS) I mentioned that SQL Profiles and Baselines can both be applied to a single statement. In this case, it appears that the hints are merged. The Notes section of the XPLAN output shows that both the Baseline and the Profile are in effect. I wanted to prove to myself that the hints from both the Profile and Baseline were indeed applied. So here’s my simple test case:

Basic Design For the Test:

  1. Use a Baseline to make a statement do something it wouldn’t normally do
  2. Add a Profile that makes the same statement do something else it wouldn’t normally do
  3. Verify that the statement now does both “thingies”
> !sql
sqlplus "/ as sysdba"

SQL*Plus: Release 11.2.0.1.0 Production on Tue Feb 2 20:09:53 2010

Copyright (c) 1982, 2009, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SYS@LAB112> -- first the set up
SYS@LAB112> -- 32M row table with an index on col1
SYS@LAB112> !cat avgskewi.sql
select avg(pk_col) from kso.skew
where col1 = 136133
/

SYS@LAB112> @avgskewi

AVG(PK_COL)
-----------
   15636133

1 row selected.

Elapsed: 00:00:00.00
SYS@LAB112>
SYS@LAB112> @find_sql
Enter value for sql_text: %where col1 = 136133%
Enter value for sql_id: 

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
84q0zxfzn5u6s      0 3723858078          3           .00           36 select avg(pk_col) from kso.skew where col1 = 136133

1 row selected.

Elapsed: 00:00:00.12
SYS@LAB112>
SYS@LAB112> @dplan
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for child_no: 

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  84q0zxfzn5u6s, child number 0
-------------------------------------
select avg(pk_col) from kso.skew where col1 = 136133

Plan hash value: 3723858078

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |       |       |    35 (100)|          |
|   1 |  SORT AGGREGATE              |           |     1 |    24 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| SKEW      |    35 |   840 |    35   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SKEW_COL1 |    35 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("COL1"=136133)


20 rows selected.

Elapsed: 00:00:00.03
SYS@LAB112>
SYS@LAB112> -- so the index is used as expected - does 36 lio's and completes in < 1/100 of a second
SYS@LAB112> -- now let's make it do something it wouldn't normally do (with a Baseline)
SYS@LAB112> -- one way is to create a Profile, create a Baseline on top, drop the Profile
SYS@LAB112>
SYS@LAB112>
SYS@LAB112> 
SYS@LAB112> @create_1_hint_sql_profile
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for profile_name (PROFILE_sqlid_MANUAL): PROFILE_84q0zxfzn5u6s_GPS
Enter value for category (DEFAULT): 
Enter value for force_matching (false): 
Enter value for hint: full(skew@sel$1)
Profile PROFILE_84q0zxfzn5u6s_FULL created.

PL/SQL procedure successfully completed.

SYS@LAB112> @avgskewi

AVG(PK_COL)
-----------
   15636133

Elapsed: 00:00:12.71
SYS@LAB112>
SYS@LAB112> @find_sql
Enter value for sql_text: 
Enter value for sql_id: 84q0zxfzn5u6s

no rows selected

Elapsed: 00:00:00.18
SYS@LAB112>
SYS@LAB112> -- run again, SPM makes you run it twice ...   
SYS@LAB112> 
SYS@LAB112> @avgskewi

AVG(PK_COL)
-----------
   15636133

Elapsed: 00:00:07.32
SYS@LAB112>
SYS@LAB112> @find_sql
Enter value for sql_text: 
Enter value for sql_id: 84q0zxfzn5u6s

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
84q0zxfzn5u6s      0  568322376          1          7.31      162,301 select avg(pk_col) from kso.skew where col1 = 136133

Elapsed: 00:00:00.10
SYS@LAB112>
SYS@LAB112> @dplan
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for child_no: 

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  84q0zxfzn5u6s, child number 0
-------------------------------------
select avg(pk_col) from kso.skew where col1 = 136133

Plan hash value: 568322376

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       | 28360 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    24 |            |          |
|*  2 |   TABLE ACCESS FULL| SKEW |    35 |   840 | 28360   (1)| 00:05:41 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COL1"=136133)

Note
-----
   - SQL profile PROFILE_84q0zxfzn5u6s_FULL used for this statement


23 rows selected.

Elapsed: 00:00:00.05
SYS@LAB112>
SYS@LAB112> -- so it's now doing a full table scan, 162K lio's and takes several seconds
SYS@LAB112> 
SYS@LAB112> -- now create the Baseline on the statement that's already using a Profile
SYS@LAB112> 
SYS@LAB112> @create_baseline
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for plan_hash_value: 568322376
Enter value for fixed (NO): 
Enter value for enabled (YES): 
Enter value for plan_name (SQL_sqlid_planhashvalue): 
sql_id: 84q0zxfzn5u6s
plan_hash_value: 568322376
fixed: NO
enabled: YES
plan_name: SQL_84q0zxfzn5u6s_568322376
sql_handle: SYS_SQL_94dc89c011141f02
Baseline SQL_84q0zxfzn5u6s_568322376 created.

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
SYS@LAB112>
SYS@LAB112> @avgskewi                       

AVG(PK_COL)
-----------
   15636133

Elapsed: 00:00:07.44
SYS@LAB112> @avgskewi

AVG(PK_COL)
-----------
   15636133

Elapsed: 00:00:07.52
SYS@LAB112>
SYS@LAB112> -- obviously still doing the full table scan, but let's check anyway
SYS@LAB112> 
SYS@LAB112> @dplan
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for child_no: 

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  84q0zxfzn5u6s, child number 0
-------------------------------------
select avg(pk_col) from kso.skew where col1 = 136133

Plan hash value: 568322376

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       | 28360 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    24 |            |          |
|*  2 |   TABLE ACCESS FULL| SKEW |    35 |   840 | 28360   (1)| 00:05:41 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COL1"=136133)

Note
-----
   - SQL profile PROFILE_84q0zxfzn5u6s_FULL used for this statement
   - SQL plan baseline SQL_84Q0ZXFZN5U6S_568322376 used for this statement


24 rows selected.

Elapsed: 00:00:00.03
SYS@LAB112>
SYS@LAB112> -- let's check hints in Profile and Baseline
SYS@LAB112> 
SYS@LAB112> @sql_profile_hints
Enter value for profile_name: PROFILE_84q0zxfzn5u6s_FULL

HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
full(skew@sel$1)

1 rows selected.

Elapsed: 00:00:00.10
SYS@LAB112>
SYS@LAB112> @baseline_hints
Enter value for baseline_plan_name: SQL_84Q0ZXFZN5U6S_568322376

OUTLINE_HINTS
------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
DB_VERSION('11.2.0.1')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "SKEW"@"SEL$1")

6 rows selected.

Elapsed: 00:00:00.09
SYS@LAB112>
SYS@LAB112> -- so the Baseline has inherited the full hint
SYS@LAB112> 
SYS@LAB112> -- now let's drop the original Profile and add another non-standard hint to see if they are merged
SYS@LAB112> 
SYS@LAB112> @drop_sql_profile
Enter value for profile_name: PROFILE_84q0zxfzn5u6s_FULL

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.02
SYS@LAB112>
SYS@LAB112> @create_1_hint_sql_profile
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for profile_name (PROFILE_sqlid_MANUAL): PROFILE_84q0zxfzn5u6s_GPS
Enter value for category (DEFAULT): 
Enter value for force_matching (false): 
Enter value for hint: GATHER_PLAN_STATISTICS
Profile PROFILE_84q0zxfzn5u6s_GPS created.

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.05
SYS@LAB112> @avgskewi

AVG(PK_COL)
-----------
   15636133

1 row selected.

Elapsed: 00:00:07.51
SYS@LAB112> /

AVG(PK_COL)
-----------
   15636133

1 row selected.

Elapsed: 00:00:08.22
SYS@LAB112>
SYS@LAB112> -- if that worked we should still have the full table scan from the Baseline, and the extended stats from the Profile with GATHER_PLAN_STATISTICS
SYS@LAB112> 
SYS@LAB112> !cat dplan_allstats.sql
set lines 180
select * from table(dbms_xplan.display_cursor('&sql_id','&child_no','allstats  +peeked_binds'))
/

SYS@LAB112> @dplan_allstats
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for child_no: 

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  84q0zxfzn5u6s, child number 1
-------------------------------------
select avg(pk_col) from kso.skew where col1 = 136133

Plan hash value: 568322376

----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      2 |        |      2 |00:00:15.69 |     324K|    324K|
|   1 |  SORT AGGREGATE    |      |      2 |      1 |      2 |00:00:15.69 |     324K|    324K|
|*  2 |   TABLE ACCESS FULL| SKEW |      2 |     35 |     64 |00:00:12.02 |     324K|    324K|
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COL1"=136133)

Note
-----
   - SQL profile PROFILE_84q0zxfzn5u6s_GPS used for this statement
   - SQL plan baseline SQL_84Q0ZXFZN5U6S_568322376 used for this statement


24 rows selected.

Elapsed: 00:00:00.03
SYS@LAB112>
SYS@LAB112> -- notice that the XPLAN output has A-Rows and A-Time columns - means GATHER_PLAN_STATISTICS was used 
SYS@LAB112> -- otherwise it would have thrown a warning message 
SYS@LAB112>
SYS@LAB112> -- so that worked - it merged the hints! 
SYS@LAB112> 
SYS@LAB112> -- quick verify
SYS@LAB112> 
SYS@LAB112> @sql_profile_hints
Enter value for profile_name: PROFILE_84q0zxfzn5u6s_GPS

HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
GATHER_PLAN_STATISTICS

1 rows selected.

Elapsed: 00:00:00.04

SYS@LAB112> @baseline_hints
Enter value for baseline_plan_name: SQL_84Q0ZXFZN5U6S_568322376

OUTLINE_HINTS
-----------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
DB_VERSION('11.2.0.1')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "SKEW"@"SEL$1")

6 rows selected.

Elapsed: 00:00:00.09
SYS@LAB112>
SYS@LAB112> -- one more test, drop the Profile and the extended stats should go away
SYS@LAB112> 
SYS@LAB112> @drop_sql_profile
Enter value for profile_name: PROFILE_84q0zxfzn5u6s_GPS

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.02
SYS@LAB112> @avgskewi

AVG(PK_COL)
-----------
   15636133

1 row selected.

Elapsed: 00:00:07.20
SYS@LAB112> /

AVG(PK_COL)
-----------
   15636133

1 row selected.

Elapsed: 00:00:07.91
SYS@LAB112> @dplan_allstats
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for child_no: 

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  84q0zxfzn5u6s, child number 1
-------------------------------------
select avg(pk_col) from kso.skew where col1 = 136133

Plan hash value: 568322376

--------------------------------------------
| Id  | Operation          | Name | E-Rows |
--------------------------------------------
|   0 | SELECT STATEMENT   |      |        |
|   1 |  SORT AGGREGATE    |      |      1 |
|*  2 |   TABLE ACCESS FULL| SKEW |     35 |
--------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COL1"=136133)

Note
-----
   - SQL plan baseline SQL_84Q0ZXFZN5U6S_568322376 used for this statement
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level


26 rows selected.

Elapsed: 00:00:00.03
SYS@LAB112>
SYS@LAB112> -- yep no more extended stats (warning message), but the full table scan is still working
SYS@LAB112> 
SYS@LAB112> -- let's put it back the way it was
SYS@LAB112>
SYS@LAB112> @baselines
Enter value for sql_text: 
Enter value for name: 
Enter value for plan_name: SQL_84Q0ZXFZN5U6S_568322376

SQL_HANDLE                     PLAN_NAME                      SQL_TEXT                                           ENABLED ACC FIX LAST_EXECUTED
------------------------------ ------------------------------ -------------------------------------------------- ------- --- --- ----------------
SYS_SQL_94dc89c011141f02       SQL_84Q0ZXFZN5U6S_568322376    select avg(pk_col) from kso.skew                   YES     YES NO  02-feb-10 20:19

1 row selected.

Elapsed: 00:00:00.04
SYS@LAB112>
SYS@LAB112> @drop_baseline
Enter value for sql_handle: SYS_SQL_94dc89c011141f02
Enter value for plan_name: SQL_84Q0ZXFZN5U6S_568322376

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SYS@LAB112> @avgskewi

AVG(PK_COL)
-----------
   15636133

1 row selected.

Elapsed: 00:00:00.01
SYS@LAB112> /

AVG(PK_COL)
-----------
   15636133

1 row selected.

Elapsed: 00:00:00.00
SYS@LAB112> @dplan_allstats
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for child_no: 

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  84q0zxfzn5u6s, child number 0
-------------------------------------
select avg(pk_col) from kso.skew where col1 = 136133

Plan hash value: 3723858078

-----------------------------------------------------------
| Id  | Operation                    | Name      | E-Rows |
-----------------------------------------------------------
|   0 | SELECT STATEMENT             |           |        |
|   1 |  SORT AGGREGATE              |           |      1 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SKEW      |     35 |
|*  3 |    INDEX RANGE SCAN          | SKEW_COL1 |     35 |
-----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("COL1"=136133)

Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level


26 rows selected.

Elapsed: 00:00:00.03
SYS@LAB112>
SYS@LAB112> -- back to using the index and no stats

As usual I used a bunch of my scripts. I believe all the scripts referenced can be found in one of these two zip files:

Controlling Execution Plans Zip File

My Favorite Scripts 2010 zip file

I have a few thoughts as to why the developers decided to implement these two features this way. Remember that SQL Profiles were designed to be generated by the SQL Tuning Advisor which does statistical analysis on a query and potentially produces hints to correct calculations that the optimizer would otherwise get wrong. So it’s possible that the developers decided they wanted these types of statistical correction hints to be combined with plans already being “enforced” by Baselines. I must say though that I don’t really think these two constructs would work well together in most cases as the hints may well end up working against each other. And since Baselines are aware of the plan they are trying to reproduce, anything that actually alters the plan would basically disable all the hints associated with the Baseline. Although I expect that the plan output would probably still say the Baseline had been used. – I haven’t tested that though – so much to do, so little time.

I’m in the Wrong Business

Well I stayed at home today to do some writing on a book project while one of my sons and one of my daughters and my wife went to the Fort Worth Natural Science Museum. So I was feeling a little overworked and like I was missing out because I’ve been slaving away writing “scholarly technical material”. Then I got a text from my son. Just a picture that’s all.

I’m sure these two books will probably sell a few orders of magnitude more copies than any book that I contribute to. Kind of puts it all in perspective. I think next weekend I’ll go to the zoo with them.

Exadata and Parallel Queuing

Over the years Oracle has added many enhancements in order to allow individual SQL statements to take full advantage of multiprocessor computers. A few months ago Cary Millsap did a talk where he recalled the presentation Larry Ellison did when Oracle first announced the Parallel Query feature. During Larry’s demo he had a multiprocessor computer all to himself. I don’t remember how many processors it had, but I remember he had some kind of graphic showing individual CPU utilization on one screen while he fired up a parallel query on another screen. The monitoring screen lit up like a Christmas tree. Every one of the CPU’s was pegged during his demo. When Cary was telling the story he said that he had wondered at the time what would have happened if there had been other users on the system during the demo. Their experience would probably not have been a good one. I remember having the exact same thought.

Oracle’s parallel capabilities have been a great gift but they have also been a curse because controlling the beast in an environment where there are multiple users trying to share the resources is pretty difficult. There have been many attempts at coming up with a reasonable way of throttling big parallel statements along the way. But to date, I think this technology has only been used effectively in batch processing environments and large data warehouses where consuming the whole machine’s resources is acceptable due to the relatively low degree of concurrency required by those environments.

So why did I mention Exadata in the title of this post? Well I believe that one of the most promising aspects of Exadata is it’s potential with regard to running mixed work loads (OLTP and DW) on the same database, without crippling one or the other. In order to do that, Oracle needs some mechanism to separate the workloads. Resource Manager is an option in this area, but it doesn’t go far enough in controlling throughput on parallel queries. This new queuing mechanism should be a great help in that regard. So let’s review the options:

Parallel Adaptive Multi User (the old way)
This ability to automatically downgrade the degree of parallelism based on what’s happening on the system when a query kicks off is actually a powerful mechanism and is the best approach we’ve had prior to 11g Release 2. The downside of this approach is that parallelized statements can have wildly varying execution times. As you can imagine, a statement that gets 32 slaves one time and then gets downgraded to serial execution the next time will probably not make the user very happy. The argument for this type of approach is that stuff is going to run slower if the system is busy regardless of what you do. And that users expect it to run slower when the system is busy. The first part of that statement may be true but I don’t believe the second part is (at least in most cases). The bigger problem with the downgrade mechanism though is that the the decision about how many slaves to use is based on a single point in time (the point when the parallel statement starts). And once the degree of parallelism (DOP) is set for a statement it can not be changed. That execution of the statement will run with the number of slaves it got to start with, even if additional resources become available. So consider the statement that takes a minute with 32 slaves that gets downgraded to serial due to a momentarily high load. And say that 10 seconds after it starts the system load drops back to more normal levels. Unfortunately, the serialized statement will continue to run for nearly 30 minutes with it’s single process, even though on average the system is not busier than usual.

Parallel Queuing (the new way)
Now let’s compare that with the new mechanism introduced in 11gR2 that allows parallel statements to be queued in a First In – First Out fashion. This mechanism separates (presumably) long running parallel queries from the rest of the workload. The mechanics are pretty simple. Turn the feature on. Set a target number of parallel slaves (parallel_servers_target). Run stuff. If a statement tries to start that requires exceeding the target, it will be queued until the required number of slaves become available.

The Parallel Queuing feature is controlled by a hidden parameter called “_parallel_statement_queuing”. A value of TRUE turns it on and FALSE turns it off. FALSE is the default by the way. This parameter is not documented but is set automatically when the PARALLEL_DEGREE_POLICY parameter is set to AUTO. Unfortunately, PARALLEL_DEGREE_POLICY is one of those parameters that controls more than one thing. When set to AUTO it also turns on Automatic DOP calculation. This feature calculates a DOP for each statement regardless of whether any objects have been decorated with a parallel setting. The result is that all kinds of statements are run in parallel, even if no objects have been specifically defined with a parallel degree setting. This is truly automatic parallel processing because the database decides what to run in parallel and with how many slaves. On top of that, by default, the slaves may be spread across multiple nodes in a RAC database (this can be disabled by setting PARALLEL_FORCE_LOCAL to TRUE). Finally, AUTO is supposed to enable “In Memory Parallel Query”. This poorly named feature refers to 11gR2’s ability to make use of the SGA for parallel query, as opposed to using direct reads exclusively. Note: I haven’t actually seen this kick in yet, which is probably good, since Exadata Offloading depends on direct reads. I haven’t seen it kick in on non-Exadata databases either though.

Unfortunately this combination of features is a little like the wild west with things running in parallel all over the place. But the ability to queue parallel statements does provide some semblance of order. And to be fair, there are a number of parameters that can be set to control how the calculations are performed. Anyway, here’s a brief synopsis of parameter changes caused by the various settings PARALLEL_DEGREE_POLICY.

Continue reading ‘Exadata and Parallel Queuing’ »

Exadata Offload – The Secret Sauce

The “Secret Sauce” for Exadata is its ability to offload processing to the storage tier. Offloading and Smart Scan are two terms that are used somewhat interchangeably. Offloading is a more generic term that means doing work at the storage tier that would otherwise have to be done on the database tier (this can include work that is not related to executing queries such as optimization of incremental backups). Smart Scans on the other hand are the access mechanism used to offload query processing tasks. For example, storage servers can apply predicate filters at the storage layer, instead of shipping every possible block back to the database server(s). Another thing that happens with Smart Scans is that the volume of data returned can be further reduced by column projection (i.e. if you only select 1 column from a 100 column table, there is no need to return the other 99 columns). Offloading is geared to long running queries that access a large amount of data. Offloading only works if Oracle decides to use its direct path read mechanism. Direct path reads have traditionally been done by parallel query slaves, but can also be done by serial queries. In fact, as of 11g, Oracle has changed the decision making process resulting in more aggressive use of serial direct path reads. I’ve seen this feature described both as “serial direct path reads” and “adaptive direct path reads”.

I’ll digress here a bit to discuss this feature since direct path reads are critical to Exadata Offloading. Direct path reads do not load blocks into Oracle’s buffer cache. Instead, the data is returned directly to the PGA of the process requesting the data. This means that the data does not have to be in Oracle block format. That means no 8K block that is only partially filled, that may only have a record or two that you’re interested in, containing every column including ones you don’t want, and with additional header information – needs to be shipped back up from the storage layer. Instead, a much more compact result set containing only the columns requested and hopefully only the rows you need are returned. As I said, direct path reads are traditionally used by parallel query slaves. They are also used in a few other instances such as LOB access and sorts that spill over into TEMP. So the ability to use direct path reads is very important to the Exadata platform and thus the changes to the make them more attractive in 11g. Here are a few links to info on the subject of serial direct path reads:

  1. Doug Burns has a good post on 11g serial direct path reads.
  2. Alex Fatkulin has a very good post on some of the factors controlling adaptive direct path reads.
  3. There is a note on MOS (793845.1) on changes in 11g in the heuristics to choose between direct path reads or reads through the buffer cache.
  4. You may also find MOS note (50415.1) on misleading nature of “direct path read” wait events of interest.

Also be aware that direct path reads are only available for full scans (tables or indexes). So any statement that uses an index range scan to get to a row in a table via a rowid will not use this mechanism. Also keep in mind that direct path requires extra processing to ensure that all blocks on disk are current – (i.e. an object level check point), so frequently modified tables will suffer some overhead before direct path reads can be initiated.

I must say that I think the changes to the heuristics in 11g may be a little on the aggressive side for non-Exadata platforms (the changes may well be driven by Exadata). And by the way, serial direct path reads are not always faster than the normal reads that go through the buffer cache. Dion Cho has a good post on a performance problem due to serial direct path reads kicking in on one node of an 11g RAC environment (not Exadata). The node doing the direct path reads was running the query much slower than the node using the normal buffer cache reads. He also has a post on turning off serial direct path reads.

But enough about the direct path reads stuff, on to the Offloading. One of the first things I wanted to know when I got my first look at a system running on Exadata was whether a particular query was eligible for offloading and if so, how much of the expected i/o was saved by the feature. So of course I wrote a little script to show me that. Turns out there is plenty of info in V$SQL to see what’s going on. I called the script fsx.sql (short for Find_Sql_eXadata). Here’s a little demo:

Continue reading ‘Exadata Offload – The Secret Sauce’ »

What is Exadata?

Well I’ve been holed up playing with an Exadata V2 machine for the past several weeks. Wow. Very interesting technology.

I must say that I believe the concept of offloading SQL processing to the storage layer is a game changer and I wouldn’t be at all surprised to see this as a standard feature a few years from now. What that means for other storage vendors is unclear at this point. So for this first post on the topic let me just describe the configuration (and some potential upgrades).

The basic architecture consists of a set of database severs and a set of storage servers.

Database Servers:

  • Sun x4170 (1RU 64x server)
  • 2 – Quad-core Intel Xeon E5540 2.53GHz processors
  • 72G Ram (18x4G Dimms – max of 144G using 8G DIMMs)
  • Dual-Port QDR InfiniBand Host Channel Adapter
  • HBA with 512MB Battery Backed Write Cache (only for internal disks???)
  • 4 – 146G internal drives (SAS 10,000 RPM)
  • dual hot swappable power supplies
  • no spare/empty slots!

Here’s what the Database Servers look like:

Storage Servers:

  • Sun x4275 (2RU 64x server)
  • 2 – Quad-core Intel Xeon E5540 (2.53GHz) processors
  • 24G Ram
  • Dual-Port QDR InfiniBand Host Channel Adapter
  • HBA with 512MB Battery Backed Write Cache (only for internal disks???)
  • dual hot swappable power supplies
  • 4 – 96G Sun Flash PCIe Cards (total of 384 GB)
  • 12 – 600 GB 15,000 RPM SAS or 2 TB 7,200 RPM SATA

Continue reading ‘What is Exadata?’ »

Oracle 11gR2 Now Available for Solaris

I just noticed that 11g Release 2 for the Solaris Operating System is now available for download:

Oracle 11g R2 Download Page

It didn’t lag the Linux release by too long!

Upgrading to 11gR2 – DOUG Presentation Materials

I re-did my Upgrade to 11gR2 talk at the DOUG meeting in Dallas this afternoon. (I originally did it at the Cowboys Stadium for an Oracle Tech Day around Halloween). I promised that I’d post a link to the presentation, so here it is (just click on the image):

It’s the same presentation as the one from the original talk, although I got twice as much time to do it this time (worked out a lot better). Here’s a link to the post about that original Tech Day event with a bunch of pictures of the stadium, including a couple of Jerry’s data center: Cowboy Stadium Pictures