Kerry Osborne's Oracle Blog

Archive for the ‘Oracle’ Category.

I’ll Gladly Pay You Tuesday for a Hamburger Today

May 20, 2009, 7:30 pm

Those of you who are old enough to remember disco dancing will probably remember this guy. Wimpy (J. Wellington Wimpy actually) was a lazy, glutinous character in the old Popeye cartoons.

I was reminded of Wimpy last week by a performance problem that occurred on an 11g database I was looking at (I’ll tell you why in a minute). So here’s what happened. There were several statements that were running much more slowly than usual (10’s of seconds instead of less than a second). So I started poking around and here’s what I saw:

(note that I cleaned this up a bit to protect the innocent)

(also note that many of the scripts I use have been documented in other posts – so if I haven’t shown the SQL for a particular script, try searching for it on this site)

SQL> @find_sql_acs
Enter value for sql_text:
Enter value for sql_id: 3dhvsp0duttat
Enter value for is_bind_aware:

SQL_ID         CHILD PLAN_HASH_VALUE IOB IBS IBA ISH      EXECS ROWS_PROCESSED  AVG_ETIME    AVG_CPU    AVG_PIO    AVG_LIO SQL_TEXT
------------- ------ --------------- --- --- --- --- ---------- -------------- ---------- ---------- ---------- ---------- ------------------
3dhvsp0duttat      0      3289235442 N   N   N   Y            9             46       0.02       0.00       1.11        272 SELECT row_order, 
3dhvsp0duttat      1      3289235442 N   N   N   Y           13             48       0.08       0.01      12.62        546 SELECT row_order, 
3dhvsp0duttat      2      4076066623 N   N   N   Y           15             31      38.07       6.55  27,176.53  2,426,778 SELECT row_order, 


SQL> @dplan
Enter value for sql_id: 3dhvsp0duttat
Enter value for child_no: 0

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3dhvsp0duttat, child number 0
-------------------------------------
SELECT row_order, ...

Plan hash value: 3289235442

--------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                   | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                        |       |       |     5 (100)|          |       |       |
|*  1 |  VIEW                                 |                        |     1 |   586 |     5  (20)| 00:00:01 |       |       |
|*  2 |   WINDOW SORT PUSHED RANK             |                        |     1 |   258 |     5  (20)| 00:00:01 |       |       |
|*  3 |    FILTER                             |                        |       |       |            |          |       |       |
|   4 |     PARTITION RANGE ITERATOR          |                        |     1 |   258 |     4   (0)| 00:00:01 |   KEY |   KEY |
|*  5 |      TABLE ACCESS BY LOCAL INDEX ROWID| REALLY_BIG_TABLE       |     1 |   258 |     4   (0)| 00:00:01 |   KEY |   KEY |
|*  6 |       INDEX RANGE SCAN                | GOOD_INDEX             |     1 |       |     3   (0)| 00:00:01 |   KEY |   KEY |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(...)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "DATE_COL")<=:25)
   3 - filter((TO_DATE(:1,:2)<=TO_DATE(:3,:4) AND :24<=:25))
   5 - filter(...)
   6 - access("COL1" LIKE :21 AND "DATE_COL">=TO_DATE(:1,:2) AND "DATE_COL"<=TO_DATE(:3,:4))

SQL> @dplan
Enter value for sql_id: 3dhvsp0duttat
Enter value for child_no: 2

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  3dhvsp0duttat, child number 2
-------------------------------------
SELECT row_order, ...

Plan hash value: 4076066623

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                  | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                       |       |       |     1 (100)|       |       |
|*  1 |  VIEW                                 |                       |     1 |   586 |     0   (0)|       |       |
|*  2 |   WINDOW NOSORT STOPKEY               |                       |     1 |   574 |     0   (0)|       |       |
|*  3 |    FILTER                             |                       |       |       |            |       |       |
|   4 |     PARTITION RANGE ITERATOR          |                       |     1 |   574 |     0   (0)|   KEY |   KEY |
|*  5 |      TABLE ACCESS BY LOCAL INDEX ROWID| REALLY_BIG_TABLE      |     1 |   574 |     0   (0)|   KEY |   KEY |
|*  6 |       INDEX RANGE SCAN                | BAD_INDEX             |     1 |       |     0   (0)|   KEY |   KEY |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("ROW_ORDER">=:24 AND "ROW_ORDER"<=:25))
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "DATE_COL")<=:25)
   3 - filter((TO_DATE(:1,:2)<=TO_DATE(:3,:4) AND :24<=:25))
   5 - filter(...)
   6 - access("DATE_COL">=TO_DATE(:1,:2) AND "DATE_COL"<=TO_DATE(:3,:4))

SQL> -- Uh Oh, - cost of 1, no records expected 
SQL>
SQL> @child_load_time
Enter value for sql_text:
Enter value for sql_id: 3dhvsp0duttat

SQL_ID         CHILD  PLAN_HASH LAST_LOAD_TIME       EXECS     AVG_ETIME    AVG_PIO      AVG_LIO SQL_TEXT
------------- ------ ---------- --------------- ---------- ------------- ---------- ------------ -----------------------------------------
3dhvsp0duttat      0 3289235442 2009-05-01/12:04:20      9          0.02          1          272 SELECT row_order, ... 
3dhvsp0duttat      1 3289235442 2009-05-05/19:43:14     13          0.08         13          546
3dhvsp0duttat      2 4076066623 2009-05-07/16:49:36     15         38.07     27,177    2,426,778


SQL> -- so it looks like maybe something bad happened with stats on 5/7
SQL> -- let's make use of the diif_table_stats procedure to see if we can spot what happened

SQL> !date
Tue May 12 11:59:22 CDT 2009

SQL> !cat diff_table_stats.sql
select report, maxdiffpct from
table(dbms_stats.diff_table_stats_in_history('&owner','&table_name',
systimestamp-&days_ago))
/

SQL> @diff_table_stats.sql
Enter value for owner: XXX
Enter value for table_name: REALLY_BIG_TABLE
Enter value for days_ago: 5

REPORT                                                                           MAXDIFFPCT
-------------------------------------------------------------------------------- ----------
###############################################################################

STATISTICS DIFFERENCE REPORT FOR:
.................................

TABLE         : REALLY_BIG_TABLE
OWNER         : XXX
SOURCE A      : Statistics as of 07-MAY-09 11.46.36.000000 AM -05:00
SOURCE B      : Current Statistics in dictionary
PCTTHRESHOLD  : 10
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TABLE / (SUB)PARTITION STATISTICS DIFFERENCE:
.............................................

OBJECTNAME                  TYP SRC ROWS       BLOCKS     ROWLEN     SAMPSIZE
...............................................................................

P2009W19                    P   A   0          0          0          0
                                B   4001801    187810     342        4001801
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

COLUMN STATISTICS DIFFERENCE:
.............................

COLUMN_NAME     SRC NDV     DENSITY    HIST NULLS   LEN  MIN   MAX   SAMPSIZ
...............................................................................


                               PARTITION: P2009W19
                               ...................

COL1            A   0       0          NO   0       0      0
                B   105136  .000009511 NO   2565391 4    302E3 6E303 1436410
COL2            A   0       0          NO   0       0      0
                B   669696  .000001493 NO   1011264 9    2A2A2 554E4 2990537
COL3            A   0       0          NO   0       0      0
                B   988     .001012145 NO   1113316 4    30303 39393 2888485
COL4            A   0       0          NO   0       0      0
                B   669696  .000037461 YES  0       11   2A2A2 554E4 5540
...

SQL> -- Sure enough, stats were created when the partition was empty, ouch!

So a couple of things to note here:

1. the table is partitioned
2. dbms_stats.diff_table_stats_in_history is smart enough to show us partition stats (nice)
3. on May 7th the stats changed dramatically
4. this statement was not re-parsed due to the stats changing

So what’s the fix? Well since stats had been generated on this partition every night for the last 5 days, the fix was just to get the optimizer to re-parse my query (and any other affected statements). In 11g this is very easy as there is a built function (psuedo documented) to flush a single SQL statement from the shared pool (see my previous post – Flush A Single SQL Statement – for details). However, I wasn’t sure how many statements were affected (more on that in a minute). But certainly any statement that touched that partitioned table which got parsed on May 7th was almost certain to have a bad plan. So eventually we decided to do the brute force approach and just flush the entire shared pool. (I know, not elegant, but fairly effective) Note that there is no guarantee that flushing the shared pool will clear all the statements. In fact, on a busy system you’re almost guaranteed that it won’t flush every statement. But it did clear up most of the issues. So with the pressure now off (and the system back to running pretty well) I had time to write a little script to look for SQL statements with a cost of 1. (I guess I could have looked for statements which were parsed on May 7th that accessed my partitioned table, but I didn’t.)

Here’s a bit of the SQL*Plus Session:

SQL>-- initially we flushed a couple directly
SQL>
SQL> @flush_sql
Enter value for sql_id: 3dhvsp0duttat

PL/SQL procedure successfully completed.

SQL> -- so we can find all of these guys and flush them - or just flush the whole shared pool
SQL> -- flushing the shared pool is easier!
SQL>
SQL> alter system flush shared_pool;

System altered.

SQL> -- but there are still no guarantees they are all cleaned up, let's check
SQL>
SQL> !cat find_sql_by_cost.sql
select distinct parsing_schema_name, sa.sql_id, sp.plan_hash_value, cost
from v$sql_plan sp, v$sqlarea sa
where sp.sql_id = sa.sql_id
and parsing_schema_name like nvl('&parsing_user',parsing_schema_name)
and cost like nvl(to_number('&cost'),to_number(cost))
and sp.plan_hash_value like nvl('&plan_hash_value',sp.plan_hash_value)
and parent_id is null
order by 4 desc
/

SQL> @find_sql_by_cost
Enter value for parsing_user:
Enter value for cost: 1
Enter value for plan_hash_value: 407%

PARSING_SCHEMA_NAME            SQL_ID        PLAN_HASH_VALUE       COST
------------------------------ ------------- --------------- ----------
XXXX_PROD                      02t2tt28xp72m      4076066623          1
XXXX_PROD                      5byphafyx78j6      4076066623          1

SQL> -- note I did a little checking here to make sure they were suffering from the same issue
SQL> -- then I cleaned them up
SQL> 
SQL>
SQL> @flush_sql
Enter value for sql_id: 02t2tt28xp72m

PL/SQL procedure successfully completed.

SQL> @find_sql
Enter value for sql_text:
Enter value for sql_id: 02t2tt28xp72m

no rows selected

SQL> @flush_sql
Enter value for sql_id: 5byphafyx78j6

PL/SQL procedure successfully completed.

SQL> @find_sql
Enter value for sql_text:
Enter value for sql_id: 5byphafyx78j6

no rows selected

So the short term fix worked. The statements picked up the correct plans after re-parsing. Now let’s talk about the long term fix. Why did the partition not have stats on it? Turns out the partitioning scheme in place was range by date with a new partition for each week. Each partition goes from Thursday through the following Wednesday. Several partitions had been created in advance but as you may have guessed by now, nothing was done about the stats. Meaning that the partitions had values of 0 for the number of rows. Since the gather stats job was kicking in every evening, a whole days worth of data could be loaded into the partition before the stats would be updated. So that means that 24 hours worth of activity would occur with “bad” stats every week. Couple that with the default of NO_INVALIDATE=TRUE on the stats gathering and you get statements that may hang around for extended periods of time, even if the stats change dramatically (0 rows to a few million is dramatic, right?). So we manually set the stats on the pre-created partitions using a combination of dbms_stats.copy_table_stats and dbms_stats.set_table_stats and dbms_stats.set_column_stats (see Dion Cho’s excellent post for details and example code – and also this post on the Optimizermagic Blog for further info). I should also mention that Greg Rahn has several good posts on 11g Statistics Enhancements here, here, and here.

So why was I reminded of Wimpy? Because Wimpy’s favorite saying was “I’ll gladly pay you Tuesday for a hamburger today”. And the way this system was set up, every Tuesday (er, Thursday), we were paying the price for our bad/missing statistics. OK, so it’s a stretch, but I like the pictures of Wimpy!

Please feel free to leave me a comment letting me know how you feel …

… about the pictures of Wimpy.

Category: Oracle | 8 Comments

Oracle Tech Day Presentation – 11g (Upgrade?)

May 8, 2009, 10:00 am

I did a little impromptu talk yesterday at an Oracle Tech Day. It was kind of a funny situation. They had a speaker that couldn’t make it, so they called me at about 1:30 to see if I could talk at 2:00. Luckily my office is very close to the Oracle office, so I said sure.

The topic was supposed to be something about upgrading to 11g, and I had done a talk a month or so ago (at a Dallas Oracle Users Group meeting) about some of my favorite 11g features. So you know how it works, when you’ve got a hammer, everything starts to look like a nail. So I used the Power Point from that earlier talk and tried to tie in the upgrade theme as best I could. At about 3:05 I looked down at my watch and realized my hour was up (everyone knows that standard presentations are one hour). So I politely asked the audience if I should be wrapping up and they said, no you’re supposed to go until 4:00. Ouch!

Anyway, it wasn’t too painful, thanks to some good questions. Thanks to everyone who put up with the last minute sub. I promised to post the presentation materials. So here’s a link: 11g Presentation Materials

Category: Oracle, Speaking | 1 Comment

Extremes

May 6, 2009, 3:56 pm

Last week I got to look at a version 9 Statspack Report from a database where the buffer cache was set to 32M. That’s right, 32M. I don’t think you can buy a thumb drive that small. They were using 16K blocks too, so what’s that:

       (32*1024*1024)/(16*1024) = 2,048 blocks

Not very many blocks for that database. Needless to say it wasn’t working very well for the multi-hour batch job they were trying to run.

Then over the weekend someone sent an AWR report from a version 10 database and it had a 60G buffer cache. That’s the biggest one I’ve ever seen!

So it was a pretty interesting week (from a geeky technical standpoint anyway)!

It reminded me of Muggsy Bogues guarding Shawn Bradley, or Manute Bol, or MJ, or just about anybody.

Anyone seen a bigger (or smaller) buffer cache recently?

I’d love to hear about it.

Category: Oracle | 17 Comments

AWR DBtime script

May 5, 2009, 10:44 am

Here’s a handy little script for showing when the database has been most busy. I wrote this one because I wanted to be able to identify the workload characteristics of an instance when it was at its “busiest”, so we could model our testing strategy “correctly”. So the first part of that problem was to identify when the instance was “busiest”. The DBtime in the AWR reports seemed like a good indicator of “busiest”, so there you go. The script calculates the DBtime for each snapshot using the lag function and then reports the top 30. The script could easily be modified to calculate across multiple snapshots if desired to give the busiest day for example, but I’ll leave that as an exercise for the reader. By the way, the calculation came from 11g statspack report. The AWR report is wrapped, but the statspack report is not. So I just modified the query to use the AWR tables instead of the Statspack tables and added the lag function. This script works on 10gR2 as well.

Here’s the script: dbtime.sql
And an example of its use:

SQL*Plus: Release 11.1.0.7.0 - Production on Tue May 5 10:08:58 2009

Copyright (c) 1982, 2008, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> @dbtime
Enter value for instance_number: 
Enter value for begin_snap_id: 
Enter value for end_snap_id: 

BEGIN_SNAP   END_SNAP BEGIN_TIMESTAMP                                INST      DBTIME
---------- ---------- ---------------------------------------- ---------- -----------
      8205       8206 27-APR-09 09.21.22.022 AM                         1       32.87
      8207       8208 27-APR-09 09.40.50.404 AM                         1       32.63
      8245       8246 28-APR-09 10.00.58.734 PM                         1        5.65
      8389       8390 04-MAY-09 10.00.26.563 PM                         1        5.62
      8325       8326 02-MAY-09 06.00.06.252 AM                         1        4.81
      8220       8221 27-APR-09 10.00.06.667 PM                         1        4.28
      8317       8318 01-MAY-09 10.00.18.928 PM                         1        3.14
      8230       8231 28-APR-09 08.00.17.951 AM                         1        2.80
      8293       8294 30-APR-09 10.00.56.784 PM                         1        2.37
      8349       8350 03-MAY-09 06.00.28.355 AM                         1        2.35
      8244       8245 28-APR-09 09.00.51.632 PM                         1        2.26
      8232       8233 28-APR-09 10.00.32.610 AM                         1        1.97
      8305       8306 01-MAY-09 10.00.07.398 AM                         1        1.93
      8292       8293 30-APR-09 09.00.50.806 PM                         1        1.86
      8268       8269 29-APR-09 09.00.29.463 PM                         1        1.82
      8269       8270 29-APR-09 10.00.35.137 PM                         1        1.80
      8388       8389 04-MAY-09 09.00.20.561 PM                         1        1.80
      8353       8354 03-MAY-09 10.00.51.838 AM                         1        1.32
      8316       8317 01-MAY-09 09.00.12.973 PM                         1        1.31
      8329       8330 02-MAY-09 10.00.29.899 AM                         1        1.22
      8348       8349 03-MAY-09 05.00.22.374 AM                         1        1.18
      8213       8214 27-APR-09 03.00.16.481 PM                         1        1.17
      8365       8366 03-MAY-09 10.00.02.746 PM                         1        1.17
      8357       8358 03-MAY-09 02.00.15.346 PM                         1        1.08
      8341       8342 02-MAY-09 10.00.40.654 PM                         1        1.04
      8337       8338 02-MAY-09 06.00.17.054 PM                         1        1.03
      8204       8205 27-APR-09 09.20.32.516 AM                         1        1.00
      8361       8362 03-MAY-09 06.00.39.298 PM                         1        1.00
      8202       8203 27-APR-09 09.17.02.145 AM                         1        0.97
      8333       8334 02-MAY-09 02.00.53.480 PM                         1        0.97

30 rows selected.

And here’s the output of the AWR report for the top period.


WORKLOAD REPOSITORY report for

DB Name         DB Id    Instance     Inst Num Startup Time    Release     RAC
------------ ----------- ------------ -------- --------------- ----------- ---
LAB111        1385947011 LAB111              1 22-Apr-09 14:19 11.1.0.7.0  NO

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
homer            Linux IA (32-bit)                  16     4       4      31.73

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:      8205 27-Apr-09 09:21:22        25       1.8
  End Snap:      8206 27-Apr-09 09:27:33        23       1.8
   Elapsed:                6.19 (mins)
   DB Time:               32.87 (mins)

Category: Oracle | 17 Comments

Oracle 11g SQL Plan Management – SQL Plan Baselines

April 29, 2009, 3:17 pm

Note, this post applies to 11gR1 (specifically 11.1.0.7).

11g has a new method of dealing with plan instability (i.e. the tendency of a statement to flip flop between plans). The goal has always been to provide the best performance, but until 11g there has not been a mechanism that had as a design goal, the idea of eliminating backwards movement (“performance regressions” as the Oracle documentation calls them). That is, not allowing a statement to switch to a plan that is significantly slower than the one it has already been executing. This new mechanism depends on a construct called a Baseline. So for this post I am not going to focus on the process of evolving plans to prevent “performance regressions”, but rather on the Baselines themselves. (hopefully I’ll get to the plan management part soon)

Here’s a little excerpt from the 11gR1 documentation:

Each time a SQL statement is compiled, the optimizer first uses a cost-based search method to build a best-cost plan, then tries to find a matching plan in the SQL plan baseline. If a match is found, the optimizer will proceed using this plan. Otherwise, it evaluates the cost of each accepted plan in the SQL plan baseline and selects the plan with the lowest cost. The best-cost plan found by the optimizer that does not match any plans in the plan history for the SQL statement represents a new plan, and is added as a non-accepted plan to the plan history. The new plan is not used until it is verified to not cause a performance regression. However, if a change in the system (such as a dropped index) causes all accepted plans to become non-reproducible, the optimizer will select the best-cost plan. Thus, the presence of a SQL plan baseline causes the optimizer to use conservative plan selection strategy for the SQL statement.

Clear as mud?

Now my description:

Baselines are the latest evolution in Oracle’s efforts to allow a plan to be locked. The previous efforts were Outlines and SQL Profiles. Both of those efforts were based on the idea that hints could be applied to limit the optimizers choices down to one. That approach seems a little flawed. Why not just save the plan and be done with it? I believe that’s what Baselines are intended to do. Unfortunately, they don’t appear to do it yet. But they do have an advantage over Outlines and SQL Profiles in that they at least save the plan_hash_value, so they know if they are reproducing the correct plan or not. (see my previous post, Do Baselines Use Hints? – Take 2) And hopefully, they will evolve to the point where the actual plan is saved along with the hash value – we’ll see.

So anyway, here’s some basic housekeeping stuff on Baselines:

Baselines will be used by default in 11gR1, if they exist. There is a parameter to control whether they are used or not (OPTIMIZER_USE_SQL_PLAN_BASELINE). It is set to TRUE by default.
Baselines will not be created by default in 11gR1. So, much like with the older Outlines or SQL Profiles, you must do something to create them.
There is a view called DBA_SQL_PLAN_BASELINES that exposes the Baselines that have been created.
Just like Outlines and SQL Profiles, Baselines apply to all instances in a RAC environment (they are not localized to a specific instance).

Baselines can be created a couple of different ways. They can be created automatically by the database as it’s running by setting the OPTIMIZER_CAPTURE_SQL_PLAN_BASELINES parameter. They can also be created for statements in a SQL Tuning Set. But I’m not going to talk about either one of those approaches in this post, as I don’t think they really make much sense in the case where you want to use a Baseline to lock a plan for a specific statement. Instead, I am going to discuss creating a baseline for a statement that is in the shared pool via the DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE function.

All the function needs is a SQL_ID and a PLAN_HASH_VALUE. Optionally a parameter can be used to define the baseline as FIXED. If it’s FIXED then it gets priority over any other Baselines for that statement, except other FIXED Baselines. Confused? Well it’s not exactly the most straight forward setup. I’m kind of a simple guy, so at this point I’m thinking one FIXED Baseline is enough.

Before you get too bored, here’s a quick example:

SQL> select avg(pk_col) from kso.little_skew where col1 = 99320;

AVG(PK_COL)
-----------
     628922

SQL> @find_sql
Enter value for sql_text: %little%
Enter value for sql_id: 

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
4dd2qjxbjav6u      0 2709260180          1           .02           89 select avg(pk_col) from kso.little_skew where col1 = 99320

SQL> @dplan
Enter value for sql_id: 4dd2qjxbjav6u
Enter value for child_no: 0

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  4dd2qjxbjav6u, child number 0
-------------------------------------
select avg(pk_col) from kso.little_skew where col1 = 99320

Plan hash value: 2709260180

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |       |       |     4 (100)|          |
|   1 |  SORT AGGREGATE              |                  |     1 |     9 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| LITTLE_SKEW      |     1 |     9 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | LITTLE_SKEW_COL1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("COL1"=99320)

20 rows selected.

SQL> !cat create_baseline.sql
var ret number
exec :ret := dbms_spm.load_plans_from_cursor_cache(-
    sql_id=>'&sql_id', -
    plan_hashvalue=>&plan_hash_value,-
    fixed=>'&fixed');

SQL> @create_baseline
Enter value for sql_id: 4dd2qjxbjav6u
Enter value for plan_hash_value: 2709260180
Enter value for fixed: NO

PL/SQL procedure successfully completed.

SQL> !cat baselines.sql
set lines 155
col sql_text for a50 trunc
col last_executed for a28
col enabled for a7
col plan_hash_value for a16
col last_executed for a16
select spb.sql_handle, spb.plan_name, spb.sql_text,
spb.enabled, spb.accepted, spb.fixed,
to_char(spb.last_executed,'dd-mon-yy HH24:MI') last_executed
from
dba_sql_plan_baselines spb
where spb.sql_text like nvl('%'||'&sql_text'||'%',spb.sql_text)
and spb.sql_handle like nvl('&name',spb.sql_handle)
and spb.plan_name like nvl('&plan_name',spb.plan_name)
/

SQL> @baselines
Enter value for sql_text: %little%
Enter value for name:
Enter value for plan_name: 

SQL_HANDLE                     PLAN_NAME                      SQL_TEXT                                           ENABLED ACC FIX LAST_EXECUTED
------------------------------ ------------------------------ -------------------------------------------------- ------- --- --- ----------------
SYS_SQL_b77256b604589dec       SYS_SQL_PLAN_04589dec2fdfd157  select avg(pk_col) from kso.little_skew where col1 YES     YES NO

SQL> select avg(pk_col) from kso.little_skew where col1 = 99320;

AVG(PK_COL)
-----------
     628922

SQL> select avg(pk_col) from kso.little_skew where col1 = 99320;

AVG(PK_COL)
-----------
     628922

SQL> @find_sql
Enter value for sql_text:
Enter value for sql_id: 4dd2qjxbjav6u

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
4dd2qjxbjav6u      0 2709260180          1           .02           89 select avg(pk_col) from kso.little_skew where col1 = 99320
4dd2qjxbjav6u      1 2709260180          1           .00            5 select avg(pk_col) from kso.little_skew where col1 = 99320

SQL> @dplan
Enter value for sql_id: 4dd2qjxbjav6u
Enter value for child_no: 1

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  4dd2qjxbjav6u, child number 1
-------------------------------------
select avg(pk_col) from kso.little_skew where col1 = 99320

Plan hash value: 2709260180

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |       |       |     4 (100)|          |
|   1 |  SORT AGGREGATE              |                  |     1 |     9 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| LITTLE_SKEW      |     1 |     9 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | LITTLE_SKEW_COL1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("COL1"=99320)

Note
-----
   - SQL plan baseline SYS_SQL_PLAN_04589dec2fdfd157 used for this statement

24 rows selected.

So that’s a quick example of how easy it is to create a Baseline. Once a Baseline is created it behaves much like an Outline or SQL Profile, in that it will be applied to any SQL statement where the normalized text matches. It’s interesting to note that even though Outlines are way outdated in 11g, they still take precedence over Baselines. So if you have an Outline and a Baseline on the same statement, the Outline will be used. Here’s an example of that behavior:
Continue reading ‘Oracle 11g SQL Plan Management – SQL Plan Baselines’ »

Category: Oracle, Plan Stability, Tuning | 53 Comments

Do SQL Plan Baselines Use Hints? – Take 2

April 23, 2009, 10:33 am

Warning – This post is almost pure speculation!

You may have read a previous post where I said I thought Baselines didn’t use hints. We’ll I’ve changed my mind. I no longer think that Baselines store an actual plan for a SQL statement. (See my previous post – Do SQL Plan Baselines Use Hints?) I never could find a place where the plan data was stored (only the plan_hash_value). So I had a closer look at some more 10053 trace files. Here’s a quick run down of what I found. First the set up:

SQL> @flush_pool

System altered.

SQL> select avg(pk_col) from kso.little_skew where col1 = 1;

AVG(PK_COL)
-----------
 310999.357

SQL> @find_sql
Enter value for sql_text: %little%
Enter value for sql_id: 

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
9df852xw86wxt      0 3498336203          1           .18        3,437 select avg(pk_col) from kso.little_skew where col1 = 1

SQL> @dplan
Enter value for sql_id: 9df852xw86wxt
Enter value for child_no: 0

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9df852xw86wxt, child number 0
-------------------------------------
select avg(pk_col) from kso.little_skew where col1 = 1

Plan hash value: 3498336203

----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |       |       |   907 (100)|          |
|   1 |  SORT AGGREGATE    |             |     1 |     9 |            |          |
|*  2 |   TABLE ACCESS FULL| LITTLE_SKEW |   620K|  5449K|   907   (2)| 00:00:11 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COL1"=1)


19 rows selected.

SQL> select /*+ index (a little_skew_col1) */ avg(pk_col) from kso.little_skew a where col1 = 1;

AVG(PK_COL)
-----------
 310999.357

SQL> @find_sql
Enter value for sql_text: %little%
Enter value for sql_id: 

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
9df852xw86wxt      0 3498336203          1           .18        3,437 select avg(pk_col) from kso.little_skew where col1 = 1
fbnx1bws6n4xu      0 2709260180          1           .86        5,278 select /*+ index (a little_skew_col1) */ avg(pk_col) from ks
                                                                      o.little_skew a where col1 = 1


SQL> @dplan
Enter value for sql_id: fbnx1bws6n4xu
Enter value for child_no: 0

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  fbnx1bws6n4xu, child number 0
-------------------------------------
select /*+ index (a little_skew_col1) */ avg(pk_col) from
kso.little_skew a where col1 = 1

Plan hash value: 2709260180

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |       |       | 20835 (100)|          |
|   1 |  SORT AGGREGATE              |                  |     1 |     9 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| LITTLE_SKEW      |   620K|  5449K| 20835   (1)| 00:04:11 |
|*  3 |    INDEX RANGE SCAN          | LITTLE_SKEW_COL1 |   620K|       |  3213   (1)| 00:00:39 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("COL1"=1)


21 rows selected.

SQL> -- need to create baseline for original statement to get SQL_HANDLE
SQL>
SQL> !cat create_baseline.sql
var ret number
exec :ret := dbms_spm.load_plans_from_cursor_cache(-
    sql_id=>'&sql_id', -
    plan_hash_value=>&plan_hash_value,-
    fixed=>'&fixed');




SQL> @create_baseline
Enter value for sql_id: 9df852xw86wxt
Enter value for plan_hash_value: 3498336203
Enter value for fixed: NO

PL/SQL procedure successfully completed.

SQL> -- now add the other plan as a fixed Baseline
SQL>
SQL> !cat create_baseline2.sql
var ret number
exec :ret := dbms_spm.load_plans_from_cursor_cache(-
    sql_id=>'&sql_id', -
    plan_hash_value=>&plan_hash_value,-
    sql_handle=>'&sql_handle',-
    fixed=>'&fixed');


SQL> @create_baseline2
Enter value for sql_id: fbnx1bws6n4xu
Enter value for plan_hash_value: 2709260180
Enter value for sql_handle: SYS_SQL_19fcc3effcce06c0
Enter value for fixed: YES

PL/SQL procedure successfully completed.

SQL> @baselines
Enter value for sql_text: 
Enter value for name: 
Enter value for plan_name: 

SQL_HANDLE                     PLAN_NAME                      SQL_TEXT                                           ENABLED ACC FIX LAST_EXECUTED
------------------------------ ------------------------------ -------------------------------------------------- ------- --- --- ----------------
SYS_SQL_19fcc3effcce06c0       SYS_SQL_PLAN_fcce06c08b1420d2  select avg(pk_col) from kso.little_skew where col1 YES     YES NO
SYS_SQL_19fcc3effcce06c0       SYS_SQL_PLAN_fcce06c02fdfd157  select avg(pk_col) from kso.little_skew where col1 YES     YES YES

SQL> select avg(pk_col) from kso.little_skew where col1 = 1;

AVG(PK_COL)
-----------
 310999.357

SQL> /        

AVG(PK_COL)
-----------
 310999.357

SQL> -- check to see if it's using the fixed Baseline
SQL> @find_sql
Enter value for sql_text: %little%
Enter value for sql_id: 

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
9df852xw86wxt      0 3498336203          2           .19        3,279 select avg(pk_col) from kso.little_skew where col1 = 1
fbnx1bws6n4xu      0 2709260180          1           .86        5,278 select /*+ index (a little_skew_col1) */ avg(pk_col) from ks
                                                                      o.little_skew a where col1 = 1


SQL> 
SQL> @dplan
Enter value for sql_id: 9df852xw86wxt
Enter value for child_no: 0

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9df852xw86wxt, child number 0
-------------------------------------
select avg(pk_col) from kso.little_skew where col1 = 1

Plan hash value: 3498336203

----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |       |       |   907 (100)|          |
|   1 |  SORT AGGREGATE    |             |     1 |     9 |            |          |
|*  2 |   TABLE ACCESS FULL| LITTLE_SKEW |   620K|  5449K|   907   (2)| 00:00:11 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COL1"=1)


19 rows selected.

SQL> -- no it's not using the Baseline, why not?

So we created a Fixed Baseline on a statement that used a hint to force an index based access path, attached it to another statement with no hints, and then re-ran the un-hinted statement. It did not pick up the Fixed Baseline. The next step was to have a look at a 10053 trace file. Here are some excerpts from the 10053 trace file.


...

SPM: statement found in SMB

...

******************************************
----- Current SQL Statement for this session (sql_id=9df852xw86wxt) -----
select avg(pk_col) from kso.little_skew where col1 = 1
*******************************************

...

SPM: cost-based plan found in the plan baseline, planId = 2333352146
SPM: fixed planId's of plan baseline are: 803197271
SPM: using qksan to reproduce, cost and select accepted plan, sig = 1872586980585244352
SPM: reproduce plan round 1, using session OFE=11010007
SPM: using qksan to reproduce accepted plan, planId = 803197271
Registered qb: SEL$1 0xf6d872ec (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$1 nbfros=1 flg=0
    fro(0): flg=4 objn=90376 hint_alias="LITTLE_SKEW"@"SEL$1"

DOP: Automatic degree of parallelism is disabled: Parameter.
PM: Considering predicate move-around in query block SEL$1 (#0)
**************************
Predicate Move-Around (PM)
**************************
OPTIMIZER INFORMATION

******************************************
----- Current SQL Statement for this session (sql_id=7039h2744jjrv) -----
/* SQL Analyze(110,0) */ select avg(pk_col) from kso.little_skew where col1 = 1
*******************************************

...

SPM: planId in plan baseline = 803197271, planId of reproduced plan = 2333352146
------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : SYS
  plan_baseline signature  : 1872586980585244352
  plan_baseline plan_id    : 803197271
  plan_baseline hintset    :
    hint num  1 len 27 text: IGNORE_OPTIM_EMBEDDED_HINTS
    hint num  2 len 37 text: OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
    hint num  3 len 22 text: DB_VERSION('11.1.0.7')
    hint num  4 len  8 text: ALL_ROWS
    hint num  5 len 22 text: OUTLINE_LEAF(@"SEL$1")
    hint num  6 len 57 text: INDEX_RS_ASC(@"SEL$1" "A"@"SEL$1" ("LITTLE_SKEW"."COL1"))
SPM: generated non-matching plan:
----- Explain Plan Dump -----
----- Plan Table -----

============
Plan Table
============
-----------------------------------------+-----------------------------------+
| Id  | Operation           | Name       | Rows  | Bytes | Cost  | Time      |
-----------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT    |            |       |       |   907 |           |
| 1   |  SORT AGGREGATE     |            |     1 |     9 |       |           |
| 2   |   TABLE ACCESS FULL | LITTLE_SKEW|  605K | 5449K |   907 |  00:00:11 |
-----------------------------------------+-----------------------------------+
Predicate Information:
----------------------
2 - filter("COL1"=1)

Content of other_xml column
===========================
  db_version     : 11.1.0.7
  parse_schema   : SYS
  plan_hash      : 3498336203
  plan_hash_2    : 2333352146
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
      DB_VERSION('11.1.0.7')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "LITTLE_SKEW"@"SEL$1")
    END_OUTLINE_DATA
  */

------- END SPM Plan Dump -------

So it appears that the Baseline is found in the SMB (SQL Management Base). It then appears to try to reproduce the plan in a separate session using the hints. It then announces that it came up with a different plan and spits out the plan data that it came up with. So it appears that the plan_hash_value is just used for comparison to see if it came up with the same plan as the one that was in place when the Baseline was generated. If not, then it falls back to the regular (no Baseline) behavior. Not at all what I had hoped for!

It may be that 11gR1 is an intermediate step and that a future version (11gR2 maybe, I hope) will actually store the plan details instead of just the plan hash value. But as I said in the intro, this is complete speculation on my part.

Category: Oracle, Plan Stability | 10 Comments

Hidden SQL – why can’t I find my SQL Text?

April 16, 2009, 9:53 am

I recently ran into an interesting issue that drove me crazy for a couple of days! A very frequently executed statement on a database was generating some waits on “SQL*Net more data from client”. ASH shows the statement, as does the V$SESSION view, but the statement never shows up in the V$SQL view nor in the AWR tables. Here’s a brief run down of how the investigation went.

The reason for looking at this in the first place was that the application was timing out for brief periods of time. So a quick look at an AWR report for the time period showed a significant amount of time being spent waiting on the “SQL*Net more data from client” event. This event was taking a much larger slice of the pie than we saw during periods of “normal” activity. Here’s a little excerpt from one of the AWR reports:

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
DB CPU                                            2,045          57.7
SQL*Net more data from client       165,760         868      5   24.5 Network
db file sequential read              58,083         449      8   12.7 User I/O
log file sync                        33,656          85      3    2.4 Commit
direct path read                     22,026          42      2    1.2 User I/O


...

                                                    % of Waits
                                 -----------------------------------------------
                           Total
Event                      Waits  <1ms  <2ms  <4ms  <8ms <16ms <32ms  <=1s   >1s
-------------------------- ----- ----- ----- ----- ----- ----- ----- ----- -----
SQL*Net break/reset to cli  833   95.1    .6    .2          .1   3.6    .4
SQL*Net message to client  7537K 100.0    .0
SQL*Net more data from cli  165K  87.5    .4    .1    .0    .0   7.1   4.8
SQL*Net more data to clien  410K 100.0    .0

Obviously this is not good. This is on an 11g system so the AWR report provides a nice little histogram section on the wait events. The histogram indicates that most of these events are under 1ms, but a significant percentage are in the 16ms – 1sec range. So the question was, what could be causing those waits.

No problem, just check the Active Session History, find the sql_id for the statement(s) waiting on this event, and then look in V$SQL for the text. But not so fast bucko!

SQL> !cat find_sql_waiting_on_event.sql
select event, sql_id, count(*),
avg(time_waited) avg_time_waited
from v$active_session_history
where event like nvl('&event','%more data from%')
group by event, sql_id
order by event, 3
/

SQL> @find_sql_waiting_on_event
Enter value for event: 

EVENT                          SQL_ID          COUNT(*)
------------------------------ ------------- ----------
SQL*Net more data from client  dkc2kbj7kdsu7         21
SQL*Net more data from client                        62
SQL*Net more data from client  40du2847gh6sq        156
SQL*Net more data from client  7vtb7h3kpfat0       2965


SQL> -- Ah there's the culpret, 7vtb7h3kpfat0
SQL>
SQL> @find_sql
Enter value for sql_text:
Enter value for sql_id: 7vtb7h3kpfat0
Enter value for plan_hash_value:

no rows selected

SQL> -- hmm, must have aged out of the shared pool (seems unlikely since ASH has it so many times)
SQL> -- but maybe it's in the AWR tables
SQL>
SQL> @find_sql_awr
Enter value for sql_text:
Enter value for sql_id: 7vtb7h3kpfat0

no rows selected

SQL> -- nope, not there either

So I tried a couple of things to see if I could prove to myself that it was never in V$SQL (and thus never in the AWR tables). First I tried coloring the statement with the dbms_workload_repository.add_colored_sql procedure. (See Dion Cho’s post on this feature for more info). No luck! Then I tried using Tanel Poder’s sample.sql script (more info on that here) to run a bunch of queries against the V$SQLAREA view to see if the statement ever showed up. It did not! I then traced (10046) a bunch of the sessions that were executing the statement, hoping that the trace files would have the statement text. They did not, nor was the SQL_ID even contained in any of the trace files!

Turns out it is never in the shared pool and it is never in the AWR tables. But it is captured by ASH and it shows up in the SQL_ID column in V$SESSION. Here’s a little more output from the system:

SQL> select username, sid, serial#, status, sql_id, event, seconds_in_wait
  2  from v$session
  3  where username like nvl('&username',username)
  4  and sid like nvl('&sid',sid)
  5  and sql_id='7vtb7h3kpfat0'
  6* order by username, sid, serial#;

Enter value for username:
Enter value for sid:
 
USERNAME                     SID    SERIAL# STATUS   SQL_ID        EVENT                          SECONDS_IN_WAIT
------------------------- ------ ---------- -------- ------------- ------------------------------ ---------------
XX_OWNER                    4661       4929 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                823
XX_OWNER                    4676       3835 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                  0
XX_OWNER                    4682       9170 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                 16
XX_OWNER                    4699      11522 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                537
XX_OWNER                    4712       9592 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                 93
XX_OWNER                    4717       6659 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                814
XX_OWNER                    4727      25467 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                155
XX_OWNER                    4739       8796 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                 17
XX_OWNER                    4755       7327 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                  9
XX_OWNER                    4800       4002 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                 11
XX_OWNER                    4826      15699 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                153
XX_OWNER                    4838       6283 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                 68
XX_OWNER                    4849      11468 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client               1052
XX_OWNER                    5089       8610 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                  1
XX_OWNER                    5091       7709 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                 31
XX_OWNER                    5168       4728 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                585
XX_OWNER                    5185       8298 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                157
XX_OWNER                    5186       6969 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client               1197
XX_OWNER                    5210      10830 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                  0
XX_OWNER                    5232       6509 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                  1
XX_OWNER                    5251      12231 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                155
XX_OWNER                    5400       5217 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                697
XX_OWNER                    5423      12903 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                  5
XX_OWNER                    5434      14154 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                191
XX_OWNER                    5441      33222 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                  3
XX_OWNER                    5442       5669 INACTIVE 7vtb7h3kpfat0 SQL*Net message from client                217

26 rows selected.

SQL> -- so there are a lot of sessions that execute our statement and then wait
SQL> -- (sometimes for a long time) before the user requests another action
SQL>
SQL> -- Using Tanel's sample.sql script - does about 1000 queries against V$SQLAREA in a sec or so
SQL>
SQL> @sample sql_text v$sqlarea "sql_id = '7vtb7h3kpfat0'" 1000

no rows selected

SQL> -- doesn't look like we'll ever catch it in the shared pool

Grasping at straws, I started looking for any views that had a sql_id and/or sql_text field, and I found one that I hadn’t thought to look in – V$OPEN_CURSOR. Ah, that’s probably a good place to look since surely a cursor must be created for this statement. Sure enough, this turned out to provide the necessary clue.

SQL> select distinct * from v$open_cursor
  2  where rownum < 10
  3* and sql_id like '%fat%';

SADDR               SID USER_NAME                      ADDRESS          HASH_VALUE SQL_ID        SQL_TEXT                                  LAST_SQL_
---------------- ------ ------------------------------ ---------------- ---------- ------------- ----------------------------------------- ---------
SQL_EXEC_ID
-----------
0000000210ED3980   4750 XX_OWNER                       000000021AFEB748 3847695136 7vtb7h3kpfat0 table_4_9_fbd5_0_0_0

00000002111EC2E8   5043 SIGNON                         0000000219E02E80 1527693606 6pfatm1dhxh96 SELECT  u.app_user_id, u.username, u.pass
                                                                                                 word, u.first_name,

000000021163BDA8   5451 XX_OWNER                       000000021AFEB748 3847695136 7vtb7h3kpfat0 table_4_9_fbd5_0_0_0

00000002114067E0   5242 XX_OWNER                       000000021AFEB748 3847695136 7vtb7h3kpfat0 table_4_9_fbd5_0_0_0

00000002114372F0   5260 SIGNON                         0000000219E02E80 1527693606 6pfatm1dhxh96 SELECT  u.app_user_id, u.username, u.pass
                                                                                                 word, u.first_name,

0000000210F06FD8   4769 XX_OWNER                       000000021AFEB748 3847695136 7vtb7h3kpfat0 table_4_9_fbd5_0_0_0

0000000211439E38   5261 SIGNON                         0000000219E02E80 1527693606 6pfatm1dhxh96 SELECT  u.app_user_id, u.username, u.pass
                                                                                                 word, u.first_name,

00000002111E97A0   5042 SIGNON                         0000000219E02E80 1527693606 6pfatm1dhxh96 SELECT  u.app_user_id, u.username, u.pass
                                                                                                 word, u.first_name,

000000021161E190   5440 XX_OWNER                       000000021AFEB748 3847695136 7vtb7h3kpfat0 table_4_9_fbd5_0_0_0


9 rows selected.

So the SQL_ID was present in the V$OPEN_CURSOR view and there was a value in the SQL_TEXT field. Unfortunately, it didn't provide the text of the original statement, but at least I now had something to work with. A quick google for table_4_9_fbd5_0_0_0 (which is what was in the SQL_TEXT field) turned up nothing, of course. But a little digging turned up this post on Ask Tom. You'll have to read down a ways in that post to find that the real culpret was LOBs. Also, see this post on the Oracle-l list (which is what I originally found that pointed me to the Ask Tom article).

So it appears that the issue is that LOBs are being accessed, which causes Oracle to generate a new SQL_ID for the LOB access part of the execution. Apparently, these types of statements do not get exposed via the V$SQL and V$SQLAREA views and thus do not get captured by AWR. There are other types of statements that exhibit this behavior as well, by the way. The Tom Kyte post talks about nested tables for example. So back to the question, what statement generated the SQL_ID with the missing sql_text in the first place? We'll the first clue came from the Tom Kyte post. Apparently the middle bit of the text in the V$OPEN_CURSOR view is a hex representation of the object_id of the object that is being accessed (in our case, "fbd5"). So if we convert it to decimal we can find the object that is contains the LOB, like so:

SQL> select owner, object_name, object_type
  2  from dba_objects
  3  where object_id = (select to_number('&hex_value','XXXXXX') from dual);
Enter value for hex_value: fbd5

OWNER                          OBJECT_NAME                    OBJECT_TYPE
------------------------------ ------------------------------ -------------------
XX_OWNER                       XXX_MY_TABLE                   TABLE

So now I know that I'm looking for a statement that is accessing XXX_MY_TABLE (note: names have been changed to protect the innocent). But how to find the statement, that's the question that was never answered in the Tom Kyte post. It was at about this point, while I was filling my coffee cup, that I had one of those ah-ha moments (usually I have them when I'm doing something else - not even actively thinking about the problem). Anyway, I got to thinking about the PREV_SQL_ID in the V$SESSION view. If a statement that accesses a LOB, immediately generates a new SQL_ID for the LOB access part of the execution, perhaps the original statement's SQL_ID would be in the PREV_SQL_ID field. And that's exactly what I found.


SQL> select username, sid, serial#, status, sql_id, prev_sql_id, event, seconds_in_wait
  2  from v$session
  3  where username like nvl('&username',username)
  4  and sid like nvl('&sid',sid)
  5  and sql_id='7vtb7h3kpfat0'
  6* order by username, sid, serial#;

Enter value for username:
Enter value for sid:


USERNAME                     SID    SERIAL# STATUS   SQL_ID        PREV_SQL_ID   EVENT                          SECONDS_IN_WAIT
------------------------- ------ ---------- -------- ------------- ------------- ------------------------------ ---------------
XX_OWNER                    4665       8707 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client               1235
XX_OWNER                    4745       9383 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                  1
XX_OWNER                    4792      23304 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client               1235
XX_OWNER                    4816      11098 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                  0
XX_OWNER                    4844       3050 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                773
XX_OWNER                    4907       8184 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                232
XX_OWNER                    4959      35967 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                633
XX_OWNER                    5154      33701 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                 25
XX_OWNER                    5230      12792 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                510
XX_OWNER                    5248      49715 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                456
XX_OWNER                    5249      32300 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                147
XX_OWNER                    5282      21331 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                182
XX_OWNER                    5285        386 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                185
XX_OWNER                    5295        792 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                 40
XX_OWNER                    5416      49780 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                  7
XX_OWNER                    5419      22645 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client                890
XX_OWNER                    5442       6200 INACTIVE 7vtb7h3kpfat0 78jdubkb65p2w SQL*Net message from client               1326

17 rows selected.

SQL> @find_sql_long
Enter value for sql_id: 78jdubkb65p2w

SQL_TEXT
----------------------------------------------------------------------
SELECT  XYZ_COLUMN, BLAH ... FROM XXX_MY_TABLE ...


SQL> @find_sql
Enter value for sql_text:
Enter value for sql_id: 78jdubkb65p2w

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ -----------------------------------------
78jdubkb65p2w      0  147781538  9,534,632         .0006           11 SELECT  XYZ_COLUMN, BLAH
78jdubkb65p2w      1  147781538        191         .0005           11

One last comment (and then I'll shut up):

I was surprised to find that the statement was short - only a few lines long. I was expecting a really long string that would not fit in a single SQL*Net packet. After all, we are waiting on the "SQL*Net more data FROM client" event, right? Which is supposed to be the time waited for an additional packet from the client when Oracle knows that more is coming. Clearly this was not the case. Rather it is waiting because Oracle is sending data back to the client that takes more than one packet (i.e. the LOB). You would expect that time would be logged under the "SQL*Net more data TO client" event in this case. However, it appears that time is logged to that event only until the packet is turned over to the network. It appears that Oracle then starts logging time to the "SQL*Net more data FROM client" event while it waits for the next response from the client. Therefore, the actual network transfer time is clocked to the wrong event. You can understand why the developers made this choice, because Oracle has no way of knowing how long it actually takes the packet to get to the client. It only knows how long the round trip takes. So maybe they should have named the event slightly differently, like "SQL*Net more data from client including the entire round trip network time" or something. Anyway, Tanel Poder has a post that briefly mentions this quirk of the SQL*Net events.

Note: this is somewhat speculative on my part as I have not tried to prove exactly where the "TO" event stops and the "FROM" event begins. But I am pretty sure Tanel is correct on this point.

So there you have it. Just a little story about wandering through the weeds on the way to enlightenment.

=============================

Note: This part was added after Dion Cho commented that he wasn't able to re-produce the behavior, so I gave it shot (I got close). Here's the output:

> sqlplus / as sysdba

SQL*Plus: Release 11.1.0.7.0 - Production on Mon Apr 20 14:59:25 2009

Copyright (c) 1982, 2008, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> -- session (SID 124) connected as KSO issues "select xml_clob from test_lob"

...


SQL> -- what's going on?
SQL> @sessions
Enter value for username: 
Enter value for sid: 

USERNAME                   SID    SERIAL# STATUS   SQL_ID        EVENT                          SECONDS_IN_WAIT
------------------------- ---- ---------- -------- ------------- ------------------------------ ---------------
KSO                        124       2402 ACTIVE   cu98u9ukjqj19 SQL*Net message from client                  0
SYS                        128       3580 ACTIVE   bsbmyw5thg8qf SQL*Net message to client                    0

SQL> -- KSO is running cu98u9ukjqj19, what's that doing?
SQL> @find_sql
Enter value for sql_text: 
Enter value for sql_id: cu98u9ukjqj19

no rows selected

SQL> -- so that's probably the LOB access, let's check 
SQL> select * from v$open_cursor where sql_id like 'cu98u9ukjqj19';

SADDR     SID USER_NAME               ADDRESS  HASH_VALUE SQL_ID        SQL_TEXT                              LAST_SQL_ SQL_EXEC_ID
-------- ---- ----------------------- -------- ---------- ------------- ------------------------------------- --------- -----------
DB1A69C8  124 KSO                     E0BD4C0C 2770027561 cu98u9ukjqj19 table_4_9_19a1d_0_0_0



SQL> -- yep, LOB access, let' see if we can find the original statement
SQL> @sessionsp
Enter value for username: 
Enter value for sid: 

USERNAME                   SID    SERIAL# STATUS   SQL_ID        PREV_SQL_ID   EVENT                          SECONDS_IN_WAIT
------------------------- ---- ---------- -------- ------------- ------------- ------------------------------ ---------------
KSO                        124       2402 ACTIVE   cu98u9ukjqj19 9babjv8yq8ru3 direct path read                             0
SYS                        128       3580 ACTIVE   grdgy5dcmqd0f 9babjv8yq8ru3 SQL*Net message to client                    0

SQL> -- says prev statement is 9babjv8yq8ru3
SQL> @find_sql
Enter value for sql_text: 
Enter value for sql_id: 9babjv8yq8ru3

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
9babjv8yq8ru3      1          0         39           .01           54 BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END;

SQL> -- oops, SQL*Plusism - need to turn off serveroutput

/*
-- go to other session here and set serveroutput off and re-run select statement
*/

SQL> -- wait a little while until the statement finishes 
SQL> @sessionsp
Enter value for username: 
Enter value for sid: 

USERNAME                   SID    SERIAL# STATUS   SQL_ID        PREV_SQL_ID   EVENT                          SECONDS_IN_WAIT
------------------------- ---- ---------- -------- ------------- ------------- ------------------------------ ---------------
KSO                        124       2402 INACTIVE               djqbj30qjyngw SQL*Net message from client                 74
SYS                        128       3580 ACTIVE   grdgy5dcmqd0f 9babjv8yq8ru3 SQL*Net message to client                    0

SQL> -- note that the SQL_ID field has been wiped after the statement completes - need to look a little closer at this behavior
SQL> -- the java application I was looking at in production did not cause this behavior - i.e. the SQL_ID field would not be cleared
SQL> -- so finding the SQL_ID for the LOB access and the PREV_SQL_ID for the original statement didn't happen in my test case 
SQL> -- although the PREV_SQL_ID field does contain the value for the original statement
 
SQL> @find_sql
Enter value for sql_text: 
Enter value for sql_id: djqbj30qjyngw

SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
djqbj30qjyngw      0  825212230          2          2.61       53,065 select xml_clob from test_lob

Category: Oracle | 23 Comments

Do SQL Plan Baselines Use Hints?

April 6, 2009, 11:42 pm

I posted this weekend on SQL Profiles and there were a couple of follow up comments that got me wondering about the internal mechanism used by Baselines and since this got a little wordy, I decided to make it a new post as opposed to just a comment. The documentation appears to indicate that Baselines actually store a plan, as opposed to the earlier mechanism used by both Outlines and SQL Profiles, which are hint based. But the DBA_SQL_PLAN_BASELINES view exposes a column called OTHER_XML which contains, you guessed it, hints. (by the way, this column also exists in the V$SQL_PLAN view, where it exposes hints for every plan in the shared pool) So anyway, the question was posed, “Do SQL Plan Baselines use hints?”. The answer is … I don’t think so. Here’s why:

The DBA_SQL_PLAN_BASELINES view is based on SQLOBJ$, SQLOBJ$DATA, and SQLOBJ$AUXDATA. These tables contain data for both SQL Profiles and Baselines (but not for Outlines). The SQLOBJ$ table has a column (PLAN_ID) which is populated with a plan hash value (but not the one exposed in V$SQL – more on that later). And now the interesting bit, the entries for the Baselines have a value in the PLAN_ID column while the entries for the SQL Profiles don’t. Hmmmm…

10053 trace files have a section labeled “Final query after transformations:”. For statements using both Outlines and SQL Profiles, the “Final Query” includes hint text. For both fixed and non-fixed Baselines, the “Final Query” does not include hints.

So here’s some excerpts from three 10053 trace files. They were all created on the same 11.1.0.7 database. The first file was created for a statement using an Outline. The second was created for a statement using a SQL Profile. The third was created for a statement using a Baseline.

===================================================================================
10053 level 1 for statement using an Outline
===================================================================================

...

---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$1 nbfros=1 flg=0
    fro(0): flg=4 objn=79200 hint_alias="SKEW"@"SEL$1"

SPM: statement not found in SMB

...


******************************************
----- Current SQL Statement for this session (sql_id=0r09xac72gvj1) -----
select /* test1 */ avg(pk_col) from kso.skew where col1 > 0
*******************************************

...

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT /*+ FULL ("SKEW") */ AVG("SKEW"."PK_COL") "AVG(PK_COL)" FROM "KSO"."SKEW" "SKEW" WHERE "SKEW"."COL1">0

...

Starting SQL statement dump

user_id=0 user_name=SYS module=sqlplus@homer (TNS V1-V3) action=
sql_id=0r09xac72gvj1 plan_hash_value=568322376 problem_type=3
----- Current SQL Statement for this session (sql_id=0r09xac72gvj1) -----
select /* test1 */ avg(pk_col) from kso.skew where col1 > 0
sql_text_length=60
sql=select /* test1 */ avg(pk_col) from kso.skew where col1 > 0
----- Explain Plan Dump -----
----- Plan Table -----

============
Plan Table
============
--------------------------------------+-----------------------------------+
| Id  | Operation           | Name    | Rows  | Bytes | Cost  | Time      |
--------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT    |         |       |       |   43K |           |
| 1   |  SORT AGGREGATE     |         |     1 |    11 |       |           |
| 2   |   TABLE ACCESS FULL | SKEW    |   31M |  336M |   43K |  00:09:54 |
--------------------------------------+-----------------------------------+
Predicate Information:
----------------------
2 - filter("COL1">0)

Content of other_xml column
===========================
  db_version     : 11.1.0.7
  parse_schema   : SYS
  plan_hash      : 568322376
  plan_hash_2    : 1429740808
  outline        : KSOTEST1
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
      DB_VERSION('11.1.0.7')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "SKEW"@"SEL$1")
    END_OUTLINE_DATA
  */

Optimizer state dump:
Compilation Environment Dump
optimizer_mode_hinted               = true
optimizer_features_hinted           = 11.1.0.7

...

===================================================================================
10053 level 1 for statement using a manually generated SQL Profile
===================================================================================

...

---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$1 nbfros=1 flg=0
    fro(0): flg=4 objn=79200 hint_alias="SKEW"@"SEL$1"

SPM: statement found in SMB
SPM: current statement has no SQL plan baseline, sig = 4814772129047009103

...

******************************************
----- Current SQL Statement for this session (sql_id=399m90n8jzpu6) -----
select avg(pk_col) from kso.skew
*******************************************

...

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT /*+ INDEX_FFS ("SKEW" "SYS_C0015494") */ AVG("SKEW"."PK_COL") "AVG(PK_COL)" FROM "KSO"."SKEW" "SKEW"

...

SPM: statement in SMB but no plan baseline, sig = 4814772129047009103
Starting SQL statement dump

user_id=0 user_name=SYS module=sqlplus@homer (TNS V1-V3) action=
sql_id=399m90n8jzpu6 plan_hash_value=-2082116200 problem_type=3
----- Current SQL Statement for this session (sql_id=399m90n8jzpu6) -----
select avg(pk_col) from kso.skew
sql_text_length=33
sql=select avg(pk_col) from kso.skew
----- Explain Plan Dump -----
----- Plan Table -----

============
Plan Table
============
---------------------------------------------+-----------------------------------+
| Id  | Operation              | Name        | Rows  | Bytes | Cost  | Time      |
---------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT       |             |       |       |   19K |           |
| 1   |  SORT AGGREGATE        |             |     1 |     6 |       |           |
| 2   |   INDEX FAST FULL SCAN | SYS_C0015494|   31M |  183M |   19K |  00:04:58 |
---------------------------------------------+-----------------------------------+
Predicate Information:
----------------------

Content of other_xml column
===========================
  db_version     : 11.1.0.7
  parse_schema   : SYS
  plan_hash      : 2212851096
  plan_hash_2    : 4073867208
  sql_profile    : PROFILE_399m90n8jzpu6
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
      DB_VERSION('11.1.0.7')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX_FFS(@"SEL$1" "SKEW"@"SEL$1" ("SKEW"."PK_COL"))
    END_OUTLINE_DATA
  */

Optimizer state dump:
Compilation Environment Dump
optimizer_mode_hinted               = true
optimizer_features_hinted           = 11.1.0.7

...

===================================================================================
10053 level 1 for statement using a Baseline
===================================================================================

...

---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$1 nbfros=1 flg=0
    fro(0): flg=4 objn=79200 hint_alias="SKEW"@"SEL$1"

SPM: statement found in SMB
DOP: Automatic degree of parallelism is disabled: Parameter.
PM: Considering predicate move-around in query block SEL$1 (#0)
**************************
Predicate Move-Around (PM)
**************************
OPTIMIZER INFORMATION

******************************************
----- Current SQL Statement for this session (sql_id=0qa98gcnnza7h) -----
select avg(pk_col) from kso.skew
where col1 > 0
*******************************************

...

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT AVG("SKEW"."PK_COL") "AVG(PK_COL)" FROM "KSO"."SKEW" "SKEW" WHERE "SKEW"."COL1">0

...

SPM: cost-based plan found in the plan baseline, planId = 1429740808
SPM: cost-based plan was successfully matched, planId = 1429740808
Starting SQL statement dump

user_id=0 user_name=SYS module=sqlplus@homer (TNS V1-V3) action=
sql_id=0qa98gcnnza7h plan_hash_value=568322376 problem_type=3
----- Current SQL Statement for this session (sql_id=0qa98gcnnza7h) -----
select avg(pk_col) from kso.skew
where col1 > 0
sql_text_length=48
sql=select avg(pk_col) from kso.skew
where col1 > 0
----- Explain Plan Dump -----
----- Plan Table -----

============
Plan Table
============
--------------------------------------+-----------------------------------+
| Id  | Operation           | Name    | Rows  | Bytes | Cost  | Time      |
--------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT    |         |       |       |   43K |           |
| 1   |  SORT AGGREGATE     |         |     1 |    11 |       |           |
| 2   |   TABLE ACCESS FULL | SKEW    |   31M |  336M |   43K |  00:09:54 |
--------------------------------------+-----------------------------------+
Predicate Information:
----------------------
2 - filter("COL1">0)

Content of other_xml column
===========================
  db_version     : 11.1.0.7
  parse_schema   : SYS
  plan_hash      : 568322376
  plan_hash_2    : 1429740808
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
      DB_VERSION('11.1.0.7')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "SKEW"@"SEL$1")
    END_OUTLINE_DATA
  */

Optimizer state dump:
Compilation Environment Dump
optimizer_mode_hinted               = false
optimizer_features_hinted           = 0.0.0

...

A couple of final thoughts:

These are very simple examples, so they may not be representative of more complicated statements. I’ve had a quick look at some more realistic examples though, and the basic info I’ve presented seems to hold for them as well. It is clear however, that a ton of work is still going on when the statement is parsed. I’m assuming that is due to the optimizer’s attempts to see if a potentially better plan might exist, even with a fixed Baseline.

It appears that every statement is checked early in the parsing process to see if it is stored in the SMB (SQL Management Base) which is where Baselines are stored (and apparently where SQL Profiles are stored as well).

The 10053 trace shows a couple of plan hash values (PLAN_HASH and PLAN_HASH2). The one labeled PLAN_HASH is the one found in the V$SQL view, the one labeled PLAN_HASH2 is the one found in the SQLOBJ$ table. I’m not sure how to relate these two yet.

I’m hopeful that fixed Baselines will provide the ability to absolutely lock a Plan for a statement, as opposed to the hint based approaches. Not that those haven’t worked pretty well, but actually storing a plan instead of hints that try to limit the optimizers choices, if indeed that’s what’s actually happening, seems a much better approach.

That’s all for now. I’ve been working with Baselines in a production 11.1.0.7 database for a few weeks now and have been wanting to do a post on my observations but keep getting distracted. Hopefully in the next few days I will get a chance to do that.

Please let me know if you can shed any further light on Baselines. I’d be very interested to hear about any experiences with them.

Category: Oracle | 6 Comments

SQL Profiles

April 4, 2009, 8:35 pm

Well I was wrong! SQL Profiles are better than Outlines. For a while now I have been saying that I didn’t like SQL Profiles because they were less stable than Outlines. Turns out that the OPT_ESTIMATE hint used by SQL Profiles which are created by the SQL Tuning Advisor is what I really didn’t like. I just didn’t know it.

Let me back up for a minute. I posted about Oracle’s SQL Tuning Advisor a while back. It’s a feature that was added to Oracle in version 10g. It basically looks a SQL statement and tries to come up with a better execution plan than the one the optimizer has picked. Since it is allowed as much time as it wants to do it’s analysis, the advisor can sometimes find better approaches. That’s because it can actually validate the optimizer’s original estimates by running various steps in a given plan and comparing the actual results to the estimates. When it’s all done, if it has found a better plan, it offers to implement that new plan via a SQL Profile. Those offered Profiles often have a lightly documented hint (OPT_ESTIMATE) that allows it to scale the optimizer estimates for various operations – essentially it’s a fudge factor. The problem with this hint is that, far from locking a plan in place, it is locking an empirically derived fudge factor in place. This still leaves the optimizer with a lot of flexibility when it comes to choosing a plan. It also sets up a commonly occurring situation where the fudge factors stop making sense as the statistics change. Thus the observation that SQL Profiles tend to sour over time.

I’ve been saying for some time that I don’t like buying generic Viagra online. Since it is very convenient and effective.

I have to give credit to Randolf Geist for making me take a second look at SQL Profiles. He commented on my Outlines post last week and recommended I give his post on SQL Profiles a look. I did and it really got me thinking. One of the things I liked the best about the post was that he created a couple of scripts to pull the existing hints from a statement in the shared pool or the AWR tables , and create a SQL Profile from those hints using the DBMS_SQLTUNE.IMPORT_SQL_PROFILE procedure. This makes perfect sense because the hints are stored with every plan (that’s what DBMS_XPLAN uses to spit them out if you ask for them). Unfortunately this procedure is only lightly documented. Also he had a nice script for pulling the hints from V$SQL_PLAN table which I have made use of as well.

So as always I have created a few scripts (borrowing from Randolf mostly).

create_sql_profile.sql – uses cursor from the shared pool
create_sql_profile_awr.sql – uses AWR tables
sql_profile_hints.sql – shows the hints in a SQL Profile for 10g

So here’s little example:
Note: unstable_plans.sql and awr_plan_stats.sql are discussed here: Unstable Plans (Plan Instability)

SQL> @unstable_plans
Enter value for min_stddev:
Enter value for min_etime:

SQL_ID        SUM(EXECS)   MIN_ETIME   MAX_ETIME   NORM_STDDEV
------------- ---------- ----------- ----------- -------------
0qa98gcnnza7h          4       42.08      208.80        2.8016

SSQL> @awr_plan_stats
Enter value for sql_id: 0qa98gcnnza7h

SQL_ID        PLAN_HASH_VALUE        EXECS          ETIME    AVG_ETIME        AVG_LIO
------------- --------------- ------------ -------------- ------------ --------------
0qa98gcnnza7h       568322376            3          126.2       42.079      124,329.7
0qa98gcnnza7h      3723858078            1          208.8      208.796   28,901,466.0

SQL> @create_sql_profile_awr
Enter value for sql_id: 0qa98gcnnza7h
Enter value for plan_hash_value: 568322376
Enter value for category:
Enter value for force_matching:

PL/SQL procedure successfully completed.

SQL> @sql_profiles
Enter value for sql_text:
Enter value for name: PROFIL%

NAME                           CATEGORY        STATUS   SQL_TEXT                                                               FOR
------------------------------ --------------- -------- ---------------------------------------------------------------------- ---
PROFILE_0qa98gcnnza7h          DEFAULT         ENABLED  select avg(pk_col) from kso.skew                                       NO

SQL> set echo on
SQL> @sql_profile_hints
SQL> set lines 155
SQL> col hint for a150
SQL> select attr_val hint
  2  from dba_sql_profiles p, sqlprof$attr h
  3  where p.signature = h.signature
  4  and name like ('&profile_name')
  5  order by attr#
  6  /
Enter value for profile_name: PROFILE_0qa98gcnnza7h

HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "SKEW"@"SEL$1")

SQL> @sql_hints_awr
SQL> select
  2  extractvalue(value(d), '/hint') as outline_hints
  3  from
  4  xmltable('/*/outline_data/hint'
  5  passing (
  6  select
  7  xmltype(other_xml) as xmlval
  8  from
  9  dba_hist_sql_plan
 10  where
 11  sql_id = '&sql_id'
 12  and plan_hash_value = &plan_hash_value
 13  and other_xml is not null
 14  )
 15  ) d;
Enter value for sql_id: 0qa98gcnnza7h
Enter value for plan_hash_value: 568322376

OUTLINE_HINTS
-----------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "SKEW"@"SEL$1")

A couple of additional points:

Outlines and SQL Profiles both take the same approach to controlling execution plans. They both attempt to force the optimizer down a certain path by applying hints behind the scenes. This is in my opinion an almost impossible task. The more complex the statement, the more difficult the task becomes. The newest kid on the block in this area (in 11g) is called a baseline and while it doesn’t abandon the hinting approach altogether, it does at least store the plan_hash_value – so it can tell if it regenerated the correct plan or not.
It does not appear that Outlines are being actively pursued by Oracle development anymore. So while they still work in 11g, they are becoming a little less reliable (and they were a bit quirky to begin with).
SQL Profiles have the ability to replace literals with bind variables similar to the cursor_sharing parameter. This means you can have a SQL Profile that will match multiple statements which use literals without having to set cursor_sharing for the whole instance.
Outlines take precedence over SQL Profiles. You can create both on the same statement and if you do, the outline will be used and the SQL Profile will be ignored. This is true in 11g as well, by the way.
Outlines don’t appear to use the OPT_ESTIMATE hint. So I believe it is still a valid approach to accept a SQL Profile as offered by the SQL Tuning Advisor and then create an Outline on top of it. It seems to work pretty well most of the time. (be sure and check the hints and the plan that gets produced)
Manually created SQL Profiles also don’t appear to use the OPT_ESTIMATE hint. So I believe it is also a valid approach to accept a SQL Profile as offered by the SQL Tuning Advisor and then create a SQL Profile on top of it. Note that you’ll have to use a different category, then drop the original, then enable the new SQL Profile. Which means this approach is a bit more work than just creating an Outline in the DEFAULT category.

Have a look at the difference between SQL Tuning Set generated hints and those created by a manual SQL Profile or an Outline (note that I have tried to change the object names to protect the innocent and in so doing may have made it slightly more difficult to follow) :

SQL>
SQL> select * from table(dbms_xplan.display_cursor('&sql_id','&child_no',''));
Enter value for sql_id: fknfhx8wth51q
Enter value for child_no: 1

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  fknfhx8wth51q, child number 1
-------------------------------------
SELECT /* test4 */ col1, col2, col3 ...


Plan hash value: 3163842146

----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                          |       |       |  1778 (100)|          |
|   1 |  NESTED LOOPS                 |                          |  1039 | 96627 |  1778   (1)| 00:00:33 |
|   2 |   NESTED LOOPS                |                          |   916 | 57708 |  1778   (1)| 00:00:33 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| TABLE_XXXX_LOOKUP        |   446 | 17840 |   891   (1)| 00:00:17 |
|*  4 |     INDEX RANGE SCAN          | INDEX_XXXX_IS_CPCI       | 12028 |       |    18   (0)| 00:00:01 |
|   5 |    TABLE ACCESS BY INDEX ROWID| TABLE_XXXX_IDENT         |     2 |    46 |     2   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN          | INDEX_XXXXIP_17_FK       |     2 |       |     1   (0)| 00:00:01 |
|*  7 |   INDEX UNIQUE SCAN           | PK_TABLE_XXXX_ASSIGNMENT |     1 |    30 |     0   (0)|          |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(( 
...
   4 - access("L"."COL1"=:N1)
   6 - access("L"."COL2"="I"."COL1")

Note
-----
   - SQL profile SYS_SQLPROF_012061f471d50001 used for this statement


85 rows selected.

SQL> @sql_profile_hints
Enter value for name: SYS_SQLPROF_012061f471d50001

HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
OPT_ESTIMATE(@"SEL$1", TABLE, "L"@"SEL$1", SCALE_ROWS=0.0536172171)
OPT_ESTIMATE(@"SEL$1", INDEX_SKIP_SCAN, "A"@"SEL$1", PK_TABLE_XXXX_ASSIGNMENT, SCALE_ROWS=4)
COLUMN_STATS("APP_OWNER"."TABLE_XXXX_ASSIGNMENT", "COL1", scale, length=6 distinct=1234 nulls=0 min=1000000014 max=1026369632)
COLUMN_STATS("APP_OWNER"."TABLE_XXXX_ASSIGNMENT", "COL2", scale, length=12 distinct=2 nulls=0)
COLUMN_STATS("APP_OWNER"."TABLE_XXXX_ASSIGNMENT", "COL3", scale, length=12 distinct=2 nulls=0)
TABLE_STATS("APP_OWNER"."TABLE_XXXX_ASSIGNMENT", scale, blocks=5 rows=2400)
OPTIMIZER_FEATURES_ENABLE(default)

7 rows selected.

SQL> -- no direct hints - only stats and scaling on the profile created by the SQL Tuning Advisor
SQL> -- (i.e. the dreaded OPT_ESTIMATE hints and no directive type hints like INDEX or USE_NL)
SQL>
SQL> -- now let's try an outline on top of it
SQL> @create_outline

Session altered.

Enter value for sql_id: fknfhx8wth51q
Enter value for child_number: 1
Enter value for outline_name: KSOTEST1
Outline KSOTEST1 created.

PL/SQL procedure successfully completed.

SQL> @outline_hints
Enter value for name: KSOTEST1

HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
USE_NL(@"SEL$1" "A"@"SEL$1")
USE_NL(@"SEL$1" "I"@"SEL$1")
LEADING(@"SEL$1" "L"@"SEL$1" "I"@"SEL$1" "A"@"SEL$1")
INDEX(@"SEL$1" "A"@"SEL$1" ("TABLE_XXXX_ASSIGNMENT"."COL1" "TABLE_XXXX_ASSIGNMENT"."COL2" "TABLE_XXXX_ASSIGNMENT"."COL3"))
INDEX_RS_ASC(@"SEL$1" "I"@"SEL$1" ("TABLE_XXXX_IDENT"."COL1"))
INDEX_RS_ASC(@"SEL$1" "L"@"SEL$1" ("TABLE_XXXX_LOOKUP"."COL1" "TABLE_XXXX_LOOKUP"."COL2"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('11.1.0.7')
OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
IGNORE_OPTIM_EMBEDDED_HINTS

11 rows selected.

SQL> -- no OPT_ESTIMATE hints on the outline
SQL> -- directive type hints - INDEX, USE_NL, etc...
SQL> 
SQL> -- now let's try creating a manual profile
SQL> @create_sql_profile.sql
Enter value for sql_id: fknfhx8wth51q
Enter value for child_no: 1
Enter value for category: TEST
Enter value for force_matching:

PL/SQL procedure successfully completed.

SQL> @sql_profile_hints
Enter value for name: PROFILE_fknfhx8wth51q

HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
DB_VERSION('11.1.0.7')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX_RS_ASC(@"SEL$1" "L"@"SEL$1" ("TABLE_XXXX_LOOKUP"."COL1" "TABLE_XXXX_LOOKUP"."COL2"))
INDEX_RS_ASC(@"SEL$1" "I"@"SEL$1" ("TABLE_XXXX_IDENT"."COL1"))
INDEX(@"SEL$1" "A"@"SEL$1" ("TABLE_XXXX_ASSIGNMENT"."COL1" "TABLE_XXXX_ASSIGNMENT"."COL2" "TABLE_XXXX_ASSIGNMENT"."COL3"))
LEADING(@"SEL$1" "L"@"SEL$1" "I"@"SEL$1" "A"@"SEL$1")
USE_NL(@"SEL$1" "I"@"SEL$1")
USE_NL(@"SEL$1" "A"@"SEL$1")

11 rows selected.

SQL> -- no OPT_ESTIMATE with the SQL Profile we created manually !
SQL> -- again it's directive - USE_NL, INDEX, LEADING, etc...
SQL>

So I apologize to all you SQL Profiles out there who have been lumped together by my prejudiced view, just because of the acts of a few of your brethren (i.e. the ones created by the SQL Tuning Advisor). SQL Profiles do indeed have all the capabilities of Outlines and probably are a better choice in most cases than Outlines.

Thanks again to Randolf Geist for his comments and his ideas on creating manual SQL Profiles.

Category: Oracle, Plan Stability, Tuning | 90 Comments

Bind Variable Peeking – Drives Me Nuts!

March 27, 2009, 10:23 pm

In the constant battle to provide consistent performance, Oracle took a giant step backwards with the 9i version by introducing an “Enhancement” called Bind Variable Peeking. I’ll explain what I mean in a minute, but first a bit of history.

When Oracle introduced histograms in 8i, they provided a mechanism for the optimizer to recognize that the values in a column were not distributed evenly. That is, in a table with 100 rows and 10 distinct values, the default assumption the optimizer would make, in the absence of a histogram, would be that no matter which value you picked – you would always get 100/10 or 10 rows back. Histograms let the optimizer know if that was not the case. The classic example would be 100 records with 2 distinct values where one value, say “Y”, occurred 99 times and the other value, say “N”, occurred only 1 time. So without a histogram the optimizer would always assume that whether you requested records with a “Y” or an “N”, you would get half the records back (100/2 = 50). Therefore you always want to do a full table scan as opposed to using an index on the column. A histogram, assuming it was accurate (we’ll come back to that later), would let the optimizer know that the distribution was not normal (i.e. not spread out evenly – also commonly called skewed) and that a “Y” would get basically the whole table, while an “N” would get only 1%. This would allow the optimizer to pick an appropriate plan regardless of which value was specified in the Where Clause.

So let’s consider the implications of that. Would that improve the response time for the query where the value was “Y”. The answer is no. In this simple case, the default costing algorithm is close enough and produces the same plan that the histogram produces. The full table scan takes just as long whether the optimizer thought it was getting 50 rows or 99 rows. But what about the case where we specified the value of “N”. In this case, with a histogram we would pick up the index on that column and presumably get a much better response time than the full table scan. This is an important point. Generally speaking it is only the outliers, the exceptional cases if you will, where the histogram really makes a difference.

So at first glance, all appeared well with the world. But there was a fly in the ointment. You had to use literals in your SQL statements for the optimizer to be able use the histograms. So you had to write your statements like this:

SELECT XYZ FROM TABLE1 WHERE COLUMN1 = ‘Y’;

SELECT XYZ FROM TABLE1 WHERE COLUMN1 = ‘N’;

Not a problem in our simple example, because you only have two possibilities. But consider a statement with 2 or 3 skewed columns, each with a couple of hundred distinct values. The possible combinations could quickly grow into the millions. Not a good thing for the shared pool.

Enter our star: Bind Variable Peeking, a new feature introduced in 9i that was added to allow the optimizer to peek at the value of bind variables and then use a histogram to pick an appropriate plan, just like it would do with literals. The problem with the new feature was that it only looked at the variables once, when the statement was parsed. So let’s make our simple example a little more realistic by assuming we have a 10 million row table where 99% have a value of “Y” and 1% have a value of “N”. So in our example, if the first time the statement was executed it was passed a “Y”, the full table scan plan would be locked in and it would be used until the statement had to be re-parsed, even if the value “N” was passed to it in subsequent executions.

So let’s consider the implication of that. When you get the full table scan plan (because you passed a “Y” the first time) it behaves the same way no matter what which value you pass subsequently. Always a full table scan, always the same amount of work and the same basic elapsed time. From a user standpoint that seems reasonable. The performance is consistent. (this is the way it would work without a histogram by the way) On the other hand, if the index plan gets picked because the parse occurs with a value of “N”, the executions where the value is “N” will be even faster than they were before, but the execution with a value of “Y” will be incredibly slow. This is not at all what the user expects. They expect the response time to be about the same every time they execute a piece of code. And this is the problem with bind variable peeking. It’s basically just Russian Roulette. It just depends on what value you happen to pass the statement when it’s parsed (which could be any execution by the way).

So is Bind Variable Peeking a feature or a bug? Well technically it’s not a bug because it works the way it’s designed. I just happen to believe that it was not a good decision to implement it that way. But what other choices did the optimizer development group have?

They could have evaluated the bind variables and re-parsed for every execution of every statement using bind variables. This would eliminate the advantage of having bind variables in the first place and would never work for high transaction systems. So it was basically not an option.
They could have just said no, and made us use literals in order to get the benefit of histograms (probably not a bad option in retrospect – the fact that they added _optim_peek_user_binds probably means that they decided later to give us that option via setting this hidden parameter).
They could have implemented a system where they could identify statements that might benefit from different plans based on the values of bind variables. Then peek at those variables for every execution of those “bind sensitive” statements (sound familiar? – that’s what they finally did in 11g with Adaptive Cursor Sharing).

So why is it such a pervasive problem? And I do believe it is a pervasive problem with 10g in particular. A couple of reasons come to mind:

We’ve been taught to always use bind variables. It’s a best practice which allows SQL statements to be shared, thus eliminating a great deal of work/contention. Using bind variable is an absolute necessity when building scalable high transaction rate systems. (of course that doesn’t mean that you can’t bend the rule occasionally)
10g changed it’s default stats gathering method to automatically gather histograms. So in a typical 10g database there are a huge number of histograms, many of them inappropriate (i.e. on columns that don’t have significantly skewed distributions) and many of them created with very small sample sizes causing the histograms to be less than accurate. Note that 11g appears to be better on both counts – that is to say, 11g seems to create fewer inappropriate histograms and seems to create much more accurate histograms with small sample sizes. But the jury is still out on 11g stats gathering as it has not been widely adopted at this point in time.
In my humble opinion, Bind Variable Peeking is not that well understood. When I talk to people about the issue, they usually have heard of it and have a basic idea what the problem is, but their behavior (in terms of the code they write and how they manage their databases) indicates that they don’t really have a good handle on the issue.

So what’s the best way to deal with this issue? Well recognizing that you have a problem is the first step to recovery, so being able to identify that you have a problem with plan stability is an appropriate first step. Direct queries against the Statspack or AWR tables are probably the best way to identify the issue. I’ve posted a couple of scripts that I find useful for this purpose previously – (unstable_plans.sql, awr_plan_stats.sql, awr_plan_change.sql). What you’re looking for is statements that flip flop back and forth between 2 or more plans. Note that there are other reasons for statements to change plans, but Bind Variable Peeking is the number one suspect. Here’s an example of their usage:

SQL> @unstable_plans
SQL> break on plan_hash_value on startup_time skip 1
SQL> select * from (
  2  select sql_id, sum(execs), min(avg_etime) min_etime, max(avg_etime) max_etime, stddev_etime/min(avg_etime) norm_stddev
  3  from (
  4  select sql_id, plan_hash_value, execs, avg_etime,
  5  stddev(avg_etime) over (partition by sql_id) stddev_etime
  6  from (
  7  select sql_id, plan_hash_value,
  8  sum(nvl(executions_delta,0)) execs,
  9  (sum(elapsed_time_delta)/decode(sum(nvl(executions_delta,0)),0,1,sum(executions_delta))/1000000) avg_etime
 10  -- sum((buffer_gets_delta/decode(nvl(buffer_gets_delta,0),0,1,executions_delta))) avg_lio
 11  from DBA_HIST_SQLSTAT S, DBA_HIST_SNAPSHOT SS
 12  where ss.snap_id = S.snap_id
 13  and ss.instance_number = S.instance_number
 14  and executions_delta > 0
 15  and elapsed_time_delta > 0
 16  group by sql_id, plan_hash_value
 17  )
 18  )
 19  group by sql_id, stddev_etime
 20  )
 21  where norm_stddev > nvl(to_number('&min_stddev'),2)
 22  and max_etime > nvl(to_number('&min_etime'),.1)
 23  order by norm_stddev
 24  /
Enter value for min_stddev:
Enter value for min_etime:

SQL_ID        SUM(EXECS)   MIN_ETIME   MAX_ETIME   NORM_STDDEV
------------- ---------- ----------- ----------- -------------
1tn90bbpyjshq         20         .06         .24        2.2039
0qa98gcnnza7h         16       20.62      156.72        4.6669
7vgmvmy8vvb9s        170         .04         .39        6.3705
32whwm2babwpt        196         .02         .26        8.1444
5jjx6dhb68d5v         51         .03         .47        9.3888
71y370j6428cb        155         .01         .38       19.7416
66gs90fyynks7        163         .02         .55       21.1603
b0cxc52zmwaxs        197         .02         .68       23.6470
31a13pnjps7j3        196         .02        1.03       35.1301
7k6zct1sya530        197         .53       49.88       65.2909

10 rows selected.

SQL> @find_sql
SQL> select sql_id, child_number, plan_hash_value plan_hash, executions execs,
  2  (elapsed_time/1000000)/decode(nvl(executions,0),0,1,executions) avg_etime,
  3  buffer_gets/decode(nvl(executions,0),0,1,executions) avg_lio,
  4  sql_text
  5  from v$sql s
  6  where upper(sql_text) like upper(nvl('&sql_text',sql_text))
  7  and sql_text not like '%from v$sql where sql_text like nvl(%'
  8  and sql_id like nvl('&sql_id',sql_id)
  9  order by 1, 2, 3
 10  /
Enter value for sql_text:
Enter value for sql_id: 0qa98gcnnza7h

SQL_ID         CHILD  PLAN_HASH        EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ------------ ------------- ------------ ------------------------------------------------------------
0qa98gcnnza7h      0  568322376            3          9.02      173,807 select avg(pk_col) from kso.skew where col1 > 0

SQL> @awr_plan_stats
SQL> break on plan_hash_value on startup_time skip 1
SQL> select sql_id, plan_hash_value, sum(execs) execs, sum(etime) etime, sum(etime)/sum(execs) avg_etime, sum(lio)/sum(execs) avg_lio
  2  from (
  3  select ss.snap_id, ss.instance_number node, begin_interval_time, sql_id, plan_hash_value,
  4  nvl(executions_delta,0) execs,
  5  elapsed_time_delta/1000000 etime,
  6  (elapsed_time_delta/decode(nvl(executions_delta,0),0,1,executions_delta))/1000000 avg_etime,
  7  buffer_gets_delta lio,
  8  (buffer_gets_delta/decode(nvl(buffer_gets_delta,0),0,1,executions_delta)) avg_lio
  9  from DBA_HIST_SQLSTAT S, DBA_HIST_SNAPSHOT SS
 10  where sql_id = nvl('&sql_id','4dqs2k5tynk61')
 11  and ss.snap_id = S.snap_id
 12  and ss.instance_number = S.instance_number
 13  and executions_delta > 0
 14  )
 15  group by sql_id, plan_hash_value
 16  order by 5
 17  /
Enter value for sql_id: 0qa98gcnnza7h

SQL_ID        PLAN_HASH_VALUE        EXECS          ETIME    AVG_ETIME        AVG_LIO
------------- --------------- ------------ -------------- ------------ --------------
0qa98gcnnza7h       568322376           14          288.7       20.620      172,547.4
0qa98gcnnza7h      3723858078            2          313.4      156.715   28,901,466.0

SQL> @awr_plan_change
SQL> break on plan_hash_value on startup_time skip 1
SQL> select ss.snap_id, ss.instance_number node, begin_interval_time, sql_id, plan_hash_value,
  2  nvl(executions_delta,0) execs,
  3  (elapsed_time_delta/decode(nvl(executions_delta,0),0,1,executions_delta))/1000000 avg_etime,
  4  (buffer_gets_delta/decode(nvl(buffer_gets_delta,0),0,1,executions_delta)) avg_lio
  5  from DBA_HIST_SQLSTAT S, DBA_HIST_SNAPSHOT SS
  6  where sql_id = nvl('&sql_id','4dqs2k5tynk61')
  7  and ss.snap_id = S.snap_id
  8  and ss.instance_number = S.instance_number
  9  and executions_delta > 0
 10  order by 1, 2, 3
 11  /
Enter value for sql_id: 0qa98gcnnza7h

   SNAP_ID   NODE BEGIN_INTERVAL_TIME            SQL_ID        PLAN_HASH_VALUE        EXECS    AVG_ETIME        AVG_LIO
---------- ------ ------------------------------ ------------- --------------- ------------ ------------ --------------
     21857      1 20-MAR-09 04.00.08.872 PM      0qa98gcnnza7h       568322376            1       31.528      173,854.0
     22027      1 27-MAR-09 05.00.08.006 PM      0qa98gcnnza7h                            1      139.141      156,807.0
     22030      1 27-MAR-09 08.00.15.380 PM      0qa98gcnnza7h                            3       12.451      173,731.0
     22031      1 27-MAR-09 08.50.04.757 PM      0qa98gcnnza7h                            2        8.771      173,731.0
     22032      1 27-MAR-09 08.50.47.031 PM      0qa98gcnnza7h      3723858078            1      215.876   28,901,466.0
     22033      1 27-MAR-09 08.57.37.614 PM      0qa98gcnnza7h       568322376            2        9.804      173,731.0
     22034      1 27-MAR-09 08.59.12.432 PM      0qa98gcnnza7h      3723858078            1       97.554   28,901,466.0
     22034      1 27-MAR-09 08.59.12.432 PM      0qa98gcnnza7h       568322376            2        8.222      173,731.5
     22035      1 27-MAR-09 09.12.00.422 PM      0qa98gcnnza7h                            3        9.023      173,807.3

9 rows selected.

So back to the question, what’s the best way to deal with the issue. In general, the best way to eliminate Bind Variable Peeking is as follows:

Only create histograms on skewed columns.
Use literals in where clauses on columns where you have histograms and want to use them. Note that it’s not necessary to use literals for every possible value of a skewed column. There may be only a few outlier values that result in significantly different plans. With a little extra code you can use literals for those values and bind variables for the rest of the values that don’t matter.
If you can’t modify the code, consider turning off Bind Variable Peeking by setting the _OPTIM_PEEK_USER_BINDS parameter to false. You won’t get the absolute best performance for every possible statement, but you will get much more consistent performance, which is, in my opinion, more important than getting the absolute best performance. Keep in mind that this is a hidden parameter and so should be carefully tested and probably discussed with Oracle support prior to implementing it in any production system.
You can also consider stronger methods of forcing the optimizer’s hand such as Outlines (see my previous posts on Unstable Plans and on Outlines). This option provides a quick method of locking in a single plan, but it’s not fool proof. Even with outlines, there is some possibility that the plan can change. Also note that this option is only palatable in situations where you have a relatively small number of problem SQL statements.
Upgrade to 11g and let Adaptive Cursor Sharing take care of all your problems for you (don’t bet on it working without a little effort – I’ll try to do a post on that soon).

In summary, using literals with histograms on columns with skewed data distributions are really the only effective way to deal with the issue and still retain the ability for the optimizer to choose the absolute best execution plans. However, if circumstances prevent this approach, there are other techniques that can be applied. These should be considered temporary fixes, but may work well while a longer term solution is contemplated. From a philosophical stand point, I strongly believe that consistency is more important than absolute speed. So when a choice must be made, I would always favor slightly reduced but consistent performance over anything that didn’t provide that consistency.

Your comments are always welcome. Please let me know what you think.

Category: Oracle, Plan Stability, Tuning | 28 Comments

Kerry Osborne's Oracle Blog

I’ll Gladly Pay You Tuesday for a Hamburger Today

Oracle Tech Day Presentation – 11g (Upgrade?)

Extremes

AWR DBtime script

Oracle 11g SQL Plan Management – SQL Plan Baselines

Do SQL Plan Baselines Use Hints? – Take 2

Hidden SQL – why can’t I find my SQL Text?

Do SQL Plan Baselines Use Hints?

SQL Profiles

Bind Variable Peeking – Drives Me Nuts!

Recent Posts

Recent Comments

Archives

Categories

Meta