Archive for the ‘Plan Stability’ Category.

Oracle 11g Adaptive Cursor Sharing (ACS)

Ah, the long awaited fix for the Bind Variable Peeking problem.

Adaptive Cursor Sharing (ACS) sounds great if you say it fast. Unfortunately it doesn’t work quite as well as I had hoped (at least as of 11.1.0.7). I was really hoping that we’d never have to worry about plan instability due to bind variable peeking again, but it seems that will remain an elusive goal for a while longer.

So the goal of this post is to provide some data about how ACS works based on observations of a couple of production systems. By the way, this post assumes you already have a basic understanding of how ACS works. There are some links at the bottom of this post to other references and in particular, this post is a good basic description of ACS.

So here’s some background info:
Observations were made on a version 10.2.0.4 RAC database that was being migrated to an 11.1.0.7 RAC database. Both are on Redhat Linux and similar Dell hardware. The version 10 database had several queries which suffered from bind variable peeking issues which were addressed with Outlines (while the developers considered using literals). Also, the statements were using real bind variables, not the fake kind that result from setting the cursor_sharing parameter. We were hoping that ACS would solve the bind variable peeking problem without manual intervention (i.e. without using Outlines or SQL Profiles or Baselines to lock in a specific plan).

I have used several scripts in this post:

unstable_plans.sql – this one shows statements that have multiple plans with large variance in execution time
awr_plan_change.sql – history of how the statement plan changes over time
awr_plan_stats.sql – aggregate statement stats (like elapsed_time, lio, etc) grouped by plan
find_sql_acs.sql – A queries v$sql and shows ACS related columns
mismatch3.sql – A variation of Dion Cho’s script to display data from v$sql_shared_cursor (see his post here)

The first couple of these scripts I discussed in a previous post on Plan Instability. I also did a previous post that is a basic overview of the Bind Variable Peeking issue, if you want a little more background info.

So here’s a look at what’s going on with that newly migrated system:

SQL*Plus: Release 11.1.0.7.0 - Production on Mon May 4 19:53:00 2009
 
Copyright (c) 1982, 2008, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
 
SQL> -- find statements with multiple plans with big differences in elapsed time
SQL>
SQL> @unstable_plans
Enter value for min_stddev:
Enter value for min_etime: 1
 
SQL_ID        SUM(EXECS)   MIN_ETIME   MAX_ETIME   NORM_STDDEV
------------- ---------- ----------- ----------- -------------
f2pz10qx59awc          2       11.75       45.31        2.0197
5mh2127hkzbtr         70        2.64       18.06        3.0272
0hur96bxr18jn         24        1.65        9.14        3.1981
76gduk3urk91d          6        9.75       57.82        3.4880
cqxkwk9hp8rpf         31        7.18       43.77        3.6015
3u2bxt4y0740a         17        0.49        4.19        4.1316
af6j2dyzawp7w         78        6.83       60.31        4.4492
2mzzy3u2rtgqx         93        4.55       34.13        4.6025
6vb3gxdfzbhuz         33        0.63        5.21        5.1502
gc69y3vv5ws31         72        0.87        7.64        5.5128
b6zcb86sub9gv          2        0.95        8.74        5.8349
6pdbja617d2g7          2        2.24       23.55        6.7132
cc7yv55yn2wzm          5        0.26        3.05        7.4533
ach69kkyac77x          5        0.23        2.65        7.5202
09xqs3yhmnnc2         82        0.31        4.47        7.5534
3zqwzanpnddt1          2        0.69        8.75        8.2026
70zr1cg7ssfkw          2        0.71        9.04        8.2487
664tcunn5nks9          3        0.36        4.83        8.7858
39f6bx40awrms          2        0.74       10.15        9.0125
5z7v0n6amf8yx        116        9.06      125.53        9.0942
3q67n6qh7tcqv          2        0.08        1.20        9.2843
30yrz0qst88mj         68        0.21        2.97        9.3903
a1mk6hw6s20an         19        0.26        3.78        9.6454
26070mzfxps8d         72        0.18        2.75       10.0027
byznf81kmwumv          2        0.75       11.45       10.0407
1fj1qcqtmgmd1          2        0.54        9.33       11.4092
fg96c4y95u3s8          2        0.49        9.18       12.6745
chxn6vwj02xj4          2        0.65       12.27       12.7001
2xg8psmr3k7vz          2        0.76       14.38       12.7429
grysxv4s2hhkb          2        0.63       12.10       12.8338
2bfs79h84xsch        148        5.65      108.87       12.9179
4qz01hjwat4u3         39        0.27        5.50       13.6528
3wj5ngbv1sa9x          2        0.40        8.70       14.7151
g68szg2ydq6dg         14        0.26        7.48       19.4587
9vt6aaf5xcmh2        381        0.67       24.90       19.9706
cs8ng5sv7jgj4          4        1.12       35.70       21.8969
12a2xbmwn5v6z          7        0.23        8.08       23.7235
d2x7g9wr30v33          4       12.45      638.60       25.4008
4ymph98r42nbj        192        0.11        4.43       26.9188
1uby0zcw55cnp          4        0.63       31.22       34.3088
4jhx8qrkzztqg         15        1.22       66.07       37.6215
7xu6y0cvs55gh         34        0.44       30.08       37.9640
6hstnb2fghg76         77        0.65       49.01       39.2329
gzwgbycgs0fx9        265        0.08        7.37       47.4965
f028rmkp3qjvq         47        0.03        1.99       48.5307
7s1nc9wa2pc4r          2        0.19       15.53       57.6592
f2tm6xrb26y51          2        0.17       14.80       61.7260
0xz1j5y313f3c        206        0.02        2.38       71.6444
3dhwvfmkjzwtv         72        1.24      154.56       87.6684
fb90nawgwx3mj         37        0.34       51.96      108.3873
b7d0d3gu5fvqp         22        0.13       33.85      137.5321
ghdy23pftj44q       1100        0.02        5.93      143.7765
7v2jtb3u02qx6         10        0.43      139.36      172.2951
fw9ntwzhygmcq         87        0.32      100.44      220.3149
gxv6umvct0xsv       4188        0.84      333.79      225.1338
6jdu11g4zzjkh         42        0.41      137.88      234.5440
2wrngntk7v0st        183        0.11       50.61      333.9597
27wha8b8s21xw       1112        0.01        7.08      408.3026
dhmvcrmgdk3sn        881        0.05       46.99      620.1920
6u4c54k36ngwf       4329        0.00        4.39      656.4506
8jt9yh7jf8tn8       4788        0.09      160.35    1,251.5684
4fy7uznh9zz27       4723        0.03       47.16    1,256.8971
2ddh1h012t8au         12        0.03       67.62    1,491.8099
fw9hrrv37hb3v          5        0.04      100.86    1,503.5361
68wg4gjb51dsh       4411        0.01       16.44    1,539.4726
7d407h5cqmv81       4202        0.00      180.63   28,768.6107
 
66 rows selected.

Please note that the unstable_plan.sql script is not guaranteed to find every statement that suffers from plan instability, because some of the good plans never make it into the AWR tables. This is due to the fact that AWR only grabs the “worst” statements in the various categories that AWR reports (i.e. elapsed time, cpu time, gets, reads, and executions). But you could argue that it will find the most important ones, because even when they run fast they still end up in the top of at least one category. Anyway, it’s obvious that there are many statements that are running with multiple plans with wide variances in their execution time.

So let’s look a little closer:

SQL> -- Let's check out the worst one on the list
SQL> --   first let's see how many plans have been used and how they stack up
SQL>
SQL> @awr_plan_stats
Enter value for sql_id: 7d407h5cqmv81
Enter value for snap_id:
 
SQL_ID        PLAN_HASH_VALUE        EXECS          ETIME    AVG_ETIME        AVG_LIO
------------- --------------- ------------ -------------- ------------ --------------
7d407h5cqmv81      4289789142          840            2.4        0.003          273.7
7d407h5cqmv81      1007536393          551          116.2        0.211       14,563.1
7d407h5cqmv81      1723168170        2,852       44,716.2       15.679      255,603.3
7d407h5cqmv81      2337118685            1          180.6      180.629    6,495,990.0
 
SQL> -- So the best plan was executed 840 times and took .003 seconds on average 
SQL> -- The worst one was only executed once but took 180.6 seconds and did about 6.5M lios
SQL>
SQL>  -- Now let's see if it looks like bind variable peeking - (i.e. the plan is flip-flopping)
SQL>
SQL> @awr_plan_change
Enter value for sql_id: 7d407h5cqmv81
 
   SNAP_ID   NODE BEGIN_INTERVAL_TIME            SQL_ID        PLAN_HASH_VALUE        EXECS    AVG_ETIME        AVG_LIO
---------- ------ ------------------------------ ------------- --------------- ------------ ------------ --------------
      1732      2 22-APR-09 01.00.06.582 PM      7d407h5cqmv81      1007536393            1        1.478        5,269.0
      1733      2 22-APR-09 02.00.06.425 PM      7d407h5cqmv81                            8        1.635       14,580.0
      1734      2 22-APR-09 03.00.01.366 PM      7d407h5cqmv81                            6        0.196       14,580.0
      1734      3 22-APR-09 03.00.00.710 PM      7d407h5cqmv81      1723168170            2        8.891      228,740.5
      1735      2 22-APR-09 04.00.19.942 PM      7d407h5cqmv81      1007536393            3        0.189       14,580.0
      1736      2 22-APR-09 05.00.06.982 PM      7d407h5cqmv81                           63        0.188       14,580.0
      1736      3 22-APR-09 05.00.06.350 PM      7d407h5cqmv81      1723168170           22        9.564      236,902.3
      1737      2 22-APR-09 06.00.03.582 PM      7d407h5cqmv81      1007536393          244        0.187       14,580.0
      1737      3 22-APR-09 06.00.04.267 PM      7d407h5cqmv81      1723168170           76        9.505      236,959.9
      1738      2 22-APR-09 07.00.28.278 PM      7d407h5cqmv81      1007536393          226        0.188       14,580.0
      1738      3 22-APR-09 07.00.28.926 PM      7d407h5cqmv81      1723168170           71        9.395      240,745.0
      1739      2 22-APR-09 08.00.09.681 PM      7d407h5cqmv81                           81       11.114      253,481.2
      1739      3 22-APR-09 08.00.10.422 PM      7d407h5cqmv81                           22        9.866      249,882.3
      1740      2 22-APR-09 09.00.43.176 PM      7d407h5cqmv81                          215       13.682      256,467.0
      1740      3 22-APR-09 09.00.43.910 PM      7d407h5cqmv81                           80       12.605      252,606.0
      1741      2 22-APR-09 10.00.16.298 PM      7d407h5cqmv81                          367       20.209      256,157.1
      1741      3 22-APR-09 10.00.17.072 PM      7d407h5cqmv81                          214       15.039      257,287.3
      1742      2 22-APR-09 11.00.28.426 PM      7d407h5cqmv81                          376       17.855      257,066.5
      1742      3 22-APR-09 11.00.27.666 PM      7d407h5cqmv81                          178       13.437      255,902.4
      1743      2 23-APR-09 12.00.16.137 AM      7d407h5cqmv81                          352       18.639      257,326.6
      1743      3 23-APR-09 12.00.15.386 AM      7d407h5cqmv81                          182       13.787      258,192.2
      1744      2 23-APR-09 01.00.03.520 AM      7d407h5cqmv81                          285       16.275      256,335.3
      1744      3 23-APR-09 01.00.02.838 AM      7d407h5cqmv81                           94       13.492      258,508.9
      1745      2 23-APR-09 02.00.21.107 AM      7d407h5cqmv81                          132       14.876      258,913.5
      1745      3 23-APR-09 02.00.20.482 AM      7d407h5cqmv81                          103       13.113      256,821.0
      1750      2 23-APR-09 07.00.33.920 AM      7d407h5cqmv81      4289789142           77        0.002          267.0
      1753      2 23-APR-09 09.34.00.440 AM      7d407h5cqmv81                          278        0.001          267.0
      1754      2 23-APR-09 11.00.06.058 AM      7d407h5cqmv81                           96        0.001          267.0
      1754      3 23-APR-09 11.00.05.422 AM      7d407h5cqmv81                           42        0.000          267.0
      1755      2 23-APR-09 11.17.45.649 AM      7d407h5cqmv81                          283        0.001          267.0
      1756      2 23-APR-09 12.00.35.977 PM      7d407h5cqmv81                           62        0.000          267.0
      1894      3 29-APR-09 06.00.09.823 AM      7d407h5cqmv81                            1        0.444        2,777.0
      1900      2 29-APR-09 12.00.36.427 PM      7d407h5cqmv81      2337118685            1      180.629    6,495,990.0
      1943      3 01-MAY-09 07.00.02.507 AM      7d407h5cqmv81      4289789142            1        1.199        3,426.0
 
34 rows selected.
 
SQL> -- It does look like a typical bind variable peeking flip-flop pattern
SQL>
SQL> -- Now let's see if it's still in the shared pool and if it's bind aware
SQL>
SQL> @find_sql_acs
Enter value for sql_text:
Enter value for sql_id: 7d407h5cqmv81
Enter value for is_bind_aware:
 
SQL_ID         CHILD PLAN_HASH_VALUE IBS IBA ISH      EXECS ROWS_PROCESSED  AVG_ETIME    AVG_CPU    AVG_PIO    AVG_LIO SQL_TEXT
------------- ------ --------------- --- --- --- ---------- -------------- ---------- ---------- ---------- ---------- ------------------------------
7d407h5cqmv81      0      4289789142 N   N   Y            2              2       0.62       0.07     315.50      1,758 SELECT COUNT(*) as total_recor
7d407h5cqmv81      1      4289789142 N   N   Y            2              2       0.03       0.01       1.00        264
 
SQL> -- It is in the shared pool, but it is not bind aware
SQL>

So the previous example showed a statement that had several plans, one of which definitely had better performance characteristics. But since there was only one plan in the shared pool at the time we examined it, there is is no reason to expect that Oracle would have recognized that the statement should be bind aware. This brings up an interesting point. It does not appear that (at least as of 11.1.0.7) Oracle keeps track of bind sensitivity of a statement persistently. That is to say, if the statement gets flushed for any reason, the optimizer appears to completely forget about any analysis it has done up to that point. Thus the painful (for the user) process of discovering which sets of bind variables should go with which plan starts all over.

So here’s another statement:

SQL> -- another example from the list of unstable_plans
SQL>
SQL> @awr_plan_stats
Enter value for sql_id: 8jt9yh7jf8tn8
 
SQL_ID        PLAN_HASH_VALUE        EXECS          ETIME    AVG_ETIME        AVG_LIO
------------- --------------- ------------ -------------- ------------ --------------
8jt9yh7jf8tn8      1093407144        4,818          481.5        0.100        2,818.6
8jt9yh7jf8tn8      4076066623           21        3,269.3      155.679    5,901,988.0
 
SQL> @awr_plan_change
Enter value for sql_id: 8jt9yh7jf8tn8
 
   SNAP_ID   NODE BEGIN_INTERVAL_TIME            SQL_ID        PLAN_HASH_VALUE        EXECS    AVG_ETIME        AVG_LIO
---------- ------ ------------------------------ ------------- --------------- ------------ ------------ --------------
      1785      3 24-APR-09 05.00.13.361 PM      8jt9yh7jf8tn8      1093407144            6        1.102        2,872.7
      1786      2 24-APR-09 06.00.02.510 PM      8jt9yh7jf8tn8                          158        0.024        2,873.0
      1786      3 24-APR-09 06.00.03.170 PM      8jt9yh7jf8tn8                          223        0.023        2,873.0
      1787      2 24-APR-09 07.00.30.171 PM      8jt9yh7jf8tn8                          749        0.020        2,873.0
      1787      3 24-APR-09 07.00.30.935 PM      8jt9yh7jf8tn8                          873        0.019        2,873.0
      1788      2 24-APR-09 08.00.03.359 PM      8jt9yh7jf8tn8                          726        0.020        2,873.9
      1788      3 24-APR-09 08.00.04.148 PM      8jt9yh7jf8tn8                          871        0.020        2,873.9
      1789      2 24-APR-09 09.00.28.203 PM      8jt9yh7jf8tn8                          373        0.016        2,874.0
      1789      3 24-APR-09 09.00.27.481 PM      8jt9yh7jf8tn8                          566        0.016        2,874.0
      1892      2 29-APR-09 04.00.02.385 AM      8jt9yh7jf8tn8                            1        2.613        3,811.0
      1894      2 29-APR-09 06.00.09.154 AM      8jt9yh7jf8tn8                            2        0.462          733.0
      1894      3 29-APR-09 06.00.09.823 AM      8jt9yh7jf8tn8                            2        1.056          847.0
      1895      2 29-APR-09 07.00.00.226 AM      8jt9yh7jf8tn8                            7        1.714        2,869.6
      1895      3 29-APR-09 07.00.00.921 AM      8jt9yh7jf8tn8                            2        0.856        1,208.0
      1896      2 29-APR-09 08.00.20.446 AM      8jt9yh7jf8tn8                            9        1.635        2,103.0
      1897      2 29-APR-09 09.00.09.367 AM      8jt9yh7jf8tn8                            2        8.179        8,529.0
      1897      3 29-APR-09 09.00.10.115 AM      8jt9yh7jf8tn8                           17        1.714        3,416.5
      1898      2 29-APR-09 10.00.43.551 AM      8jt9yh7jf8tn8                            5        2.553        2,733.2
      1898      3 29-APR-09 10.00.42.788 AM      8jt9yh7jf8tn8                            6        3.751        4,484.2
      1899      3 29-APR-09 11.00.10.447 AM      8jt9yh7jf8tn8                            7        1.742        3,284.4
      1900      3 29-APR-09 12.00.35.788 PM      8jt9yh7jf8tn8                            2        1.199          966.0
      1901      2 29-APR-09 01.00.18.515 PM      8jt9yh7jf8tn8                            8        2.345        2,409.6
      1902      3 29-APR-09 02.00.15.910 PM      8jt9yh7jf8tn8                            2        3.941        2,649.5
      1905      2 29-APR-09 05.00.02.254 PM      8jt9yh7jf8tn8                            1        0.887        1,230.0
      1918      2 30-APR-09 06.00.09.089 AM      8jt9yh7jf8tn8                            1        0.653        1,248.0
      1918      3 30-APR-09 06.00.08.403 AM      8jt9yh7jf8tn8                            2        0.421          485.5
      1919      2 30-APR-09 07.00.28.148 AM      8jt9yh7jf8tn8                            1        1.152        1,242.0
      1920      2 30-APR-09 08.00.03.733 AM      8jt9yh7jf8tn8                            4        3.273        3,200.3
      1920      3 30-APR-09 08.00.04.389 AM      8jt9yh7jf8tn8                           12        2.491        3,314.2
      1921      2 30-APR-09 09.00.10.125 AM      8jt9yh7jf8tn8                            5        3.947        3,333.4
      1921      3 30-APR-09 09.00.10.838 AM      8jt9yh7jf8tn8                            2        2.416        1,769.5
      1922      3 30-APR-09 10.00.25.754 AM      8jt9yh7jf8tn8      4076066623            2       54.237    2,291,432.5
      1923      2 30-APR-09 11.00.17.345 AM      8jt9yh7jf8tn8      1093407144            2        0.812          975.0
      1923      3 30-APR-09 11.00.18.032 AM      8jt9yh7jf8tn8      4076066623            3      134.031      933,124.3
      1924      3 30-APR-09 12.00.15.448 PM      8jt9yh7jf8tn8                            3      227.009    6,987,169.3
      1926      2 30-APR-09 02.00.11.921 PM      8jt9yh7jf8tn8      1093407144            8        0.818        1,574.5
      1926      3 30-APR-09 02.00.11.174 PM      8jt9yh7jf8tn8      4076066623            2      175.709    8,963,417.0
      1927      2 30-APR-09 03.00.24.923 PM      8jt9yh7jf8tn8      1093407144            4        1.344        1,068.8
      1927      3 30-APR-09 03.00.24.306 PM      8jt9yh7jf8tn8      4076066623            5      156.378   10,159,992.0
      1928      2 30-APR-09 04.00.30.061 PM      8jt9yh7jf8tn8      1093407144            6        0.923        1,225.8
      1928      3 30-APR-09 04.00.29.416 PM      8jt9yh7jf8tn8      4076066623            1      180.488    2,150,190.0
      1930      3 30-APR-09 06.00.37.119 PM      8jt9yh7jf8tn8                            2      180.371    8,255,881.5
      1934      3 30-APR-09 10.00.12.055 PM      8jt9yh7jf8tn8                            1      180.491    3,102,577.0
      1939      2 01-MAY-09 03.00.31.764 AM      8jt9yh7jf8tn8      1093407144           21        0.825        1,041.8
      1939      3 01-MAY-09 03.00.32.453 AM      8jt9yh7jf8tn8                            4        0.575        1,211.8
      1944      3 01-MAY-09 08.00.15.686 AM      8jt9yh7jf8tn8                            6        1.328        1,788.3
      1946      2 01-MAY-09 10.00.17.105 AM      8jt9yh7jf8tn8                            1        1.170        2,411.0
      1946      3 01-MAY-09 10.00.16.458 AM      8jt9yh7jf8tn8                            4        2.041        2,414.3
      1947      3 01-MAY-09 11.00.14.385 AM      8jt9yh7jf8tn8                           10        1.725        2,937.1
      1948      3 01-MAY-09 12.00.08.928 PM      8jt9yh7jf8tn8                            3        2.232        3,415.7
      1987      2 03-MAY-09 03.00.03.550 AM      8jt9yh7jf8tn8                            7        1.029          901.0
      1990      3 03-MAY-09 06.00.07.641 AM      8jt9yh7jf8tn8                            3        1.225        1,465.7
      1991      3 03-MAY-09 07.00.05.445 AM      8jt9yh7jf8tn8                           26        0.370          710.5
      1992      2 03-MAY-09 08.00.20.010 AM      8jt9yh7jf8tn8                            6        0.213          685.7
      1992      3 03-MAY-09 08.00.19.366 AM      8jt9yh7jf8tn8                            3        0.658          883.0
      1993      2 03-MAY-09 09.00.04.983 AM      8jt9yh7jf8tn8                            8        0.769          950.9
      1996      2 03-MAY-09 12.00.19.205 PM      8jt9yh7jf8tn8                            2        0.101          861.5
      2015      3 04-MAY-09 07.00.13.869 AM      8jt9yh7jf8tn8                            4        0.376          854.5
      2016      3 04-MAY-09 08.00.09.477 AM      8jt9yh7jf8tn8                            6        0.143          571.0
      2019      2 04-MAY-09 11.00.11.317 AM      8jt9yh7jf8tn8                           12        0.937        1,352.1
      2019      3 04-MAY-09 11.00.10.691 AM      8jt9yh7jf8tn8                           10        1.612        1,341.9
      2019      3 04-MAY-09 11.00.10.691 AM      8jt9yh7jf8tn8      4076066623            1       41.592    3,942,672.0
      2020      2 04-MAY-09 12.00.06.355 PM      8jt9yh7jf8tn8      1093407144           15        1.037        1,734.6
      2020      3 04-MAY-09 12.00.06.919 PM      8jt9yh7jf8tn8      4076066623            1      181.044    1,764,007.0
      2022      2 04-MAY-09 02.00.26.599 PM      8jt9yh7jf8tn8      1093407144            2        2.214        2,780.5
 
65 rows selected.
 
SQL> -- typical bind variable peeking pattern
SQL> -- let's look at just one node - it'll be a little more clear
SQL> l8
  8* and ss.instance_number = S.instance_number
SQL> i
  9i and s.instance_number = &inst
 10i
SQL>
SQL> /
Enter value for sql_id: 8jt9yh7jf8tn8
Enter value for inst: 3
 
   SNAP_ID   NODE BEGIN_INTERVAL_TIME            SQL_ID        PLAN_HASH_VALUE        EXECS    AVG_ETIME        AVG_LIO
---------- ------ ------------------------------ ------------- --------------- ------------ ------------ --------------
      1785      3 24-APR-09 05.00.13.361 PM      8jt9yh7jf8tn8      1093407144            6        1.102        2,872.7
      1786      3 24-APR-09 06.00.03.170 PM      8jt9yh7jf8tn8                          223        0.023        2,873.0
      1787      3 24-APR-09 07.00.30.935 PM      8jt9yh7jf8tn8                          873        0.019        2,873.0
      1788      3 24-APR-09 08.00.04.148 PM      8jt9yh7jf8tn8                          871        0.020        2,873.9
      1789      3 24-APR-09 09.00.27.481 PM      8jt9yh7jf8tn8                          566        0.016        2,874.0
      1894      3 29-APR-09 06.00.09.823 AM      8jt9yh7jf8tn8                            2        1.056          847.0
      1895      3 29-APR-09 07.00.00.921 AM      8jt9yh7jf8tn8                            2        0.856        1,208.0
      1897      3 29-APR-09 09.00.10.115 AM      8jt9yh7jf8tn8                           17        1.714        3,416.5
      1898      3 29-APR-09 10.00.42.788 AM      8jt9yh7jf8tn8                            6        3.751        4,484.2
      1899      3 29-APR-09 11.00.10.447 AM      8jt9yh7jf8tn8                            7        1.742        3,284.4
      1900      3 29-APR-09 12.00.35.788 PM      8jt9yh7jf8tn8                            2        1.199          966.0
      1902      3 29-APR-09 02.00.15.910 PM      8jt9yh7jf8tn8                            2        3.941        2,649.5
      1918      3 30-APR-09 06.00.08.403 AM      8jt9yh7jf8tn8                            2        0.421          485.5
      1920      3 30-APR-09 08.00.04.389 AM      8jt9yh7jf8tn8                           12        2.491        3,314.2
      1921      3 30-APR-09 09.00.10.838 AM      8jt9yh7jf8tn8                            2        2.416        1,769.5
      1922      3 30-APR-09 10.00.25.754 AM      8jt9yh7jf8tn8      4076066623            2       54.237    2,291,432.5
      1923      3 30-APR-09 11.00.18.032 AM      8jt9yh7jf8tn8                            3      134.031      933,124.3
      1924      3 30-APR-09 12.00.15.448 PM      8jt9yh7jf8tn8                            3      227.009    6,987,169.3
      1926      3 30-APR-09 02.00.11.174 PM      8jt9yh7jf8tn8                            2      175.709    8,963,417.0
      1927      3 30-APR-09 03.00.24.306 PM      8jt9yh7jf8tn8                            5      156.378   10,159,992.0
      1928      3 30-APR-09 04.00.29.416 PM      8jt9yh7jf8tn8                            1      180.488    2,150,190.0
      1930      3 30-APR-09 06.00.37.119 PM      8jt9yh7jf8tn8                            2      180.371    8,255,881.5
      1934      3 30-APR-09 10.00.12.055 PM      8jt9yh7jf8tn8                            1      180.491    3,102,577.0
      1939      3 01-MAY-09 03.00.32.453 AM      8jt9yh7jf8tn8      1093407144            4        0.575        1,211.8
      1944      3 01-MAY-09 08.00.15.686 AM      8jt9yh7jf8tn8                            6        1.328        1,788.3
      1946      3 01-MAY-09 10.00.16.458 AM      8jt9yh7jf8tn8                            4        2.041        2,414.3
      1947      3 01-MAY-09 11.00.14.385 AM      8jt9yh7jf8tn8                           10        1.725        2,937.1
      1948      3 01-MAY-09 12.00.08.928 PM      8jt9yh7jf8tn8                            3        2.232        3,415.7
      1990      3 03-MAY-09 06.00.07.641 AM      8jt9yh7jf8tn8                            3        1.225        1,465.7
      1991      3 03-MAY-09 07.00.05.445 AM      8jt9yh7jf8tn8                           26        0.370          710.5
      1992      3 03-MAY-09 08.00.19.366 AM      8jt9yh7jf8tn8                            3        0.658          883.0
      2015      3 04-MAY-09 07.00.13.869 AM      8jt9yh7jf8tn8                            4        0.376          854.5
      2016      3 04-MAY-09 08.00.09.477 AM      8jt9yh7jf8tn8                            6        0.143          571.0
      2019      3 04-MAY-09 11.00.10.691 AM      8jt9yh7jf8tn8      4076066623            1       41.592    3,942,672.0
      2019      3 04-MAY-09 11.00.10.691 AM      8jt9yh7jf8tn8      1093407144           10        1.612        1,341.9
      2020      3 04-MAY-09 12.00.06.919 PM      8jt9yh7jf8tn8      4076066623            1      181.044    1,764,007.0
      2031      3 04-MAY-09 11.00.03.519 PM      8jt9yh7jf8tn8      1093407144            1        0.737          482.0
      2039      3 05-MAY-09 07.00.27.610 AM      8jt9yh7jf8tn8      4076066623            5       42.900    4,295,251.8
      2041      3 05-MAY-09 09.00.09.829 AM      8jt9yh7jf8tn8                            2        3.282    1,968,698.5
      2044      3 05-MAY-09 12.00.16.920 PM      8jt9yh7jf8tn8      1093407144            2        1.535          844.5
      2045      3 05-MAY-09 01.00.08.637 PM      8jt9yh7jf8tn8                            2        2.384        1,683.5
 
41 rows selected.
 
SQL> -- the plans are clearly flip-flopping
SQL> -- and the performance of plan 1093407144 is clearly orders of magnitude better
SQL> 
SQL> -- Let's see what's in the shared pool
SQL>
SQL> @find_sql_acs
Enter value for sql_text:
Enter value for sql_id: 8jt9yh7jf8tn8
Enter value for is_bind_aware:
 
SQL_ID         CHILD PLAN_HASH_VALUE IBS IBA ISH      EXECS ROWS_PROCESSED  AVG_ETIME    AVG_CPU    AVG_PIO    AVG_LIO SQL_TEXT
------------- ------ --------------- --- --- --- ---------- -------------- ---------- ---------- ---------- ---------- -----------------------------------
8jt9yh7jf8tn8      0      4076066623 N   N   Y           19              6     160.35      29.28 ##########  6,222,898 SELECT row_order, cdr_id, site_id,
8jt9yh7jf8tn8      1      1093407144 N   N   Y           61              0       1.92       0.13     565.70      2,860 SELECT row_order, cdr_id, site_id,
8jt9yh7jf8tn8      2                 N   N   Y           55              0       1.01       0.06     252.09      1,555 SELECT row_order, cdr_id, site_id,
 
SQL> -- So there are children with both plans in the shared the pool (135 execs total)
SQL> -- But the cursors are not marked as bind aware
SQL> -- So why are there multiple child cursors?
SQL> -- (using a modified version of Dion Cho's script)
SQL>
SQL> @mismatch3
Enter value for sql_id: 8jt9yh7jf8tn8
SQL_ID                         = 8jt9yh7jf8tn8
CHILD_NUMBER                   = 1
--------------------------------------------------
SQL_ID                         = 8jt9yh7jf8tn8
CHILD_NUMBER                   = 0
ROLL_INVALID_MISMATCH          = Y
--------------------------------------------------
SQL_ID                         = 8jt9yh7jf8tn8
CHILD_NUMBER                   = 2
ROLL_INVALID_MISMATCH          = Y
--------------------------------------------------
 
PL/SQL procedure successfully completed.
 
SQL> -- none of the bind related flags show up, so it doesn't look promising for ACS
SQL> --

Brief Digression: The ROLL_INVALID_MISMATCH flag has to do with changing stats using AUTO_INVALIDATE which does not cause cursors to be immediately invalidated, but instead let’s them be invalidated over a rolling window (_optimizer_invalidation_period which defaults to 5 hours).

So then I got to wondering if any statements were being marked as bind aware.

SQL> @find_sql_acs
Enter value for sql_text:
Enter value for sql_id:
Enter value for is_bind_aware: Y
 
SQL_ID         CHILD PLAN_HASH_VALUE IBS IBA ISH      EXECS ROWS_PROCESSED  AVG_ETIME    AVG_CPU    AVG_PIO    AVG_LIO SQL_TEXT
------------- ------ --------------- --- --- --- ---------- -------------- ---------- ---------- ---------- ---------- ---------------
0qvgb3dyfg539      1       722236007 Y   Y   Y            2              0        .15        .02      23.00      3,487 SELECT row_orde
17uuqnvxmzxhj      1      3038781757 Y   Y   Y           31             31        .17        .01      43.74      1,126 SELECT COUNT(*)
                   3      3038781757 Y   Y   N           21             21        .02        .02       1.33      7,290
                   4      3038781757 Y   Y   Y           52             52        .23        .02     120.29      2,046
                   6      3038781757 Y   Y   Y           51             51        .00        .00        .10        284
34x6683rpwtxa      4       722236007 Y   Y   Y           18            164        .01        .00        .17        498 SELECT row_orde
3tfx8fzp64vkb      1      3038781757 Y   Y   Y            2              2        .01        .01        .00      1,178 SELECT COUNT(*)
4vb86f36xqc50      1      2983410489 Y   Y   Y           62           1683        .12        .01      53.76        960 SELECT row_orde
                   4      2983410489 Y   Y   Y            7            163        .69        .07     328.43        723
58p0j1q6rmv34      1      1144901783 Y   Y   Y            2              2        .02        .00        .00        335 SELECT COUNT(*)
5mxqphz5qfs4d      1      1144901783 Y   Y   Y            2              2        .02        .00        .00         54 SELECT COUNT(*)
dt1v1cmua9cnq      1      4076066623 Y   Y   Y            8             37       3.47       3.41      15.63  2,233,144 SELECT row_orde
ftxa99d89yzz0      1      4289789142 Y   Y   Y            2              2        .01        .00        .00        157 SELECT COUNT(*)
g375mcpc30dy5      2      1690109023 Y   Y   N            1             10        .03        .01        .00        767 SELECT row_orde
                   3      1690109023 Y   Y   Y            5             24        .02        .00       2.00        362
 
15 rows selected.
 
SQL> --

Sure enough, there are a few (10 statements to be exact with a total of 15 child cursors) that are marked as Bind Aware. However, there are only 3 statements that have multiple child cursors (sql_id: 17uuqnvxmzxhj, 4vb86f36xqc50, g375mcpc30dy5). And for each of those statements, the child cursors all have the same plan (sql_id: 17uuqnvxmzxhj, for example, has 3 children all using plan_hash_value: 3038781757). So even though they are marked Bind Aware, and have multiple child cursors, they are not producing distinct plans for those statements. So even the ones that the optimizer has noticed, do not appear to be providing us any benefit.

But back to the original question. We have many statements that are suffering from bind variable peeking issues that are apparently not being noticed. Why is that? Well I got to thinking about why ACS might be ignoring the statements that I know have this issue. One of the characteristics of these statements is that they have a relatively large number of bind variables. Some of them have as many as a couple of hundred. So I let my mind wander off into programmer land. How would the guys writing the optimizer code deal with a situation where they had a virtually unlimited number of bind variables. My first thought was to just put a hard limit on the number of variables to handle. Histograms have a relatively small number of potential buckets, and since ACS uses a histogram approach, it stands to reason the programmers might have put in a relatively small limit for their first pass. So I went off to test this idea (since this post is already so long I’ll cut right to the chase). There does appear to be a hard limit of 14 bind variables and statements that have more than that are apparently not evaluated for bind sensitivity. Here’s a test case:

> !sql
sqlplus "/ as sysdba"
 
SQL*Plus: Release 11.1.0.7.0 - Production on Mon May 11 11:16:06 2009
 
Copyright (c) 1982, 2008, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
 
SQL> !cat t14.sql
set echo off
variable v1 number
variable v2 varchar2(30)
variable v3 varchar2(30)
variable v4 varchar2(30)
variable v5 varchar2(30)
variable v6 varchar2(30)
variable v7 varchar2(30)
variable v8 varchar2(30)
variable v9 varchar2(30)
variable v9 varchar2(30)
variable v10 varchar2(30)
variable v11 varchar2(30)
variable v12 varchar2(30)
variable v13 varchar2(30)
variable v14 varchar2(30)
variable v15 varchar2(30)
variable v16 varchar2(30)
variable v17 varchar2(30)
variable v18 varchar2(30)
variable v19 varchar2(30)
exec :v1 := 999999;
exec :v2 := 'TESTING'
exec :v3 := '01-jan-08'
exec :v4 := '01-jan-09'
exec :v5 := 'Y'
exec :v6 := 'X'
exec :v7 := 'y'
exec :v8 := 'Z'
exec :v9 := 'a'
exec :v10 := 'b'
exec :v10 := 'c'
exec :v11 := 'd'
exec :v12 := 'e'
exec :v13 := 'f'
exec :v14 := 'g'
exec :v15 := 'h'
exec :v16 := 'i'
exec :v17 := 'j'
exec :v18 := 'k'
exec :v19 := 'l'
set echo on
select avg(pk_col) from kso.little_skew
where col1 = :v1
and col2 = :v2
and col3 between :v3 and :v4
and col4 in (:v5,:v6,:v7,:v8,:v9,:v10,:v11,:v12,:v13,:v14)
/
 
SQL> @flush_pool
SQL> @t14
SQL> @t14
SQL> @t14
SQL> @t14
SQL> 
SQL> -- 4 execs of the first bind variable set, now let's use a different set of bind variables
SQL> -- t14a.sql is the same except v1=1 - which (due to histogram) wants to use a different plan
SQL>
SQL> @t14a
SQL> @t14a
SQL> @t14a
SQL> @t14a
SQL> @t14a
SQL> @t14a
SQL> @t14a
SQL>
SQL> -- 7 execs with the new set of bind variables, should give it plenty of info to make it bind aware
SQL> -- now a few more with the original set of bind variables
SQL>
SQL> @t14
SQL> @t14
SQL> @t14
SQL> @t14
SQL>
SQL> -- now let's see what we've got
SQL>
SQL> @find_sql_acs 
Enter value for sql_text: %v14)%
Enter value for sql_id: 
Enter value for is_bind_aware: 
 
SQL_ID         CHILD PLAN_HASH_VALUE IBS IBA ISH      EXECS ROWS_PROCESSED  AVG_ETIME    AVG_CPU    AVG_PIO    AVG_LIO SQL_TEXT
------------- ------ --------------- --- --- --- ---------- -------------- ---------- ---------- ---------- ---------- -----------------------------------
2yk8hbkc4kbd5      0       376960484 N   N   N            8              8        .18        .18        .00      2,641 select avg(pk_col) from kso.little_
                                                                                                                       skew where col1 = :v1 and col2 = :v
                                                                                                                       2 and col3 between :v3 and :v4 and
                                                                                                                       col4 in (:v5,:v6,:v7,:v8,:v9,:v10,:
                                                                                                                       v11,:v12,:v13,:v14)
 
2yk8hbkc4kbd5      1      3746388338 N   Y   Y            3              3        .00        .00        .00         15 select avg(pk_col) from kso.little_
                                                                                                                       skew where col1 = :v1 and col2 = :v
                                                                                                                       2 and col3 between :v3 and :v4 and
                                                                                                                       col4 in (:v5,:v6,:v7,:v8,:v9,:v10,:
                                                                                                                       v11,:v12,:v13,:v14)
 
2yk8hbkc4kbd5      2       376960484 N   Y   Y            4              4        .00        .00        .00          3 select avg(pk_col) from kso.little_
                                                                                                                       skew where col1 = :v1 and col2 = :v
                                                                                                                       2 and col3 between :v3 and :v4 and
                                                                                                                       col4 in (:v5,:v6,:v7,:v8,:v9,:v10,:
                                                                                                                       v11,:v12,:v13,:v14)
 
 
3 rows selected.
 
SQL> -- As expected, the optimizer has noticed and marked it Bind Aware
SQL> -- Now let's try the same thing with 15 bind variables instead of 14
SQL> -- (note: t15 is the same with one more bind variable)
SQL>
SQL> @t15
SQL> @t15
SQL> @t15
SQL> @t15
SQL> @t15a
SQL> @t15a
SQL> @t15a
SQL> @t15a
SQL> @t15a
SQL> @t15a
SQL> @t15a
SQL> @t15
SQL> @t15
SQL> @t15
SQL> @t15
SQL> @find_sql_acs
Enter value for sql_text: %v15)%
Enter value for sql_id: 
Enter value for is_bind_aware: 
 
SQL_ID         CHILD PLAN_HASH_VALUE IOB IBA ISH      EXECS ROWS_PROCESSED  AVG_ETIME    AVG_CPU    AVG_PIO    AVG_LIO SQL_TEXT
------------- ------ --------------- --- --- --- ---------- -------------- ---------- ---------- ---------- ---------- -----------------------------------
4ndg8w1ga5dth      0       376960484 N   N   Y           15             15        .17        .16        .00      2,465 select avg(pk_col) from kso.little_
                                                                                                                       skew where col1 = :v1 and col2 = :v
                                                                                                                       2 and col3 between :v3 and :v4 and
                                                                                                                       col4 in (:v5,:v6,:v7,:v8,:v9,:v10,:
                                                                                                                       v11,:v12,:v13,:v14,:v15)
 
 
1 row selected.
 
SQL> -- no joy!
SQL>
SQL> -- so now I wondered if it really was just a hard limit or something more complicated
SQL> -- so I wrote a little script to count the number of bind variables in Bind Aware statements
SQL>
SQL> !cat acs_bind_count.sql
select distinct sql_id, bind_count from (
select sql_id, child_number, count(*) bind_count from v$sql_bind_capture
where sql_id in (
select sql_id from v$sqlarea where is_bind_aware = 'Y')
group by sql_id, child_number
)
order by bind_count
/
 
SQL> @acs_bind_count.sql
 
SQL_ID        BIND_COUNT
------------- ----------
2jmg3n6k2q82x          2
dxjvgtd3stafn          4
ck67mwdmdnaww          8
9pcw6qh83mj69          9
883uvnn2gz36u         10
4ysyd47nkap6d         11
7mygfqgs52au2         12
2744r65xx9h2h         13
2yk8hbkc4kbd5         14
 
9 rows selected.

So it looks like there probably is a hard limit. Let’s see if I can disprove that by looking at a couple of production systems. Here’s the output from the same script on four production instances (note: system 1 is that same one that I referenced earlier with the statements with known Bind Variable Peeking issues, where ACS was not noticing them):

-- System 1, Node 2
 
SQL> @acs_bind_count
 
SQL_ID        BIND_COUNT
------------- ----------
17uuqnvxmzxhj          8
58p0j1q6rmv34          8
5mxqphz5qfs4d          8
g375mcpc30dy5          8
3tfx8fzp64vkb          9
4vb86f36xqc50          9
34x6683rpwtxa         10
dt1v1cmua9cnq         11
ftxa99d89yzz0         11
0qvgb3dyfg539         12
 
10 rows selected.
 
 
-- System 1, Node 3
 
SQL> @acs_bind_count
 
SQL_ID        BIND_COUNT
------------- ----------
5bjg6w9btv3zn          7
17uuqnvxmzxhj          8
6n1790gk5m0hy          8
2pv7g3nutnyq5          9
4vb86f36xqc50          9
11at9nnhrw3w9         10
4p9zp29bcu927         10
4axts2hm73n98         11
ftxa99d89yzz0         11
490gh9uugyqqq         13
 
10 rows selected.
 
-- System 2, Node 1
 
SQL> @acs_bind_count
 
SQL_ID        BIND_COUNT
------------- ----------
4rqy85b9r413x          1
7h47mb48s19t3          1
51z6b0kbduf8m          2
djkchp3qwadu1          2
9g5ng8xjj4h05          3
amyvvzzjccvsg          3
cuv2pyzvz7b4z          3
d68jnfnqs6uh7          3
dkc2kbj7kdsu7          3
fxmfhkkhcrdj6          3
6196v5td8pk4n          4
9m63xcj6sbswu          4
0mn3xx02ww6pd          5
c7k3pwbjg4d1d          5
dy98ajwqc8s2x          5
3qh6yw37n8m1a          6
3xxgg248shg62          6
83v17f53rnzcv          6
a2271k4f7j211          6
06n7rk2jw3rsy          7
f4xr9x5wh46zt          7
48ryzfg5dxpqn          8
an9114vtxukz2          8
 
23 rows selected.
 
-- System 2, Node 2
 
SQL> @acs_bind_count
 
SQL_ID        BIND_COUNT
------------- ----------
4rqy85b9r413x          1
7h47mb48s19t3          1
gvj749s4654hm          1
51z6b0kbduf8m          2
6qwznvjfajh7t          2
djkchp3qwadu1          2
2mhfdtyrvhz3c          3
92a7wu0891bjk          3
9g5ng8xjj4h05          3
9p9gsakw5qyrg          3
amyvvzzjccvsg          3
bs63sfxynwtc6          3
cuv2pyzvz7b4z          3
d68jnfnqs6uh7          3
dvh3d8g2f844j          3
0341h5ytqbv7k          4
...
gj0rzh6xw2msp          4
grvh9mmkynm4u          4
0mn3xx02ww6pd          5
7ys2u732cj1ag          5
9kfj6kj32p7ku          5
dy98ajwqc8s2x          5
83v17f53rnzcv          6
d79xsp0utuvm2          6
gqkx5bsg1zfzc          6
4q3n2wwqtghma          7
f4xr9x5wh46zt          7
g7c90sr77bb78          7
7gb24h57r0spb          8
9zcy2yz537z41          8
an9114vtxukz2          8
7uwy0uxauuabz          9
80xmux45pvkrr          9
 
73 rows selected.

So it appears that there probably is a hard limit and that it probably is 14 (although there certainly could be a more complicated algorithm in play). None of the 4 production systems I checked had a single statement that was marked bind aware with more than 14 bind variables.

So here are a few closing thoughts:

To me, ACS doesn’t seem to be completely cooked yet. In fact, it seems to be somewhat of a futile attempt, since in the very best case, one execution with a bad plan would be necessary for the optimizer to recognize that a bind variable peeking issue existed. This wouldn’t be so bad if the information was persisted, so that the same “learning” process wouldn’t have to be endured if/when a statement gets flushed from the shared pool. This issue alone is enough to keep this feature from being relied upon in situations where performance is critical. In my opinion, the best approach remains unchanged from version 10, that being the judicious use of literals where necessary to prevent plan instability due to bind variable peeking.

My impression of it is not completely negative though. It’s certainly better than the old approach of pick a plan and stick with it. And for systems that use a limited number of binds per statement where performance swings are not as dramatic and/or performance is not as critical, it seems to work pretty well.

Here are some other references:

OptimizerMagic Blog – Good basic description of ACS
OptimizerMagic Blog – Update on ACS
Ric Van Dyke – Intelligent Cusor Sharing in 11.1.0.6
Ric Van Dyke – Intelligent Cusor Sharing in 11.1.0.7
Section on ACS in Troubleshooting Oracle Performance By Christian Antognini

Oracle 11g SQL Plan Management – SQL Plan Baselines

Note, this post applies to 11gR1 (specifically 11.1.0.7).

11g has a new method of dealing with plan instability (i.e. the tendency of a statement to flip flop between plans). The goal has always been to provide the best performance, but until 11g there has not been a mechanism that had as a design goal, the idea of eliminating backwards movement (“performance regressions” as the Oracle documentation calls them). That is, not allowing a statement to switch to a plan that is significantly slower than the one it has already been executing. This new mechanism depends on a construct called a Baseline. So for this post I am not going to focus on the process of evolving plans to prevent “performance regressions”, but rather on the Baselines themselves. (hopefully I’ll get to the plan management part soon)

Here’s a little excerpt from the 11gR1 documentation:

Each time a SQL statement is compiled, the optimizer first uses a cost-based search method to build a best-cost plan, then tries to find a matching plan in the SQL plan baseline. If a match is found, the optimizer will proceed using this plan. Otherwise, it evaluates the cost of each accepted plan in the SQL plan baseline and selects the plan with the lowest cost. The best-cost plan found by the optimizer that does not match any plans in the plan history for the SQL statement represents a new plan, and is added as a non-accepted plan to the plan history. The new plan is not used until it is verified to not cause a performance regression. However, if a change in the system (such as a dropped index) causes all accepted plans to become non-reproducible, the optimizer will select the best-cost plan. Thus, the presence of a SQL plan baseline causes the optimizer to use conservative plan selection strategy for the SQL statement.

Clear as mud?

Now my description:

Baselines are the latest evolution in Oracle’s efforts to allow a plan to be locked. The previous efforts were Outlines and SQL Profiles. Both of those efforts were based on the idea that hints could be applied to limit the optimizers choices down to one. That approach seems a little flawed. Why not just save the plan and be done with it? I believe that’s what Baselines are intended to do. Unfortunately, they don’t appear to do it yet. But they do have an advantage over Outlines and SQL Profiles in that they at least save the plan_hash_value, so they know if they are reproducing the correct plan or not. (see my previous post, Do Baselines Use Hints? – Take 2) And hopefully, they will evolve to the point where the actual plan is saved along with the hash value – we’ll see.

So anyway, here’s some basic housekeeping stuff on Baselines:

  • Baselines will be used by default in 11gR1, if they exist. There is a parameter to control whether they are used or not (OPTIMIZER_USE_SQL_PLAN_BASELINE). It is set to TRUE by default.
  • Baselines will not be created by default in 11gR1. So, much like with the older Outlines or SQL Profiles, you must do something to create them.
  • There is a view called DBA_SQL_PLAN_BASELINES that exposes the Baselines that have been created.
  • Just like Outlines and SQL Profiles, Baselines apply to all instances in a RAC environment (they are not localized to a specific instance).

Baselines can be created a couple of different ways. They can be created automatically by the database as it’s running by setting the OPTIMIZER_CAPTURE_SQL_PLAN_BASELINES parameter. They can also be created for statements in a SQL Tuning Set. But I’m not going to talk about either one of those approaches in this post, as I don’t think they really make much sense in the case where you want to use a Baseline to lock a plan for a specific statement. Instead, I am going to discuss creating a baseline for a statement that is in the shared pool via the DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE function.

All the function needs is a SQL_ID and a PLAN_HASH_VALUE. Optionally a parameter can be used to define the baseline as FIXED. If it’s FIXED then it gets priority over any other Baselines for that statement, except other FIXED Baselines. Confused? Well it’s not exactly the most straight forward setup. I’m kind of a simple guy, so at this point I’m thinking one FIXED Baseline is enough.

Before you get too bored, here’s a quick example:

SQL> select avg(pk_col) from kso.little_skew where col1 = 99320;
 
AVG(PK_COL)
-----------
     628922
 
SQL> @find_sql
Enter value for sql_text: %little%
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
4dd2qjxbjav6u      0 2709260180          1           .02           89 select avg(pk_col) from kso.little_skew where col1 = 99320
 
SQL> @dplan
Enter value for sql_id: 4dd2qjxbjav6u
Enter value for child_no: 0
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  4dd2qjxbjav6u, child number 0
-------------------------------------
select avg(pk_col) from kso.little_skew where col1 = 99320
 
Plan hash value: 2709260180
 
-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |       |       |     4 (100)|          |
|   1 |  SORT AGGREGATE              |                  |     1 |     9 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| LITTLE_SKEW      |     1 |     9 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | LITTLE_SKEW_COL1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("COL1"=99320)
 
20 rows selected.
 
SQL> !cat create_baseline.sql
var ret number
exec :ret := dbms_spm.load_plans_from_cursor_cache(-
    sql_id=>'&sql_id', -
    plan_hashvalue=>&plan_hash_value,-
    fixed=>'&fixed');
 
SQL> @create_baseline
Enter value for sql_id: 4dd2qjxbjav6u
Enter value for plan_hash_value: 2709260180
Enter value for fixed: NO
 
PL/SQL procedure successfully completed.
 
SQL> !cat baselines.sql
set lines 155
col sql_text for a50 trunc
col last_executed for a28
col enabled for a7
col plan_hash_value for a16
col last_executed for a16
select spb.sql_handle, spb.plan_name, spb.sql_text,
spb.enabled, spb.accepted, spb.fixed,
to_char(spb.last_executed,'dd-mon-yy HH24:MI') last_executed
from
dba_sql_plan_baselines spb
where spb.sql_text like nvl('%'||'&sql_text'||'%',spb.sql_text)
and spb.sql_handle like nvl('&name',spb.sql_handle)
and spb.plan_name like nvl('&plan_name',spb.plan_name)
/
 
SQL> @baselines
Enter value for sql_text: %little%
Enter value for name:
Enter value for plan_name: 
 
SQL_HANDLE                     PLAN_NAME                      SQL_TEXT                                           ENABLED ACC FIX LAST_EXECUTED
------------------------------ ------------------------------ -------------------------------------------------- ------- --- --- ----------------
SYS_SQL_b77256b604589dec       SYS_SQL_PLAN_04589dec2fdfd157  select avg(pk_col) from kso.little_skew where col1 YES     YES NO
 
SQL> select avg(pk_col) from kso.little_skew where col1 = 99320;
 
AVG(PK_COL)
-----------
     628922
 
SQL> select avg(pk_col) from kso.little_skew where col1 = 99320;
 
AVG(PK_COL)
-----------
     628922
 
SQL> @find_sql
Enter value for sql_text:
Enter value for sql_id: 4dd2qjxbjav6u
 
SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
4dd2qjxbjav6u      0 2709260180          1           .02           89 select avg(pk_col) from kso.little_skew where col1 = 99320
4dd2qjxbjav6u      1 2709260180          1           .00            5 select avg(pk_col) from kso.little_skew where col1 = 99320
 
SQL> @dplan
Enter value for sql_id: 4dd2qjxbjav6u
Enter value for child_no: 1
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  4dd2qjxbjav6u, child number 1
-------------------------------------
select avg(pk_col) from kso.little_skew where col1 = 99320
 
Plan hash value: 2709260180
 
-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |       |       |     4 (100)|          |
|   1 |  SORT AGGREGATE              |                  |     1 |     9 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| LITTLE_SKEW      |     1 |     9 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | LITTLE_SKEW_COL1 |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("COL1"=99320)
 
Note
-----
   - SQL plan baseline SYS_SQL_PLAN_04589dec2fdfd157 used for this statement
 
24 rows selected.

So that’s a quick example of how easy it is to create a Baseline. Once a Baseline is created it behaves much like an Outline or SQL Profile, in that it will be applied to any SQL statement where the normalized text matches. It’s interesting to note that even though Outlines are way outdated in 11g, they still take precedence over Baselines. So if you have an Outline and a Baseline on the same statement, the Outline will be used. Here’s an example of that behavior:
Continue reading ‘Oracle 11g SQL Plan Management – SQL Plan Baselines’ »

Do SQL Plan Baselines Use Hints? – Take 2

Warning – This post is almost pure speculation!

You may have read a previous post where I said I thought Baselines didn’t use hints. We’ll I’ve changed my mind. I no longer think that Baselines store an actual plan for a SQL statement. (See my previous post – Do SQL Plan Baselines Use Hints?) I never could find a place where the plan data was stored (only the plan_hash_value). So I had a closer look at some more 10053 trace files. Here’s a quick run down of what I found. First the set up:

SQL> @flush_pool
 
System altered.
 
SQL> select avg(pk_col) from kso.little_skew where col1 = 1;
 
AVG(PK_COL)
-----------
 310999.357
 
SQL> @find_sql
Enter value for sql_text: %little%
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
9df852xw86wxt      0 3498336203          1           .18        3,437 select avg(pk_col) from kso.little_skew where col1 = 1
 
SQL> @dplan
Enter value for sql_id: 9df852xw86wxt
Enter value for child_no: 0
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9df852xw86wxt, child number 0
-------------------------------------
select avg(pk_col) from kso.little_skew where col1 = 1
 
Plan hash value: 3498336203
 
----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |       |       |   907 (100)|          |
|   1 |  SORT AGGREGATE    |             |     1 |     9 |            |          |
|*  2 |   TABLE ACCESS FULL| LITTLE_SKEW |   620K|  5449K|   907   (2)| 00:00:11 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("COL1"=1)
 
 
19 rows selected.
 
SQL> select /*+ index (a little_skew_col1) */ avg(pk_col) from kso.little_skew a where col1 = 1;
 
AVG(PK_COL)
-----------
 310999.357
 
SQL> @find_sql
Enter value for sql_text: %little%
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
9df852xw86wxt      0 3498336203          1           .18        3,437 select avg(pk_col) from kso.little_skew where col1 = 1
fbnx1bws6n4xu      0 2709260180          1           .86        5,278 select /*+ index (a little_skew_col1) */ avg(pk_col) from ks
                                                                      o.little_skew a where col1 = 1
 
 
SQL> @dplan
Enter value for sql_id: fbnx1bws6n4xu
Enter value for child_no: 0
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  fbnx1bws6n4xu, child number 0
-------------------------------------
select /*+ index (a little_skew_col1) */ avg(pk_col) from
kso.little_skew a where col1 = 1
 
Plan hash value: 2709260180
 
-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |       |       | 20835 (100)|          |
|   1 |  SORT AGGREGATE              |                  |     1 |     9 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| LITTLE_SKEW      |   620K|  5449K| 20835   (1)| 00:04:11 |
|*  3 |    INDEX RANGE SCAN          | LITTLE_SKEW_COL1 |   620K|       |  3213   (1)| 00:00:39 |
-------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("COL1"=1)
 
 
21 rows selected.
 
SQL> -- need to create baseline for original statement to get SQL_HANDLE
SQL>
SQL> !cat create_baseline.sql
var ret number
exec :ret := dbms_spm.load_plans_from_cursor_cache(-
    sql_id=>'&sql_id', -
    plan_hash_value=>&plan_hash_value,-
    fixed=>'&fixed');
 
 
 
 
SQL> @create_baseline
Enter value for sql_id: 9df852xw86wxt
Enter value for plan_hash_value: 3498336203
Enter value for fixed: NO
 
PL/SQL procedure successfully completed.
 
SQL> -- now add the other plan as a fixed Baseline
SQL>
SQL> !cat create_baseline2.sql
var ret number
exec :ret := dbms_spm.load_plans_from_cursor_cache(-
    sql_id=>'&sql_id', -
    plan_hash_value=>&plan_hash_value,-
    sql_handle=>'&sql_handle',-
    fixed=>'&fixed');
 
 
SQL> @create_baseline2
Enter value for sql_id: fbnx1bws6n4xu
Enter value for plan_hash_value: 2709260180
Enter value for sql_handle: SYS_SQL_19fcc3effcce06c0
Enter value for fixed: YES
 
PL/SQL procedure successfully completed.
 
SQL> @baselines
Enter value for sql_text: 
Enter value for name: 
Enter value for plan_name: 
 
SQL_HANDLE                     PLAN_NAME                      SQL_TEXT                                           ENABLED ACC FIX LAST_EXECUTED
------------------------------ ------------------------------ -------------------------------------------------- ------- --- --- ----------------
SYS_SQL_19fcc3effcce06c0       SYS_SQL_PLAN_fcce06c08b1420d2  select avg(pk_col) from kso.little_skew where col1 YES     YES NO
SYS_SQL_19fcc3effcce06c0       SYS_SQL_PLAN_fcce06c02fdfd157  select avg(pk_col) from kso.little_skew where col1 YES     YES YES
 
SQL> select avg(pk_col) from kso.little_skew where col1 = 1;
 
AVG(PK_COL)
-----------
 310999.357
 
SQL> /        
 
AVG(PK_COL)
-----------
 310999.357
 
SQL> -- check to see if it's using the fixed Baseline
SQL> @find_sql
Enter value for sql_text: %little%
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
9df852xw86wxt      0 3498336203          2           .19        3,279 select avg(pk_col) from kso.little_skew where col1 = 1
fbnx1bws6n4xu      0 2709260180          1           .86        5,278 select /*+ index (a little_skew_col1) */ avg(pk_col) from ks
                                                                      o.little_skew a where col1 = 1
 
 
SQL> 
SQL> @dplan
Enter value for sql_id: 9df852xw86wxt
Enter value for child_no: 0
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9df852xw86wxt, child number 0
-------------------------------------
select avg(pk_col) from kso.little_skew where col1 = 1
 
Plan hash value: 3498336203
 
----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |       |       |   907 (100)|          |
|   1 |  SORT AGGREGATE    |             |     1 |     9 |            |          |
|*  2 |   TABLE ACCESS FULL| LITTLE_SKEW |   620K|  5449K|   907   (2)| 00:00:11 |
----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("COL1"=1)
 
 
19 rows selected.
 
SQL> -- no it's not using the Baseline, why not?

So we created a Fixed Baseline on a statement that used a hint to force an index based access path, attached it to another statement with no hints, and then re-ran the un-hinted statement. It did not pick up the Fixed Baseline. The next step was to have a look at a 10053 trace file. Here are some excerpts from the 10053 trace file.

 
...
 
SPM: statement found in SMB
 
...
 
******************************************
----- Current SQL Statement for this session (sql_id=9df852xw86wxt) -----
select avg(pk_col) from kso.little_skew where col1 = 1
*******************************************
 
...
 
SPM: cost-based plan found in the plan baseline, planId = 2333352146
SPM: fixed planId's of plan baseline are: 803197271
SPM: using qksan to reproduce, cost and select accepted plan, sig = 1872586980585244352
SPM: reproduce plan round 1, using session OFE=11010007
SPM: using qksan to reproduce accepted plan, planId = 803197271
Registered qb: SEL$1 0xf6d872ec (PARSER)
---------------------
QUERY BLOCK SIGNATURE
---------------------
  signature (): qb_name=SEL$1 nbfros=1 flg=0
    fro(0): flg=4 objn=90376 hint_alias="LITTLE_SKEW"@"SEL$1"
 
DOP: Automatic degree of parallelism is disabled: Parameter.
PM: Considering predicate move-around in query block SEL$1 (#0)
**************************
Predicate Move-Around (PM)
**************************
OPTIMIZER INFORMATION
 
******************************************
----- Current SQL Statement for this session (sql_id=7039h2744jjrv) -----
/* SQL Analyze(110,0) */ select avg(pk_col) from kso.little_skew where col1 = 1
*******************************************
 
...
 
SPM: planId in plan baseline = 803197271, planId of reproduced plan = 2333352146
------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : SYS
  plan_baseline signature  : 1872586980585244352
  plan_baseline plan_id    : 803197271
  plan_baseline hintset    :
    hint num  1 len 27 text: IGNORE_OPTIM_EMBEDDED_HINTS
    hint num  2 len 37 text: OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
    hint num  3 len 22 text: DB_VERSION('11.1.0.7')
    hint num  4 len  8 text: ALL_ROWS
    hint num  5 len 22 text: OUTLINE_LEAF(@"SEL$1")
    hint num  6 len 57 text: INDEX_RS_ASC(@"SEL$1" "A"@"SEL$1" ("LITTLE_SKEW"."COL1"))
SPM: generated non-matching plan:
----- Explain Plan Dump -----
----- Plan Table -----
 
============
Plan Table
============
-----------------------------------------+-----------------------------------+
| Id  | Operation           | Name       | Rows  | Bytes | Cost  | Time      |
-----------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT    |            |       |       |   907 |           |
| 1   |  SORT AGGREGATE     |            |     1 |     9 |       |           |
| 2   |   TABLE ACCESS FULL | LITTLE_SKEW|  605K | 5449K |   907 |  00:00:11 |
-----------------------------------------+-----------------------------------+
Predicate Information:
----------------------
2 - filter("COL1"=1)
 
Content of other_xml column
===========================
  db_version     : 11.1.0.7
  parse_schema   : SYS
  plan_hash      : 3498336203
  plan_hash_2    : 2333352146
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
      DB_VERSION('11.1.0.7')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "LITTLE_SKEW"@"SEL$1")
    END_OUTLINE_DATA
  */
 
------- END SPM Plan Dump -------

So it appears that the Baseline is found in the SMB (SQL Management Base). It then appears to try to reproduce the plan in a separate session using the hints. It then announces that it came up with a different plan and spits out the plan data that it came up with. So it appears that the plan_hash_value is just used for comparison to see if it came up with the same plan as the one that was in place when the Baseline was generated. If not, then it falls back to the regular (no Baseline) behavior. Not at all what I had hoped for!

It may be that 11gR1 is an intermediate step and that a future version (11gR2 maybe, I hope) will actually store the plan details instead of just the plan hash value. But as I said in the intro, this is complete speculation on my part.

SQL Profiles

Well I was wrong! SQL Profiles are better than Outlines. For a while now I have been saying that I didn’t like SQL Profiles because they were less stable than Outlines. Turns out that the OPT_ESTIMATE hint used by SQL Profiles which are created by the SQL Tuning Advisor is what I really didn’t like. I just didn’t know it.

Let me back up for a minute. I posted about Oracle’s SQL Tuning Advisor a while back. It’s a feature that was added to Oracle in version 10g. It basically looks a SQL statement and tries to come up with a better execution plan than the one the optimizer has picked. Since it is allowed as much time as it wants to do it’s analysis, the advisor can sometimes find better approaches. That’s because it can actually validate the optimizer’s original estimates by running various steps in a given plan and comparing the actual results to the estimates. When it’s all done, if it has found a better plan, it offers to implement that new plan via a SQL Profile. Those offered Profiles often have a lightly documented hint (OPT_ESTIMATE) that allows it to scale the optimizer estimates for various operations – essentially it’s a fudge factor. The problem with this hint is that, far from locking a plan in place, it is locking an empirically derived fudge factor in place. This still leaves the optimizer with a lot of flexibility when it comes to choosing a plan. It also sets up a commonly occurring situation where the fudge factors stop making sense as the statistics change. Thus the observation that SQL Profiles tend to sour over time.

I have to give credit to Randolf Geist for making me take a second look at SQL Profiles. He commented on my Outlines post last week and recommended I give his post on SQL Profiles a look. I did and it really got me thinking. One of the things I liked the best about the post was that he created a couple of scripts to pull the existing hints from a statement in the shared pool or the AWR tables , and create a SQL Profile from those hints using the DBMS_SQLTUNE.IMPORT_SQL_PROFILE procedure. This makes perfect sense because the hints are stored with every plan (that’s what DBMS_XPLAN uses to spit them out if you ask for them). Unfortunately this procedure is only lightly documented. Also he had a nice script for pulling the hints from V$SQL_PLAN table which I have made use of as well.

So as always I have created a few scripts (borrowing from Randolf mostly).

create_sql_profile.sql – uses cursor from the shared pool
create_sql_profile_awr.sql – uses AWR tables
sql_profile_hints.sql – shows the hints in a SQL Profile for 10g

So here’s little example:
Note: unstable_plans.sql and awr_plan_stats.sql are discussed here: Unstable Plans (Plan Instability)

SQL> @unstable_plans
Enter value for min_stddev:
Enter value for min_etime:
 
SQL_ID        SUM(EXECS)   MIN_ETIME   MAX_ETIME   NORM_STDDEV
------------- ---------- ----------- ----------- -------------
0qa98gcnnza7h          4       42.08      208.80        2.8016
 
SSQL> @awr_plan_stats
Enter value for sql_id: 0qa98gcnnza7h
 
SQL_ID        PLAN_HASH_VALUE        EXECS          ETIME    AVG_ETIME        AVG_LIO
------------- --------------- ------------ -------------- ------------ --------------
0qa98gcnnza7h       568322376            3          126.2       42.079      124,329.7
0qa98gcnnza7h      3723858078            1          208.8      208.796   28,901,466.0
 
SQL> @create_sql_profile_awr
Enter value for sql_id: 0qa98gcnnza7h
Enter value for plan_hash_value: 568322376
Enter value for category:
Enter value for force_matching:
 
PL/SQL procedure successfully completed.
 
SQL> @sql_profiles
Enter value for sql_text:
Enter value for name: PROFIL%
 
NAME                           CATEGORY        STATUS   SQL_TEXT                                                               FOR
------------------------------ --------------- -------- ---------------------------------------------------------------------- ---
PROFILE_0qa98gcnnza7h          DEFAULT         ENABLED  select avg(pk_col) from kso.skew                                       NO
 
SQL> set echo on
SQL> @sql_profile_hints
SQL> set lines 155
SQL> col hint for a150
SQL> select attr_val hint
  2  from dba_sql_profiles p, sqlprof$attr h
  3  where p.signature = h.signature
  4  and name like ('&profile_name')
  5  order by attr#
  6  /
Enter value for profile_name: PROFILE_0qa98gcnnza7h
 
HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "SKEW"@"SEL$1")
 
SQL> @sql_hints_awr
SQL> select
  2  extractvalue(value(d), '/hint') as outline_hints
  3  from
  4  xmltable('/*/outline_data/hint'
  5  passing (
  6  select
  7  xmltype(other_xml) as xmlval
  8  from
  9  dba_hist_sql_plan
 10  where
 11  sql_id = '&sql_id'
 12  and plan_hash_value = &plan_hash_value
 13  and other_xml is not null
 14  )
 15  ) d;
Enter value for sql_id: 0qa98gcnnza7h
Enter value for plan_hash_value: 568322376
 
OUTLINE_HINTS
-----------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "SKEW"@"SEL$1")

A couple of additional points:

  • Outlines and SQL Profiles both take the same approach to controlling execution plans. They both attempt to force the optimizer down a certain path by applying hints behind the scenes. This is in my opinion an almost impossible task. The more complex the statement, the more difficult the task becomes. The newest kid on the block in this area (in 11g) is called a baseline and while it doesn’t abandon the hinting approach altogether, it does at least store the plan_hash_value – so it can tell if it regenerated the correct plan or not.
  • It does not appear that Outlines are being actively pursued by Oracle development anymore. So while they still work in 11g, they are becoming a little less reliable (and they were a bit quirky to begin with).
  • SQL Profiles have the ability to replace literals with bind variables similar to the cursor_sharing parameter. This means you can have a SQL Profile that will match multiple statements which use literals without having to set cursor_sharing for the whole instance.
  • Outlines take precedence over SQL Profiles. You can create both on the same statement and if you do, the outline will be used and the SQL Profile will be ignored. This is true in 11g as well, by the way.
  • Outlines don’t appear to use the OPT_ESTIMATE hint. So I believe it is still a valid approach to accept a SQL Profile as offered by the SQL Tuning Advisor and then create an Outline on top of it. It seems to work pretty well most of the time. (be sure and check the hints and the plan that gets produced)
  • Manually created SQL Profiles also don’t appear to use the OPT_ESTIMATE hint. So I believe it is also a valid approach to accept a SQL Profile as offered by the SQL Tuning Advisor and then create a SQL Profile on top of it. Note that you’ll have to use a different category, then drop the original, then enable the new SQL Profile. Which means this approach is a bit more work than just creating an Outline in the DEFAULT category.

Have a look at the difference between SQL Tuning Set generated hints and those created by a manual SQL Profile or an Outline (note that I have tried to change the object names to protect the innocent and in so doing may have made it slightly more difficult to follow) :

SQL>
SQL> select * from table(dbms_xplan.display_cursor('&sql_id','&child_no',''));
Enter value for sql_id: fknfhx8wth51q
Enter value for child_no: 1
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  fknfhx8wth51q, child number 1
-------------------------------------
SELECT /* test4 */ col1, col2, col3 ...
 
 
Plan hash value: 3163842146
 
----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                          |       |       |  1778 (100)|          |
|   1 |  NESTED LOOPS                 |                          |  1039 | 96627 |  1778   (1)| 00:00:33 |
|   2 |   NESTED LOOPS                |                          |   916 | 57708 |  1778   (1)| 00:00:33 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| TABLE_XXXX_LOOKUP        |   446 | 17840 |   891   (1)| 00:00:17 |
|*  4 |     INDEX RANGE SCAN          | INDEX_XXXX_IS_CPCI       | 12028 |       |    18   (0)| 00:00:01 |
|   5 |    TABLE ACCESS BY INDEX ROWID| TABLE_XXXX_IDENT         |     2 |    46 |     2   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN          | INDEX_XXXXIP_17_FK       |     2 |       |     1   (0)| 00:00:01 |
|*  7 |   INDEX UNIQUE SCAN           | PK_TABLE_XXXX_ASSIGNMENT |     1 |    30 |     0   (0)|          |
----------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter(( 
...
   4 - access("L"."COL1"=:N1)
   6 - access("L"."COL2"="I"."COL1")
 
Note
-----
   - SQL profile SYS_SQLPROF_012061f471d50001 used for this statement
 
 
85 rows selected.
 
SQL> @sql_profile_hints
Enter value for name: SYS_SQLPROF_012061f471d50001
 
HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
OPT_ESTIMATE(@"SEL$1", TABLE, "L"@"SEL$1", SCALE_ROWS=0.0536172171)
OPT_ESTIMATE(@"SEL$1", INDEX_SKIP_SCAN, "A"@"SEL$1", PK_TABLE_XXXX_ASSIGNMENT, SCALE_ROWS=4)
COLUMN_STATS("APP_OWNER"."TABLE_XXXX_ASSIGNMENT", "COL1", scale, length=6 distinct=1234 nulls=0 min=1000000014 max=1026369632)
COLUMN_STATS("APP_OWNER"."TABLE_XXXX_ASSIGNMENT", "COL2", scale, length=12 distinct=2 nulls=0)
COLUMN_STATS("APP_OWNER"."TABLE_XXXX_ASSIGNMENT", "COL3", scale, length=12 distinct=2 nulls=0)
TABLE_STATS("APP_OWNER"."TABLE_XXXX_ASSIGNMENT", scale, blocks=5 rows=2400)
OPTIMIZER_FEATURES_ENABLE(default)
 
7 rows selected.
 
SQL> -- no direct hints - only stats and scaling on the profile created by the SQL Tuning Advisor
SQL> -- (i.e. the dreaded OPT_ESTIMATE hints and no directive type hints like INDEX or USE_NL)
SQL>
SQL> -- now let's try an outline on top of it
SQL> @create_outline
 
Session altered.
 
Enter value for sql_id: fknfhx8wth51q
Enter value for child_number: 1
Enter value for outline_name: KSOTEST1
Outline KSOTEST1 created.
 
PL/SQL procedure successfully completed.
 
SQL> @outline_hints
Enter value for name: KSOTEST1
 
HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
USE_NL(@"SEL$1" "A"@"SEL$1")
USE_NL(@"SEL$1" "I"@"SEL$1")
LEADING(@"SEL$1" "L"@"SEL$1" "I"@"SEL$1" "A"@"SEL$1")
INDEX(@"SEL$1" "A"@"SEL$1" ("TABLE_XXXX_ASSIGNMENT"."COL1" "TABLE_XXXX_ASSIGNMENT"."COL2" "TABLE_XXXX_ASSIGNMENT"."COL3"))
INDEX_RS_ASC(@"SEL$1" "I"@"SEL$1" ("TABLE_XXXX_IDENT"."COL1"))
INDEX_RS_ASC(@"SEL$1" "L"@"SEL$1" ("TABLE_XXXX_LOOKUP"."COL1" "TABLE_XXXX_LOOKUP"."COL2"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('11.1.0.7')
OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
IGNORE_OPTIM_EMBEDDED_HINTS
 
11 rows selected.
 
SQL> -- no OPT_ESTIMATE hints on the outline
SQL> -- directive type hints - INDEX, USE_NL, etc...
SQL> 
SQL> -- now let's try creating a manual profile
SQL> @create_sql_profile.sql
Enter value for sql_id: fknfhx8wth51q
Enter value for child_no: 1
Enter value for category: TEST
Enter value for force_matching:
 
PL/SQL procedure successfully completed.
 
SQL> @sql_profile_hints
Enter value for name: PROFILE_fknfhx8wth51q
 
HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.1.0.7')
DB_VERSION('11.1.0.7')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX_RS_ASC(@"SEL$1" "L"@"SEL$1" ("TABLE_XXXX_LOOKUP"."COL1" "TABLE_XXXX_LOOKUP"."COL2"))
INDEX_RS_ASC(@"SEL$1" "I"@"SEL$1" ("TABLE_XXXX_IDENT"."COL1"))
INDEX(@"SEL$1" "A"@"SEL$1" ("TABLE_XXXX_ASSIGNMENT"."COL1" "TABLE_XXXX_ASSIGNMENT"."COL2" "TABLE_XXXX_ASSIGNMENT"."COL3"))
LEADING(@"SEL$1" "L"@"SEL$1" "I"@"SEL$1" "A"@"SEL$1")
USE_NL(@"SEL$1" "I"@"SEL$1")
USE_NL(@"SEL$1" "A"@"SEL$1")
 
11 rows selected.
 
SQL> -- no OPT_ESTIMATE with the SQL Profile we created manually !
SQL> -- again it's directive - USE_NL, INDEX, LEADING, etc...
SQL>

So I apologize to all you SQL Profiles out there who have been lumped together by my prejudiced view, just because of the acts of a few of your brethren (i.e. the ones created by the SQL Tuning Advisor). SQL Profiles do indeed have all the capabilities of Outlines and probably are a better choice in most cases than Outlines.

Thanks again to Randolf Geist for his comments and his ideas on creating manual SQL Profiles.

Bind Variable Peeking – Drives Me Nuts!

In the constant battle to provide consistent performance, Oracle took a giant step backwards with the 9i version by introducing an “Enhancement” called Bind Variable Peeking. I’ll explain what I mean in a minute, but first a bit of history.

When Oracle introduced histograms in 8i, they provided a mechanism for the optimizer to recognize that the values in a column were not distributed evenly. That is, in a table with 100 rows and 10 distinct values, the default assumption the optimizer would make, in the absence of a histogram, would be that no matter which value you picked – you would always get 100/10 or 10 rows back. Histograms let the optimizer know if that was not the case. The classic example would be 100 records with 2 distinct values where one value, say “Y”, occurred 99 times and the other value, say “N”, occurred only 1 time.  So without a histogram the optimizer would always assume that whether you requested records with a “Y” or an “N”, you would get half the records back (100/2 = 50). Therefore you always want to do a full table scan as opposed to using an index on the column. A histogram, assuming it was accurate (we’ll come back to that later), would let the optimizer know that the distribution was not normal (i.e. not spread out evenly – also commonly called skewed) and that a “Y” would get basically the whole table, while an “N” would get only 1%. This would allow the optimizer to pick an appropriate plan regardless of which value was specified in the Where Clause.

So let’s consider the implications of that. Would that improve the response time for the query where the value was “Y”. The answer is no. In this simple case, the default costing algorithm is close enough and produces the same plan that the histogram produces. The full table scan takes just as long whether the optimizer thought it was getting 50 rows or 99 rows. But what about the case where we specified the value of “N”. In this case, with a histogram we would pick up the index on that column and presumably get a much better response time than the full table scan. This is an important point. Generally speaking it is only the outliers, the exceptional cases if you will, where the histogram really makes a difference.

So at first glance, all appeared well with the world. But there was a fly in the ointment. You had to use literals in your SQL statements for the optimizer to be able use the histograms. So you had to write your statements like this:

SELECT XYZ FROM TABLE1 WHERE COLUMN1 = ‘Y’;

SELECT XYZ FROM TABLE1 WHERE COLUMN1 = ‘N’;

Not a problem in our simple example, because you only have two possibilities. But consider a statement with 2 or 3 skewed columns, each with a couple of hundred distinct values. The possible combinations could quickly grow into the millions. Not a good thing for the shared pool.

Enter our star: Bind Variable Peeking, a new feature introduced in 9i that was added to allow the optimizer to peek at the value of bind variables and then use a histogram to pick an appropriate plan, just like it would do with literals. The problem with the new feature was that it only looked at the variables once, when the statement was parsed. So let’s make our simple example a little more realistic by assuming we have a 10 million row table where 99% have a value of “Y” and 1% have a value of “N”. So in our example, if the first time the statement was executed it was passed a “Y”, the full table scan plan would be locked in and it would be used until the statement had to be re-parsed, even if the value “N” was passed to it in subsequent executions.

So let’s consider the implication of that. When you get the full table scan plan (because you passed a “Y” the first time) it behaves the same way no matter what which value you pass subsequently. Always a full table scan, always the same amount of work and the same basic elapsed time. From a user standpoint that seems reasonable. The performance is consistent. (this is the way it would work without a histogram by the way) On the other hand, if the index plan gets picked because the parse occurs with a value of “N”, the executions where the value is “N” will be even faster than they were before, but the execution with a value of “Y” will be incredibly slow. This is not at all what the user expects. They expect the response time to be about the same every time they execute a piece of code. And this is the problem with bind variable peeking. It’s basically just Russian Roulette. It just depends on what value you happen to pass the statement when it’s parsed (which could be any execution by the way).

So is Bind Variable Peeking a feature or a bug? Well technically it’s not a bug because it works the way it’s designed. I just happen to believe that it was not a good decision to implement it that way. But what other choices did the optimizer development group have?

  • They could have evaluated the bind variables and re-parsed  for every execution of every statement using bind variables. This would eliminate the advantage of having bind variables in the first place and would never work for high transaction systems. So it was basically not an option.
  • They could have just said no, and made us use literals in order to get the benefit of histograms (probably not a bad option in retrospect – the fact that they added _optim_peek_user_binds probably means that they decided later to give us that option via setting this hidden parameter).
  • They could have implemented a system where they could identify statements that might benefit from different plans based on the values of bind variables. Then peek at those variables for every execution of those “bind sensitive” statements (sound familiar? – that’s what they finally did in 11g with Adaptive Cursor Sharing).

So why is it such a pervasive problem? And I do believe it is a pervasive problem with 10g in particular. A couple of reasons come to mind:

  1. We’ve been taught to always use bind variables. It’s a best practice which allows SQL statements to be shared, thus eliminating a great deal of work/contention. Using bind variable is an absolute necessity when building scalable high transaction rate systems. (of course that doesn’t mean that you can’t bend the rule occasionally)
  2. 10g changed it’s default stats gathering method to automatically gather histograms. So in a typical 10g database there are a huge number of histograms, many of them inappropriate (i.e. on columns that don’t have significantly skewed distributions) and many of them created with very small sample sizes causing the histograms to be less than accurate. Note that 11g appears to be better on both counts – that is to say, 11g seems to create fewer inappropriate histograms and seems to create much more accurate histograms with small sample sizes. But the jury is still out on 11g stats gathering as it has not been widely adopted at this point in time.
  3. In my humble opinion, Bind Variable Peeking is not that well understood. When I talk to people about the issue, they usually have heard of it and have a basic idea what the problem is, but their behavior (in terms of the code they write and how they manage their databases) indicates that they don’t really have a good handle on the issue.

So what’s the best way to deal with this issue? Well recognizing that you have a problem is the first step to recovery, so being able to identify that you have a problem with plan stability is an appropriate first step. Direct queries against the Statspack or AWR tables are probably the best way to identify the issue. I’ve posted a couple of scripts that I find useful for this purpose previously – (unstable_plans.sql, awr_plan_stats.sql, awr_plan_change.sql). What you’re looking for is statements that flip flop back and forth between 2 or more plans. Note that there are other reasons for statements to change plans, but Bind Variable Peeking is the number one suspect. Here’s an example of their usage:

SQL> @unstable_plans
SQL> break on plan_hash_value on startup_time skip 1
SQL> select * from (
  2  select sql_id, sum(execs), min(avg_etime) min_etime, max(avg_etime) max_etime, stddev_etime/min(avg_etime) norm_stddev
  3  from (
  4  select sql_id, plan_hash_value, execs, avg_etime,
  5  stddev(avg_etime) over (partition by sql_id) stddev_etime
  6  from (
  7  select sql_id, plan_hash_value,
  8  sum(nvl(executions_delta,0)) execs,
  9  (sum(elapsed_time_delta)/decode(sum(nvl(executions_delta,0)),0,1,sum(executions_delta))/1000000) avg_etime
 10  -- sum((buffer_gets_delta/decode(nvl(buffer_gets_delta,0),0,1,executions_delta))) avg_lio
 11  from DBA_HIST_SQLSTAT S, DBA_HIST_SNAPSHOT SS
 12  where ss.snap_id = S.snap_id
 13  and ss.instance_number = S.instance_number
 14  and executions_delta > 0
 15  and elapsed_time_delta > 0
 16  group by sql_id, plan_hash_value
 17  )
 18  )
 19  group by sql_id, stddev_etime
 20  )
 21  where norm_stddev > nvl(to_number('&min_stddev'),2)
 22  and max_etime > nvl(to_number('&min_etime'),.1)
 23  order by norm_stddev
 24  /
Enter value for min_stddev:
Enter value for min_etime:
 
SQL_ID        SUM(EXECS)   MIN_ETIME   MAX_ETIME   NORM_STDDEV
------------- ---------- ----------- ----------- -------------
1tn90bbpyjshq         20         .06         .24        2.2039
0qa98gcnnza7h         16       20.62      156.72        4.6669
7vgmvmy8vvb9s        170         .04         .39        6.3705
32whwm2babwpt        196         .02         .26        8.1444
5jjx6dhb68d5v         51         .03         .47        9.3888
71y370j6428cb        155         .01         .38       19.7416
66gs90fyynks7        163         .02         .55       21.1603
b0cxc52zmwaxs        197         .02         .68       23.6470
31a13pnjps7j3        196         .02        1.03       35.1301
7k6zct1sya530        197         .53       49.88       65.2909
 
10 rows selected.
 
SQL> @find_sql
SQL> select sql_id, child_number, plan_hash_value plan_hash, executions execs,
  2  (elapsed_time/1000000)/decode(nvl(executions,0),0,1,executions) avg_etime,
  3  buffer_gets/decode(nvl(executions,0),0,1,executions) avg_lio,
  4  sql_text
  5  from v$sql s
  6  where upper(sql_text) like upper(nvl('&sql_text',sql_text))
  7  and sql_text not like '%from v$sql where sql_text like nvl(%'
  8  and sql_id like nvl('&sql_id',sql_id)
  9  order by 1, 2, 3
 10  /
Enter value for sql_text:
Enter value for sql_id: 0qa98gcnnza7h
 
SQL_ID         CHILD  PLAN_HASH        EXECS     AVG_ETIME      AVG_LIO SQL_TEXT
------------- ------ ---------- ------------ ------------- ------------ ------------------------------------------------------------
0qa98gcnnza7h      0  568322376            3          9.02      173,807 select avg(pk_col) from kso.skew where col1 > 0
 
SQL> @awr_plan_stats
SQL> break on plan_hash_value on startup_time skip 1
SQL> select sql_id, plan_hash_value, sum(execs) execs, sum(etime) etime, sum(etime)/sum(execs) avg_etime, sum(lio)/sum(execs) avg_lio
  2  from (
  3  select ss.snap_id, ss.instance_number node, begin_interval_time, sql_id, plan_hash_value,
  4  nvl(executions_delta,0) execs,
  5  elapsed_time_delta/1000000 etime,
  6  (elapsed_time_delta/decode(nvl(executions_delta,0),0,1,executions_delta))/1000000 avg_etime,
  7  buffer_gets_delta lio,
  8  (buffer_gets_delta/decode(nvl(buffer_gets_delta,0),0,1,executions_delta)) avg_lio
  9  from DBA_HIST_SQLSTAT S, DBA_HIST_SNAPSHOT SS
 10  where sql_id = nvl('&sql_id','4dqs2k5tynk61')
 11  and ss.snap_id = S.snap_id
 12  and ss.instance_number = S.instance_number
 13  and executions_delta > 0
 14  )
 15  group by sql_id, plan_hash_value
 16  order by 5
 17  /
Enter value for sql_id: 0qa98gcnnza7h
 
SQL_ID        PLAN_HASH_VALUE        EXECS          ETIME    AVG_ETIME        AVG_LIO
------------- --------------- ------------ -------------- ------------ --------------
0qa98gcnnza7h       568322376           14          288.7       20.620      172,547.4
0qa98gcnnza7h      3723858078            2          313.4      156.715   28,901,466.0
 
SQL> @awr_plan_change
SQL> break on plan_hash_value on startup_time skip 1
SQL> select ss.snap_id, ss.instance_number node, begin_interval_time, sql_id, plan_hash_value,
  2  nvl(executions_delta,0) execs,
  3  (elapsed_time_delta/decode(nvl(executions_delta,0),0,1,executions_delta))/1000000 avg_etime,
  4  (buffer_gets_delta/decode(nvl(buffer_gets_delta,0),0,1,executions_delta)) avg_lio
  5  from DBA_HIST_SQLSTAT S, DBA_HIST_SNAPSHOT SS
  6  where sql_id = nvl('&sql_id','4dqs2k5tynk61')
  7  and ss.snap_id = S.snap_id
  8  and ss.instance_number = S.instance_number
  9  and executions_delta > 0
 10  order by 1, 2, 3
 11  /
Enter value for sql_id: 0qa98gcnnza7h
 
   SNAP_ID   NODE BEGIN_INTERVAL_TIME            SQL_ID        PLAN_HASH_VALUE        EXECS    AVG_ETIME        AVG_LIO
---------- ------ ------------------------------ ------------- --------------- ------------ ------------ --------------
     21857      1 20-MAR-09 04.00.08.872 PM      0qa98gcnnza7h       568322376            1       31.528      173,854.0
     22027      1 27-MAR-09 05.00.08.006 PM      0qa98gcnnza7h                            1      139.141      156,807.0
     22030      1 27-MAR-09 08.00.15.380 PM      0qa98gcnnza7h                            3       12.451      173,731.0
     22031      1 27-MAR-09 08.50.04.757 PM      0qa98gcnnza7h                            2        8.771      173,731.0
     22032      1 27-MAR-09 08.50.47.031 PM      0qa98gcnnza7h      3723858078            1      215.876   28,901,466.0
     22033      1 27-MAR-09 08.57.37.614 PM      0qa98gcnnza7h       568322376            2        9.804      173,731.0
     22034      1 27-MAR-09 08.59.12.432 PM      0qa98gcnnza7h      3723858078            1       97.554   28,901,466.0
     22034      1 27-MAR-09 08.59.12.432 PM      0qa98gcnnza7h       568322376            2        8.222      173,731.5
     22035      1 27-MAR-09 09.12.00.422 PM      0qa98gcnnza7h                            3        9.023      173,807.3
 
9 rows selected.

So back to the question, what’s the best way to deal with the issue. In general, the best way to eliminate Bind Variable Peeking is as follows:

  1. Only create histograms on skewed columns.
  2. Use literals in where clauses on columns where you have histograms and want to use them. Note that it’s not necessary to use literals for every possible value of a skewed column. There may be only a few outlier values that result in significantly different plans. With a little extra code you can use literals for those values and bind variables for the rest of the values that don’t matter.
  3. If you can’t modify the code, consider turning off Bind Variable Peeking by setting the _OPTIM_PEEK_USER_BINDS parameter to false. You won’t get the absolute best performance for every possible statement, but you will get much more consistent performance, which is, in my opinion, more important than getting the absolute best performance. Keep in mind that this is a hidden parameter and so should be carefully tested and probably discussed with Oracle support prior to implementing it in any production system.
  4. You can also consider stronger methods of forcing the optimizer’s hand such as Outlines (see my previous posts on Unstable Plans and on Outlines). This option provides a quick method of locking in a single plan, but it’s not fool proof. Even with outlines, there is some possibility that the plan can change. Also note that this option is only palatable in situations where you have a relatively small number of problem SQL statements.
  5. Upgrade to 11g and let Adaptive Cursor Sharing take care of all your problems for you (don’t bet on it working without a little effort – I’ll try to do a post on that soon).

In summary, using literals with histograms on columns with skewed data distributions are really the only effective way to deal with the issue and still retain the ability for the optimizer to choose the absolute best execution plans. However, if circumstances prevent this approach, there are other techniques that can be applied. These should be considered temporary fixes, but may work well while a longer term solution is contemplated. From a philosophical stand point, I strongly believe that consistency is more important than absolute speed. So when a choice must be made, I would always favor slightly reduced but consistent performance over anything that didn’t provide that consistency.

Your comments are always welcome. Please let me know what you think.