Unstable Plans (Oracle Plan Stability/Instability)
One of the most frustrating things about the Cost Based Optimizer in Oracle is the sometimes erratic performance. It can seem almost random at times. Resolving these issues can also be challenging. It’s a little like taking your car to the mechanic when it makes a funny noise. It never seems to make the noise when the mechanic is there. Fortunately we have ASH/AWR which captures a ton of information about what our databases are doing.
Here’s a couple of scripts that I find useful.
The first one can be used to show statements that have experienced significant variances in execution time (it can be modified to look for variances in the amount of logical i/o, but I’ll leave it as an exercise for the reader). I called the script unstable_plans.sql. It uses an analytic function to calculate a standard deviation on the average elapsed time by plan. So the statements that have multiple plans with wild variations in the response time between plans will be returned by the script. The script prompts for a couple of values. The first is minimum number of standard deviations. The second is the minimum elapsed time (I usually don’t care if a statement executes sometimes in .005 seconds and sometimes in .02 seconds, even though this is a large swing statistically). Both these inputs are defaulted by the way.
The second one is called awr_plan_change.sql. It’s shows how the plans for a given statement have changed over time, along with some statistics such as the average elapsed time and the average lio’s.
Anyway, here’s an example of the usage of these two scripts (by the way, the example is on an 11gR1 database, but the scrips work fine on 10g as well).
> sqlplus / as sysdba SQL*Plus: Release 11.1.0.6.0 - Production on Tue Oct 7 15:44:20 2008 Copyright (c) 1982, 2007, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> alter system set optimizer_mode=rule; System altered. SQL> @awr_snap PL/SQL procedure successfully completed. SQL> @unstable_plans Enter value for min_stddev: Enter value for min_etime: SQL_ID SUM(EXECS) MIN_ETIME MAX_ETIME NORM_STDDEV ------------- ---------- ----------- ----------- ------------- c5by9gdw3814r 14 .03 .12 2.1274 848dyu9288c3h 16 .02 .16 2.1309 2am60vd2kw8ux 11 .05 .22 2.4976 frxg53fac2n8j 5 .03 .30 4.2479 0qa98gcnnza7h 62 25.58 314.34 7.9833 g0jvz8csyrtcf 2 .09 1.19 8.2304 2cn0kc8u4b81w 545 .02 .42 12.8022 9wt62290ah0f7 6 .01 .47 38.5857 d8mayxqw0wnpv 1373 .01 .85 48.3874 9 rows selected. SQL> / Enter value for min_stddev: Enter value for min_etime: 2 SQL_ID SUM(EXECS) MIN_ETIME MAX_ETIME NORM_STDDEV ------------- ---------- ----------- ----------- ------------- 0qa98gcnnza7h 62 25.58 314.34 7.9833 SQL> @find_sql Enter value for sql_text: Enter value for address: Enter value for sql_id: 0qa98gcnnza7h SQL_ID CHILD PLAN_HASH EXECS ETIME AVG_ETIME USERNAME SQL_TEXT ------------- ------ ---------- ------------ ------------- ------------- ------------- ----------------------------------------- 0qa98gcnnza7h 0 3723858078 5 356.53 71.31 SYS select avg(pk_col) from kso.skew where co l1 > 0 0qa98gcnnza7h 1 568322376 1 7.92 7.92 SYS select avg(pk_col) from kso.skew where co l1 > 0 0qa98gcnnza7h 2 568322376 10 52.14 5.21 SYS select avg(pk_col) from kso.skew where co l1 > 0 0qa98gcnnza7h 3 568322376 30 1,064.19 35.47 KSO select avg(pk_col) from kso.skew where co l1 > 0 0qa98gcnnza7h 4 3723858078 10 4,558.62 455.86 KSO select avg(pk_col) from kso.skew where co l1 > 0 SQL> @awr_plan_change Enter value for sql_id: 0qa98gcnnza7h SNAP_ID NODE BEGIN_INTERVAL_TIME SQL_ID PLAN_HASH_VALUE EXECS AVG_ETIME AVG_LIO ---------- ------ ------------------------------ ------------- --------------- ------------ ------------ -------------- 3206 1 02-OCT-08 08.00.38.743 AM 0qa98gcnnza7h 568322376 4 10.359 121,722.8 3235 1 03-OCT-08 01.00.44.932 PM 0qa98gcnnza7h 1 10.865 162,375.0 3235 1 03-OCT-08 01.00.44.932 PM 0qa98gcnnza7h 3723858078 1 127.664 28,913,271.0 3236 1 03-OCT-08 01.28.09.000 PM 0qa98gcnnza7h 568322376 1 7.924 162,585.0 3236 1 03-OCT-08 01.28.09.000 PM 0qa98gcnnza7h 3723858078 1 86.682 27,751,123.0 3305 1 06-OCT-08 10.00.11.988 AM 0qa98gcnnza7h 4 64.138 22,616,931.5 3305 1 06-OCT-08 10.00.11.988 AM 0qa98gcnnza7h 568322376 2 5.710 81,149.0 3306 1 06-OCT-08 11.00.16.490 AM 0qa98gcnnza7h 6 5.512 108,198.5 3307 1 06-OCT-08 12.00.20.716 PM 0qa98gcnnza7h 2 3.824 81,149.0 3328 1 07-OCT-08 08.39.20.525 AM 0qa98gcnnza7h 30 35.473 156,904.7 3335 1 07-OCT-08 03.00.20.950 PM 0qa98gcnnza7h 3723858078 10 455.862 28,902,128.6 11 rows selected. SQL> @dplan_awr Enter value for sql_id: 0qa98gcnnza7h Enter value for plan_hash_value: PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 0qa98gcnnza7h -------------------- select avg(pk_col) from kso.skew where col1 > 0 Plan hash value: 568322376 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 44497 (100)| | | 1 | SORT AGGREGATE | | 1 | 11 | | | | 2 | TABLE ACCESS FULL| SKEW | 32M| 335M| 44497 (2)| 00:08:54 | --------------------------------------------------------------------------- SQL_ID 0qa98gcnnza7h -------------------- select avg(pk_col) from kso.skew where col1 > 0 Plan hash value: 3723858078 -------------------------------------------------- | Id | Operation | Name | -------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | SORT AGGREGATE | | | 2 | TABLE ACCESS BY INDEX ROWID| SKEW | | 3 | INDEX RANGE SCAN | SKEW_COL1 | -------------------------------------------------- Note ----- - rule based optimizer used (consider using cbo) 33 rows selected. |
A couple of quick comments. The unstable_plan.sql script shows several statements that are switching plans. Most of them are sub-second response times though. The one that stands out is sql_id 0qa98gcnnza7h. It has been executed 62 times and one plan averages about 25 seconds while another averages about 314 seconds. The awr_plan_changes.sql script shows that this statement has been switching between two plans (3723858078 and 568322376). 3723858078 is the less efficient plan (doing 25M or so logical i/o’s per execution) while plan 568322376 is much better (doing only about 120K logical i/o’s per execution).
This is a typical pattern we see when bind variable peeking is an issue. Plans that flip flop back and forth between two or three options. This post is not specifically about bind variable peeking and so I won’t go into all the details (if you want more info just search on this site). But I will say that it is quite often the case that a single plan is acceptable regardless of the bind variables that are used (in this case the 568322376 plan). It is pretty easy to get Oracle to use a specific plan using an Outline, SQL Profile, or Baseline (depending on the version of Oracle you are using). And that’s exactly what we did in this case to resolve the issue.

Kerry Osborne's Oracle Blog - Bind Variable Peeking - Drives Me Nuts! - Kerry Osborne’s Oracle Blog:
[...] stronger methods of forcing the optimizer’s hand such as Outlines (see my previous posts on Unstable Plans and on Outlines). This option provides a quick method of locking in a single plan [...]
March 27, 2009, 11:04 pmKerry Osborne’s Oracle Blog » Blog Archive SQL Profiles - Kerry Osborne’s Oracle Blog:
[...] here’s little example: Note: unstable_plans.sql and awr_plan_stats.sql are discussed here: Unstable Plans (Plan Instability) ?View Code SQLSQL> @unstable_plans Enter value for min_stddev: Enter value for min_etime: [...]
April 5, 2009, 12:06 pmKerry Osborne’s Oracle Blog » Blog Archive Oracle 11g Adaptive Cursor Sharing (ACS) - Kerry Osborne’s Oracle Blog:
[...] First take on observations of ACS on 11.1.0.7 [...]
June 1, 2009, 9:57 amJulio Cesar:
Hi Kerry, we are facing this trouble now, the same query can run in 1 minute or in a hour, we disabled the AMM and set all the memory parameters manually, but the problem persists in a dangerous way, do you have some idea in solving it?
Best regards and I wish you luck if you’re still with this trouble.
Júlio César
June 10, 2009, 3:36 pmosborne:
Julio,
Not much to go on in you’re question, but I’ll hazard a couple of suggestions any way. Plan instability is often the result of bind variable peeking. The worst cases often happen when you have a histogram on a skewed column in combination with a bind variable in a where clause that references the column. The best way to fix this issue is to use literals instead of bind variables for such columns. Creating a SQL Profile or Outline to lock in one of the plans can often be a pretty quick short term fix as well. Also, you should be aware that 10g’s standard method of gathering stats does not do a very good job (in my opinion) of picking which columns to gather histograms on (it usually creates too many). The auto sample size is often too small to create an accurate histogram as well. So you may want to re-evaluate the stats gathering. I’ve written several posts about bind variable peeking and methods for dealing with the issue. Here are a couple you might want to look at:
Bind Variable Peeking Drives Me Nuts!
Outlines - aka Plan Stability
SQl Profiles
Or just search the blog for “Plan Instability” or “Bind Variable Peeking”.
Hope that helps.
Kerry
June 12, 2009, 8:27 amPaul:
Hi Kerry,
July 9, 2009, 4:20 amNice piece of work. Could you please post the ‘find_sql.sql’ script??
thx
Paul
osborne:
Hi Paul,
The find_sql.sql script is here:
http://www.oracle-guy.com/scripts/find_sql.sql
Sometimes I don’t post the text for scripts I use regularly (if I’ve already posted them). A quick search for the script name in the blog search box will usually turn them up. I probably should do a post that lists my most used scripts with links to them. I’ll see if I can get that done soon.
Kerry
July 9, 2009, 8:05 amYuvaraj:
Could you paste the “@dplan_awr” script?
July 4, 2010, 2:41 pmosborne:
SELECT * FROM table(dbms_xplan.display_awr(nvl(’&sql_id’,'a96b61z6vp3un’),nvl(’&plan_hash_value’,null),null,’typical +peeked_binds’));
July 4, 2010, 2:47 pmYuvaraj:
Thanks. I am trying to find a procedure to use old hash value instead of a new hash value. There are two hash values for a same SQL_ID. Could you pls provide a link if you have written a doc on this?
July 4, 2010, 7:13 pmosborne:
I’m sorry Yuvaraj, but I have not done anything with old_hash_value. Nor do I know of anyone who has written about it. Tanel Poder has written about calculating SQL_ID from hash_value though.
Kerry
July 4, 2010, 7:22 pmYuvaraj:
I thought of checking with you whether if it is possible to force oracle to use the plan from old hash value then the newer one. There are two hash values for a same SQL_ID. I would like to force the old hash value.
July 4, 2010, 7:27 pmosborne:
I think you are confusing plan_hash_value with hash_value (and old_hash_value). A statement can have only one sql_id and only one hash_value (its based on the text of the statement). A statement can have many plan_hash_values. And yes you can influence a statement to use a particular plan. I’ve done a number of posts on using SQL Profiles to do just that. See SQL Profiles for example.
Kerry
July 18, 2010, 7:17 pm