Archive for the ‘Plan Stability’ Category.

SQL Gone Bad – But Plan Not Changed? – Part 2

In Part 1 of this series I talked about the basic problem, which is that plan_hash_values are not based on the whole plan. One of the main missing components is the predicates associated with a plan, but there are other missing parts as was pointed out in Part 1 of Randolf Geist’s post on the topic. At any rate, predicates seem to be the most critical of the missing parts.

The purpose of this second post on the topic is to talk about diagnosis. Basically how do you identify when some other part of a plan has changed that doesn’t affect the plan_hash_value, specifically a predicate.

So first I thought I would show a few examples of statements with the same sql_id and plan_hash_value that have other plan differences (in the predicate section). To do this I used a method proposed by Randolf Geist a few years back in his 2nd post on the topic which covered Alternative Ways to Calculate a PLAN_HASH_VALUE In that post, Randolf shows several ways to compute a hash value on any or all of the columns in the v$sql_plan table. I wrote a simple script around one of the those methods (find_pred_mismatch.sql), and as you might guess from the name, I limited this version to not include all the columns in v$sql_plan, but to only identify statements with mismatched predicates. To be more explicit, the script will locate statements in the shared_pool that have multiple child cursors, where there are more than one set of predicates to go with a single plan_hash_value. Here’s an example:

SYS@DEMO1> @find_pred_mismatch
 
Type created.
 
 
SQL_ID        PLAN_HASH_VALUE CHILD_NUMBER   THE_HASH ARE_H
------------- --------------- ------------ ---------- -----
063m5s0cvrr19      1502175119            0 2709091620 DIFF!
093fgfvygm51m      3114044936            0 3689661040 DIFF!
0cn2wm9d7zq8d      1540383338            0 3746559344 DIFF!
0pt4jfmq9f1q0      3078496799            0 1309675335 DIFF!
155cwuv2pfp1d       768389713            0 2982291916 DIFF!
18c2yb5aj919t      1032442798            0 1714458756 DIFF!
1n9crga6mbw2x      4174841998            0 2752042257 DIFF!
1ytxrt5qp9qdd      2707146560            0 3757837347 DIFF!
23buxzfxyp1vy      3143617369            0 2089881285 DIFF!
23nad9x295gkf       891847141            0 4056778555 DIFF!
24zvjzuyrxh3w      1877711096            0 1680905434 DIFF!
28n17ru48jnh5      1665111388            0 3584687734 DIFF!
2j0fw19fph49j      1337823903            0 2431841938 DIFF!
2kd6nusgzc3uw      3151266570            0 3024843785 DIFF!
2rpwgryn7pxz5      3329544515            0  452505826 DIFF!
35nhk48nxwc0v      2553960494            0  117262982 DIFF!
3bc73t2h9mwxc      1420457580            0 1226225583 DIFF!
3gputsqv4u1j3      3161304745            0 2252819340 DIFF!
3zauy2zqryrsx      1420457580            0 1128973296 DIFF!
42q1qby3huf2c      3069437562            0 4008632079 DIFF!
47mm81hm9sggy      1836210496            0 1554180227 DIFF!
4g46czps3t55u      2714703948            0 4063401817 DIFF!
4n2gca427719q      1360319091            0 4013571180 DIFF!
4tpsnbkt1dm5j      2960949352            0 3341004783 DIFF!
5dyhfnkzta2zm      3767331201            0 4238766232 DIFF!
5h91zx386wbht       293199272            0  949799344 DIFF!
5s34t44u10q4g      2693539438            0  839739072 DIFF!
5uw1u291s3m0k       219265157            0  642280427 DIFF!
61tn3mam0vq0b      2012170170            0 2048362545 DIFF!
63t3ufgq37m0c      1155609947            0  844291465 DIFF!
69k5bhm12sz98      3091659676            0  356777601 DIFF!
6cp74g22fzahf        76968983            0 1617454724 DIFF!
6wm3n4d7bnddg      1772758172            0 1148123313 DIFF!
78kp0fcyxavzb      2960949352            0 1085639264 DIFF!
7ah4afrggrw5c      4213028598            0 4285032606 DIFF!
7g4rxwbvhdh3q      3170022080            0 2083442940 DIFF!
7hspvruktu52b      4016032974            0 2538340188 DIFF!
84p3g5b5bsfvn       681044650            0 3826083810 DIFF!
86521pa77y28j      3760090177            0 3887843475 DIFF!
8ak9gkw2mjhvr      1526940012            0 2946674232 DIFF!
8p9z2ztb272bm       408663731            0 3293625021 DIFF!
aca4xvmz0rzup       427901911            0 4215668999 DIFF!
akh9zqqkx3wj7      2306922995            0 2084689096 DIFF!
akx4284f2vjnv      3948068913            0 2662025793 DIFF!
amycufzt6uq5f      3283312188            0 1896511712 DIFF!
atnkqhrp3t7xa      2196914545            0   26873046 DIFF!
aw2x7hh2a9ag0      1148557212            0  719001678 DIFF!
b41wak2bb7atw       108532975            0 1699960507 DIFF!
bhvyz9bgyrhb2      1134671139            0 2402404248 DIFF!
c8gnrhxma4tas      4024720576            0 2473084105 DIFF!
cc7vvmrsxzyq1      1849127868            0 1912933403 DIFF!
cjtaqp92v10bn       922118807            0 2313573387 DIFF!
ckfgcyv05xptf      2869192503            0 3932622422 DIFF!
cw860p03hy5ff      1502175119            0 2915988156 DIFF!
cyw0c6qyrvsdd       192117504            0 2551186960 DIFF!
d53nc7j6n1057      1356236608            0  582788179 DIFF!
dyj1ssw8jw54f      1836210496            0   66902761 DIFF!
fkjkrv5ycz96u      2247257083            0 1809299677 DIFF!
gdn3ysuyssf82      4024720576            0 2473084105 DIFF!
gwbdd5m45ugpm      3180770434            0  235716193 DIFF!
 
60 rows selected.
 
SYS@DEMO1> select * from table(dbms_xplan.display_cursor('&sql_id','&child_no','typical'));
Enter value for sql_id: 24zvjzuyrxh3w
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  24zvjzuyrxh3w, child number 0
-------------------------------------
SELECT script FROM sys.metaxsl$ WHERE xmltag=:1 AND transform=:2  AND
model=:3
 
Plan hash value: 1877711096
 
--------------------------------------------------------------------------------------
| Id  | Operation                 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          |       |       |     3 (100)|          |
|*  1 |  TABLE ACCESS STORAGE FULL| METAXSL$ |     3 |    99 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - storage(("XMLTAG"=:1 AND "TRANSFORM"=:2 AND "MODEL"=:3))
       filter(("XMLTAG"=:1 AND "TRANSFORM"=:2 AND "MODEL"=:3))
 
SQL_ID  24zvjzuyrxh3w, child number 1
-------------------------------------
SELECT script FROM sys.metaxsl$ WHERE xmltag=:1 AND transform=:2  AND
model=:3
 
Plan hash value: 1877711096
 
----------------------------------------------
| Id  | Operation                 | Name     |
----------------------------------------------
|   0 | SELECT STATEMENT          |          |
|*  1 |  TABLE ACCESS STORAGE FULL| METAXSL$ |
----------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - storage(("MODEL"=:3 AND "TRANSFORM"=:2 AND "XMLTAG"=:1))
       filter(("MODEL"=:3 AND "TRANSFORM"=:2 AND "XMLTAG"=:1))
 
Note
-----
   - rule based optimizer used (consider using cbo)
 
 
44 rows selected.

Continue reading ‘SQL Gone Bad – But Plan Not Changed? – Part 2’ »

SQL Gone Bad – But Plan Not Changed? – Part 1

Last week an interesting issue popped up on a mission critical production app (MCPA). A statement that was run as part of a nightly batch process ran long. In fact, the statement never finished and the job had to be killed and restarted. This particular system is prone to plan stability issues due to various factors outside the scope of this post, so the first thing that the guys checked was if there had been a plan change. Surprisingly the plan_hash_value was the same as it had been for the past several months. The statement was very simple and a quick look at the xplan output showed that the plan was indeed the same with one exception. The predicate section was slightly different.

As a quick diversion, you probably already know that the plan_hash_value is calculated based on partial information about the plan. Arguably it’s the most important parts, but there are some important parts of the plan that are not included (namely the stuff that shows up in the predicate section of the plan). Randolf Geist explained which parts of the plan are used in calculating the plan_hash_value well in a post on How PLAN_HASH_VALUES Are Calculated several years ago. His summary was this:

So in summary the following conclusions can be made:

- The same PLAN_HASH_VALUE is merely an indicator that the same operations on objects of the same name are performed in the same order.

- It tells nothing about the similarity of the expected runtime performance of the execution plan, due to various reasons as demonstrated. The most significant information that is not covered by the PLAN_HASH_VALUE are the filter and access predicates, but there are other attributes, too, that are not part of the hash value calculation.

The production issue got me thinking about several things:

    Could I come up with a simple test case to show a severe performance degradation between executions of a statement with the same plan_hash_value because of a change in the predicates? (the answer is it’s pretty easy actually)

    What forensics could be used to determine when this situation has occurred?

    How do you fix the problem?

Continue reading ‘SQL Gone Bad – But Plan Not Changed? – Part 1’ »

Displaying SQL Baseline Plans

Since I’m on vacation and not “really” working, I thought I might have time to write up a quick blog post. The idea for this one was triggered by one of Maria Colgan’s presentations at Hotsos last week. Maria was talking about SQL Plan Management and Baselines and somehow got me thinking about the DBMS_XPLAN option to display plans for Baselines. This is a pretty neat feature that allows you to the see the plan associated with a Baseline (well sort of).

The 11.2 documentation (Oracle® Database PL/SQL Packages and Types Reference) says this about the DISPLAY_SQL_PLAN_BASELINE function:

This procedure uses plan information stored in the plan baseline to explain and display the plans.It is possible that the plan_id stored in the SQL management base may not match with the plan_id of the generated plan. A mismatch between stored plan_id and generated plan_id means that it is a non-reproducible plan. Such a plan is deemed invalid and is bypassed by the optimizer during SQL compilation.

But what does that mean? Well in short it means that Baselines don’t store plans, they store hints that when fed to the optimizer will hopefully cause it to come up with the desired plan. Baselines also store a plan_hash_value so it’s possible to tell whether the hints worked or not. Baselines do not actually store all the steps of a plan. So if that’s the case, then it’s obviously not possible for the display_sql_plan_baseline function to show the plan if the optimizer can’t reproduce it for some reason. When the doc’s say “it is possible that the plan_id stored in the SQL management base may not match with the plan_id of the generated plan”, that’s what they are talking about. I decided to create a test case to see what happens when the generated plan can’t match the original. Here’s the basic idea:

  1. run a statement that uses an index and check the plan
  2. create a Baseline on the statement using the index (using my create_baseline.sql script)
  3. check the hints stored with the baseline (using my baselines_hints.sql script)
  4. run the statement again and check the real plan to see that the Baseline was used
  5. use the display_sql_plan_baseline function to show the Baseline plan
  6. make the index invisible (thus rendering the Baseline plan non-reproducible)
  7. execute the statement again and check the real plan
  8. use the display_sql_plan_baseline function to show the Baseline plan

So here’s the test:

Continue reading ‘Displaying SQL Baseline Plans’ »

New create_1_hint_sql_profile.sql

I modified my create_1_hint_sql_profile.sql script (which I blogged about here: Single Hint Profiles) to allow any arbitrary text sting including quote characters. This is a script that I use fairly often to apply a hint to a single SQL statement that is executing in a production system where we can’t touch the code for some reason. For example, it’s sometimes useful to add a MONITOR hint or a GATHER_PLAN_STATISTICS hint to a statement that’s behaving badly so we can get more information about what the optimizer is thinking. I recently updated the script to allow special characters in the hint syntax. This feature is useful when you want to add something like an OPT_PARAM hint that takes quoted arguments. The change makes use of the q-Quote feature which I blogged about here: q-Quote. (the original version just barfed on quotes being input as part of the hint)

Here’s an example of how to use it:

SYS@SANDBOX1> alter session set cell_offload_processing=false;
 
Session altered.
 
Elapsed: 00:00:00.00
SYS@SANDBOX1> select avg(pk_col) from kso.skew3 where col1 < 0;
 
AVG(PK_COL)
-----------
  1849142.5
 
1 row selected.
 
Elapsed: 00:00:28.08
SYS@SANDBOX1> @fsx
Enter value for sql_text: select avg(pk_col) from kso.skew3 where col1 < 0
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH  EXECS  AVG_ETIME AVG_PX OFFLOAD IO_SAVED_% SQL_TEXT
------------- ------ ---------- ------ ---------- ------ ------- ---------- ----------------------------------------------------------------------
a6j7wgqf84jvg      0 2684249835      1      28.07      0 No             .00 select avg(pk_col) from kso.skew3 where col1 < 0
 
1 row selected.
 
Elapsed: 00:00:00.02
SYS@SANDBOX1> @create_1_hint_sql_profile.sql
Enter value for sql_id: a6j7wgqf84jvg
Enter value for profile_name (PROFILE_sqlid_MANUAL): 
Enter value for category (DEFAULT): 
Enter value for force_matching (false): 
Enter value for hint_text: opt_param('cell_offload_processing' 'true')
 
Profile PROFILE_a6j7wgqf84jvg_MANUAL created.
 
Elapsed: 00:00:00.07
SYS@SANDBOX1> @sql_profile_hints
Enter value for profile_name: PROFILE_a6j7wgqf84jvg_MANUAL
 
HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
opt_param('cell_offload_processing' 'true')
 
1 rows selected.
 
Elapsed: 00:00:00.04
SYS@SANDBOX1> select avg(pk_col) from kso.skew3 where col1 < 0;
 
AVG(PK_COL)
-----------
  1849142.5
 
1 row selected.
 
Elapsed: 00:00:05.11
SYS@SANDBOX1> @fsx
Enter value for sql_text: select avg(pk_col) from kso.skew3 where col1 < 0
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH  EXECS  AVG_ETIME AVG_PX OFFLOAD IO_SAVED_% SQL_TEXT
------------- ------ ---------- ------ ---------- ------ ------- ---------- ----------------------------------------------------------------------
a6j7wgqf84jvg      0 2684249835      1      28.07      0 No             .00 select avg(pk_col) from kso.skew3 where col1 < 0
a6j7wgqf84jvg      1 2684249835      1       5.10      0 Yes          99.99 select avg(pk_col) from kso.skew3 where col1 < 0

In the example I turned off cell offload processing with the ALTER SESSION and ran a SQL statement that took 28 seconds. Then I used my fsx.sql script to verify that the statement was not offloaded and to find the SQL_ID. Next I created a 1 hint Profile with an OPT_PARAM hint that set the cell_offload_processing parameter back to TRUE using the new version of my create_1_hint_sql_profile.sql script. Next I used my sql_profile_hints.sql script to verify the text of the hint that was added to the Profile. It looked good including the quotes. When I executed the statement a second time it ran in 5 seconds. I then used fsx.sql again to see that the statement was offloaded for the second execution (child 1).

Cardinality Feedback

I ran into an interesting issue last week having to do with plan stability. The problem description went something like this:

“I have a statement that runs relatively quickly the first time I run it (around 12 seconds). Subsequent executions always run much slower, usually around 20 or 30 minutes. If I flush the shared pool and run it again elapsed time is back to 12 seconds or so.”

The query looked a little like this:

Continue reading ‘Cardinality Feedback’ »