GATHER_PLAN_STATISTICS
Hey, I just thought of something. One of the things I do fairly regularly when struggling with a SQL statement that is not behaving goes like this:
- create a copy of the statement in a test script
- add the GATHER_PLAN_STATISTICS hint
- duplicate all the bind variables (with values used at parse if possible)
- run it
- run xplan on it
This provides the ability to see how the optimizer’s estimates line up with reality using the DBMS_XPLAN.DISPLAY_CURSOR function with the ALLSTATS format option. It’s very easy to see where the optimizer is having problems by comparing the A-Rows column (actual rows) with the E-Rows column (estimated rows). Anywhere there is a big difference (think orders of magnitude), there is a potential problem. Well actually it’s a little more complicated than that, because the A-Rows are cumulative while the E-Rows are not. You have to multiply the E-Row by Starts (or divide A-Rows by the number of executions) in order to compare apples to apples. If you need a little more info see this post by Jonathan Lewis or this one by Greg Rahn. Here’s how the output looks in case you haven’t seen it before:
SYS@LAB1024> @fs
Enter value for sql_text: %gather_plan%
Enter value for sql_id:
SQL_ID CHILD PLAN_HASH EXECS AVG_ETIME AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------ ------------------------------------------------------------
dvfumy503zfnw 0 1660344652 3 .00 25 select /*+ gather_plan_statistics */ t1.x, t2.y from kso.t1
t1, kso.t2 t2 where t1.y = t2.x and rownum < 10
SYS@LAB1024> !cat dplan_allstats.sql
set lines 180
select * from table(dbms_xplan.display_cursor('&sql_id','&child_no','allstats +peeked_binds'))
/
SYS@LAB1024> @dplan_allstats
Enter value for sql_id: dvfumy503zfnw
Enter value for child_no:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID dvfumy503zfnw, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ t1.x, t2.y from kso.t1 t1, kso.t2 t2 where t1.y = t2.x and
rownum < 10
Plan hash value: 1660344652
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------
|* 1 | COUNT STOPKEY | | 3 | | 27 |00:00:00.01 | 75 |
| 2 | NESTED LOOPS | | 3 | 9 | 27 |00:00:00.01 | 75 |
| 3 | TABLE ACCESS FULL | T1 | 3 | 53629 | 27 |00:00:00.01 | 15 |
| 4 | TABLE ACCESS BY INDEX ROWID| T2 | 27 | 1 | 27 |00:00:00.01 | 60 |
|* 5 | INDEX UNIQUE SCAN | SYS_C0014104 | 27 | 1 | 27 |00:00:00.01 | 33 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<10)
5 - access("T1"."Y"="T2"."X")
23 rows selected.
Just a quick note on the correlation between STARTS, E-ROWS, and A-ROWS. I picked a bit of a bad example due to the "rownum < 10" predicate. It causes processing to stop on the full table scan after 9 rows are retrieved. But it does show a mismatch in the number of estimated rows and the number of actual rows. Keep in mind that the actual rows are cumulative, so it's the total rows this step has handled regardless of the number of executions. This statement has been executed 3 times (see the 3 under Starts on line 1). And the full table scan was executed (Started) 3 times (once per execution). So the optimizer was expecting 53629 rows for each execution, but got on average 9 rows per execution. Note also that the Starts column is the actual number of times the operation was executed, not the number of times that the optimizer thinks it will be executed. Here's another little bit of output to show that Starts is actual, not estimated:
SYS@LAB1024> -- here's a query without the "rownum<10" predicate
SYS@LAB1024> @fs
Enter value for sql_text: %t1_nl_join_agg.sql%
Enter value for sql_id:
SQL_ID CHILD PLAN_HASH EXECS AVG_ETIME AVG_LIO SQL_TEXT
------------- ------ ------------ ------------ ------------- ------------ ------------------------------------------------------------
a36w6pfkgp2sy 0 174109304 1 3.77 53,745 select /*+ gather_plan_statistics */ avg(x*2) from ( select
/* t1_nl_join_agg.sql */ t1.x, t2.y from kso.t1 t1, kso.t2
t2 where t1.y = t2.x )
1 row selected.
SYS@LAB1024> @dplan_allstats
Enter value for sql_id: a36w6pfkgp2sy
Enter value for child_no:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID a36w6pfkgp2sy, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ avg(x*2) from ( select /* t1_nl_join_agg.sql */ t1.x,
t2.y from kso.t1 t1, kso.t2 t2 where t1.y = t2.x )
Plan hash value: 174109304
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
-------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:03.77 | 53745 | 169 |
| 2 | NESTED LOOPS | | 1 | 53629 | 25994 |00:00:02.94 | 53745 | 169 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 53629 | 53629 |00:00:00.80 | 114 | 105 |
|* 4 | INDEX UNIQUE SCAN| SYS_C0014104 | 53629 | 1 | 25994 |00:00:02.41 | 53631 | 64 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."Y"="T2"."X")
21 rows selected.
SYS@LAB1024> -- the stats are 100% correct on the full table scan (line 3),
SYS@LAB1024> -- but off by 2X on the NL join (25994 vs. 53629 on line 2)
SYS@LAB1024> -- so let's mess up the stats on T1 to see if Starts is optimizer guess or actual
SYS@LAB1024> -- set rows in T1 to 1234
SYS@LAB1024>
SYS@LAB1024> exec dbms_stats.set_table_stats(ownname => 'KSO', tabname => 'T1', numrows => 1234);
PL/SQL procedure successfully completed.
SYS@LAB1024> @flush_sql
Enter value for sql_id: a36w6pfkgp2sy
PL/SQL procedure successfully completed.
SYS@LAB1024> @t1_nl_join_agg.sql
AVG(X*2)
------------
108794.25383
1 row selected.
SYS@LAB1024> @dplan_allstats
Enter value for sql_id: a36w6pfkgp2sy
Enter value for child_no:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID a36w6pfkgp2sy, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ avg(x*2) from ( select /* t1_nl_join_agg.sql */
t1.x, t2.y from kso.t1 t1, kso.t2 t2 where t1.y = t2.x )
Plan hash value: 174109304
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:03.63 | 53745 |
| 2 | NESTED LOOPS | | 1 | 1234 | 25994 |00:00:03.20 | 53745 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 1234 | 53629 |00:00:00.86 | 114 |
|* 4 | INDEX UNIQUE SCAN| SYS_C0014104 | 53629 | 1 | 25994 |00:00:02.41 | 53631 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."Y"="T2"."X")
21 rows selected.
SYS@LAB1024> -- note that Starts on the Index Lookup inside the NL Join is 53629 (line 4)
SYS@LAB1024> -- this is the number of times that operation was actually executed,
SYS@LAB1024> -- not 1234, which is the number of times the optimizer thought is would be executed
SYS@LAB1024> -- so Starts and A-rows are actual values, E-Rows is the optimizer estimate
SYS@LAB1024> -- (at least for NL Join)
Back to the topic at hand, my normal steps (i.e. generate test script, add GATHER_PLAN_STATISTICS hint, etc…)
I have a script that helps generate the test script (see this post: Creating Test Scripts with Bind Variables). But there are definite drawbacks to this approach. It can be difficult to duplicate the exact environment that the optimizer sees when running the production code, even if you are testing on the production instance. The bottom line is that you have to be pretty careful to make sure that your test script is really doing the same thing that the production code is doing.
Which brings me to my idea (finally)! Here’s a quick way to get extended plan statistics for any statement without modifying the code or creating a test version. Drum roll please ….
Just create a one line SQL Profile (using create_1_hint_sql_profile.sql) with the GATHER_PLAN_STATISTICS hint. You can use dplan_allstats.sql to see the extended statistics. Here’s an example:
> !sql
sqlplus "/ as sysdba"
SQL*Plus: Release 11.2.0.1.0 Production on Thu Jan 28 17:49:56 2010
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SYS@LAB112> -- note: SQL_ID 84q0zxfzn5u6s is the statement I want to look at
SYS@LAB112>
SYS@LAB112> @create_1_hint_sql_profile
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for profile_name (PROFILE_sqlid_MANUAL): PROFILE_84q0zxfzn5u6s_GPS
Enter value for category (DEFAULT):
Enter value for force_matching (false):
Enter value for hint: GATHER_PLAN_STATISTICS
Profile PROFILE_84q0zxfzn5u6s_MANUAL created.
PL/SQL procedure successfully completed.
SYS@LAB112> @sql_profiles
Enter value for sql_text:
Enter value for name:
NAME CATEGORY STATUS SQL_TEXT FOR
------------------------------ --------------- -------- ---------------------------------------------------------------------- ---
PROFILE_fgn6qzrvrjgnz DEFAULT DISABLED select /*+ index(a SKEW_COL1) */ avg(pk_col) from kso.skew a NO
PROFILE_69k5bhm12sz98 DEFAULT DISABLED SELECT dbin.instance_number, dbin.db_name, dbin.instance_name, NO
PROFILE_8js5bhfc668rp DEFAULT DISABLED select /*+ index(a SKEW_COL2_COL1) */ avg(pk_col) from kso.skew a wher NO
PROFILE_bxd77v75nynd8 DEFAULT DISABLED select /*+ parallel (a 4) */ avg(pk_col) from kso.skew a where col1 > NO
PROFILE_8hjn3vxrykmpf DEFAULT DISABLED select /*+ invalid_hint (doda) */ avg(pk_col) from kso.skew where col1 NO
PROFILE_7ng34ruy5awxq DEFAULT DISABLED select i.obj#,i.ts#,i.file#,i.block#,i.intcols,i.type#,i.flags,i.prope NO
PROFILE_84q0zxfzn5u6s_GPS DEFAULT ENABLED select avg(pk_col) from kso.skew NO
7 rows selected.
SYS@LAB112> @sql_profile_hints
Enter value for profile_name: PROFILE_84q0zxfzn5u6s_GPS
HINT
------------------------------------------------------------------------------------------------------------------------------------------------------
GATHER_PLAN_STATISTICS
1 rows selected.
SYS@LAB112> @avgskewi
AVG(PK_COL)
-----------
15636133
1 row selected.
SYS@LAB112> @dplan_allstats
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for child_no:
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 84q0zxfzn5u6s, child number 0
-------------------------------------
select avg(pk_col) from kso.skew where col1 = 136133
Plan hash value: 3723858078
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 35 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 35 |
| 2 | TABLE ACCESS BY INDEX ROWID| SKEW | 1 | 35 | 32 |00:00:00.01 | 35 |
|* 3 | INDEX RANGE SCAN | SKEW_COL1 | 1 | 35 | 32 |00:00:00.01 | 3 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("COL1"=136133)
Note
-----
- SQL profile PROFILE_84q0zxfzn5u6s_GPS used for this statement
- SQL plan baseline SQL_84Q0ZXFZN5U6S_3723858078 used for this statement
25 rows selected.
SYS@LAB112> @sql_hints
Enter value for sql_id: 84q0zxfzn5u6s
Enter value for child_no: 0
OUTLINE_HINTS
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
DB_VERSION('11.2.0.1')
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX_RS_ASC(@"SEL$1" "SKEW"@"SEL$1" ("SKEW"."COL1"))
6 rows selected.
Works great! Now you don’t have to do all that work just to see the rowsource execution stats.
Note that the Outline Hints stored in the OTHER_XML column of V$sql_plan have no record of the GATHER_PLAN_STATISTICS hint, even though it was clearly executed (otherwise we’d have gotten a warning message in the XPLAN output and no A-Rows column).
Note also that this example was done on an 11gR2 database and that the statement was found in the SQL Management Base (i.e. there was a Baseline in place for this statement already). The Notes section of the plan output shows that both the Profile and the Baseline were used. This got me wondering if it really merged the hints from the Profile and the Baseline. The short answer is YES, it appears that it does. I’ve got an example, but since that’s a different topic and this post is already pretty long, I’ll just save it for another post.
As always, your comments are welcomed.
UPDATE: This has proved pretty useful, so I wrote a little script that just prompts for a sql_id and creates a SQL Profile with the GATHER_PLAN_STATISTICS hint. I have just uploaded it to the site here: gps.sql
UPDATE 2 (4/5/12): I have recently re-written the gps.sql script to use the SQL Patch facility which allows it to be used with statements that already have a SQL Profile attached to them.
Hi,
Nice post, very informative.Where can i get dplan_allstats script?
Anand
Hi Kerry,
When i am generating the explian plan using DBMS_XPLAN.DISPLAY_CURSOR with format “ALLSTATS LAST” i am not getting the A-rows,A-time etc columns?Why is it so?What all i am getting is “OMem | 1Mem | Used-Mem | Used-Tm”
Regards,
Anand
Anad,
My dplan_allstats.sql script is just DBMS_XPLAN.DISPLAY_CURSOR with the ALLSTATS format option prompting for a sql_id and child_number (like so):
It won’t show the A-Rows column unless the statement was instructed to collect the extra information. This can be done by running with the GATHER_PLAN_STATISTICS hint or the session having statistics_level = all, or _rowsource_execution_statistics = true. You should get a note at bottom of the xplan output like so if you have not enabled the collection of the extended statistics.
Hope that helps.
Kerry
Incorporating this as a hint is a great idea, never thought of that but it makes sense.
I also like the idea of temporarily messing up stats to make sure they are affecting the XPLAN as expected.
A couple of more tools for my toolbox, thanks!
David,
Thanks for the comments. I just updated the post with another script (gps.sql) that just prompts for the SQL_ID and adds the GATHER_PLAN_STATISTICS hint to it (via a Profile). A little more straight forward than the generic create_1_hint_sql_profile.sql script.
Kerry
[…] Kerry Osborne-gather_plan_statistics […]
[…] Click Here « Simulating Oracle ASM Disks […]
[…] a previous post (GATHER_PLAN_STATISTICS) I mentioned that SQL Profiles and Baselines can both be applied to a single statement. In this […]
Very clever trick! Thanks a lot for sharing. We are seeing quite a lot of overhead with tracing in 11g, especially when we have slow-by-slow (i.e., row-by-row) processing. So, this trick would a lot. With this trick and Real-Time SQL Monitoring we would not miss a expensive (expensive is very subjective) SQL.
Thanks!
[…] GATHER_PLAN_STATISTICS February 9, 2010, 12:02 pm […]
I’ve updated the gps.sql script to use the SQL Patch facility which allows it to be used with a statement that already has a SQL Profile attached to it. See NOTE 2 at the end of the post.
Hello Kerry,
Can you help me to understand below plan. It is on 11gR2 RAC database.
Why is the nested loop A-Time is 35.85 ( i believe it is in seconds) where as the SELECT STATEMENT A-Time is only 6.19.
As a matter of fact the query took just 9 seconds to return the value. It was select count(1)
I haven’t noticed this type of anomaly before. I want to say it’s a bug. Do you have an easily reproducible test case that you can share?
Kerry,
I am not sure how I can reproduce this as a test case. I can share with you the SQL used. This SQL takes less than 1 second in DEV database which is running on a small VM server. Where are in a
relatively BIGGER 16 CPU quad core server ( has other database as well) , query takes 9 to 12 seconds. Difference I saw in 2 execution plans was that in DEV it does a hash join where as in production it does a Nested loop 15358 times.
I tried to force the usage of hash tables by setting hints , but it did not help.. It always took around 10 seconds. See one more plan where it shows more than 1 minute in hash join step where as the query completed in 10 seconds
Hi Mohammed,
I traded an email with Tanel on this topic. His comments are included below:
======
it’s due to the sampling of A-time in 10g+ … the timestamps aren’t taken (extra syscall and/or just cpu overhead) every time execution transitions between row-sources, but only every 128th time its sampled and extrapolated from there. So sometimes you get such weird results.
if you set _rowsource_statistics_sampfreq = 1 for the test, the query should run longer due to more instrumentation but then the cumulative accounting should be correct …
======
Kerry,
I know this is an old post but why not just:
ALTER SESSION SET STATISTICS_LEVEL=ALL
if you don’t want to modify your original query? Isn’t this easier than creating a SQL profile?
Hi Tony,
ALTER SESSION has to be run in the session(s) that’s executing the statement, before it’s executed (as far as I know). So you can’t do that to statements that are executing in a production environment without changing the app or adding a login trigger or something. Even that would require that new sessions be created to pick up the trigger and connection pools often reuse sessions for a long time. Hope that helps.
Kerry
[…] http://kerryosborne.oracle-guy.com/2010/02/gather_plan_statistics/ […]
[…] http://kerryosborne.oracle-guy.com/2010/02/gather_plan_statistics/ […]
hi all,
if we have multiple execution plan,how could we decide best execution plan among them.
regards
Girish
Usually it’s a trade off between best elapsed time and lowest resource usage. They don’t always line up.
Dear Kerry,
Needless to say,this is again a very useful post from you.
can we drop this hint which we have added with gps.sql script, once we are done with reading the execution plan
Thanks,
satish