What Did My New Index Mess Up?

This week I got to look at a system where a new version of an application had been rolled out. I came across a SQL statement that had taken a turn for the worse in terms of its average elapsed time. It was using an index I hadn’t noticed before, so I took a look to see if the index was new (it was – apparently it was added as part of the code roll out). So I got to wondering if there were any other statements that had suffered a backward movement in performance. So I wrote this little script called whats_changed.sql. It’s the same basic idea as one I wrote about earlier here called unstable_plans.sql. Both of these scripts are based on a calculated variance. The original unstable_plans script uses variance in average elapsed times for statements with multiple plans (i.e. it shows statements that have multiple plans which exhibit a significant difference in average elapsed time between plans). The whats_changed.sql script shows statements that exhibit significant differences in average elapsed time before and after a point in time (i.e. when you rolled out new code, or added an index, etc…) A short example may explain it better than I can:

$ !sql
sqlplus / as sysdba
 
SQL*Plus: Release 11.1.0.7.0 - Production on Thu Jun 25 22:24:35 2009
 
Copyright (c) 1982, 2008, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
 
SQL> @whats_changed
Enter Days ago: 3
Enter value for min_stddev:
Enter value for min_etime:
Enter value for faster_slower:
 
SQL_ID               EXECS AVG_ETIME_BEFORE AVG_ETIME_AFTER   NORM_STDDEV RESULT
------------- ------------ ---------------- --------------- ------------- ------
an9114vtxukz2           17             1.25            5.00        2.1148 Slower
803u6vwz0ah1p           29             0.04            0.18        2.1552 Slower
bvf3fxv3hatw7        2,390             0.03            0.14        2.4147 Slower
5sgs7q9pjnzg5            2             2.21            0.42        3.0130 Faster
b0zkuv9qtcvnn           48             0.16            0.85        3.1500 Slower
9ws5c6p77d384        1,852             0.57            3.30        3.3609 Slower
dy98ajwqc8s2x           15             1.32            8.34        3.7558 Slower
bkka22vb5y7t6          596             0.19            1.26        3.8804 Slower
6qfwkqgvjkn2q           12             0.51            3.60        4.2641 Slower
05xcf43d9psvm        1,197             0.02            0.14        5.4263 Slower
8wgmc9w5ubtsy          183             0.49            4.32        5.4946 Slower
fwfpuf1vfwfu2            6             0.02            0.22        5.6314 Slower
50c18dwvxq4sb          222             0.54            5.29        6.1586 Slower
aukanfjd3d8fa            3             0.88           10.00        7.3496 Slower
44bq4r5z2xrsm           54             3.13           43.39        9.0946 Slower
0az7czjdw8z7j          110             0.62            0.02       17.5933 Faster
f41agfq9qdhk6            3             0.24            8.06       22.6765 Slower
 
17 rows selected.
 
SQL> -- or to see just the slower ones
SQL>
SQL> @whats_changed
Enter Days ago: 3
Enter value for min_stddev:
Enter value for min_etime:
Enter value for faster_slower: Slower
 
SQL_ID               EXECS AVG_ETIME_BEFORE AVG_ETIME_AFTER   NORM_STDDEV RESULT
------------- ------------ ---------------- --------------- ------------- ------
an9114vtxukz2           17             1.25            5.00        2.1148 Slower
803u6vwz0ah1p           29             0.04            0.18        2.1552 Slower
bvf3fxv3hatw7        2,390             0.03            0.14        2.4147 Slower
b0zkuv9qtcvnn           48             0.16            0.85        3.1500 Slower
9ws5c6p77d384        1,852             0.57            3.30        3.3609 Slower
dy98ajwqc8s2x           15             1.32            8.34        3.7558 Slower
bkka22vb5y7t6          596             0.19            1.26        3.8804 Slower
6qfwkqgvjkn2q           12             0.51            3.60        4.2641 Slower
05xcf43d9psvm        1,197             0.02            0.14        5.4263 Slower
8wgmc9w5ubtsy          183             0.49            4.32        5.4946 Slower
fwfpuf1vfwfu2            6             0.02            0.22        5.6314 Slower
50c18dwvxq4sb          222             0.54            5.29        6.1586 Slower
aukanfjd3d8fa            3             0.88           10.00        7.3496 Slower
44bq4r5z2xrsm           54             3.13           43.39        9.0946 Slower
f41agfq9qdhk6            3             0.24            8.06       22.6765 Slower
 
15 rows selected.

A couple of comments on the script:

  • The default minimum normalized standard deviation is 2. So only statements that are 2X slower were listed.
  • The default minimum average elapsed time per execution is 0.1. So only statements that ran longer than 0.1 seconds were reported.
  • The DAYS_AGO parameter sets a pivot point, we’ll call it the REFERENCE_TIME. The calculations are based on averages from before and after the REFERENCE_TIME.

And a brief explanation of what’s in the output:

  • SQL_ID – the statement id
  • EXECS – the total number of executions in the AWR tables
  • AVG_ETIME_BEFORE – the average elapsed time per execution before the REFERENCE_TIME
  • AVG_ETIME_AFTER – the average elapsed time per execution after the REFERENCE_TIME
  • NORM_STDDEV – this is a normalized standard deviation (i.e. how many times slower/faster it is now)

So the above output indicates that only a few statements are significantly worse (avg. elapsed time 2X or greater) since our new code rollout, which occurred about 3 days ago. Statement 44bq4r5z2xrsm obviously sticks out with a new average elapsed time of 43+ seconds and about 9 times slower than before. This was the one that had picked up the new index. The rest were primarily due to increased physical i/o caused by some newly introduced statements which were doing a few million lio’s per execution and thus causing all the other statements to suffer slightly.

Let me know what you think. Your comments are always welcome.

7 Comments

  1. alex says:

    Hi Kerry,

    I think the script is very good and can help a lot in those so called “post release – it should be faster the dev team told me ” situations. Excellent work !

    Regards,
    Alex

  2. B. Polarski says:

    Nice piece of work. Couple it with the Z. Breitling published in 2006 ‘http://www.doggarts.com/sql/xplan.sql’ and you get a powerful monitoring and diagnostic tool:
    Your query pinpoint candidates to investigate plan variance.

  3. Mark Davidson says:

    Kerry,

    Excellent script and thank you! This type of script will be extremely helpful in finding those “little” queries that seem to go bad after a release when the development team forgets to let the DBA’s know something else was changed and also when the DBA’s may have missed something.
    This might be a good presentation for Hotsos 2010… something like to “Tools and Methods for Identifying Post-Release Issues” :-)

    Thanks again,
    Mark

  4. Anand says:

    Great script…Nice work..Waiting for more

    Regards,
    Anand

  5. Great Script !

    It is worthy of sharing at a Conference !

  6. osborne says:

    Thanks for the comments guys. I have been using this one quite a bit lately. I find I often change it to pivot around a specific snap_id. Like add an index at 11:05, find the 11:00 snap_id, and use that as my pivot point. I also find myself reducing the default minimum etime and minimum stddev parameters. It does turn up some interesting information. Thanks again for all the positive comments.

    Kerry

Leave a Reply