Archive for the ‘Variance’ Category.

What Did My New Index Mess Up?

This week I got to look at a system where a new version of an application had been rolled out. I came across a SQL statement that had taken a turn for the worse in terms of its average elapsed time. It was using an index I hadn’t noticed before, so I took a look to see if the index was new (it was – apparently it was added as part of the code roll out). So I got to wondering if there were any other statements that had suffered a backward movement in performance. So I wrote this little script called whats_changed.sql. It’s the same basic idea as one I wrote about earlier here called unstable_plans.sql. Both of these scripts are based on a calculated variance. The original unstable_plans script uses variance in average elapsed times for statements with multiple plans (i.e. it shows statements that have multiple plans which exhibit a significant difference in average elapsed time between plans). The whats_changed.sql script shows statements that exhibit significant differences in average elapsed time before and after a point in time (i.e. when you rolled out new code, or added an index, etc…) A short example may explain it better than I can:

$ !sql
sqlplus / as sysdba
 
SQL*Plus: Release 11.1.0.7.0 - Production on Thu Jun 25 22:24:35 2009
 
Copyright (c) 1982, 2008, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
 
SQL> @whats_changed
Enter Days ago: 3
Enter value for min_stddev:
Enter value for min_etime:
Enter value for faster_slower:
 
SQL_ID               EXECS AVG_ETIME_BEFORE AVG_ETIME_AFTER   NORM_STDDEV RESULT
------------- ------------ ---------------- --------------- ------------- ------
an9114vtxukz2           17             1.25            5.00        2.1148 Slower
803u6vwz0ah1p           29             0.04            0.18        2.1552 Slower
bvf3fxv3hatw7        2,390             0.03            0.14        2.4147 Slower
5sgs7q9pjnzg5            2             2.21            0.42        3.0130 Faster
b0zkuv9qtcvnn           48             0.16            0.85        3.1500 Slower
9ws5c6p77d384        1,852             0.57            3.30        3.3609 Slower
dy98ajwqc8s2x           15             1.32            8.34        3.7558 Slower
bkka22vb5y7t6          596             0.19            1.26        3.8804 Slower
6qfwkqgvjkn2q           12             0.51            3.60        4.2641 Slower
05xcf43d9psvm        1,197             0.02            0.14        5.4263 Slower
8wgmc9w5ubtsy          183             0.49            4.32        5.4946 Slower
fwfpuf1vfwfu2            6             0.02            0.22        5.6314 Slower
50c18dwvxq4sb          222             0.54            5.29        6.1586 Slower
aukanfjd3d8fa            3             0.88           10.00        7.3496 Slower
44bq4r5z2xrsm           54             3.13           43.39        9.0946 Slower
0az7czjdw8z7j          110             0.62            0.02       17.5933 Faster
f41agfq9qdhk6            3             0.24            8.06       22.6765 Slower
 
17 rows selected.
 
SQL> -- or to see just the slower ones
SQL>
SQL> @whats_changed
Enter Days ago: 3
Enter value for min_stddev:
Enter value for min_etime:
Enter value for faster_slower: Slower
 
SQL_ID               EXECS AVG_ETIME_BEFORE AVG_ETIME_AFTER   NORM_STDDEV RESULT
------------- ------------ ---------------- --------------- ------------- ------
an9114vtxukz2           17             1.25            5.00        2.1148 Slower
803u6vwz0ah1p           29             0.04            0.18        2.1552 Slower
bvf3fxv3hatw7        2,390             0.03            0.14        2.4147 Slower
b0zkuv9qtcvnn           48             0.16            0.85        3.1500 Slower
9ws5c6p77d384        1,852             0.57            3.30        3.3609 Slower
dy98ajwqc8s2x           15             1.32            8.34        3.7558 Slower
bkka22vb5y7t6          596             0.19            1.26        3.8804 Slower
6qfwkqgvjkn2q           12             0.51            3.60        4.2641 Slower
05xcf43d9psvm        1,197             0.02            0.14        5.4263 Slower
8wgmc9w5ubtsy          183             0.49            4.32        5.4946 Slower
fwfpuf1vfwfu2            6             0.02            0.22        5.6314 Slower
50c18dwvxq4sb          222             0.54            5.29        6.1586 Slower
aukanfjd3d8fa            3             0.88           10.00        7.3496 Slower
44bq4r5z2xrsm           54             3.13           43.39        9.0946 Slower
f41agfq9qdhk6            3             0.24            8.06       22.6765 Slower
 
15 rows selected.

A couple of comments on the script:

  • The default minimum normalized standard deviation is 2. So only statements that are 2X slower were listed.
  • The default minimum average elapsed time per execution is 0.1. So only statements that ran longer than 0.1 seconds were reported.
  • The DAYS_AGO parameter sets a pivot point, we’ll call it the REFERENCE_TIME. The calculations are based on averages from before and after the REFERENCE_TIME.

And a brief explanation of what’s in the output:

  • SQL_ID – the statement id
  • EXECS – the total number of executions in the AWR tables
  • AVG_ETIME_BEFORE – the average elapsed time per execution before the REFERENCE_TIME
  • AVG_ETIME_AFTER – the average elapsed time per execution after the REFERENCE_TIME
  • NORM_STDDEV – this is a normalized standard deviation (i.e. how many times slower/faster it is now)

So the above output indicates that only a few statements are significantly worse (avg. elapsed time 2X or greater) since our new code rollout, which occurred about 3 days ago. Statement 44bq4r5z2xrsm obviously sticks out with a new average elapsed time of 43+ seconds and about 9 times slower than before. This was the one that had picked up the new index. The rest were primarily due to increased physical i/o caused by some newly introduced statements which were doing a few million lio’s per execution and thus causing all the other statements to suffer slightly.

Let me know what you think. Your comments are always welcome.

Unstable Plans (Oracle Plan Stability/Instability)

One of the most frustrating things about the Cost Based Optimizer in Oracle is the sometimes erratic performance. It can seem almost random at times. Resolving these issues can also be challenging. It’s a little like taking your car to the mechanic when it makes a funny noise. It never seems to make the noise when the mechanic is there. Fortunately we have ASH/AWR which captures a ton of information about what our databases are doing.

Here’s a couple of scripts that I find useful.

The first one can be used to show statements that have experienced significant variances in execution time (it can be modified to look for variances in the amount of logical i/o, but I’ll leave it as an exercise for the reader).  I called the script unstable_plans.sql. It uses an analytic function to calculate a standard deviation on the average elapsed time by plan. So the statements that have multiple plans with wild variations in the response time between plans will be returned by the script. The script prompts for a couple of values. The first is minimum number of standard deviations. The second is the minimum elapsed time (I usually don’t care if a statement executes sometimes in .005 seconds and sometimes in .02 seconds, even though this is a large swing statistically). Both these inputs are defaulted by the way.

Continue reading ‘Unstable Plans (Oracle Plan Stability/Instability)’ »