Funny Developer Tricks – Decode

I ran into a really ugly SQL statement last week. It was the most expensive statement running on the system for the last several weeks. On top of the fact that the statement ran for hours, it also had a number of “issues”. The statement had the following characteristics:

  • Several DISTINCT Operators – these tend to show up in statements where developers have left off a join condition
  • ANSI Join Syntax – generally Oracle converts this syntax back to Oracle join syntax under the covers. I realize this is standard SQL syntax, but it still makes me nervous due to previous experience with problems caused by this conversion process.
  • Functions Used in WHERE Clauses (DECODE and NVL in particular) – can disable indexes
  • UNION – forces a sort of the union’ed result sets so that it can eliminate duplicates (which is often not necessary). Depending on how the statement is coded, UNIONs can also sometimes result in multiple accesses of the same objects (i.e. if the developer has used the UNION in the place of a more robust where clause).
  • Numerous != and <> Comparison Operators – Not necessarily a problem but often can be coded in a more efficient manner
  • Several OR’s – OR’s are tricky. I get suspicious when there are numerous other problems in a statement as well.

I’ll refrain from publishing the actual statement to protect the guilty, but one of the most interesting bits of the statement looked something like this:

...
WHERE a.col1 =  DECODE ('XYZ', '*', a.col1 ,'XYZ')
...

The processing of the where clause can be roughly translated to the following psuedo code:

 
if 'XYZ' = '*' then 
  x = a.col1
else
  x = 'XYZ'
end if
 
...
WHERE a.col1 = x
...

Since the literal ‘XYZ’ will never be equal to the literal ‘*’, the value returned by this DECODE will always be ‘XYZ’. Therefore the statement could have been written much more simply as:

...
WHERE a.col1 = 'XYZ'
...

So this is an extremely strange way to write the statement and clouds the intention considerably, but does it hurt performance? Well actually no. The optimizer is smart enough to know that this DECODE will always result in the same literal value and so it appears that Oracle does not run the DECODE statement for each row. Here’s a quick example from a 10.2.0.4 database. (note that I used dplan.sql and fss.sql in this code snippet)

 
SYS@LAB1024> -- first a DECODE that actually does something
SYS@LAB1024> select count(pk_col) from kso.skew where col2 = DECODE(col1,1,col2,'asd');
 
COUNT(PK_COL)
-------------
      3199971
 
Elapsed: 00:00:06.62
SYS@LAB1024> @fss
Enter value for sql_text: select count(pk_col) from kso.skew where col2 = DECODE(col1,1,col2,'asd')
Enter value for sql_id: 
 
SQL_ID         CHILD      EXECS   AVG_ROWS     AVG_ETIME       AVG_CPU       AVG_PIO      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------- ------------- ------------ ----------------------------------------
45ynyjvg6hyyh      0          1          1          6.46          6.41    111,029.00      162,298 select count(pk_col) from kso.skew where
 
Elapsed: 00:00:00.07
SYS@LAB1024> @dplan
Enter value for sql_id: 45ynyjvg6hyyh
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  45ynyjvg6hyyh, child number 0
-------------------------------------
select count(pk_col) from kso.skew where col2 =
DECODE(col1,1,col2,'asd')
 
Plan hash value: 568322376
 
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     4 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    16 |            |          |
|*  2 |   TABLE ACCESS FULL| SKEW |  1234 | 19744 |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("COL2"=DECODE("COL1",1,"COL2",'asd'))
 
 
20 rows selected.
 
Elapsed: 00:00:00.06
SYS@LAB1024> -- Now one that is equivalent to WHERE col2 = 'asddsadasd'
SYS@LAB1024> select count(pk_col) from kso.skew where col2 = DECODE('asddsadasd','*', col2, 'asddsadasd');
 
COUNT(PK_COL)
-------------
     32000000
 
Elapsed: 00:00:03.88
SYS@LAB1024> @fss
Enter value for sql_text: select count(pk_col) from kso.skew where col2 = DECODE('asddsadasd','*', col2, 'asddsadasd')
Enter value for sql_id: 
 
SQL_ID         CHILD      EXECS   AVG_ROWS     AVG_ETIME       AVG_CPU       AVG_PIO      AVG_LIO SQL_TEXT
------------- ------ ---------- ---------- ------------- ------------- ------------- ------------ ----------------------------------------
4u3nbf55z1ztd      0          1          1          3.88          3.88    111,054.50      162,298 select count(pk_col) from kso.skew where
 
Elapsed: 00:00:00.07
SYS@LAB1024> @dplan 
Enter value for sql_id: 4u3nbf55z1ztd
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  4u3nbf55z1ztd, child number 0
-------------------------------------
select count(pk_col) from kso.skew where col2 =
DECODE('asddsadasd','*', col2, 'asddsadasd')
 
Plan hash value: 568322376
 
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     4 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    11 |            |          |
|*  2 |   TABLE ACCESS FULL| SKEW |  1234 | 13574 |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("COL2"='asddsadasd')
 
 
20 rows selected.
 
Elapsed: 00:00:00.01

As you can see, the unnecessary DECODE ran much faster and if you look closely at the Predicate Information sections of the plans you’ll see that the unnecessary DECODE filter was converted to “COL2″=’asddsadasd’. That’s pretty cool.

Unfortunately, whether the DECODE slows down the processing is not the biggest issue. It’s highly probable that the developer thought this DECODE was doing something other than what it’s actually doing and thus that the result produced by this statement is incorrect. But that’s a story for another day.

Leave a Reply