storage(:Z>=:Z AND :Z<=:Z)

Ever wonder why that weird line shows up in the Predicate Section of an execution plan on Exadata? Me too! The STORAGE bit tells us it’s a filter applied at the storage cell layer, but the rest is nonsensical. Well I recently ran across a very brief mention of it in a Metalink note. (I know it’s not called Metalink any more, but I’m kind of set in my ways). The note said it was related to distribution of rows to PX slaves. Ah ha! Let’s test it out. Here’s a plan with the predicate just so you can see what it looks like.

KSO@arcp> @dplan
Enter value for sql_id: a9axwj6ym3b29
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  a9axwj6ym3b29, child number 0
-------------------------------------
select /*+  parallel_index(t, "ISD_SI_I03",8)  dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0)
no_monitoring no_substrb_pad  no_expand index_ffs(t,"ISD_SI_I03") */
count(*) as nrw,count(distinct sys_op_lbid(725425,'L',t.rowid)) as
nlb,count(distinct hextoraw(sys_op_descend("SUPPORT_LEVEL")||sys_op_desc
end("SUPPORT_LEVEL_KEY")||sys_op_descend("NAME")||sys_op_descend("VALUE"
))) as ndk,sys_op_countchg(substrb(t.rowid,1,15),1) as clf from
"ISD"."SUPPORT_INFO" sample block (   .0503530742,1)  t where
"SUPPORT_LEVEL" is not null or "SUPPORT_LEVEL_KEY" is not null or
"NAME" is not null or "VALUE" is not null
 
Plan hash value: 1766555783
 
---------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |            |       |       |   672 (100)|          |        |      |            |
|   1 |  SORT GROUP BY                              |            |     1 |    56 |            |          |        |      |            |
|   2 |   PX COORDINATOR                            |            |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)                      | :TQ10001   |     1 |    56 |            |          |  Q1,01 | P->S | QC (RAND)  |
|   4 |     SORT GROUP BY                           |            |     1 |    56 |            |          |  Q1,01 | PCWP |            |
|   5 |      PX RECEIVE                             |            |     1 |    56 |            |          |  Q1,01 | PCWP |            |
|   6 |       PX SEND HASH                          | :TQ10000   |     1 |    56 |            |          |  Q1,00 | P->P | HASH       |
|   7 |        SORT GROUP BY                        |            |     1 |    56 |            |          |  Q1,00 | PCWP |            |
|   8 |         PX BLOCK ITERATOR                   |            |   397K|    21M|   672   (1)| 00:00:01 |  Q1,00 | PCWC |            |
|*  9 |          INDEX STORAGE SAMPLE FAST FULL SCAN| ISD_SI_I03 |   397K|    21M|   672   (1)| 00:00:01 |  Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   9 - storage(:Z>=:Z AND :Z<=:Z AND ("SUPPORT_LEVEL" IS NOT NULL OR "SUPPORT_LEVEL_KEY" IS NOT NULL OR "NAME" IS NOT NULL OR
              "VALUE" IS NOT NULL))
       filter(("SUPPORT_LEVEL" IS NOT NULL OR "SUPPORT_LEVEL_KEY" IS NOT NULL OR "NAME" IS NOT NULL OR "VALUE" IS NOT NULL))
 
 
37 rows selected.

Notice that the plan is for a PX statement. So how can we convince ourselves that it is a PX related predicate. We can try two tests.

  1. Let’s see if we can find any SQL statements that have the predicate that aren’t PX statements.
  2. Let’s see if we can find any PX statements that don’t have the predicate.

So here we go.

KSO@arcp>  -- How many :Z>=:Z's are there?
KSO@arcp> select count(*) from v$sql_plan
  2  where ACCESS_PREDICATES like '%:Z>=:Z%'
  3  /
 
  COUNT(*)
----------
        73
 
1 row selected.
 
KSO@arcp>  -- How many :Z>=:Z's are there that are not PX?
KSO@arcp> select count(*) from v$sql_plan
  2  where ACCESS_PREDICATES like '%:Z>=:Z%'
  3  and sql_id not in (select sql_id from v$sql_plan where operation like 'PX%')
  4  /
 
  COUNT(*)
----------
         0
 
1 row selected.
 
KSO@arcp>  -- How many PX's don't have :Z>=:Z?
KSO@arcp> select count(distinct sql_id) from v$sql_plan
  2  where operation like 'PX%'
  3  and sql_id not in (select sql_id from v$sql_plan
  4  where ACCESS_PREDICATES like '%:Z>=:Z%')
  5  /
 
  COUNT(*)
----------
       154
 
1 row selected.
 
KSO@arcp>  -- Whoa, that's a little unexpected!

So there are none of the :Z>=:Z predicates on non-PX queries but there are a bunch of PX queries that don’t have the predicate. Let’s look at a couple of those and see why those might not have the predicate.

 
KSO@arcp> -- Get SQL_ID's for PX's that don't have :Z>=:Z
KSO@arcp> select distinct sql_id from v$sql_plan
  2  where operation like 'PX%'
  3  and sql_id not in (select sql_id from v$sql_plan
  4  where ACCESS_PREDICATES like '%:Z>=:Z%')
  5  /
 
SQL_ID
-------------
7xa3zbpgkbta7
ftkmqqq3ga0nf
6wgmq24t9xy6f
7tsf3h3qjth77
fg1aphaqvcmb3
...
bt9n0qsg8k4sb
fffrvvnrnmztg
97x3zj2fb0y5z
5dudhrch3sv8r
bbw31mhra7ryu
 
154 rows selected.
 
KSO@arcp> @dplan
KSO@arcp> set lines 150
KSO@arcp> select * from table(dbms_xplan.display_cursor('&sql_id','&child_no','typical'))
  2  /
Enter value for sql_id: bbw31mhra7ryu
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  bbw31mhra7ryu, child number 0
-------------------------------------
select MUTEX_TYPE, LOCATION, SLEEPS, WAIT_TIME from GV$MUTEX_SLEEP
where INST_ID = USERENV('INSTANCE')
 
Plan hash value: 365768877
 
----------------------------------------------------------------------------
| Id  | Operation            | Name           |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                |        |      |            |
|*  1 |  PX COORDINATOR      |                |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000       |  Q1,00 | P->S | QC (RAND)  |
|*  3 |    VIEW              | GV$MUTEX_SLEEP |  Q1,00 | PCWP |            |
|   4 |     FIXED TABLE FULL | X$MUTEX_SLEEP  |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("INST_ID"=USERENV('INSTANCE'))
   3 - filter("INST_ID"=USERENV('INSTANCE'))
 
Note
-----
   - rule based optimizer used (consider using cbo)
 
 
27 rows selected.
 
KSO@arcp> 
KSO@arcp> /
Enter value for sql_id: 5dudhrch3sv8r
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  5dudhrch3sv8r, child number 0
-------------------------------------
select  NAMESPACE , GETS , GETHITS , GETHITRATIO , PINS , PINHITS ,
PINHITRATIO , RELOADS , INVALIDATIONS , DLM_LOCK_REQUESTS ,
DLM_PIN_REQUESTS , DLM_PIN_RELEASES , DLM_INVALIDATION_REQUESTS ,
DLM_INVALIDATIONS from GV$LIBRARYCACHE where inst_id =
USERENV('Instance')
 
Plan hash value: 3363616119
 
-----------------------------------------------------------------------------
| Id  | Operation            | Name            |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                 |        |      |            |
|*  1 |  PX COORDINATOR      |                 |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000        |  Q1,00 | P->S | QC (RAND)  |
|*  3 |    VIEW              | GV$LIBRARYCACHE |  Q1,00 | PCWP |            |
|*  4 |     FIXED TABLE FULL | X$KGLST         |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("INST_ID"=USERENV('INSTANCE'))
   3 - filter("INST_ID"=USERENV('INSTANCE'))
   4 - filter((LENGTH("KGLSTDSC")<=15 AND "KGLSTGET"<>0 AND
              "KGLSTTYP"='NAMESPACE'))
 
Note
-----
   - rule based optimizer used (consider using cbo)
 
 
32 rows selected.

So both of those queries are against Fixed Tables (memory structures). Makes sense that a filter at the storage layer would not be necessary. As it turns out, all 154 of the PX queries that don’t have the :Z>=:Z predicate are queries against Fixed Tables. Here’s a quick check in case you want to try it on your own Exadata.

 
KSO@arcp> select count(*) from v$sql_plan
  2  where operation like 'PX%'
  3  and sql_id not in (select sql_id from v$sql_plan
  4  where ACCESS_PREDICATES like '%:Z>=:Z%')
  5  and sql_id not in (select sql_id from v$sql_plan
  6  where operation not like 'FIXED%')
  7  /
 
  COUNT(*)
----------
         0
 
1 row selected.

So while not a definitive proof, it does appear that the :Z>=:Z predicate is related to PX row distribution. Now if I can just figure out why the storage filters are always repeated in the XPLAN output as a standard filter for offloaded queries like in this example:

Enter value for sql_id: 0qa98gcnnza7h
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0qa98gcnnza7h, child number 0
-------------------------------------
select avg(pk_col) from kso.skew where col1 > 0
 
Plan hash value: 568322376
 
-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |       |       | 44486 (100)|          |
|   1 |  SORT AGGREGATE            |      |     1 |    11 |            |          |
|*  2 |   TABLE ACCESS STORAGE FULL| SKEW |    32M|   335M| 44486   (1)| 00:08:54 |
-----------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - storage("COL1">0)
       filter("COL1">0)
 
 
20 rows selected.

Let me know if you have the answer to that one!

8 Comments

  1. Uwe Hesse says:

    Hi Kerry,
    nice bit of research about the weird storage filter. Thanks for pointing me at this with your comment on my Blog! My conclusion:
    The storage filter :Z>=:Z AND :Z<=:Z appears if
    a) We do parallel query AND
    b) We have done a Smart Scan

    (The repeated storage filter always seems to indicate a Smart Scan)

  2. Kerry,

    I believe the (:Z >= :Z and :Z <= :Z) predicates are the result of a PX query decomposing a tablescan or index fast full scan into rowid ranges. A slave is given a granule where “rowid between :rowid1 and :rowid2″. Transitive closure turns this into two predicates testing that :rowid1 <= :rowid2 and :rowid2 >= :rowid1 that act as a “conditional filter” that may short-cut the next child operation.

    You can see a similar thing happening in queries with predicates like date_col between :b1 and :b2 – the optimizer generates :b1 <= :b2 as a filter predicate on a FILTER operation.

    You might be able to construct a parallel query without this type of predicate if you set up a parallel partition-wise join where the degree of parallelism matches the number of partitions.

    I’d guess that the duplication of the storage()/filter() predicates is a refinement thing. The storage() predicate tells Oracle to look at the allocation unit only if there may be data in the space – the filter() tells Oracle what to do if it decides to look at the allocation unit. It’s a bit like the repetition in access and filter predicates when you don’t make complete use of an index, e.g.
    "where col1 = 'X' and col3 = 'Y'" (missing col2 in the middle of the index)
    turns into
    Access col1 = 'X' and col3 = 'Y'
    Filter col3 = 'Y'

    Regards
    Jonathan Lewis

  3. osborne says:

    Thanks for the comments guys. I agree it definitely looks like row distribution to PX slaves for smart scans.

    I still find it odd that they used the same variable name in the XPLAN output. “:Z>=:Z and :Z<=:Z” means “:Z=:Z” which is always true except for nulls, right?

    I think Jonathan is probably on the right track on the duplicate storage/filter predicates. I am guessing that there are some circumstances where filtering at the storage layer is not complete and so the DB reapplies the filter. But that seems pretty expensive, so maybe it just does it in specific situations. Maybe XPLAN always puts the predicate info there though. Looking at the functions being called might show this happening – maybe we can get Tanel to have a look at that later. ;)

    Kerry

    • Well, I read the code and still couldn’t figure it out. So, I asked the engineer that more or less owns that chunk of the code and his replay was, “That’s an artifact of how PQ granules show up in the plan ( rowid ranges ). Don’t worry about it.”

      So, I don’t worry about it…sometimes when we presume sophistication….

    • Flado says:

      > “:Z>=:Z and :Z<=:Z” means “:Z=:Z” which is always true except for nulls, right?

      Right, and except that Oracle can bind by position, meaning the four :Zs may all have different values.

  4. Tanel Poder says:

    Also, remember how the DBMS_XPLAN.DISPLAY_CURSOR / V$SQL_PLAN* views get the plan – they go to the binary compiled execution plan and UNparse it into human readable stuff. Some things aren’t unparsed properly, that’s why you can have the INTERNAL_FUNCTION in execution plan output – all it means that the UNparser wasn’t able to find a corresponding human readable function name for the opcode found in the plan. Also the :Z<=:Z thing may not mean that Oracle is checking some (internal) bind variable :Z with itself – there may be something much more sophisticated going on internally, but the unparser was able to just express :Z <= :Z…

    • Yasser says:

      I dont think these predicate applies only in Exadata environment…

      Below is the plan extracted from Oracle 10.2.0.4 database using dbms_xplan.display_cursor

      ------------------------------------------------------------------------------------------------------------------------------------------------
      | Id  | Operation                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
      ------------------------------------------------------------------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT              |                   |       |       | 29487 (100)|          |       |       |        |      |            |
      |   1 |  PX COORDINATOR               |                   |       |       |            |          |       |       |        |      |            |
      |   2 |   PX SEND QC (RANDOM)         | :TQ10002          |   240 | 11280 | 29487   (4)| 00:05:54 |       |       |  Q1,02 | P->S | QC (RAND)  |
      |   3 |    VIEW                       | index$_join$_001  |   240 | 11280 | 29487   (4)| 00:05:54 |       |       |  Q1,02 | PCWP |            |
      |*  4 |     HASH JOIN BUFFERED        |                   |       |       |            |          |       |       |  Q1,02 | PCWP |            |
      |   5 |      BUFFER SORT              |                   |       |       |            |          |       |       |  Q1,02 | PCWC |            |
      |   6 |       PX RECEIVE              |                   |       |       |            |          |       |       |  Q1,02 | PCWP |            |
      |   7 |        PX SEND HASH           | :TQ10000          |       |       |            |          |       |       |        | S->P | HASH       |
      |*  8 |         HASH JOIN             |                   |       |       |            |          |       |       |        |      |            |
      |   9 |          PARTITION RANGE ALL  |                   |   240 | 11280 |  3027   (4)| 00:00:37 |     1 |     3 |        |      |            |
      |  10 |           INDEX FAST FULL SCAN| IDX_ACTIVE        |   240 | 11280 |  3027   (4)| 00:00:37 |     1 |     3 |        |      |            |
      |  11 |          PARTITION RANGE ALL  |                   |   240 | 11280 |  8915   (2)| 00:01:47 |     1 |     3 |        |      |            |
      |* 12 |           INDEX FAST FULL SCAN| IDX_COD           |   240 | 11280 |  8915   (2)| 00:01:47 |     1 |     3 |        |      |            |
      |  13 |      PX RECEIVE               |                   |   240 | 11280 |  4482   (3)| 00:00:54 |       |       |  Q1,02 | PCWP |            |
      |  14 |       PX SEND HASH            | :TQ10001          |   240 | 11280 |  4482   (3)| 00:00:54 |       |       |  Q1,01 | P->P | HASH       |
      |  15 |        PX BLOCK ITERATOR      |                   |   240 | 11280 |  4482   (3)| 00:00:54 |       |       |  Q1,01 | PCWC |            |
      |* 16 |         INDEX FAST FULL SCAN  | ONS               |   240 | 11280 |  4482   (3)| 00:00:54 |       |       |  Q1,01 | PCWP |            |
      ------------------------------------------------------------------------------------------------------------------------------------------------
      
      Predicate Information (identified by operation id):
      ---------------------------------------------------
      
         4 - access(ROWID=ROWID)
         8 - access(ROWID=ROWID)
        12 - filter((NLSSORT(SYS_OP_C2C("NAME"),'nls_sort=''BINARY_CI''')=NLSSORT(:B1,'nls_sort=''BINARY_CI''') AND
                    NLSSORT(SYS_OP_C2C("CILE"),'nls_sort=''BINARY_CI''')=NLSSORT(:B2,'nls_sort=''BINARY_CI''')))
        16 - access(:Z>=:Z AND :Z<=:Z)
      

      So its purely related to Parallel Distribution.

      Thanks,
      Yasser

  5. osborne says:

    Yep – it has nothing to do with anything Exadata specific.

    Kerry

Leave a Reply