Funny Developer Tricks – first_rows(999999999)
I ran across a funny SQL statement recently (funny strange, not funny ha ha – well actually funny ha ha too I guess). It had a first_rows hint like so:
select /*+ FIRST_ROWS (999999999) */
"MANDT" , "OPBEL" , "OPUPW" , "OPUPK" , "OPUPZ" , "BUKRS" , "GSBER" , "BUPLA" , "SEGMENT" ,
"AUGST" , "GPART" , "VTREF" , "VTRE2" , "VKONT" , "ABWBL" , "ABWTP" , "ABWKT" , "APPLK" ,
"HVORG" , "TVORG" , "KOFIZ" , "SPART" , "HKONT", "MWSKZ" , "MWSZKZ" , "XANZA" , "STAKZ" ,
"BLDAT" , "BUDAT" , "OPTXT" , "WAERS" , "FAEDN", "FAEDS" , "VERKZ" , "STUDT" , "SKTPZ" ,
"XMANL" , "KURSF" , "BETRH" , "BETRW" , "BETR2" , "BETR3" , "SKFBT" , "SBETH" , "SBETW" ,
"SBET2" , "SBET3" , "MWSKO" , "MWVKO" , "TXRUL" , "SPZAH" , "PYMET" , "PYBUK" , "PERNR" ,
"GRKEY" , "PERSL" , "XAESP" , "AUGDT" , "AUGBL" , "AUGBD" , "AUGRD" , "AUGWA" , "AUGBT" ,
"AUGBS" , "AUGSK" , "AUGVD" , "AUGOB" , "WHANG" , "WHGRP" , "XEIPH" , "MAHNV" , "MANSP" ,
"XAUGP" , "ABRZU" , "ABRZO" , "FDGRP" , "FDLEV" , "FDZTG", "FDWBT" , "XTAUS" , "AUGRS" ,
"PYGRP" , "PDTYP" , "SPERZ" , "INFOZ" , "TXJCD" , "TXDAT" ,"VBUND" , "KONTT" , "KONTL" ,
"OPSTA" , "BLART" , "EMGPA" , "EMBVT" , "EMADR" , "IKEY" , "EUROU" , "XRAGL" , "ASTKZ" ,
"ASBLG" , "XBLNR" , "INKPS" , "RNDPS" , "QSSKZ" , "QSSEW" , "QSPTP" , "QSSHB" , "QBSHB" ,
"QSZNR" , "RFUPK" , "STRKZ" , "FITPR" , "XPYOR" , "LANDL" , "INTBU", "EMCRD" , "C4EYE" ,
"C4EYP" , "SCTAX" , "STTAX" , "STZAL" , "ORUPZ" , "NEGBU" , "SUBAP" , "PSWSL" , "PSWBT" ,
"PSWTX" , "PSGRP" , "FINRE" , "RDSTA" , "RDSTB" , "DEAKTIV" , "SGRKEY", "SOLLDAT" , "RECPT" ,
"TOCOLLECT" , "EINMALANF" , "VORAUSZAHL" , "APERIODIC" , "ABRABS" , "GRBBP" , "ASMETH" ,
"INT_CROSSREFNO" , "ETHPPM" , "PAYFREQID" , "INVOICING_PARTY" , "PPMST" , "LOGNO" , "APERIODICT" ,
"ADD_REFOBJ" , "ADD_REFOBJID" , "ADD_SERVICE" , "ZZAGENCY" , "ZZ_EXT_REF" , "ZZ_PAY_AGENT" ,
"ZZFUNDSOURCE" , "ZZINSTALLMENT" , "Z_PROD_ID" , "ZZUSERNAME" , "ZZWF_STAT" , "ZZPAYCHANNEL"
FROM "DFKKOP"
WHERE "MANDT" = :A0 -- NDV=1
AND "BUKRS" = :A1 -- NDV=1
AND "AUGST" = :A2 -- NDV=2
AND "FAEDN" < :A3 -- less than today probably all records
AND ( "PYMET" = :A4 OR "PYMET" = :A5 ) -- NDV=8
AND ROWNUM <= :A6; -- less than 1B
Yes - that's a first rows hint with about a billion as the number of rows to optimizer for.
The reason I noticed it is that it runs for 15 hours before getting a Snapshot Too Old error. The attempted solution was to restart it the next day (thinking maybe it will run better the second time I guess). The table has roughly 100M rows. There was no index on PYMET which is unfortunate as the two values requested account for only about 0.15% (not 15%, 0.15%). The optimizer chooses an index on MANDT, BURKRS, AUGST, FAEDN and as you might expect, it doesn't work very well (see the NDV comments I added to the statement).
Funny things:
The First_Rows hint is requesting the Oracle optimizer to return the first billion records as fast as possible (even though there are only 100M rows).
The documentation for the First_Rows hint in 11g looks like this:
The FIRST_ROWS hint instructs Oracle to optimize an individual SQL statement for fast response, choosing the plan that returns the first n rows most efficiently. For integer, specify the number of rows to return.
For example, the optimizer uses the query optimization approach to optimize the following statement for best response time:
SELECT /*+ FIRST_ROWS(10) */ employee_id, last_name, salary, job_id
FROM employees
WHERE department_id = 20;In this example each department contains many employees. The user wants the first 10 employees of department 20 to be displayed as quickly as possible.
So I can see where the developers might have interpreted this as the ever elusive "Go Fast" hint.
The developers also added "and rownum < 999999999" to the where clause which limits the amount of rows that can be returned. I'm not sure whether they knew it or not, but this clause also has the same affect as the hint. That is to say that the clause causes the optimizer to modify it's calculations as if the first_rows_N hint had been applied. Maybe the developers weren't getting the "go fast" behavior they expected from the hint and after doing some research found that the "rownum <" syntax would basically do the same thing. I'm guessing that's the case because I can't see why they would really want to limit the number of rows coming back, but I'm not sure. It's a very odd statement because the First_Rows hint tends to push the optimizer towards index usage, and this statement was behaving badly precisely because it was using an index (a full table scan only took about 1 hour). Regardless of what the developers were trying to do, the fact that they used such a big number caused the optimizer to ignore the hint anyway. Since the table only had 100M rows and the parameter was 1B, the hint was ignored (well at least the "First K Rows" modifications to the optimizer calculations were not used). This happens to the "rownum <" induced behavior as well by the way. Here's a bit of a couple of 10053 trace file showing some details:
First a trace file from a select using a first_rows(10).
blah blah blah
...
First K Rows: Setup begin
...
First K Rows: Setup end
...
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
Permutations for Starting Table :0
***********************
Join order[1]: SKEW[A]#0
***********************
Best so far: Table#: 0 cost: 10258.1471 card: 32000004.0000 bytes: 352000044
First K Rows: K = 10.00, N = 32000004.00
First K Rows: old pf = -1.0000000, new pf = 0.0000003
***************************************
SINGLE TABLE ACCESS PATH (First K Rows)
-----------------------------------------
BEGIN Single Table Cardinality Estimation
-----------------------------------------
Table: SKEW Alias: A
Card: Original: 10 Rounded: 10 Computed: 10.00 Non Adjusted: 10.00
-----------------------------------------
END Single Table Cardinality Estimation
-----------------------------------------
Access Path: TableScan
Cost: 2.00 Resp: 2.00 Degree: 0
Cost_io: 2.00 Cost_cpu: 9321
Resp_io: 2.00 Resp_cpu: 9321
kkofmx: index filter:"A"."COL1">0
Access Path: index (RangeScan)
Index: SKEW_COL1
resc_io: 13.00 resc_cpu: 96279
ix_sel: 1 ix_sel_with_filters: 1
Cost: 13.02 Resp: 13.02 Degree: 1
Access Path: index (skip-scan)
SS sel: 1 ANDV (#skips): 10
SS io: 10.00 vs. table scan io: 2.00
Skip Scan rejected
Access Path: index (FullScan)
Index: SKEW_COL2_COL1
resc_io: 14.00 resc_cpu: 103400
ix_sel: 1 ix_sel_with_filters: 1
Cost: 14.02 Resp: 14.02 Degree: 1
****** trying bitmap/domain indexes ******
****** finished trying bitmap/domain indexes ******
Best:: AccessPath: IndexRange Index: SKEW_COL1
Cost: 13.02 Degree: 1 Resp: 13.02 Card: 10.00 Bytes: 11
First K Rows: unchanged join prefix len = 1
***********************
Join order[1]: SKEW[A]#0
***********************
Best so far: Table#: 0 cost: 13.0227 card: 10.0000 bytes: 110
*********************************
Number of join permutations tried: 1
*********************************
Final - First K Rows Plan: Best join order: 1
Cost: 13.0227 Degree: 1 Card: 10.0000 Bytes: 110
Resc: 13.0227 Resc_io: 13.0000 Resc_cpu: 96279
Resp: 13.0227 Resp_io: 13.0000 Resc_cpu: 96279
kkoipt: Query block SEL$1 (#0)
******* UNPARSED QUERY IS *******
SELECT /*+ FIRST_ROWS (10) */ "A"."PK_COL" "PK_COL" FROM "KSO"."SKEW" "A" WHERE "A"."COL1">0
kkoqbc-subheap (delete addr=0xf62bdf2c, in-use=23752, alloc=24592)
kkoqbc-end
: call(in-use=21112, alloc=49112), compile(in-use=39416, alloc=40744)
apadrv-end: call(in-use=21112, alloc=49112), compile(in-use=39936, alloc=40744)
sql_id=3n4vu47jvx7qg.
Current SQL statement for this session:
select /*+ first_rows(10) */ pk_col from kso.skew a where col1 > 0
============
Plan Table
============
------------------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
------------------------------------------------+-----------------------------------+
| 0 | SELECT STATEMENT | | | | 13 | |
| 1 | TABLE ACCESS BY INDEX ROWID | SKEW | 10 | 110 | 13 | 00:00:01 |
| 2 | INDEX RANGE SCAN | SKEW_COL1| | | 3 | 00:00:01 |
------------------------------------------------+-----------------------------------+
And now an example with a number larger than the number of rows in the table. (it recognizes the hint but ignores it after it determines that the parameter is larger than the expected number of rows)
blah blah blah
...
First K Rows: Setup begin
...
First K Rows: Setup end
...
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
Permutations for Starting Table :0
***********************
Join order[1]: SKEW[A]#0
***********************
Best so far: Table#: 0 cost: 10258.1471 card: 32000004.0000 bytes: 352000044
*********************************
Number of join permutations tried: 1
*********************************
Final - All Rows Plan: Best join order: 1
Cost: 10258.1471 Degree: 1 Card: 32000004.0000 Bytes: 352000044
Resc: 10258.1471 Resc_io: 8323.0000 Resc_cpu: 8195767863
Resp: 10258.1471 Resp_io: 8323.0000 Resc_cpu: 8195767863
kkoipt: Query block SEL$1 (#0)
******* UNPARSED QUERY IS *******
SELECT /*+ FIRST_ROWS (33000000) */ "A"."PK_COL" "PK_COL" FROM "KSO"."SKEW" "A" WHERE "A"."COL1">0
kkoqbc-subheap (delete addr=0xf62bdf2c, in-use=13620, alloc=16344)
kkoqbc-end
: call(in-use=17088, alloc=49112), compile(in-use=37632, alloc=40744)
apadrv-end: call(in-use=17088, alloc=49112), compile(in-use=38152, alloc=40744)
sql_id=bfzsrf3z0nbr9.
Current SQL statement for this session:
select /*+ first_rows(33000000) */ pk_col from kso.skew a where col1 > 0
============
Plan Table
============
-------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-------------------------------------+-----------------------------------+
| 0 | SELECT STATEMENT | | | | 10K | |
| 1 | TABLE ACCESS FULL | SKEW | 31M | 336M | 10K | 00:01:23 |
-------------------------------------+-----------------------------------+
So the optimizer knows about the hint but doesn't do anything with it.
Well that's my story for today. The First K Rows modifications to optimizer calculations are interesting in there own right, by the way. There are a couple of good posts on the subject here:
Dion Cho on FIRST_ROWS vs. ALL_ROWS and ROWNUM predicate
Jonathan Lewis on first_rows_N
And the most thorough discussion of the First_Rows hint I have seen comes from Randolf Geist. Here's a link to a presentation he did on the topic:
"Everything You Wanted To Ask About FIRST_ROWS_N But Were Afraid To Ask"
Kerry,
my first thought was when reading the beginning of your post that the hint should be ignored due to the fact that the number of rows specified is very likely larger than the estimated cardinality of the ALL_ROWS mode.
You basically confirmed that, so my question is now: Is the bad execution plan using that index caused by the FIRST_ROWS hint / rownum predicate or not?
I assume that you ought to get the same bad plan when using the ALL_ROWS mode since the optimizer should effectively “ignore” the hint after determining the ALL_ROWS cardinality, but I was just looking for a confirmation – it would be interesting if you get a different execution plan although the hint ought to be “ignored”.
Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
Co-author of the “OakTable Expert Oracle Practices” book:
http://www.apress.com/book/view/1430226684
http://www.amazon.com/Expert-Oracle-Practices-Database-Administration/dp/1430226684
Hi Randolf,
Yes, you are correct. The same plan was generated with or without the hint because the number was much greater than the cardinality (100M row table, first_rows_999999999). I should have saved the 10053 trace file, but no use crying over spilt milk. The quick fix was a profile to force the full table scan (ran in about 1 hour) and a recommendation for the new index on PYMET (which has now worked it’s way through change control).
Kerry
My thanks to you, Kerry, and Randolf and Dion Cho for putting these things out there.
I fear I may have a stupid question, but I’ll ask anyway. In the 10053 trace output for first_rows(10) (above), we see:
—————————————–
BEGIN Single Table Cardinality Estimation
—————————————–
Table: SKEW Alias: A
Card: Original: 10 Rounded: 10 Computed: 10.00 Non Adjusted: 10.00
Is it expected that the estimated cardinality of a table is equal to the optimizer_mode in this case? From what I understand, first_rows is kind of ‘fooling’ the CBO into thinking that a set number of rows may (or may not) be returned, but if first_rows artificially effects the cardinality of a rowsource, it would seem that more query plans would not be optimal. For example, choosing a table with an estimated cardinality of 10 when it really has 10000000 rows to be the inner table of a nested loop join.
I apologize if this was already explain and I missed it.
As a side note, what is “pf”. It looks like it is K/N.
Thanks for your time,
Hi Charles,
I’m not sure I understand your first question. Could you restate it using different words? 🙂
I think pf probably stands for predicate filter and I think you are correct that it is calculated as K/N.
Kerry
Kerry,
No problem. =)
The 10053 show that the table cardinality is 10. If you use FIRST_ROWS_N, the 10053 will show that the cardinality changes to match N. This seems to be a problem to me. In fact, I have an open case with Oracle about this issue and found out this morning that it is rooted in unpublished bug 4887636.
My question is, would you expect the 10053 trace to show a cardinality of 10 if you know for a fact that the table does not have 10 rows and the predicates (whether ACCESS or FILTER, or even JOINs for that matter) do not return 10 rows?
Hey Charles,
Sorry for the delay – very busy these days. I don’t have a lot of time to do any detailed investigation on this issue right now but my initial thought is that any operation that can be cut short (such as a nested loop join or a table access by index rowid) would have it’s cardinality modified to N. Operations that are not able to be stopped would not.
Kerry
Looks like this was turned into bug 11858963.