Archive for February 2009

Statistics Gathering

Karen Morton just posted a great paper on her blog about statistics gathering. The paper is titled “Managing Statistics for Optimal Query Performance“. I was excited to see it because I think gathering stats is one of the most important and least well understood aspects of managing an Oracle environment. I must admit that I was expecting a recommended method or framework for gathering stats, but actually the paper is really more about how the statistics are used along with general guidelines for gathering them, rather than a direct recommendation on how to gather them. Nevertheless, it is one of the best papers I’ve seen on the subject. She’s going to present the paper at the Hotsos Symposium to be held in Dallas the week of March 9th. I’m going to be there and am really looking forward to hearing what she has to say on the subject.

By the way, I can’t recommend this conference highly enough. If you really want to understand how Oracle works, this is the place to be. You should know that I am not generally a fan of formal training classes. I have often been disappointed because I felt like my time would have been better spent researching the subject matter myself. On the other hand, I have found a lot of value in working closely on a project with someone who knows the subject matter well, kind of like a mentor. But generally speaking, the formal classes have been less satisfying, except in the rare case where you get the great instructor that wrote the class. This symposium format on the other hand allows you to listen to a collection of really knowledgeable Oracle people packed into a short period of time. I have been to the Hotsos Symposiums for several years in a row and I always come away with pages of notes on things I want to investigate further. And the participants are, generally speaking, a collection of very bright Oracle people. So even the conversations between the presentations are often very interesting. Finally, they run two presentations at a time which allows you to pick the presentation that is most interesting. I have often found it hard to choose (don’t tell anyone, but I have on more than one occasion listened to the first half of one and then the second half of the other). So like I said, I find it to be a very productive few days.

But I digress, Karen’s paper is pretty long (24 pages) but it covers a vast amount of ground. There are a number of one liners that could be expanded into full papers. In the paper she discusses a number of topics including dealing with short comings of the optimizer in 10g. One of those issues is bind variable peeking (probably my least favorite optimizer feature, quirk, bug, … what ever you want to call it). I must say that I think it has caused far more problems than it solved, and I frankly don’t know what they were thinking when they put that feature in. I wrote a little about a way to get around it using outlines here. By the way, this reminds me of a cartoon I drew 20 years ago that looked very similar to this one (that I lifted off of Steve Karam’s blog)

Of course as Karen points out, the right way to deal with bind variable peeking issues it is to understand your data and use literals where they are appropriate, keeping in mind the number of additional statements that will need to be parsed and dealt with in the shared pool. She also points out that code could be written to selectively use literals for specific values, giving you a mix of literals and bind variables for the same statement. This approach should allow you to minimize the impact on the shared pool while still providing the optimizer with the data it needs to make good decisions (this is a great idea but I’ve never seen anyone actually do it). And of course she points out that 11g has a much better mechanism for dealing with this whole issue.

Another idea that really got me thinking was the use of dynamic sampling . Karen clearly shows one of the advantages of dynamic sampling in the case of correlated predicates (i.e. the optimizer assumes a query where car_model = ‘Mustang’ and car_make = ‘Ford’ are independent, when clearly they are not). She shows how dynamic sampling can be very useful in conjunction with normal statistics in this situation. (rats, now I’m going to have to go play around with that a bit – so much to do, so little time)

Finally, she discusses some of the statistics gathering options and differences in 9i, 10g, and 11g. The automatic creation of histograms is one of the main differences between 9i and 10g and she discusses this issue, but doesn’t go into to much detail on it. I must admit that I think 10g’s default setup does a very poor job when it comes to histograms. This is the one area I would have liked to see address a little more fully, but at 24 pages already I can understand why she had to draw the line somewhere. Anyway, by default 10g creates histograms on columns based on several factors including their use in where clauses. Unfortunately, histograms often get created on columns where their usefulness is questionable at best and they regularly get created with a very small sample sizes. The small sample sizes often result in significant inaccuracies. I personally think that allowing the gather stats job to automatically create histograms in 10g is really bad idea.

Anyway, this is a paper that is well worth reading in my opinion. Typical Hotsos Symposium fare!

Saving Rows from Corrupt Blocks

Recently we ran into a database with a (or some) incomplete transactions that had not been able to rollback due to a file in the undo tablespace that had been deleted. And as luck would have it, the log files were some how lost as well (so we couldn’t just recover the undo file). So the database was up and running, but the undo file was missing and any time you hit one of the records that needed info from the missing undo file to rebuild a consistent version, it would fail with a ORA-00376  error (file not found). Technically that is not a corrupt block, just an uncommitted record that has been written, with missing undo. Make sense? Anyway, we narrowed it down to a couple of sub-partitions and were able to export the data from all the other sub-partitions. To get the data out of the affected sub-partitions we decided to use an approached based on the old rowid, which contained file_id,block_id, and row_num.

Toon Koppelaars did an excellent write up on this basic approach here. But it was written a while back and unfortunately the rowid was changed somewhere around the Oracle 8i time frame making it a little more difficult. However, in their infinite wisdom, the developers at Oracle added the DBMS_ROWID package which allows us to work around the issue.

So here’s the psuedo code for what we did:

Get Object Name (object_name)
Get Max Rows Per Block (max_rows_per_block)
Get List of Blocks by Extent for Object

For Each Extent
  For Each Block 
    For row in 1 to max_rows_per_block
      insert into saved_table select * from object_name 
        where rowid = dbms_rowid.rowid_create(file,block,row);
    End Row Loop
  End Block Loop
End Extent Loop

The actual script we used is a little more complicated. It actually created a Bad Blocks table as well so we’d know how many blocks were skipped and it had some error checking. I later embellished it a bit to make the object name dynamic (which was considerably more work than I thought it would be). Here’s the actual script I ended up with: save_u.sql. NOTE: I am not the worlds greatest pl/sql guy, so if you have any suggestions, let me know! But it seems to do the job. You may also find this script (obj_blocks.sql) useful for getting a list of all the blocks mapped to a specific object. And here’s a couple of scripts to create functions that might come in handy: create_new_rowid.sql (creates function new_rowid which returns the new format rowid if you give it the obj#, file#, block#, and row#) and create_old_rowid.sql (which returns the old format rowid if given the new format id). Here’s a quick example of the scripts in use:

> sqlplus / as sysdba

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Feb 12 11:35:14 2009

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> @obj
Enter value for owner: MF
Enter value for name: BROKE%
Enter value for type: 

OWNER           OBJECT_NAME                    OBJECT_TYPE         STATUS  T
--------------- ------------------------------ ------------------- ------- -
MF              BROKE_NO_DEPENDENCIES          TABLE               VALID   N
MF              BROKE_YES_DEPENDENCIES         TABLE               VALID   N

SQL> set echo on
SQL> @save_u
Enter value for owner_name: mf
Enter value for table_name: BROKE_NO_DEPENDENCIES

WARNING: This script may issue a DROP TABLE command. Do not execute it unless you have read through it
and are comfortable you know what it does.

Ready? (hit ctl-C to quit)  
Enter value for owner_name: mf
Enter value for table_name: BROKE_NO_DEPENDENCIES    


Saved 18800 records in BROKE_NO_DEPENDENCIES_SAVED.
35 bad records in BROKE_NO_DEPENDENCIES_BAD.

PL/SQL procedure successfully completed.

SQL> set echo on
SQL> @create_old_rowid
SQL> create or replace function old_rowid (p_rowid rowid)
  2  return varchar as
  3  
  4    rowid_type NUMBER;
  5    object_id NUMBER;
  6    fileno NUMBER;
  7    blockno   NUMBER;
  8    rowno  NUMBER;
  9  
 10  BEGIN
 11  
 12     dbms_rowid.rowid_info(p_rowid, rowid_type, object_id, fileno, blockno, rowno);
 13  /*
 14     dbms_output.put_line('Row Typ-' || TO_CHAR(rowid_type));
 15     dbms_output.put_line('Obj No-' || TO_CHAR(object_id));
 16     dbms_output.put_line('RFNO-' || TO_CHAR(fileno));
 17     dbms_output.put_line('Block No-' || TO_CHAR(blockno));
 18     dbms_output.put_line('Row No-' || TO_CHAR(rowno));
 19  */
 20  return(to_char(fileno)||'.'||to_char(blockno)||'.'||to_char(rowno));
 21  
 22  END;
 23  /

Function created.

SQL> @create_new_rowid
SQL> create or replace function new_rowid (p_object_id number, p_old_rowid varchar)
  2  return varchar as
  3  
  4    new_rowid varchar2(30);
  5    fileno NUMBER;
  6    blockno   NUMBER;
  7    rowno  NUMBER;
  8  
  9  BEGIN
 10  
 11    fileno := substr(p_old_rowid,1,instr(p_old_rowid,'.')-1);
 12    blockno := substr(p_old_rowid,instr(p_old_rowid,'.')+1,instr(p_old_rowid,'.',1,2)-instr(p_old_rowid,'.'));
 13    rowno := substr(p_old_rowid,instr(p_old_rowid,'.',1,2)+1,100);
 14    new_rowid := DBMS_ROWID.ROWID_CREATE(1, p_object_id, fileno , blockno , rowno);
 15  
 16    return(new_rowid);
 17  
 18  END;
 19  /

Function created.

SQL> select rowid,old_rowid(rowid) old_rowid from mf.BROKE_NO_DEPENDENCIES where rownum < 5
SQL> /

ROWID              OLD_ROWID
------------------ ------------------------------
AAACYmAAEAAAAAMAAt 4.12.45
AAACYmAAEAAAACIAAt 4.136.45
AAACYmAAEAAAAAMAAF 4.12.5
AAACYmAAEAAAACIAAF 4.136.5

One other thing to keep in mind, as Toon mentioned, it may be possible to retrieve data from an index, even if the underlying data block is messed up. Selecting only the values of the indexed columns allows Oracle to completely bypass the data blocks. So, if for example, you found a set of blocks were inaccessible, you may be able to construct statements that would retrieve at least some of the data from the indexes like so:

select /*+ index (messed_up_object  mess_up_object_pk) */ indexed_column1, indexed_column2 
from messed_up_object 
where rowid in (select dbms_rowid.rowid_create(1,object,file,block,row) from bad_rows_table);

Your comments are always welcomed!