Saving Rows from Corrupt Blocks

Recently we ran into a database with a (or some) incomplete transactions that had not been able to rollback due to a file in the undo tablespace that had been deleted. And as luck would have it, the log files were some how lost as well (so we couldn’t just recover the undo file). So the database was up and running, but the undo file was missing and any time you hit one of the records that needed info from the missing undo file to rebuild a consistent version, it would fail with a ORA-00376  error (file not found). Technically that is not a corrupt block, just an uncommitted record that has been written, with missing undo. Make sense? Anyway, we narrowed it down to a couple of sub-partitions and were able to export the data from all the other sub-partitions. To get the data out of the affected sub-partitions we decided to use an approached based on the old rowid, which contained file_id,block_id, and row_num.

Toon Koppelaars did an excellent write up on this basic approach here. But it was written a while back and unfortunately the rowid was changed somewhere around the Oracle 8i time frame making it a little more difficult. However, in their infinite wisdom, the developers at Oracle added the DBMS_ROWID package which allows us to work around the issue.

So here’s the psuedo code for what we did:

Get Object Name (object_name)
Get Max Rows Per Block (max_rows_per_block)
Get List of Blocks by Extent for Object
 
For Each Extent
  For Each Block 
    For row in 1 to max_rows_per_block
      insert into saved_table select * from object_name 
        where rowid = dbms_rowid.rowid_create(file,block,row);
    End Row Loop
  End Block Loop
End Extent Loop

The actual script we used is a little more complicated. It actually created a Bad Blocks table as well so we’d know how many blocks were skipped and it had some error checking. I later embellished it a bit to make the object name dynamic (which was considerably more work than I thought it would be). Here’s the actual script I ended up with: save_u.sql. NOTE: I am not the worlds greatest pl/sql guy, so if you have any suggestions, let me know! But it seems to do the job. You may also find this script (obj_blocks.sql) useful for getting a list of all the blocks mapped to a specific object. And here’s a couple of scripts to create functions that might come in handy: create_new_rowid.sql (creates function new_rowid which returns the new format rowid if you give it the obj#, file#, block#, and row#) and create_old_rowid.sql (which returns the old format rowid if given the new format id). Here’s a quick example of the scripts in use:

> sqlplus / as sysdba
 
SQL*Plus: Release 10.2.0.3.0 - Production on Thu Feb 12 11:35:14 2009
 
Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.
 
 
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
 
SQL> @obj
Enter value for owner: MF
Enter value for name: BROKE%
Enter value for type: 
 
OWNER           OBJECT_NAME                    OBJECT_TYPE         STATUS  T
--------------- ------------------------------ ------------------- ------- -
MF              BROKE_NO_DEPENDENCIES          TABLE               VALID   N
MF              BROKE_YES_DEPENDENCIES         TABLE               VALID   N
 
SQL> set echo on
SQL> @save_u
Enter value for owner_name: mf
Enter value for table_name: BROKE_NO_DEPENDENCIES
 
WARNING: This script may issue a DROP TABLE command. Do not execute it unless you have read through it
and are comfortable you know what it does.
 
Ready? (hit ctl-C to quit)  
Enter value for owner_name: mf
Enter value for table_name: BROKE_NO_DEPENDENCIES    
 
 
Saved 18800 records in BROKE_NO_DEPENDENCIES_SAVED.
35 bad records in BROKE_NO_DEPENDENCIES_BAD.
 
PL/SQL procedure successfully completed.
 
SQL> set echo on
SQL> @create_old_rowid
SQL> create or replace function old_rowid (p_rowid rowid)
  2  return varchar as
  3  
  4    rowid_type NUMBER;
  5    object_id NUMBER;
  6    fileno NUMBER;
  7    blockno   NUMBER;
  8    rowno  NUMBER;
  9  
 10  BEGIN
 11  
 12     dbms_rowid.rowid_info(p_rowid, rowid_type, object_id, fileno, blockno, rowno);
 13  /*
 14     dbms_output.put_line('Row Typ-' || TO_CHAR(rowid_type));
 15     dbms_output.put_line('Obj No-' || TO_CHAR(object_id));
 16     dbms_output.put_line('RFNO-' || TO_CHAR(fileno));
 17     dbms_output.put_line('Block No-' || TO_CHAR(blockno));
 18     dbms_output.put_line('Row No-' || TO_CHAR(rowno));
 19  */
 20  return(to_char(fileno)||'.'||to_char(blockno)||'.'||to_char(rowno));
 21  
 22  END;
 23  /
 
Function created.
 
SQL> @create_new_rowid
SQL> create or replace function new_rowid (p_object_id number, p_old_rowid varchar)
  2  return varchar as
  3  
  4    new_rowid varchar2(30);
  5    fileno NUMBER;
  6    blockno   NUMBER;
  7    rowno  NUMBER;
  8  
  9  BEGIN
 10  
 11    fileno := substr(p_old_rowid,1,instr(p_old_rowid,'.')-1);
 12    blockno := substr(p_old_rowid,instr(p_old_rowid,'.')+1,instr(p_old_rowid,'.',1,2)-instr(p_old_rowid,'.'));
 13    rowno := substr(p_old_rowid,instr(p_old_rowid,'.',1,2)+1,100);
 14    new_rowid := DBMS_ROWID.ROWID_CREATE(1, p_object_id, fileno , blockno , rowno);
 15  
 16    return(new_rowid);
 17  
 18  END;
 19  /
 
Function created.
 
SQL> select rowid,old_rowid(rowid) old_rowid from mf.BROKE_NO_DEPENDENCIES where rownum < 5
SQL> /
 
ROWID              OLD_ROWID
------------------ ------------------------------
AAACYmAAEAAAAAMAAt 4.12.45
AAACYmAAEAAAACIAAt 4.136.45
AAACYmAAEAAAAAMAAF 4.12.5
AAACYmAAEAAAACIAAF 4.136.5

One other thing to keep in mind, as Toon mentioned, it may be possible to retrieve data from an index, even if the underlying data block is messed up. Selecting only the values of the indexed columns allows Oracle to completely bypass the data blocks. So, if for example, you found a set of blocks were inaccessible, you may be able to construct statements that would retrieve at least some of the data from the indexes like so:

select /*+ index (messed_up_object  mess_up_object_pk) */ indexed_column1, indexed_column2 
from messed_up_object 
where rowid in (select dbms_rowid.rowid_create(1,object,file,block,row) from bad_rows_table);

Your comments are always welcomed!

4 Comments

  1. Anonymous says:

    Kerry, I was pleasantly suprised to hear that a paper written over 15 years ago, still helped out someone. Nice meeting you at Hotsos.

    Toon

  2. osborne says:

    Toon,

    Nice to meet you too. I enjoyed the first half of your talk this year (I skipped the second half to listen to someone else – that I can’t remember now). Anyway, you had me nodding my head up and down on the whole talk. I think moving the majority of code towards the database and away from the application is absolutely the way to go.

    Your paper on working around block corruption was very helpful on this particular situation. While it wasn’t actually block corruption (which we have other mechanism to solve now), the approach worked great. We had a billion+ row table. Fortunately it was partitioned and we were able to export by partition – only a couple of sub-partitions had the problem – so those we ran through block by block, row by row. Got it all back but 2 blocks – about 100 records I think (out of 1.3 billion or so). Not bad!

    Thanks for writing that paper!

    Kerry

  3. […] 同样我们可以采用Kerry Osborne的脚本来实现.这些稍微修改了下 增加了选择tablespace 的功能: […]

  4. […] 同样我们可以采用Kerry Osborne的脚本来实现.这些稍微修改了下 增加了选择tablespace 的功能: […]

Leave a Reply