EHCC and the GET_COMPRESSION_TYPE function

Well I turned in the HCC chapter on the Exadata book last week and of course as is usually the case, I immediately ran across something kind of cool on the topic I just finished writing about. (we still have several editing passes though, so I can add it later). Anyway, although I don’t have time to post much these days, I thought this one would be a quick little snippet. So here it is.

The Compression Advisor is part of the DBMS_COMPRESSION package. Specifically it is the GET_COMPRESSION_RATIO procedure. This procedure is worthy of a separate post but I won’t discuss it here except to say that as of 11.2.0.2 you can use it to test HCC compression ratios on non-Exadata platforms. That’s pretty cool, but what I wanted to tell you about is a handy little function in the same package called GET_COMPRESSION_TYPE. This function can tell you exactly what level of compression has been applied to a single row. This can come in handy for investigating the inner workings of  HCC (or OLTP or BASIC compression for that matter).

As you probably already know, HCC is only applied to records loaded via direct path writes. Any updates cause rows to be migrated out of that storage format into blocks flagged for OLTP compression. Of course OLTP compression on a block only kicks in when a block is “full”. On top of this, altering a table to change it’s compression does not actually change the storage format of any existing records (unless you use the MOVE keyword). So you could load some data and then change the designation (say from QUERY LOW to QUERY HIGH). Rows that are inserted after the change will be stored in the new format (assuming the records are loaded via direct path writes of course). So why am I telling you all this. Well, because I ran across a statement in some Oracle documentation that said you can check to see what compression method a table is stored with by looking at the COMPRESS_FOR column in the DBA_TABLES view. This column does reveal what the table designation is. However, the setting actually only tells you how rows inserted in the future will be compressed. It tells you absolutely nothing about the way current rows are stored.

As for the mechanics, it appears that each row has a bitmask associated with it showing what compression format is being used. So I wrote a little script to give me what I want to see (check_row_comp.sql) using the DBMS_COMPRESSION.GET_COMPRESSION_TYPE function. Here’s an example of its use.

== Note this listing has been updated to fix the bitmask as suggested by Greg in the comments (I had it wrong initially)
 
SYS@SANDBOX1> !cat check_row_comp.sql
col old_rowid for a20
/*
There is a bit mask that indicates level of compression
10000000 (1) = no compression
01000000 (2) = BASIC/OLTP
00100000 (4) = QUERY HIGH
00010000 (8) = QUERY LOW
00001000 (16) = ARCHIVE HIGH
00000100 (32) = ARCHIVE LOW
*/
select 
old_rowid(rowid) old_rowid,
decode(
DBMS_COMPRESSION.GET_COMPRESSION_TYPE ( '&owner', '&table_name', '&rowid'), 
1, 'No Compression',
2, 'Basic/OLTP Compression', 
4, 'HCC Query High',
8, 'HCC Query Low',
16, 'HCC Archive High',
32, 'HCC Archive Low',
'Unknown Compression Level') compression_type
from dual;
 
SYS@SANDBOX1> select rowid from kso.skew_hcc3 where pk_col = 444444;
 
ROWID
------------------
AAAWbXAAIAAGAfiBdz
 
1 row selected.
 
SYS@SANDBOX1> @check_row_comp
Enter value for owner: KSO
Enter value for table_name: SKEW_HCC3
Enter value for rowid: AAAWbXAAIAAGAfiBdz
 
OLD_ROWID            COMPRESSION_TYPE
-------------------- -------------------------
1.929.0              HCC Archive Low
 
1 row selected.
 
SYS@SANDBOX1> update kso.skew_hcc3 set col1 = col1*2 where pk_col=444444;
 
1 row updated.
 
SYS@SANDBOX1> commit;
 
Commit complete.
 
SYS@SANDBOX1> select rowid, old_rowid(rowid) old_rowid from kso.skew_hcc3 where pk_col = 44444;
 
ROWID              OLD_ROWID
------------------ --------------------
AAAWbXAAIAAF7aUAAA 8.1554068.0
 
1 row selected.
 
SYS@SANDBOX1> -- row migrated to file 8
SYS@SANDBOX1> 
SYS@SANDBOX1> @check_row_comp
Enter value for owner: KSO
Enter value for table_name: SKEW_HCC3
Enter value for rowid: AAAWbXAAIAAF7aUAAA
 
OLD_ROWID            COMPRESSION_TYPE
-------------------- -------------------------
1.929.0              No Compression
 
1 row selected.

I did a previous post (Proof That Whole CU’s Are Not Decompressed) where I showed row migration to a block flagged for OLTP compression, but of course the OLTP compression would not be applied until the block was sufficiently full. As you can see in the example, the function showed that the record was not actually compressed, even though it had been migrated to a block flagged for OLTP compression.

So the GET_COMPRESSION_TYPE function is handy for showing us how an individual record is actually stored. Of course you can see this by dumping blocks as well, but I think this function is much easier than looking at block dumps. You could also write a script to sample some percentage of the rows in a table to get a feel for how well compressed the rows are, but I will leave that as an exercise for the reader. (please let me know if you undertake that task as it is something I will eventually want to do as well) ;)

13 Comments

  1. Greg Rahn says:

    Your bit mask/decode is incorrect (there is no BASIC as BASIC/OLTP have the same block structure & compression algo).
    Oracle Docs: http://download.oracle.com/docs/cd/E14072_01/appdev.112/e10577/d_compress.htm

    From rdbms/admin/dbmscomp.sql

    [code]
    COMP_NOCOMPRESS CONSTANT NUMBER := 1;
    COMP_FOR_OLTP CONSTANT NUMBER := 2;
    COMP_FOR_QUERY_HIGH CONSTANT NUMBER := 4;
    COMP_FOR_QUERY_LOW CONSTANT NUMBER := 8;
    COMP_FOR_ARCHIVE_HIGH CONSTANT NUMBER := 16;
    COMP_FOR_ARCHIVE_LOW CONSTANT NUMBER := 32;
    [/code]

  2. osborne says:

    Thanks Greg,

    I have fixed the script and the listing of it in the post.

    By the way, I noticed that the script also uses my old_rowid function which can be created using this script: create_old_rowid.sql I failed to mention that in the post as well.

    By the way, I always find it amusing (probably not the best word – maybe interesting is better) when something works correctly for a seemingly random set of test data and then turns out to be completely wrong for another set of seemly random test data. That was the case here – no compression, basic, query low and archive low all worked, but the other compression levels were incorrect. It just turns out that I used no compression and archive low for my example. Doh!

    Might help if I read the docs as well. ;)

    Kerry

  3. Santhosh Nair says:

    – Getting some rowids for checking the compression

    select rowid from TEST_ORA600_MV where rownum FALSE (to get direct path insert after a truncate of the mview )

    exec dbms_mview.refresh(‘TEST_ORA600_MV’,method => ‘C’,ATOMIC_REFRESH => FALSE);

    – Get some rows again

    select rowid from TEST_ORA600_MV where rownum TRUE , same results , CFQH is completely disabled

    • Santhosh Nair says:

      Good post, Kerry

      Have you seen this behavior before.

      – Create a simple table based on DBA_objects

      CREATE TABLE test_ora600
      (
      CREATED,
      DATA_OBJECT_ID,
      EDITION_NAME,
      GENERATED,
      LAST_DDL_TIME,
      NAMESPACE,
      OBJECT_ID,
      OBJECT_NAME,
      OBJECT_TYPE,
      OWNER,
      SECONDARY,
      STATUS,
      SUBOBJECT_NAME,
      TEMPORARY,
      TIMESTAMP
      )
      TABLESPACE DATA
      COMPRESS FOR QUERY HIGH
      NOLOGGING
      NOCACHE
      PARTITION BY RANGE (CREATED)
      INTERVAL(1)
      STORE IN (DATA)
      (PARTITION P20000101 VALUES LESS THAN (20000101))
      PCTFREE 0
      PCTUSED 0
      STORAGE(INITIAL 64K)
      ENABLE ROW MOVEMENT
      AS
      SELECT
      to_number(to_char(CREATED, ‘YYYYMMDD’)) AS CREATED ,
      DATA_OBJECT_ID,
      EDITION_NAME,
      GENERATED,
      LAST_DDL_TIME,
      NAMESPACE,
      OBJECT_ID,
      OBJECT_NAME,
      OBJECT_TYPE,
      OWNER,
      SECONDARY,
      STATUS,
      SUBOBJECT_NAME,
      TEMPORARY,
      TIMESTAMP
      FROM DBA_OBJECTS
      /

      – Create an mview on it with Compress for Query High

      CREATE Materialized VIEW TEST_ORA600_MV
      compress for query high
      nologging
      partition by range (CREATED)
      interval (1)
      ( PARTITION P1 VALUES LESS THAN (20000101))
      REFRESH FAST
      ON DEMAND AS
      select created, object_type, count(*) CNT from test_ora600 group by created,object_type
      /

      If I look at the rowids it shows compress for Query High (Type 4) but another complete refresh of the mview with atomic_refresh = FALSE will revert the compression back to 1 (NOCOMPRESS)

      Thanks

      Santhosh

  4. osborne says:

    Santhosh,

    I had not played with HCC on MV’s prior to seeing your question. I tried a couple of tests though and it appears to me that compression (even OLTP and BASIC) doesn’t work with partitioned materialized views. Also you need to do the create statement in parallel to get direct path inserts which are required to get the compression to kick in. I’m not sure about the quirky reporting of compression type by the get_compression_type procedure. But in general you need to use syntax something like this:

    KSO@SANDBOX1> CREATE Materialized VIEW TEST_ORA600_MV_HCC1_PX_NOPART
    2 parallel 2
    3 compress for query low
    4 nologging
    5 REFRESH FAST
    6 ON DEMAND AS
    7 select created, object_type, count(*) CNT from test_ora600 group by created,object_type
    8 /

    The refresh will need to be done parallel as well to ensure direct path (or compression will drop back to OLTP).

  5. Santhosh Nair says:

    Thanks Kerry

    I filed a bug with oracle , I have another bug with a similar issue also filed (Partitioned mview refresh with parallel will crash with ORA-600) .

    So it looks like I had to use NOPART for the time being .

    Santhosh

  6. Santhosh Nair says:

    Hi Kerry

    Just an update , Even if use a non partition option for mview , The first refresh will disable the compression . This is true even if we use PDML or Parallelism in mview refresh statement or even use ATOMIC_REFRESH => FALSE (Which should do direct path insert)

    Looks like mview refresh and EHCC doesnt work together .

    Thanks for the help

    Santhosh

  7. Christian Antognini says:

    Hi Kerry

    > … as of 11.2.0.2 you can use it to test HCC compression ratios on non-Exadata platforms

    Mhmm… I puzzled?!?! If I correctly understands to what function you are referring it was already working in 11.2.0.1 (also to this post: http://antognini.ch/2010/05/how-good-are-the-values-returned-by-dbms_compression-get_compression_ratio/ ).

    Or I’m missing something?

    Cheers,
    Chris

  8. osborne says:

    Hi Chris,

    That’s interesting. It doesn’t work for me on my 11.2.0.1 test database. My non-Exadata test machine is Redhat Linux. I wonder if there may be porting differences that account for this behavior. Interestingly, we have seen consistent behavior on 11.2.0.1 on a couple of different Redhat boxes with regard to HCC support. On a completely separate system we tested Dataguard of an HCC table to non-Exadata platform. Dataguard had no trouble replicating on 11.2.0.1 or 11.2.0.2. Neither would allow querying the table though. 11.2.0.1 would not allow the ALTER TABLE MOVE command to decompress the table while 11.2.0.2 allowed us to decompress with the ALTER TABLE MOVE. So at least on the tests we’ve done it looks like 11.2.0.1 was very strict on not allowing any HCC features on non-Exadata platforms, but 11.2.0.2 has relaxed the restrictions on commands that make sense (like ALTER TABLE MOVE and estimating compression ratios). Here’s a quick test case on the GET_COMPRESSION_RATIO bit:

    SYS@LAB112> select version from v$instance;
    
    VERSION
    -----------------
    11.2.0.1.0
    
    1 row selected.
    
    SYS@LAB112> set echo on
    SYS@LAB112> @a
    SYS@LAB112> DECLARE
      2    l_blkcnt_cmp       BINARY_INTEGER;
      3    l_blkcnt_uncmp     BINARY_INTEGER;
      4    l_row_cmp          BINARY_INTEGER;
      5    l_row_uncmp        BINARY_INTEGER;
      6    l_cmp_ratio        NUMBER;
      7    l_comptype_str     VARCHAR2(100);
      8  BEGIN
      9        dbms_compression.get_compression_ratio(
     10          -- input parameters
     11          scratchtbsname   => 'USERS',       -- scratch tablespace
     12          ownname          => 'KSO',           -- owner of the table
     13          tabname          => 'SKEW',          -- table name
     14          partname         => NULL,            -- partition name
     15          comptype         => power(2,3),      -- compression algorithm
     16          -- output parameters
     17          blkcnt_cmp       => l_blkcnt_cmp,    -- number of compressed blocks
     18          blkcnt_uncmp     => l_blkcnt_uncmp,  -- number of uncompressed blocks
     19          row_cmp          => l_row_cmp,       -- number of rows in a compressed block
     20          row_uncmp        => l_row_uncmp,     -- number of rows in an uncompressed block
     21          cmp_ratio        => l_cmp_ratio,     -- compression ratio
     22          comptype_str     => l_comptype_str   -- compression type
     23        );
     24        dbms_output.put_line('SKEW'||' - '||'type: '||l_comptype_str||' ratio: '||to_char(l_cmp_ratio,'99.999'));
     25  END;
     26  /
    DECLARE
    *
    ERROR at line 1:
    ORA-00942: table or view does not exist
    ORA-06512: at "SYS.PRVT_COMPRESSION", line 461
    ORA-64307: hybrid columnar compression is only supported in tablespaces residing on Exadata storage
    ORA-06512: at "SYS.DBMS_COMPRESSION", line 214
    ORA-06512: at line 9
    
    ======= Log on to 11.2.0.2 database
    
    SYS@LAB11202> select version from v$instance;
    
    VERSION
    -----------------
    11.2.0.2.0
    
    SYS@LAB11202> set echo on
    SYS@LAB11202> @a
    SYS@LAB11202> DECLARE
      2    l_blkcnt_cmp       BINARY_INTEGER;
      3    l_blkcnt_uncmp     BINARY_INTEGER;
      4    l_row_cmp          BINARY_INTEGER;
      5    l_row_uncmp        BINARY_INTEGER;
      6    l_cmp_ratio        NUMBER;
      7    l_comptype_str     VARCHAR2(100);
      8  BEGIN
      9        dbms_compression.get_compression_ratio(
     10          -- input parameters
     11          scratchtbsname   => 'USERS',       -- scratch tablespace
     12          ownname          => 'KSO',           -- owner of the table
     13          tabname          => 'SKEW',          -- table name
     14          partname         => NULL,            -- partition name
     15          comptype         => power(2,3),      -- compression algorithm
     16          -- output parameters
     17          blkcnt_cmp       => l_blkcnt_cmp,    -- number of compressed blocks
     18          blkcnt_uncmp     => l_blkcnt_uncmp,  -- number of uncompressed blocks
     19          row_cmp          => l_row_cmp,       -- number of rows in a compressed block
     20          row_uncmp        => l_row_uncmp,     -- number of rows in an uncompressed block
     21          cmp_ratio        => l_cmp_ratio,     -- compression ratio
     22          comptype_str     => l_comptype_str   -- compression type
     23        );
     24        dbms_output.put_line('SKEW'||' - '||'type: '||l_comptype_str||' ratio: '||to_char(l_cmp_ratio,'99.999'));
     25  END;
     26  /
    Compression Advisor self-check validation successful. select count(*) on both Uncompressed and EHCC Compressed format = 1000001 rows
    SKEW - type: "Compress For Query Low" ratio:   4.100
    
    PL/SQL procedure successfully completed.
    
  9. Andy Black says:

    Hi Kerry,

    To get this to work on RH with 11.2.0.1, you want patch# 8896202. The functionality is included in 11.2.0.2.

    Patch 8896202: ENABLE COMPRESSION ADVISOR TO ESTIMATE EXADATA HCC COMPRESSION RATIOS

    I used it a lot on 11.2.0.1 about a year ago…it was the key justification for our Exadata purchases. When people see the list cost of Exadata, they get intimidated…but when you consider the cost savings from compression, its very inexpensive because you save soooo much $$ on storage.

  10. [...] Lets insert some data and check that the actual row is compressed (thanks to Kerry Osborne) [...]

  11. [...] Lets insert some data and check that the actual row is compressed (thanks to Kerry Osborne) [...]

  12. Joe Cornell says:

    Kerry,

    In 11.2.0.3, we get a lot of rows that are type 64. Oracle will not directly explain what that means but MOS note 1477918.1 has a script that labels it a “compressed row”. Do you know what type of compression is implied by a return value of 64. One particular table in our OBI suite has massive “updates” and shows almost all the rows as type 64. The table is supposed to be query high. I am trying to convince our ETL team they need to change.

    BTW – loved your book and found it a good quick read. Too bad the e-book version formatting does not work. e-book vendors seem to think they get to charge full price for formatting but they don’t need to do any work to make the material fit the e-reader. [End of Soapbox]

    Joe

Leave a Reply