Archive for the ‘12c’ Category.

12c In-Memory on RAC

I started looking into In-Memory on RAC this week. Data can be distributed across RAC nodes in a couple of different ways. The default is to spread it across the available nodes in the cluster. So if you had a 2 node cluster, roughly 50% of the data in your table or partition would be loaded into the column store in each of the 2 instances.

SYS@dw1> alter table kso.skew inmemory;
 
Table altered.
 
SYS@dw1> @gen_ddl
Enter value for object_type: 
Enter value for owner: KSO
Enter value for object_name: SKEW
 
DDL
--------------------------------------------------------------------------------
 
  CREATE TABLE "KSO"."SKEW"
   (    "PK_COL" NUMBER,
        "COL1" NUMBER,
        "COL2" VARCHAR2(30),
        "COL3" DATE,
        "COL4" VARCHAR2(1),
         PRIMARY KEY ("PK_COL")
  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 INVISIBLE COMPUTE STATISTICS
  STORAGE(INITIAL 865075200 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"  ENABLE
   ) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 1480589312 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"
  INMEMORY PRIORITY NONE MEMCOMPRESS FOR QUERY LOW
  DISTRIBUTE AUTO NO DUPLICATE                             <--- here's the RAC bit
   CACHE
 
SYS@dw1> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
                                                                             ----------------
sum
 
no rows selected
 
SYS@dw1> select count(*) from kso.skew;
 
  COUNT(*)
----------
  32000004
 
SYS@dw1> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            SKEW                                  1,413.0            391.4        1.7              749.4
                                                                             ----------------
sum                                                                                     391.4
SYS@dw1> -- so about half the data is loaded in the local instance column store
SYS@dw1> -- let's see what's in the other instance's cache
SYS@dw1> l
  1  SELECT v.owner, v.segment_name,
  2  v.bytes/(1024*1024) orig_size_megs,
  3  v.inmemory_size/(1024*1024) in_mem_size_megs,
  4  (v.bytes - v.bytes_not_populated) / v.inmemory_size comp_ratio,
  5  v.bytes_not_populated/(1024*1024) megs_not_populated
  6  FROM v$im_segments v
  7  where owner like nvl('&owner',owner)
  8* and segment_name like nvl('&segment_name',segment_name)
SYS@dw1> l6
  6* FROM v$im_segments v
SYS@dw1> c/v$/gv$/
  6* FROM gv$im_segments v
SYS@dw1> /
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            SKEW                                  1,413.0            569.1        1.6              526.6
KSO                            SKEW                                  1,413.0            391.4        1.7              749.4
                                                                             ----------------
sum                                                                                     960.5

So in this example with a non-partitioned table, the data is automatically spread (roughly 50/50) across the 2 instances. That’s pretty cool. It allows the column store to be scaled via adding RAC nodes. For partitioned tables there are a couple of options as to how the data can be distributed. You can distribute by partition or by rowid range. The default for a partitioned table appears to be to distribute partitions to the nodes in a round robin fashion not ordered by number of rows or size (as I would have expected), but then again I haven’t done much testing on it. Anyway, here’s a simple example using a partitioned table.

SYS@dw1> create table kso.skewp partition by range (col1) (partition "P1" values less than (2), partition p2 values less than (100000), partition p3 values less than (maxvalue)) as select * from kso.skew;
 
Table created.
 
SYS@dw1> select count(*) from kso.skewp partition (p1);
 
  COUNT(*)
----------
   3199971
 
SYS@dw1> select count(*) from kso.skewp partition (p2);
 
  COUNT(*)
----------
      8512
 
SYS@dw1> select count(*) from kso.skewp partition (p3);
 
  COUNT(*)
----------
  28791521
 
SYS@dw1> alter table kso.skewp inmemory;
 
Table altered.
 
SYS@dw1> select count(*) from kso.skewp;
 
  COUNT(*)
----------
  32000004
 
SYS@dw1> @inmem_part_segs
Enter value for owner: 
Enter value for segment_name: SKEWP
 
OWNER                          SEGMENT_NAME                   PARTITION_NAME                 ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            SKEWP                          P1                                      128.0             55.4        2.3                 .0
KSO                            SKEWP                          P3                                    1,220.0            349.6        1.5              689.3
                                                                                                            ----------------
sum                                                                                                                    405.1
 
SYS@dw1> @inmem_part_segs_g
Enter value for owner: 
Enter value for segment_name: SKEWP
 
   INST_ID OWNER                          SEGMENT_NAME                   PARTITION_NAME                 ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
---------- ------------------------------ ------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
         1 KSO                            SKEWP                          P1                                      128.0             55.4        2.3                 .0
         1 KSO                            SKEWP                          P3                                    1,220.0            878.1        1.4                 .0
         2 KSO                            SKEWP                          P2                                        8.0              1.1        7.1                 .0
                                                                                                                       ----------------
sum                                                                                                                               934.7
 
SYS@dw1> -- very skewed distribution, P1 and P3 on node 1, and P2 on node 2
SYS@dw1> -- let's try again
SYS@dw1> alter table kso.skewp inmemory distribute by rowid range;
 
Table altered.
 
SYS@dw1> select count(*) from kso.skewp;
 
  COUNT(*)
----------
  32000004
 
SYS@dw1> @inmem_part_segs
Enter value for owner: 
Enter value for segment_name: SKEWP
 
set lines 200
OWNER                          SEGMENT_NAME                   PARTITION_NAME                 ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            SKEWP                          P1                                      128.0             55.4        2.3                 .0
KSO                            SKEWP                          P3                                    1,220.0            605.2        1.4              358.9
                                                                                                            ----------------
sum                                                                                                                    660.6
 
 
SYS@dw1> @inmem_part_segs_g.sql
Enter value for owner: 
Enter value for segment_name: SKEWP
 
   INST_ID OWNER                          SEGMENT_NAME                   PARTITION_NAME                 ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
---------- ------------------------------ ------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
         1 KSO                            SKEWP                          P1                                      128.0             55.4        2.3                 .0
         1 KSO                            SKEWP                          P3                                    1,220.0            605.2        1.4              358.9
         2 KSO                            SKEWP                          P2                                        8.0              1.1        7.1                 .0
         2 KSO                            SKEWP                          P3                                    1,220.0            263.0        1.6              793.5
                                                                                                                       ----------------
sum                                                                                                                               924.8

So the default distribution mode for partitioned tables appears to be round robin by partition. This can result in very skewed data distributions as in this example. Using the DISTRIBUTE BY ROWID RANGE syntax we can get a more even distribution of data across the nodes. In this case it put the 2 smaller partitions on separate nodes and split the largest partition between the two (asymmetrically by the way, so that the data was actually spread pretty much equally).

There is another option which allows all the data to be duplicated across the instances. This is a high availability option. If a node needs to be restarted, query times could be impacted until the data is reloaded. Note that queries would still execute without error, but the data would have to be processed through the normal row oriented process. The DUPLICATE option populates the column store of both nodes with all the data for the object, in order to minimize performance impact if a node goes down. The documentation says this feature is only available on engineered systems by the way. Let’s have a look.

SYS@dw1> alter table kso.skew inmemory duplicate;
 
Table altered.
 
SYS@dw1> @gen_ddl
Enter value for object_type: 
Enter value for owner: KSO
Enter value for object_name: SKEW
 
DDL
--------------------------------------------------------------------------------
 
  CREATE TABLE "KSO"."SKEW"
   (    "PK_COL" NUMBER,
        "COL1" NUMBER,
        "COL2" VARCHAR2(30),
        "COL3" DATE,
        "COL4" VARCHAR2(1),
         PRIMARY KEY ("PK_COL")
  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 INVISIBLE COMPUTE STATISTICS
  STORAGE(INITIAL 865075200 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"  ENABLE
   ) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 1480589312 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "USERS"
  INMEMORY PRIORITY NONE MEMCOMPRESS FOR QUERY LOW
  DISTRIBUTE AUTO DUPLICATE
   CACHE
 
SYS@dw1> select count(*) from kso.skew;
 
  COUNT(*)
----------
  32000004
 
SYS@dw1> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            SKEW                                  1,413.0            956.4        1.5                 .0
                                                                             ----------------
sum                                                                                     956.4
 
SYS@dw1> @inmem_segs_g.sql
Enter value for owner: 
Enter value for segment_name: 
 
   INST_ID OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED POPULATE_
---------- ------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------ ---------
         1 KSO                            SKEW                                  1,413.0            956.4        1.5                 .0 COMPLETED
         2 KSO                            SKEW                                  1,413.0            965.4        1.5                 .0 COMPLETED
                                                                                        ----------------
sum                                                                                              1,921.9

So the DUPLICATE keyword caused the entire table to be loaded into both instances (i.e. DUPLICATEd). Note that this test was done on an Exadata (an engineered system) so it should work.

How the data is accessed across the nodes is also interesting. When accessing data spread across two nodes with a serial plan, all the work appears to be done by the server process handling the connection on the driving node. (i.e. there don’t appear to be any processes on the remote that are burning CPU) When accessing data with a parallel plan on the other hand, slave processes are used on both nodes. This makes sense because that capability already existed. In this case though, the slaves appear to automatically be spread across the nodes at least when the data is distributed (i.e. not duplicated). This is in contrast to the normal spreading of slaves across a cluster which tends to want to keep them on the same node if possible. Anyway, here is an example. Note that this example has a very unusual SQL Monitor output.

SYS@dw1> select /*+ parallel */ avg(pk_col+col1) from kso.skewp a;
 
AVG(PK_COL+COL1)
----------------
      16588749.5
 
SYS@dw1> @rsm
Enter value for sql_id: 
Enter value for sql_exec_id: 
 
REPORT
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL Monitoring Report
 
SQL Text
------------------------------
select /*+ parallel */ avg(pk_col+col1) from kso.skewp a
 
Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 Session             :  SYS (141:54624)
 SQL ID              :  0qmdbnszhb2nh
 SQL Execution ID    :  16777218
 Execution Started   :  09/16/2014 22:01:27
 First Refresh Time  :  09/16/2014 22:01:26
 Last Refresh Time   :  09/16/2014 22:01:28
 Duration            :  1s
 Module/Action       :  sqlplus@enkdb03.enkitec.com (TNS V1-V3)/-
 Service             :  SYS$USERS
 Program             :  sqlplus@enkdb03.enkitec.com (TNS V1-V3)
 Fetch Calls         :  1
 
Global Stats
===================================================================================================
| Elapsed |   Cpu   |    IO    | Application |  Other   | Fetch | Buffer | Read | Read  |  Cell   |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs | Bytes | Offload |
===================================================================================================
|    3.67 |    3.66 |     0.00 |        0.00 |     0.00 |     1 |  81893 |    1 | 352KB |  59.84% |
===================================================================================================
 
Parallel Execution Details (DOP=3 , Servers Allocated=3)
 Instances  : 2
 
=======================================================================================================================================================
| Instance |      Name      | Type  | Server# | Elapsed |   Cpu   |    IO    | Application |  Other   | Buffer | Read | Read  |  Cell   | Wait Events |
|          |                |       |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes | Offload | (sample #)  |
=======================================================================================================================================================
| 1        | PX Coordinator | QC    |         |    0.01 |    0.01 |          |        0.00 |     0.00 |     13 |      |     . |    NaN% |             |
| 1        | p000           | Set 1 |       1 |    1.05 |    1.05 |          |             |          |     21 |      |     . |    NaN% |             |
| 1        | p001           | Set 1 |       2 |    1.01 |    1.01 |     0.00 |             |          |     69 |    1 | 352KB |  59.84% |             |
| 2        | p000           | Set 1 |       3 |    1.60 |    1.60 |          |             |          |  81790 |      |     . |    NaN% |             |
=======================================================================================================================================================
 
Instance Drill-Down
====================================================================================================================================
| Instance | Process Names | Elapsed |   Cpu   |    IO    | Application |  Other   | Buffer | Read | Read  |  Cell   | Wait Events |
|          |               | Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes | Offload |             |
====================================================================================================================================
|    1     | QC p000 p001  |    2.07 |    2.06 |     0.00 |        0.00 |     0.00 |    103 |    1 | 352KB |  59.84% |             |
|    2     | p000          |    1.60 |    1.60 |          |             |          |  81790 |      |       |    NaN% |             |
====================================================================================================================================
 
SQL Plan Monitoring Details (Plan Hash Value=2545210427)
============================================================================================================================================================================
| Id |             Operation             |   Name   |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  |  Cell   |  Mem  | Activity | Activity Detail |
|    |                                   |          | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | Offload | (Max) |   (%)    |   (# samples)   |
============================================================================================================================================================================
|  0 | SELECT STATEMENT                  |          |         |      |         1 |     +1 |     1 |        1 |      |       |         |       |          |                 |
|  1 |   SORT AGGREGATE                  |          |       1 |      |         1 |     +1 |     1 |        1 |      |       |         |       |          |                 |
|  2 |    PX COORDINATOR                 |          |         |      |         1 |     +1 |     4 |        3 |      |       |         |       |          |                 |
|  3 |     PX SEND QC (RANDOM)           | :TQ10000 |       1 |      |         1 |     +1 |     3 |        3 |      |       |         |       |          |                 |
|  4 |      SORT AGGREGATE               |          |       1 |      |         1 |     +1 |     3 |        3 |      |       |         |       |    33.33 | Cpu (1)         |
|  5 |       PX BLOCK ITERATOR           |          |     32M |  783 |         1 |     +1 |     3 |      32M |      |       |         |       |          |                 |
|  6 |        TABLE ACCESS INMEMORY FULL | SKEWP    |     32M |  783 |         2 |     +0 |    31 |      32M |    1 | 352KB |  60.00% |    1M |    66.67 | in memory (2)   |
============================================================================================================================================================================

Can you see why this one is so unusual? This plan showed a TABLE ACCESS INMEMORY FULL, which had both inmemory access and offloading via smart scan (probably on different partitions). I’m not at all sure why that happened though. If you know please enlighten me.

12c In-Memory in PDB

In preparation for our upcoming 12c In-Memory Webcast @CaryMillsap, @TanelPoder, and I solicited questions from members of the universe at large on the interweb. We got a question about how In-Memory works with the 12c multi-tentant option and it got me thinking so I gave it a quick try. As it turns out, it works about as you would expect. The basic idea is to turn it on for the container DB (which is where the memory is actually allocated (ala the other main shared memory regions) and then decide which PDBs to allow to use it (and if so how much of it to use) or not. First, here are the steps necessary to allocate the memory in the container DB.

-bash-3.2$ rlwrap sqlplus / as sysdba
 
SQL*Plus: Release 12.1.0.2.0 Production on Fri Sep 12 16:07:31 2014
 
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
 
 
INSTANCE_NAME    STARTUP_TIME               CURRENT_TIME                  DAYS    SECONDS
---------------- -------------------------- -------------------------- ------- ----------
democ1           12-SEP-2014 09:49          12-SEP-2014 16:07              .26      22675
 
SYS@democ1> select cdb from v$database;
 
CDB
---
YES
 
SYS@democ1> SELECT PDB_ID, PDB_NAME, STATUS FROM DBA_PDBS ORDER BY PDB_ID;
 
    PDB_ID PDB_NAME                       STATUS
---------- ------------------------------ ---------
         2 PDB$SEED                       NORMAL
         3 DEMOPDB                        NORMAL
 
SYS@democ1> @parms
Enter value for parameter: inmem
Enter value for isset: 
Enter value for show_hidden: 
 
NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
inmemory_clause_default                                                                                                   TRUE     FALSE      FALSE
inmemory_force                                     DEFAULT                                                                TRUE     FALSE      FALSE
inmemory_max_populate_servers                      12                                                                     TRUE     FALSE      FALSE
inmemory_query                                     ENABLE                                                                 TRUE     FALSE      FALSE
inmemory_size                                      0                                                                      FALSE    FALSE      TRUE
inmemory_trickle_repopulate_servers_percent        1                                                                      TRUE     FALSE      FALSE
optimizer_inmemory_aware                           TRUE                                                                   TRUE     FALSE      FALSE
 
7 rows selected.
 
SYS@democ1> alter system set inmemory_size=500M;
alter system set inmemory_size=499M
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified
 
SYS@democ1> alter system set inmemory_size=500M scope=spfile;
 
System altered.
 
SYS@democ1> startup force
ORACLE instance started.
 
Total System Global Area 8589934592 bytes
Fixed Size                  6877112 bytes
Variable Size            1644167240 bytes
Database Buffers         6257901568 bytes
Redo Buffers              144117760 bytes
In-Memory Area            536870912 bytes
Database mounted.
Database opened.
SYS@democ1> @parms
Enter value for parameter: inmem
Enter value for isset: 
Enter value for show_hidden: 
 
NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
inmemory_clause_default                                                                                                   TRUE     FALSE      FALSE
inmemory_force                                     DEFAULT                                                                TRUE     FALSE      FALSE
inmemory_max_populate_servers                      12                                                                     TRUE     FALSE      FALSE
inmemory_query                                     ENABLE                                                                 TRUE     FALSE      FALSE
inmemory_size                                      536870912                                                              FALSE    FALSE      TRUE
inmemory_trickle_repopulate_servers_percent        1                                                                      TRUE     FALSE      FALSE
optimizer_inmemory_aware                           TRUE                                                                   TRUE     FALSE      FALSE
 
7 rows selected.

So I logged into a container database that has one PDB (DEMOPDB). I checked to see if there was memory assigned to the column store, and there wasn’t. (inmemory_size was set to the default value of 0). Then I attempted to change the value, which as you can see is not a dynamic component. I had to make the change in the spfile and then restart the entire database including any PDBs.

So we’ve enabled inmemory at the container level. Now let’s check the PDB.

Note: The CDB/PDB is still a little weird for me. I use this little script to set SQL*Plus prompt to keep me straight. connect_pdb.sql

Note 2: Someone else asked about the very simple inmem_segs.sql script so here’s a link to it too: inmem_segs.sql

 
SYS@democ1> @connect_pdb
Enter value for pdb_name: DEMOPDB
 
Session altered.
 
 
 
 
 
SYS@democ1:DEMOPDB> alter system set inmemory_size=0;
alter system set inmemory_size=0
*
ERROR at line 1:
ORA-01219: database or pluggable database not open: queries allowed on fixed tables or views only
 
SYS@democ1> select pdb_id con_id, pdb_name, status, open_mode, restricted from dba_pdbs a, v$pdbs b where a.con_id = b.con_id order by pdb_id;
 
    CON_ID PDB_NAME                       STATUS    OPEN_MODE  RES
---------- ------------------------------ --------- ---------- ---
         2 PDB$SEED                       NORMAL    READ ONLY  NO
         3 DEMOPDB                        NORMAL    MOUNTED
 
SYS@democ1:DEMOPDB> startup
Pluggable Database opened.
SYS@democ1:DEMOPDB> @parms
Enter value for parameter: inmem
Enter value for isset: 
Enter value for show_hidden: 
 
NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
inmemory_clause_default                                                                                                   TRUE     FALSE      FALSE
inmemory_force                                     DEFAULT                                                                TRUE     FALSE      FALSE
inmemory_max_populate_servers                      12                                                                     TRUE     FALSE      FALSE
inmemory_query                                     ENABLE                                                                 TRUE     FALSE      FALSE
inmemory_size                                      536870912                                                              FALSE    FALSE      TRUE
inmemory_trickle_repopulate_servers_percent        1                                                                      TRUE     FALSE      FALSE
optimizer_inmemory_aware                           TRUE                                                                   TRUE     FALSE      FALSE
 
7 rows selected.
 
SYS@democ1:DEMOPDB> -- inmemory_size value is inherited from CDB as documented. 
SYS@democ1:DEMOPDB> -- let's check which inmem parameters can be set at PDB level
SYS@democ1:DEMOPDB> 
SYS@democ1:DEMOPDB> select name from v$system_parameter where ispdb_modifiable = 'TRUE' and name like nvl('&name',name);
Enter value for name: inmem%
 
NAME
--------------------------------------------------------------------------------
inmemory_size
inmemory_clause_default
inmemory_force
inmemory_query
 
SYS@democ1:DEMOPDB> -- So inmemory_size can be set
SYS@democ1:DEMOPDB> alter system set inmemory_size = 0;
 
System altered.
 
SYS@democ1:DEMOPDB> alter system set inmemory_size = 10M;
 
System altered.
 
SYS@democ1:DEMOPDB> -- no bounce required, that's cool
SYS@democ1:DEMOPDB> -- let's make sure it works
SYS@democ1:DEMOPDB>
SYS@democ1:DEMOPDB> create table kso.junk as select * from dba_objects;
 
Table created.
 
SYS@democ1:DEMOPDB> alter table kso.junk inmemory;
 
Table altered.
 
SYS@democ1:DEMOPDB> select count(*) from kso.junk;
 
  COUNT(*)
----------
     91025
 
SYS@democ1:DEMOPDB> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            JUNK                                     13.0              4.1        3.2                 .0
                                                                             ----------------
sum                                                                                       4.1
 
SYS@democ1:DEMOPDB> -- so it works
SYS@democ1:DEMOPDB> alter system set inmemory_size = 0;
 
System altered.
 
SYS@democ1:DEMOPDB> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            JUNK                                     13.0              4.1        3.2                 .0
                                                                             ----------------
sum                                                                                       4.1
 
SYS@democ1:DEMOPDB> -- hmmm - doesn't seem to release memory already allocated, not too surprising
SYS@democ1:DEMOPDB> -- let's make sure nothing can be loaded after setting to 0 though
SYS@democ1:DEMOPDB>
SYS@democ1:DEMOPDB> alter table kso.junk inmemory;
 
Table altered.
 
SYS@democ1:DEMOPDB> -- another quirk worth mentioning - "alter table X inmemory" flushes object from column store if it happens to already be there
SYS@democ1:DEMOPDB>
SYS@democ1:DEMOPDB> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
                                                                             ----------------
sum
 
no rows selected
 
SYS@democ1:DEMOPDB> select count(*) from kso.junk;
 
  COUNT(*)
----------
     91025
 
SYS@democ1:DEMOPDB> -- should repopulate is memory is available
SYS@democ1:DEMOPDB> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
                                                                             ----------------
sum
 
no rows selected
 
SYS@democ1:DEMOPDB> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
                                                                             ----------------
sum
 
no rows selected
 
SYS@democ1:DEMOPDB> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
                                                                             ----------------
sum
 
no rows selected
 
SYS@democ1:DEMOPDB> alter system set inmemory_size = 10M;
 
System altered.
 
SYS@democ1:DEMOPDB> select count(*) from kso.junk;
 
  COUNT(*)
----------
     91025
 
SYS@democ1:DEMOPDB> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            JUNK                                     13.0              4.1        3.2                 .0
                                                                             ----------------
sum                                                                                       4.1
 
SYS@democ1:DEMOPDB> -- let's try to make it bigger than the 500M we allocated at the container level
SYS@democ1:DEMOPDB> alter system set inmemory_size = 1G;
alter system set inmemory_size = 1G
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified
 
SYS@democ1:DEMOPDB> -- No can do

So as documented, the PDB specific value of inmemory_size is inherited from the CDB if it is not explicitly set at the PDB level. But we can also modify it (without restarting anything) as long as the value doesn’t exceed the overall value allocated at the container level. Note that this can be used to control how much of the column store memory any one PDB can use and that you can under or over allocate the space as you see fit. However, a single PDB can not specify a value that exceeds the actual value set (and allocated) for the entire container DB as was shown by the last attempted change to set it to 1G when only 500M was allocated by the container DB.

Th-th-th-that’s all folks!

The Next Big Thing

Oracle’s 12.1.0.2 was released a few weeks ago (You can download it from OTN here: Oracle 12.1.0.2 Download). While technically a minor point release, it contains a couple of major features that would normally be rolled out in a more substantial version change like 12cR2 or perhaps V13. Of course the most highly anticipated feature is a new option (Oracle In-Memory Option) that provides a column oriented, in-memory store. Enkitec was in the Beta program, so we’ve been testing it out for quite a while now and we are impressed. Here’s a link to a video of a conversation between myself, Tanel Poder and Cary Millsap about the In-memory Option published prior to the general release. Note: the three of us are also scheduled to do a webcast on the topic on Sep. 17th at 9:00AM CDT. You can sign up here if you are interested: In-Memory Webcast

But back to the topic: What this new option provides is a radical departure from the way Oracle has traditionally managed data access. In the past, all data access was done using row-major format, which is a foundation of the Oracle RDBMS architecture (I’m of course leaving out some esoteric formats such as the hybrid columnar compressed (HCC) format that is available on Exadata). At any rate, this columnar format is a major change in the way data is accessed for Oracle, and while the name of the option indicates that the secret sauce is the fact that the data is accessed from memory, I’m going to argue that the “memory” part is not the most important factor. In my opinion, the column-oriented format is why it’s “The Next Big Thing”.

While accessing data from RAM is definitely faster than reading it off disk, it’s important to note that Oracle has been serving data from memory for decades via the standard buffer cache. In fact, you could describe the Oracle RDBMS as a very sophisticated disk caching mechanism. That’s certainly a vast over simplification, but it’s really not too far from reality. Many Oracle systems spend most of their time accessing data from the buffer cache. Back in the day, DBA’s even invented a metric to describe the effectiveness of the caching. The much maligned “buffer cache hit ratio” was used for that purpose and is still present in the modern day AWR reports. While tuning artificial ratios like this one has long since gone out of fashion, it’s important to note that it is not uncommon to see this ratio in the upper 90′s. (i.e. 99% of blocks being accessed from RAM is common) And in fact, we can pin tables in the buffer cache so that all rows are accessed from memory. So if that’s the case, then we should be able to compare speeds of queries executed on data in memory using both the standard row-major format and the new columnar format. Let’s give it a quick try.

SYS@INMEM> -- note that to enable the In-Memory Option we had to set the INMEMORY_SIZE parameter and bounce the database
SYS@INMEM>
SYS@INMEM> @parms
Enter value for parameter: inmemory_size
Enter value for isset: 
Enter value for show_hidden: 
 
NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
inmemory_size                                      107374182400                                                           FALSE    FALSE      TRUE
 
Elapsed: 00:00:00.08
SYS@INMEM> -- then we had to tell Oracle that we wanted our table to take advantage of the column store
SYS@INMEM>
SYS@INMEM> alter table kso.skew3 inmemory;
 
Table altered.
 
Elapsed: 00:00:00.57
SYS@INMEM> -- we also told Oracle that we wanted our table to be cached in the normal buffer cache
SYS@INMEM>
SYS@INMEM> alter table kso.skew3 cache;
 
Table altered.
 
Elapsed: 00:00:00.01
SYS@INMEM> @inmem_tables
Enter value for owner: KSO
Enter value for table_name: 
 
OWNER                          TABLE_NAME                     CACHE PRIORITY INMEMORY_DISTRI INMEMORY_COMPRESS
------------------------------ ------------------------------ ----- -------- --------------- -----------------
KSO                            SKEW3                              Y NONE     AUTO            FOR QUERY LOW
 
Elapsed: 00:00:00.09
 
. . .
 
SYS@INMEM> -- note that the table was accessed in both modes to get it loaded in both the standard buffer cache and the column store
SYS@INMEM> -- (I cut the output to keep from clouding the issue)
SYS@INMEM> -- but you can see the blocks in the buffer cache below
SYS@INMEM> 
SYS@INMEM> @buffers_in_use
Enter value for owner: KSO
Enter value for object_name: SKEW3
 
OWNER                          OBJECT_NAME                    OBJECT_TYPE             BUFFERS
------------------------------ ------------------------------ -------------------- ----------
KSO                            SKEW3                          TABLE                   1305951
 
Elapsed: 00:00:02.37
SYS@INMEM> -- and you can see the table is in the column store as well
SYS@INMEM> 
SYS@INMEM> @inmem_segs
Enter value for owner: 
Enter value for segment_name: 
 
OWNER                          SEGMENT_NAME                   ORIG_SIZE_MEGS IN_MEM_SIZE_MEGS COMP_RATIO MEGS_NOT_POPULATED
------------------------------ ------------------------------ -------------- ---------------- ---------- ------------------
KSO                            SKEW3                                10,210.7          7,278.8        1.4                 .0
                                                                             ----------------
sum                                                                                   7,278.8
 
Elapsed: 00:00:00.00
SYS@INMEM> -- now let's test performance
SYS@INMEM> -- First let's try it from the standard buffer cache
SYS@INMEM>
SYS@INMEM> ALTER SESSION set inmemory_query = disable;
 
Session altered.
 
Elapsed: 00:00:00.00
SYS@INMEM> @avgskew3x
 
AVG(PK_COL)
-----------
 16487500.8
 
Elapsed: 00:00:12.45
SYS@INMEM> /
 
AVG(PK_COL)
-----------
 16487500.8
 
Elapsed: 00:00:12.41
SYS@INMEM>
SYS@INMEM> -- So about 12.5 seconds from the buffer cache
SYS@INMEM> @x
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  6vy9zstcnq02k, child number 5
-------------------------------------
select avg(pk_col) from kso.skew3 where col1 between 0 and 1000
 
Plan hash value: 2684249835                           <--- Same as TABLE ACCESS INMEMORY FULL version (see below)
 
-----------------------------------------------------
| Id  | Operation                  | Name  | E-Rows |
-----------------------------------------------------
|   0 | SELECT STATEMENT           |       |        |
|   1 |  SORT AGGREGATE            |       |      1 |
|*  2 |   TABLE ACCESS STORAGE FULL| SKEW3 |  64007 | <--- Exadata version of full scan (not necessarily off disk)
-----------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - storage(("COL1"<=1000 AND "COL1">=0))
       filter(("COL1"<=1000 AND "COL1">=0))
 
 
21 rows selected.
 
Elapsed: 00:00:00.10
SYS@INMEM> -- Now let's try it from the column store
SYS@INMEM>
SYS@INMEM> ALTER SESSION set inmemory_query = enable;
 
Session altered.
 
Elapsed: 00:00:00.00
SYS@INMEM> @avgskew3x
 
AVG(PK_COL)
-----------
 16487500.8
 
Elapsed: 00:00:03.03
SYS@INMEM> /
 
AVG(PK_COL)
-----------
 16487500.8
 
Elapsed: 00:00:03.11
SYS@INMEM>
SYS@INMEM> -- So a little over 3 seconds from the column store
SYS@INMEM> @x
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  6vy9zstcnq02k, child number 4
-------------------------------------
select avg(pk_col) from kso.skew3 where col1 between 0 and 1000
 
Plan hash value: 2684249835                           <--- Same as the TABLE ACCESS STORAGE FULL version
 
------------------------------------------------------
| Id  | Operation                   | Name  | E-Rows |
------------------------------------------------------
|   0 | SELECT STATEMENT            |       |        |
|   1 |  SORT AGGREGATE             |       |      1 |
|*  2 |   TABLE ACCESS INMEMORY FULL| SKEW3 |  64007 | <--- new access option
------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - inmemory(("COL1"<=1000 AND "COL1">=0))
       filter(("COL1"<=1000 AND "COL1">=0))
 
21 rows selected.
 
Elapsed: 00:00:00.02

So even though the data was accessed from memory on both tests, the column store was about 4X faster. So the speed up in this admittedly very simple example was not due to the fact that the data was accessed from RAM (because both tests accessed data directly from memory).

A couple of points about this example.

1. I hope you’ll take my word for it that no trickery was performed. The data was stored in standard 8K blocks and I didn’t do anything other than what is in the terminal output.

2. Oracle generally tries to prevent full scans of objects from wiping it out large portions of the buffer cache. This is a wise choice in most systems since the sizes of many databases far exceed the amount of RAM available for caching. However, it is possible to override that behavior by declaring that a table or partition should be cached. That’s what the CACHE table attribute does.

3. Note that the plan was actually different, even though the plan_hash_value didn’t reflect the difference. The first statement has TABLE ACCESS STORAGE FULL but the second version has TABLE ACCESS INMEMORY FULL. The reason the plan_hash_value doesn’t change is because only specific parts of the plan are included in the hash value calculation (see this post for more info: SQL Gone Bad – But Plan Not Changed? In this case, the operation (TABLE ACCESS) and not the options (INMEMORY FULL or STORAGE FULL) are used. This could be problematic as it will make it more difficult to identify a rather major decision by the optimizer (i.e. whether to use the In-Memory column store or not). This is analogous to the situation where a plan on Exadata doesn’t change, but on one execution a full table scan results in a smart scan while on a another execution it doesn’t. Performance can be vastly different even though the plan hash value hasn’t changed. I personally think this is a flaw and the behavior should be changed.

4. Of course the main point was that there is a significant difference in the execution time of the query when using the column store vs. using the buffer cache. In this simple case, the main difference was the fact that the query was very selective and the in-memory column store has a built in form of storage indexes (similar to Exadata storage indexes or the new zone maps) which allow Oracle to skip processing on regions of data via a min/max pruning.

Note that I did generate 10046 traces and the output clearly shows that neither test did any disk i/o.

So clearly the column orientation (and the optimizations enabled by it) makes a huge difference. So what is the secret sauce then? It’s the new format (and the improvement in processing that is possible with data arranged in columnar format) that provides the dramatic speed up. It’s by fundamentally changing the way the data is accessed, as opposed to the fact that the data is accessed from memory, that we are able to get the vast improvement in query execution times.

For those that are not bored out of their skulls at this point, here’s a bit of the tkprof output from the buffer cache test.

 
********************************************************************************
 
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
 
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          6          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     12.79      12.82          0    1304305          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     12.80      12.83          0    1304311          0           1
 
Misses in library cache during parse: 1
 
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       3        0.00          0.00
  SQL*Net message from client                     3       11.12         14.78
  Disk file operations I/O                        2        0.00          0.00
 
 
OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
 
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          0          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          6          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        6      0.00       0.00          0          6          0           0
 
Misses in library cache during parse: 1
Misses in library cache during execute: 1
 
    1  user  SQL statements in session.
    2  internal SQL statements in session.
    3  SQL statements in session.
********************************************************************************

So as you can see, there was no time spent waiting on disk I/O.

A couple of final comments and then I’ll shut up. I expect this new feature to drive adoption of 12c. Generally we don’t see mass adoption of new Oracle versions until the second major point release (i.e. 11gR2 or 12cR2). But in this case, I believe the new In-Memory Option provides so much benefit that it will drive some customers to early adoption and will provide enough potential payback to make it worth being one of the pioneers. We already have Exadata customers planning to move to 12.1.0.2 at their next patch cycle. So I expect we’ll have several systems in production by the end of the year. I’m looking forward to seeing how it behaves in the real world.

Well that’s about it for now. I hope to do several posts on the In-Memory Option over the next few weeks. So let me know if you have anything specific you’d like to hear about.

Exadata Zone Maps

Just a quick post on a new Exadata feature called Zone Maps. They’re similar to storage indexes on Exadata, but with more control (you can define the columns and how the data is refreshed for example). People have complained for years that storage indexes provided no control mechanisms, but now we have a way to exert our God given rights as DBA’s to control yet another aspect of the database. Here’s a link to the 12.1.0.2 documentation which resides in the Data Warehousing Guide: Zone Map Documentation

Zone Maps are restricted to Exadata storage by the way (well probably they work on ZFS and Pillar too). Have a look at the Oracle error messages file:

 
>grep -i "storage type" $ORACLE_HOME/rdbms/mesg/oraus.msg | grep -i "not supported"
 
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:31969, 00000, "ZONEMAP not supported for table stored in tablespace of this storage type"
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:64307, 00000, " Exadata Hybrid Columnar Compression is not supported for tablespaces on this storage type" 
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:64309, 00000, " Hybrid Columnar Compression with row-level locking is not supported for tablespaces on this storage type."
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:65425, 00000, "CLUSTERING clause not supported for table stored in tablespace of this storage type"
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:65451, 00000, "Advanced index compression is not supported for tablespaces on this storage type."

So according to the messages file, there are a handful of features that are restricted in this fashion (Zone Maps, HCC, Attribute Clustering and Advanced Index Compression).

As a bit of totally irrelevant history, zone maps were actually included in the 12.1.0.1 release, but the documentation on them was removed. So they worked, but they were undocumented.

Here’s an example on a 12.1.0.1 DB on a non-Exadata platform.

 
SQL*Plus: Release 12.1.0.1.0 Production on Wed Aug 13 15:41:46 2014
 
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options
 
 
INSTANCE_NAME    STARTUP_TIME      CURRENT_TIME         DAYS    SECONDS
---------------- ----------------- ----------------- ------- ----------
LAB1211          13-AUG-2014 09:54 13-AUG-2014 15:41     .24      20820
 
SYS@LAB1211> create table kso.junk1 (col1 number, col2 number) clustering by linear order (col1,col2);
create table kso.junk1 (col1 number, col2 number) clustering by linear order (col1,col2)
*
ERROR at line 1:
ORA-65425: CLUSTERING clause not supported for table stored in tablespace of this storage type
 
 
SYS@LAB1211> create table kso.junk1 (col1 number, col2 number);
 
Table created.
 
SYS@LAB1211> create materialized zonemap skew_zonemap on kso.junk1(col1);
create materialized zonemap skew_zonemap on kso.junk1(col1)
                                                          *
ERROR at line 1:
ORA-31969: ZONEMAP not supported for table stored in tablespace of this storage type

Note that both zone maps and attribute clustering were disallowed with the “not supported for table stored in tablespace of this storage type” error message.

By the way, attribute clustering is another interesting new feature of 12g that allows you to declaratively instruct Oracle to store data on disk in a sorted order. This physical ordering can have big benefit for storage indexes or zone maps (or any btree index where clustering factor is important for that matter). Oracle’s new In-Memory column store also has a min/max pruning feature (storage indexes) which means physical ordering on disk is important with that feature as well.

Anyway, here’s a link to the 12.1.0.2 documentation on attribute clustering which also resides in the Data Warehousing Guide: Attribute Clustering Documentation

And here’s another example using 12.1.0.2 on an Exadata.

 
SQL*Plus: Release 12.1.0.2.0 Production on Wed Aug 13 15:42:18 2014
 
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options
 
 
INSTANCE_NAME    STARTUP_TIME      CURRENT_TIME         DAYS    SECONDS
---------------- ----------------- ----------------- ------- ----------
INMEM            24-JUL-2014 18:35 13-AUG-2014 15:42   19.88    1717600
 
Elapsed: 00:00:00.00
SYS@INMEM> @test_zonemap
SYS@INMEM> create table kso.junk1 (col1 number, col2 number) clustering by linear order (col1,col2);
 
Table created.
 
Elapsed: 00:00:00.01
SYS@INMEM> create materialized zonemap skew_zonemap on kso.junk1(col1);
 
Materialized zonemap created.
 
Elapsed: 00:00:00.15
SYS@INMEM> 
SYS@INMEM> -- so as expected, we're able to create an attribute clustered table and a zone map on Exadata
SYS@INMEM> 
SYS@INMEM> -- Let's try creating a tablespace that is not on Exa storage (even though the DB is on EXA platform)
SYS@INMEM> 
SYS@INMEM> create tablespace KSO_NON_EXA datafile '/home/oracle/KSO_NON_EXA.dbf' size 100M;
 
Tablespace created.
 
Elapsed: 00:00:00.38
SYS@INMEM> @tablespaces
 
TABLESPACE_NAME STATUS    CONTENTS  LOGGING   EXTENT_MGT ALLOC_TYP SPACE_MGT BLOCK_SIZE PREDICA
--------------- --------- --------- --------- ---------- --------- --------- ---------- -------
CLASS_DATA      ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    AUTO            8192 STORAGE
EXAMPLE         ONLINE    PERMANENT NOLOGGING LOCAL      SYSTEM    AUTO            8192 STORAGE
KSO_NON_EXA     ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    AUTO            8192 HOST      <=== 
SYSAUX          ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    AUTO            8192 STORAGE
SYSTEM          ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    MANUAL          8192 STORAGE
TEMP            ONLINE    TEMPORARY NOLOGGING LOCAL      UNIFORM   MANUAL          8192 STORAGE
UNDOTBS1        ONLINE    UNDO      LOGGING   LOCAL      SYSTEM    MANUAL          8192 STORAGE
USERS           ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    AUTO            8192 STORAGE
 
8 rows selected.
 
Elapsed: 00:00:00.02
SYS@INMEM> 
SYS@INMEM> -- note that tablespace KSO_NON_EXA is on local disk, not Exadata storage servers, so PREDICATE_EVALUATION is set to HOST.
SYS@INMEM> 
SYS@INMEM> drop table kso.junk1;
 
Table dropped.
 
Elapsed: 00:00:00.01
SYS@INMEM> create table kso.junk1 (col1 number, col2 number) clustering by linear order (col1,col2) tablespace kso_non_exa;
 
Table created.
 
Elapsed: 00:00:00.01
SYS@INMEM> select owner, table_name, tablespace_name from dba_tables where table_name like 'JUNK1';
 
OWNER                TABLE_NAME                     TABLESPACE_NAME
-------------------- ------------------------------ ---------------
KSO                  JUNK1                          KSO_NON_EXA
 
Elapsed: 00:00:00.01
SYS@INMEM> 
SYS@INMEM> -- wow - that's a bit of a surprise, clustered table create worked on non-Exa storage
SYS@INMEM> -- maybe the check is done on some other level than the tablespace
SYS@INMEM> 
SYS@INMEM> create materialized zonemap skew_zonemap on kso.junk1(col1);
create materialized zonemap skew_zonemap on kso.junk1(col1)
                                                          *
ERROR at line 1:
ORA-31969: ZONEMAP not supported for table stored in tablespace of this storage type
 
 
Elapsed: 00:00:00.00

So as you can see, attempting to create the zone map on non-Exa storage failed as expected. But I was able to create a clustered table on non-Exa storage, which is a little weird. So while the error message for attribute clustering exists in the messages file, it doesn’t appear that there is a check in the code, at least at the tablespace level. I don’t have a 12.1.0.2 install on a non-Exadata platform at the moment to test it out, but if you do, please let me know.

That’s it for now. I hope to do some more detailed posts on In-Memory, Zone Maps, Attribute Clustering in the near future. As always, your comments are welcomed.

12c Adaptive Optimization – Part 3

This is the third and final post on follow up questions from the Redgate webinar I did on 12c Adaptive Optimization (the link goes to a recording of the webcast by the way).

Also, here are links to the 2 earlier posts:

    12c Adaptive Optimization – Part 1.
    12c Adaptive Optimization – Part 2 (Hints).

So here are the last set of questions along with my responses:

Q: Is this feature on by default or you have to set a parameter to make sure of it?
A: It’s on by default but can be turned off by the methods listed in the presentation.

Q: Is there any drawback of adaptive execution plan?
A: New features (especially auto-magic ones) always make people nervous, but I don’t see too many potential pitfalls with this one. The fact that it is enabled by default out of the box is also a good indicator that the developers themselves have a lot of confidence in it. There is certainly more work going on to collect statistics and buffer rows, but it seems quite minimal and only happens on the first execution. So my basic answer is no, I don’t foresee any major drawbacks.

Q: For adaptive plans, usually queries are more complex, with multiple combinations of hash joins and nested loops. But adaptive plans only switches to one “sub plan”, correct? How does it account for all the various combinations?
A: A sub-plan is limited to a single join. There can obviously be many joins in a single plan and thus many sub-plans. But each sub-plan will result in either a HJ or a NLJ. At the end there will be only one final plan. See my previous post (Part 1) for an example of a more complex plan with multiple sub-plans.

Q: parallel distribution methods: why not use broadcast all the time? :)
A: :)

Q: Would adaptive optim switch to a better index if it finds itself sitting on a wrong index?
A: I presume the question is with regard to Adaptive Plans kicking in on the first execution, if so, the answer is No. At this point only join methods and px distribution methods can be changed. I expect this will be expanded over time though.

Q: Does same plan_hash_value’s means same final plans?
A: Yes – plan hash value is computed based on the final plan with no regard to the fact that the plan was adaptive.

Q: How correlated plan_hash_values with final plans? How we can find same final plans?
A: Plan hash value is computed based on final plan, so the correlation is very high. :)

Q: Dynamic sampling would not put an excessive pressure on the CPU?
A: I guess it could, but it’s been around for some time and I haven’t been involved in any situations where the time spent on dynamic sampling was an issue. Setting it to 11 may give us some chances to see such a thing though. More often the issues arise when dynamic sampling does not come up with a good picture of the data due to the limited size of the sample.

Q: Is dynamic sampling = 11 actually a good blanket setting, or do you not trust the optimizer that much? What do you use and why?
A: The optimizer_dynamic_sampling parameter still defaults to 2 in 12c. That alone makes me cautious about setting it to the new totally auto-magic value of 11. If the developers have enough confidence in a new feature to make it the default, then I will be more trusting. I prefer to stick with default values unless I have to make a change to address a specific issue. I have worked on a few systems that change the default setting, but 11 has not been one of those values (yet). I need to do more testing with it.

Q: Gotta love Spinal Tap… crank it up to 11 !
A: Rock and Roll!

Q: Is there any effect on cpu utilisation becoz of adaptive optimisation??
A: There is definitely some extra overhead in collecting statistics and buffering rows but it should be minimal and it should only affect the initial execution.

Q: Can HJ be change to NL in 1-st execution? What is threshold for such change?
A: Yes – Adaptive Plans kick in the first execution. The threshold depends on the specific case. See the example earlier in part 2 of this series for an example of calculating the inflection point (from a 10053 trace).

Q: This means that if it is abandoned once it will also be abondoned if ran again?
A: Yes, assuming no other changes occur. But there are many things that can change such as Adaptive Cursor Sharing, Cardinality Feedback, etc… and of course the data itself and/or the statistics about the data can change over time as well. Just to be clear, the choice between the the two join methods is only made during the first execution after a hard parse, so once a statement is loaded into the cache, the plan will be static until something changes that causes a new child cursor to be created.

Q: At what data volumes does Adaptive Optimization become likely to be helpful.
A: Any volume that causes a NLJ to result in significantly different elapsed time than HJ.

Q: Does AWR show these updated adaptive plans with minus ?
A: That’s a good question. Yes, you can use the dbms_xplan.display_awr with the ‘adaptive’ format option (see the example below).

 
SYS@db12c1> select * from table(dbms_xplan.display_awr('&sql_id',nvl('&plan_hash_value',null),null,'adaptive'));
Enter value for sql_id: 6qg99cfg26kwb
Enter value for plan_hash_value: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 6qg99cfg26kwb
--------------------
SELECT COUNT(UNQ) UNQ, COUNT(PFX) PFX FROM (SELECT /*+ first_rows(1)
leading(cc) */ CD.TYPE# UNQ, NULL PFX FROM SYS.CCOL$ CC, SYS.CDEF$ CD
WHERE CC.OBJ# = :B2 AND CC.INTCOL# = :B1 AND CD.CON# = CC.CON# AND
CD.OBJ# = CC.OBJ# AND CD.ENABLED IS NOT NULL AND CD.INTCOLS = 1 AND
CD.TYPE# IN (2,3) AND BITAND(CD.DEFER, 2+4) = 4 AND ROWNUM < 2 UNION
ALL SELECT /*+ first_rows(1) leading(i) */ CASE WHEN I.INTCOLS = 1 AND
BITAND(I.PROPERTY,1) = 1 THEN 3 ELSE NULL END UNQ, CASE WHEN IC.POS# =
1 THEN 1 ELSE NULL END PFX FROM SYS.IND$ I, SYS.ICOL$ IC WHERE I.BO# =
:B2 AND I.BO# = IC.BO# AND IC.INTCOL# = :B1 AND I.OBJ# = IC.OBJ# AND
BITAND(I.FLAGS,1025) = 0 AND ROWNUM < 2 )
 
Plan hash value: 1065215175
 
----------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |         |       |       |     6 (100)|          |
|   1 |  SORT AGGREGATE                          |         |     1 |    16 |            |          |
|   2 |   VIEW                                   |         |     2 |    32 |     6   (0)| 00:00:01 |
|   3 |    UNION-ALL                             |         |       |       |            |          |
|   4 |     COUNT STOPKEY                        |         |       |       |            |          |
|-  5 |      HASH JOIN                           |         |     1 |    35 |     3   (0)| 00:00:01 |
|   6 |       NESTED LOOPS                       |         |     1 |    35 |     3   (0)| 00:00:01 |
|-  7 |        STATISTICS COLLECTOR              |         |       |       |            |          |
|   8 |         TABLE ACCESS CLUSTER             | CCOL$   |     1 |    13 |     2   (0)| 00:00:01 |
|   9 |          INDEX UNIQUE SCAN               | I_COBJ# |     1 |       |     1   (0)| 00:00:01 |
|  10 |        TABLE ACCESS CLUSTER              | CDEF$   |     1 |    22 |     1   (0)| 00:00:01 |
|- 11 |       TABLE ACCESS BY INDEX ROWID BATCHED| CDEF$   |     1 |    22 |     1   (0)| 00:00:01 |
|- 12 |        INDEX RANGE SCAN                  | I_CDEF2 |     1 |       |     1   (0)| 00:00:01 |
|  13 |     COUNT STOPKEY                        |         |       |       |            |          |
|- 14 |      HASH JOIN                           |         |     1 |    38 |     3   (0)| 00:00:01 |
|  15 |       NESTED LOOPS                       |         |     1 |    38 |     3   (0)| 00:00:01 |
|- 16 |        STATISTICS COLLECTOR              |         |       |       |            |          |
|  17 |         TABLE ACCESS CLUSTER             | IND$    |     1 |    21 |     2   (0)| 00:00:01 |
|  18 |          INDEX UNIQUE SCAN               | I_OBJ#  |     1 |       |     1   (0)| 00:00:01 |
|  19 |        TABLE ACCESS CLUSTER              | ICOL$   |     1 |    17 |     1   (0)| 00:00:01 |
|- 20 |       TABLE ACCESS CLUSTER               | ICOL$   |     1 |    17 |     1   (0)| 00:00:01 |
|- 21 |        INDEX UNIQUE SCAN                 | I_OBJ#  |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
 
Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)
 
 
46 rows selected.

Q: Is there any way to encourage the optimizer to collect the information but not act on it?
A: Yes, set optimizer_adaptive_reporting_only = true.

Q: Does adaptive distribution for parallel processing work as expected on a Virtual server where resources can be spread over several other servers?
A: No idea (in fact I’m not even sure I understand the question). Give it a test and let us know what you find out. :)

Q: Does Adaptive Optimization help oracle optimize somewhat complex nested views? I know nested views are not recommended but we sometimes have to live with what we inherited.
A: I don’t think this particular feature is going to help nested views specifically. But who knows. The optimizer seems to get lost occasionally with deeply nested views. By the way, there is an interesting new procedure in 12c called dbms_utility.expand_sql_text which spits out the fully expanded version of a SQL statement that accesses data through views. Tom Kyte has blogged about it here: 12c – SQL Text Expansion

Q: We regularly have hash join problems tracable to temp space limits. Shifting to nested loops has proven necessary in 10 and 11. Early detection and shifting to nested loops would be important for us.
A: I’m not sure this feature is really going to help you much in that regard unless the optimizer is erroneously picking the HJ based on incorrect estimates. If you’re just forcing the NLJ to avoid poor i/o performance on the temp stuff though it probably won’t help. In that case you need to figure out how to sort less or use more memory (increase pga, or use manual workarea size, or use more slaves in px, etc…).

Q: So if sort/merge join is used then this feature would not go to nested loop/hash join if sort/merge join is a bad plan ?
A: No it applies only to HJ and NLJ as of 12.1.0.1.

Q: What happens with the rows that were read up to inflection point? Does Oracle start reading from the scratch again?
A: The rows are buffered so they don’t need to be re-read.

Q: Will the SQL scripts that were demonstrated for reviewing the SQL plan information be made available?
A: Most are on this blog already (use the search box to locate them) but let me know if you can’t find any of the ones I used.

Q: It’s is a contraction for it is or it has. Its is a possessive pronoun meaning, more or less, of it or belonging to it.
A: Duly noted (and fixed in the presentation). :)

Q: Can we *force* plan change in mid-execution?
A: No. You can enable or disable the feature, but the optimizer decides whether to switch or not.

Q: How long statistics collector runs if it does not switch?
A: It should only run until the inflection point (the point at which it makes the decision), but I have not actually tested this.

Q: Is there a way adaptive can be disabled for PDB and enabled for others?
A: Yes, the optimizer_adaptive_features parameter can be set separately for each PDB (see the example below).

> rlwrap sqlplus / as sysdba
 
SQL*Plus: Release 12.1.0.1.0 Production on Mon Dec 9 19:53:03 2013
 
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
 
 
INSTANCE_NAME    STARTUP_TIME      CURRENT_TIME         DAYS    SECONDS
---------------- ----------------- ----------------- ------- ----------
CONTAIN1         02-DEC-2013 03:22 09-DEC-2013 19:53    7.69     664225
 
SYS@CONTAIN1> @whoami_pdb
 
    CON_ID CON_NAME   USERNAME             USER#        SID    SERIAL# PREV_HASH_VALUE SCHEMANAME                     OS_PID
---------- ---------- --------------- ---------- ---------- ---------- --------------- ------------------------------ -------
         1 CDB$ROOT   SYS                      0         24        295      3265981639 SYS                            4481
 
SYS@CONTAIN1> @connect_pdb
Enter value for pdb_name: plug1
 
Session altered.
 
SYS@CONTAIN1:PLUG1> @parms
Enter value for parameter: optimizer_adaptive_features
Enter value for isset: 
Enter value for show_hidden: 
 
NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
optimizer_adaptive_features                        TRUE                                                                   TRUE     TRUE       TRUE
 
SYS@CONTAIN1:PLUG1> alter system set optimizer_adaptive_features=false;
 
System altered.
 
SYS@CONTAIN1:PLUG1> @parms
Enter value for parameter: optimizer_adaptive_features
Enter value for isset: 
Enter value for show_hidden: 
 
NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
optimizer_adaptive_features                        FALSE                                                                  TRUE     TRUE       TRUE
 
SYS@CONTAIN1:PLUG1> @connect_pdb
Enter value for pdb_name: plug2
 
Session altered.
 
SYS@CONTAIN1:PLUG2> @whoami_pdb
 
    CON_ID CON_NAME   USERNAME             USER#        SID    SERIAL# PREV_HASH_VALUE SCHEMANAME                     OS_PID
---------- ---------- --------------- ---------- ---------- ---------- --------------- ------------------------------ -------
         4 PLUG2      SYS                      0         24        295      2710464132 SYS                            4481
 
SYS@CONTAIN1:PLUG2> @parms
Enter value for parameter: optimizer_adaptive_features
Enter value for isset: 
Enter value for show_hidden: 
 
NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
optimizer_adaptive_features                        TRUE                                                                   TRUE     TRUE       TRUE
 
SYS@CONTAIN1:PLUG2> connect / as sysdba
Connected.
 
INSTANCE_NAME    STARTUP_TIME      CURRENT_TIME         DAYS    SECONDS
---------------- ----------------- ----------------- ------- ----------
CONTAIN1         02-DEC-2013 03:22 09-DEC-2013 19:54    7.69     664324
 
SYS@CONTAIN1> @parms
Enter value for parameter: optimizer_adaptive_features
Enter value for isset: 
Enter value for show_hidden: 
 
NAME                                               VALUE                                                                  ISDEFAUL ISMODIFIED ISSET
-------------------------------------------------- ---------------------------------------------------------------------- -------- ---------- ----------
optimizer_adaptive_features                        TRUE                                                                   TRUE     TRUE       TRUE

So you can set the optimizer_adaptive_features parameter separately for each PDB. Note: here are links to the couple of scripts I used in this post:

    connect_pdb.sql
    whoami_pdb.sql

There was another good question that I don’t have time to look into at the moment.

Q: In the Pro*C sequence PREPARE, OPEN, FETCH, at what point(s) might Oracle switch plans? If during FETCH, how does Oracle return the next row/array?

Maybe I’ll get around to that later but if anyone wants to give it a shot and post the results in the comments section that would be great. :)