Direct path insert and IOTs


02.10.2017
by Kamil Stawiarski

(Please tell me that I’m not the only one who thinks "Index Organized Table" instead of "Internet Of Things" when hearing IOT…)

This post is inspired by Connor McDonald and his blog post from a year ago about direct mode operations and IOTs.
You can read it here: https://connor-mcdonald.com/2016/07/04/direct-mode-operations-on-iots/amp/

While writing a redo parser for V00D00 I had to investigate this subject very closely from a redo log perspective. And this will be the subject of my 10-minute lightning talk at Oak Table World 2017 at Oracle Open World!

Before we start – in the last blog post (rollback internals) I introduced a new tool for parsing redo log dumps. It was called logfile_dump_parser but since it was a boring name, I decided to call it apud (archivelog parser using dumpfile). If you want to know why it is less boring name, reverse name apud and find the meaning in Polish-English dictionary 😉

OK. But the name is not the only thing that has changed – APUD has now support also for 11g logfile dumps. (The default parsing is for 12cR1 and 12cR2)

How it works? First of all, you have to download a script: http://ora-600.pl/oinstall/apud.py, you have to have python 2.7 and if you want to have possibility to use Oracle dictionary to decode DATA_OBJECT_ID to OBJECT_NAME, you have to also install cx_Oracle python library.

After setting up the environment, you can use a script like this:

[oracle@rokoko ~]$ python apud.py
parse file /path/to/a/logfile/dump [data_object_id | owner.object_name]   if you want to parse file based on DATA_OBJECT_ID or OWNER.OBJECT_NAME
get dict user/password@ip:port/service                                    if you want to use object_name instead of data_object_id

default supported version is 12c, to change to 11g type: set version 11g
exit to quit
help to print this
V00D00 > get dict system/oracle@localhost:1521/dupa12c

If you will be parsing a 11g dumpfile you can prepare the script like this:

[oracle@rokoko ~]$ python apud.py
parse file /path/to/a/logfile/dump [data_object_id | owner.object_name]   if you want to parse file based on DATA_OBJECT_ID or OWNER.OBJECT_NAME
get dict user/password@ip:port/service                                    if you want to use object_name instead of data_object_id

default supported version is 12c, to change to 11g type: set version 11g
exit to quit
help to print this
V00D00 > get dict system/oracle@localhost:1521/dupa11g
V00D00 > set version 11g

To set up our test I will use the following IOT definition on both database versions (12.2.0.1 and 11.2.0.4):

create table hr.employees_iot
(
  employee_id number,
  first_name varchar2(30),
  last_name varchar2(30),
  constraint pk_emp_iot primary key (employee_id)
)
organization index
/

If you will be using apud, remember to refresh dictionary in a tool after creating a new object or doing anything that could change its DATA_OBJECT_ID (like MOVE or TRUNCATE). You can do this by repeating "get dict" command.

To load data in "direct" way to those tables I will use the following SQL:

insert /*+ APPEND */ into hr.employees_iot 
select employee_id, first_name, last_name 
from hr.employees;

This is execution plan for 11.2.0.4:

SQL_ID	903y3h170f2ty, child number 0
-------------------------------------
insert /*+ APPEND */ into hr.employees_iot select employee_id,
first_name, last_name from hr.employees

Plan hash value: 1737183166

---------------------------------------------------------------------------------------------
| Id  | Operation		 | Name 	    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT	 |		    |	    |	    |	  2 (100)|	    |
|   1 |  LOAD TABLE CONVENTIONAL |		    |	    |	    |		 |	    |
|   2 |   VIEW			 | index$_join$_002 |	107 |  2033 |	  2   (0)| 00:00:01 |
|*  3 |    HASH JOIN		 |		    |	    |	    |		 |	    |
|   4 |     INDEX FAST FULL SCAN | EMP_NAME_IX	    |	107 |  2033 |	  1   (0)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN | EMP_EMP_ID_PK    |	107 |  2033 |	  1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(ROWID=ROWID)

And here you can see execution plan for 12.2.0.1:

SQL_ID	903y3h170f2ty, child number 0
-------------------------------------
insert /*+ APPEND */ into hr.employees_iot select employee_id,
first_name, last_name from hr.employees

Plan hash value: 2876790504

--------------------------------------------------------------------------------------------
| Id  | Operation		| Name		   | Rows  | Bytes | Cost (%CPU)| Time	   |
--------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT	|		   |	   |	   |	 2 (100)|	   |
|   1 |  LOAD AS SELECT 	| EMPLOYEES_IOT    |	   |	   |		|	   |
|   2 |   VIEW			| index$_join$_002 |   107 |  2033 |	 2   (0)| 00:00:01 |
|*  3 |    HASH JOIN		|		   |	   |	   |		|	   |
|   4 |     INDEX FAST FULL SCAN| EMP_NAME_IX	   |   107 |  2033 |	 1   (0)| 00:00:01 |
|   5 |     INDEX FAST FULL SCAN| EMP_EMP_ID_PK    |   107 |  2033 |	 1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access(ROWID=ROWID)

You can see the actual difference in the execution plan and you can’t execute a select statement on this table in 12c before commiting:

SQL> select count(1) from hr.employees_iot;
select count(1) from hr.employees_iot
                        *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

Let’s see the logfile dump contents for 11g:

V00D00 > set version 11g
V00D00 > parse file /u01/app/oracle/diag/rdbms/dupa11g/dupa11/trace/dupa11_ora_1855.trc HR.PK_EMP_IOT

Redo record sequence: 12 block: 21 offset 16 [rs_id: 0x00000c.00000015.0010]
	 change 1 operation INSERT Leaf Row xid = 0x0006.018.000002fa
	 change 3 operation UNDO xid = 0x0006.018.000002fa
COMMIT for transaction 0x0006.018.000002fa [sequence: 12 block: 81 offset 16 rs_id: 0x00000c.00000051.0010 ]

And for the 12c:

V00D00 > parse file /u01/app/oracle/diag/rdbms/dupa12c/dupa12/trace/dupa12_ora_7942.trc HR.PK_EMP_IOT

Redo record sequence: 22 block: 11 offset 288 [rs_id: 0x000016.0000000b.0120]
	 change 1 operation Block cleanout record xid = 0x0003.018.00000743

Redo record sequence: 22 block: 11 offset 376 [rs_id: 0x000016.0000000b.0178]
	 change 2 operation UNDO xid = 0x0003.018.00000743
	 change 3 operation INSERT Leaf Row xid = 0x0003.018.00000743
COMMIT for transaction 0x0003.018.00000743 [sequence: 22 block: 20 offset 16 rs_id: 0x000016.00000014.0010 ]

It looks almost the same! Only one regular change vector for "INSERT Leaf Row" to the primary key index! (And one associated UNDO change vector – the real DIRECT PATH INSERT doesn’t have UNDO change vectors!!!)

We can now use rs_id value for search the dumpfile and locate our redo record contents – for 11g it will be 0x000016.0000000b.0120 and for 12c: 0x000016.0000000b.0178 (NOTE: RS_ID is RBA – you can see the RS_ID column also in LogMiner and it’s also an RBA).

This is the begnining of the change vector for "INSERT Leaf Row" from RBA: 0x00000c.00000015.0010 (11.2.0.4)

CHANGE #1 TYP:0 CLS:1 AFN:4 DBA:0x0100023b OBJ:83911 SCN:0x0000.000d1fb8 SEQ:1 OP:10.2 ENC:0 RBL:0
index redo (kdxlin):  insert leaf row
KTB Redo
op: 0x01  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: F  xid:  0x0006.018.000002fa    uba: 0x00c08ad8.006e.27
REDO: ARRAY / NONKEY / --
itl: 2, sno: 106, row size 2657
number of keys: 107
slots:
 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
insert key: (426):
 02 c2 02 03 c2 02 02 03 c2 02 03 03 c2 02 04 03 c2 02 05 03 c2 02 06 03 c2
 02 07 03 c2 02 08 03 c2 02 09 03 c2 02 0a 03 c2 02 0b 03 c2 02 0c 03 c2 02
 0d 03 c2 02 0e 03 c2 02 0f 03 c2 02 10 03 c2 02 11 03 c2 02 12 03 c2 02 13
 03 c2 02 14 03 c2 02 15 03 c2 02 16 03 c2 02 17 03 c2 02 18 03 c2 02 19 03
 c2 02 1a 03 c2 02 1b 03 c2 02 1c 03 c2 02 1d 03 c2 02 1e 03 c2 02 1f 03 c2
 02 20 03 c2 02 21 03 c2 02 22 03 c2 02 23 03 c2 02 24 03 c2 02 25 03 c2 02
 26 03 c2 02 27 03 c2 02 28 03 c2 02 29 03 c2 02 2a 03 c2 02 2b 03 c2 02 2c
 03 c2 02 2d 03 c2 02 2e 03 c2 02 2f 03 c2 02 30 03 c2 02 31 03 c2 02 32 03
 c2 02 33 03 c2 02 34 03 c2 02 35 03 c2 02 36 03 c2 02 37 03 c2 02 38 03 c2
 02 39 03 c2 02 3a 03 c2 02 3b 03 c2 02 3c 03 c2 02 3d 03 c2 02 3e 03 c2 02
 3f 03 c2 02 40 03 c2 02 41 03 c2 02 42 03 c2 02 43 03 c2 02 44 03 c2 02 45
 03 c2 02 46 03 c2 02 47 03 c2 02 48 03 c2 02 49 03 c2 02 4a 03 c2 02 4b 03
 c2 02 4c 03 c2 02 4d 03 c2 02 4e 03 c2 02 4f 03 c2 02 50 03 c2 02 51 03 c2
 02 52 03 c2 02 53 03 c2 02 54 03 c2 02 55 03 c2 02 56 03 c2 02 57 03 c2 02
 58 03 c2 02 59 03 c2 02 5a 03 c2 02 5b 03 c2 02 5c 03 c2 02 5d 03 c2 02 5e
 03 c2 02 5f 03 c2 02 60 03 c2 02 61 03 c2 02 62 03 c2 02 63 03 c2 02 64 02
 c2 03 03 c2 03 02 03 c2 03 03 03 c2 03 04 03 c2 03 05 03 c2 03 06 03 c2 03
 07
each key size is:
 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 4 4 4 4 4 4
(...)

And here is the begnining of the change vector for "INSERT Leaf Row" from RBA: 0x000016.0000000b.0178 (12.2.0.1)

CHANGE #3 CON_ID:0 TYP:0 CLS:1 AFN:4 DBA:0x010001f3 OBJ:113756 SCN:0x00000000002359e4 SEQ:1 OP:10.2 ENC:0 RBL:0 FLG:0x0000
index redo (kdxlin):  insert leaf row
KTB Redo
op: 0x01  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: F  xid:  0x0003.018.00000743    uba: 0x00c186ea.0151.24
REDO: ARRAY / NONKEY / --
itl: 2, sno: 0, row size 2657
number of keys: 107
slots:
 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
insert key: (426):
 02 c2 02 03 c2 02 02 03 c2 02 03 03 c2 02 04 03 c2 02 05 03 c2 02 06 03 c2
 02 07 03 c2 02 08 03 c2 02 09 03 c2 02 0a 03 c2 02 0b 03 c2 02 0c 03 c2 02
 0d 03 c2 02 0e 03 c2 02 0f 03 c2 02 10 03 c2 02 11 03 c2 02 12 03 c2 02 13
 03 c2 02 14 03 c2 02 15 03 c2 02 16 03 c2 02 17 03 c2 02 18 03 c2 02 19 03
 c2 02 1a 03 c2 02 1b 03 c2 02 1c 03 c2 02 1d 03 c2 02 1e 03 c2 02 1f 03 c2
 02 20 03 c2 02 21 03 c2 02 22 03 c2 02 23 03 c2 02 24 03 c2 02 25 03 c2 02
 26 03 c2 02 27 03 c2 02 28 03 c2 02 29 03 c2 02 2a 03 c2 02 2b 03 c2 02 2c
 03 c2 02 2d 03 c2 02 2e 03 c2 02 2f 03 c2 02 30 03 c2 02 31 03 c2 02 32 03
 c2 02 33 03 c2 02 34 03 c2 02 35 03 c2 02 36 03 c2 02 37 03 c2 02 38 03 c2
 02 39 03 c2 02 3a 03 c2 02 3b 03 c2 02 3c 03 c2 02 3d 03 c2 02 3e 03 c2 02
 3f 03 c2 02 40 03 c2 02 41 03 c2 02 42 03 c2 02 43 03 c2 02 44 03 c2 02 45
 03 c2 02 46 03 c2 02 47 03 c2 02 48 03 c2 02 49 03 c2 02 4a 03 c2 02 4b 03
 c2 02 4c 03 c2 02 4d 03 c2 02 4e 03 c2 02 4f 03 c2 02 50 03 c2 02 51 03 c2
 02 52 03 c2 02 53 03 c2 02 54 03 c2 02 55 03 c2 02 56 03 c2 02 57 03 c2 02
 58 03 c2 02 59 03 c2 02 5a 03 c2 02 5b 03 c2 02 5c 03 c2 02 5d 03 c2 02 5e
 03 c2 02 5f 03 c2 02 60 03 c2 02 61 03 c2 02 62 03 c2 02 63 03 c2 02 64 02
 c2 03 03 c2 03 02 03 c2 03 03 03 c2 03 04 03 c2 03 05 03 c2 03 06 03 c2 03
 07
each key size is:
 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 4 4 4 4 4 4

And as you can see – those two are almost the same. So the conclusion is: this is a fake feature in 12c! And to quote Frits Hoogland: "giving it some thought, it’s obvious, isn’t it?" 🙂

P.S.
While discussing this subject, Frits asked me about kcbl functions (kernel cache buffer direct path loader).
I did compare the execution of ^kcbl functions in GDB for those two scenarios but I haven’t found anything interesting yet. It seems to work almost the same in 11g and 12c from the ^kcbl functions perspective so the difference might be at a different level… or maybe I’m not looking close enough and I should dig deeper! That’s always fun!

Either way – the only thing this feature does correctly is locking a table 🙂
Don’t use APPEND hint to accelerate your INSERT statements on IOTs in 12c. It doesn’t work.

Cheers!

Search:


Kontakt

Database Whisperers sp. z o. o. sp. k.
Al. Jerozolimskie 200
Wejście B, III piętro/ pokój 342
02-486 Warszawa
NIP: 5272744987
REGON:362524978
+48 508 943 051
+48 661 966 009
info@ora-600.pl

Newsletter zapisz się żeby być na bieżąco