rollback internals


20.09.2017
by Kamil Stawiarski

While researching redo log internals for V00D00 we had to face the fact, that we know shit about real transactional behavior. When I say "real", I mean – under the hood.
Even with a very simple stuff like COMMIT and ROLLBACK we were constantly amazed by the internal mechanisms.

Today let’s take ROLLBACK under the investigation. According to documentation:

The ROLLBACK statement ends the current transaction and undoes any changes made during that transaction.

Cool. But what it means? First of all, you have to realize that all changes in redo logs are in a form of REDO RECORD which has its own address, known as RBA or RS_ID.

Sample RS_ID (RBA) looks like this: 0x00000a.00008c0f.006c

  • 00000a = 10 : this is sequence#
  • 00008c0f = 35855 : this is a block number in a redo log
  • 006c = 108 : this is offset in a redo block, where this redo record begins

Each redo record consists of one or more change vectors and each change vector has it’s own op code. Check out this article by Jonathan Lewis: https://jonathanlewis.wordpress.com/2017/07/25/redo-op-codes/

If you update EMPLOYEES table and change 2 rows, you will get usually one redo record with at least 4 change vectors – 2 for actual updates (so REDO) and 2 for reversing the operation (undo)

You can learn it from redolog or archivelog dumps:

Let’s modify a few rows:

1SQL> update hr.employees
2  set salary=salary*2
3  where department_id=100;
4 
56 rows updated.
6 
7SQL> commit;
8 
9Commit complete.

Now I’ll make a dump of an archivelog which contains those changes:

1SQL> alter system switch logfile;
2 
3System altered.
4 
5SQL> get get_arch
6  with v_arch as
7  2  (
8  3   select name, rank() over (order by sequence# desc) as rn
9  4   from v$archived_log
10  5  )
11  select name
12  from v_arch
13  8* where rn=1
14SQL> /
15 
16NAME
17--------------------------------------------------------------------------------
18+DATA/SKIPER/ARCHIVELOG/2017_09_20/thread_1_seq_13.303.955209887
19 
20SQL> alter system dump logfile '+DATA/SKIPER/ARCHIVELOG/2017_09_20/thread_1_seq_13.303.955209887';
21 
22System altered.
23 
24SQL> get tracefile
25  1* select value from v$diag_info where name='Default Trace File'
26SQL> /
27 
28VALUE
29--------------------------------------------------------------------------------
30/u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_24204.trc

The dump file is quite big and hard to analyze at the beginning, so I wrote a small and simple (primitive even) tool in Python which will help you to analyze the dump file. The tool can be downloaded from here.

It’s tested against Python 7.5. You can use it like this:

1[oracle@skiper ~]$ python logfile_dump_parser.py
2parse file /path/to/a/redo/dump object_id_or_name        if you want to parse file (duh!)
3connect to oracle user/password@ip:port/service          if you want to connect to db to get table names instead ids
4exit to quit
5VOODOO >

At the beginning you will get a prompt – you can tell the tool to parse a dumpfile, providing DATA_OBJECT_ID of a table, which you’d like to search in a file or you can connect to Oracle (cx_Oracle module for Python required) with a simple connect string and parse a dump file, using name of the table.

1VOODOO > connect to oracle system/oracle@localhost:1521/skiper
2VOODOO > parse file /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_24204.trc EMPLOYEES
3 
4Redo record block: 42 offset: 16 [rs_id: 0x00000d.0000002a.0010]
5     change 1 operation UPDATE xid = 0x0004.019.000004f2
6     change 2 operation UNDO HEADER xid = 0x0004.019.000004f2
7     change 3 operation UPDATE xid = 0x0004.019.000004f2
8     change 4 operation UPDATE xid = 0x0004.019.000004f2
9     change 5 operation UPDATE xid = 0x0004.019.000004f2
10     change 6 operation UPDATE xid = 0x0004.019.000004f2
11     change 7 operation UPDATE xid = 0x0004.019.000004f2
12     change 8 operation COMMIT xid = 0x0004.019.000004f2
13     change 9 operation UNDO xid = 0x0004.019.000004f2
14     change 10 operation UNDO xid = 0x0004.019.000004f2
15     change 11 operation UNDO xid = 0x0004.019.000004f2
16     change 12 operation UNDO xid = 0x0004.019.000004f2
17     change 13 operation UNDO xid = 0x0004.019.000004f2
18     change 14 operation UNDO xid = 0x0004.019.000004f2

As you can see, this simple update changed 6 rows and generated one redo record with 14 change vectors – 6X UPDATE (redo) 6X UNDO, one UNDO HEADER modification and one COMMIT. This whole redo record is a single, atomic operation for the database.

Let’s see what will happen when we will ROLLBACK this kind of transaction instead.

First we have to repeat the operation and generate archivelog dump for our parser:

1SQL> get upd
2  update hr.employees
3  set salary=salary*2
4  3* where department_id=100
5SQL> /
6 
76 rows updated.
8 
9SQL> rollback;
10 
11Rollback complete.
12 
13SQL> alter system switch logfile;
14 
15System altered.
16 
17SQL> @get_arch
18 
19NAME
20--------------------------------------------------------------------------------
21+DATA/SKIPER/ARCHIVELOG/2017_09_20/thread_1_seq_29.319.955212079
22 
23SQL> alter system dump logfile '+DATA/SKIPER/ARCHIVELOG/2017_09_20/thread_1_seq_29.319.955212079';
24 
25System altered.
26 
27SQL> @tracefile
28 
29VALUE
30--------------------------------------------------------------------------------
31/u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_3601.trc

And now we can parse logfile with my little tool (note, that you have to make connection to the database only once per parsing session)

1VOODOO > parse file /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_3601.trc EMPLOYEES
2 
3Redo record block: 47 offset: 16 [rs_id: 0x00001d.0000002f.0010]
4     change 1 operation UPDATE xid = 0x0002.001.00000615
5     change 2 operation UNDO HEADER xid = 0x0002.001.00000615
6     change 3 operation UPDATE xid = 0x0002.001.00000615
7     change 4 operation UPDATE xid = 0x0002.001.00000615
8     change 5 operation UPDATE xid = 0x0002.001.00000615
9     change 6 operation UPDATE xid = 0x0002.001.00000615
10     change 7 operation UPDATE xid = 0x0002.001.00000615
11     change 8 operation UNDO xid = 0x0002.001.00000615
12     change 9 operation UNDO xid = 0x0002.001.00000615
13     change 10 operation UNDO xid = 0x0002.001.00000615
14     change 11 operation UNDO xid = 0x0002.001.00000615
15     change 12 operation UNDO xid = 0x0002.001.00000615
16     change 13 operation UNDO xid = 0x0002.001.00000615
17 
18Redo record block: 50 offset: 228 [rs_id: 0x00001d.00000032.00e4]
19     change 1 operation UPDATE xid = 0x0002.001.00000615
20     change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615
21 
22Redo record block: 50 offset: 424 [rs_id: 0x00001d.00000032.01a8]
23     change 1 operation UPDATE xid = 0x0002.001.00000615
24     change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615
25 
26Redo record block: 51 offset: 124 [rs_id: 0x00001d.00000033.007c]
27     change 1 operation UPDATE xid = 0x0002.001.00000615
28     change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615
29 
30Redo record block: 51 offset: 320 [rs_id: 0x00001d.00000033.0140]
31     change 1 operation UPDATE xid = 0x0002.001.00000615
32     change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615
33 
34Redo record block: 52 offset: 20 [rs_id: 0x00001d.00000034.0014]
35     change 1 operation UPDATE xid = 0x0002.001.00000615
36     change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615
37 
38Redo record block: 52 offset: 216 [rs_id: 0x00001d.00000034.00d8]
39     change 1 operation UPDATE xid = 0x0002.001.00000615
40     change 2 operation ROLLBACK SINGLE STATEMENT (final) xid = 0x0002.001.00000615
41COMMIT [ending rollback]  for transaction 0x0002.001.00000615 [block: 52 offset: 432 rs_id: 0x00001d.00000034.01b0]

Nice! So for each and every modified row, Oracle had to generate separate redo record with 2 change vectors – one vector is an actual UPDATE built from UNDO information, which changes a row to its previous value and one rollback code (5.6 – ROLLBACK SINGLE STATEMENT (not final) or 5.11 – ROLLBACK SINGLE STATEMENT (final))

 

  • 5.6 – ROLLBACK SINGLE STATEMENT (not final) – this means that the rollback is in progress and there can be more operations like this
  • 5.11 – ROLLBACK SINGLE STATEMENT (final) – this means that the rollback is finished and there should be no more rollback records

Each complete ROLLBACK is actually finished by COMMIT (OP:5.4). This is logical because ROLLBACK is generating commands to reverse previous operations and then it can COMMIT the whole transaction.

The interesting thing is with partial rollback. Let’s UPDATE again 6 records but then let’s try to insert a row which will violate a primary key:

1SQL> get upd
2  update hr.employees
3  set salary=salary*2
4  3* where department_id=100
5SQL> /
6 
76 rows updated.
8 
9SQL> insert into hr.employees
10  values (100,'Dupa','Blada','DBLADA','666.66.666',sysdate,'ST_CLERK',5600,null,100,100);
11insert into hr.employees
12*
13ERROR at line 1:
14ORA-00001: unique constraint (HR.EMP_EMP_ID_PK) violated

Now let’s check the parser:

1V00D00 > parse file /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_6603.trc EMPLOYEES
2 
3Redo record sequence: 32 block: 38 offset 16 [rs_id: 0x000020.00000026.0010]
4     change 1 operation UPDATE xid = 0x0005.004.0000061e
5     change 2 operation UNDO HEADER xid = 0x0005.004.0000061e
6     change 3 operation UPDATE xid = 0x0005.004.0000061e
7     change 4 operation UPDATE xid = 0x0005.004.0000061e
8     change 5 operation UPDATE xid = 0x0005.004.0000061e
9     change 6 operation UPDATE xid = 0x0005.004.0000061e
10     change 7 operation UPDATE xid = 0x0005.004.0000061e
11     change 8 operation UNDO xid = 0x0005.004.0000061e
12     change 9 operation UNDO xid = 0x0005.004.0000061e
13     change 10 operation UNDO xid = 0x0005.004.0000061e
14     change 11 operation UNDO xid = 0x0005.004.0000061e
15     change 12 operation UNDO xid = 0x0005.004.0000061e
16     change 13 operation UNDO xid = 0x0005.004.0000061e
17 
18Redo record sequence: 32 block: 41 offset 188 [rs_id: 0x000020.00000029.00bc]
19     change 1 operation OP:14.11 xid = 0x0005.004.0000061e
20 
21Redo record sequence: 32 block: 41 offset 252 [rs_id: 0x000020.00000029.00fc]
22     change 1 operation OP:13.22 xid = 0x0005.004.0000061e
23 
24Redo record sequence: 32 block: 41 offset 328 [rs_id: 0x000020.00000029.0148]
25     change 1 operation UNDO xid = 0x0005.004.0000061e
26     change 2 operation INSERT xid = 0x0005.004.0000061e
27 
28Redo record sequence: 32 block: 42 offset 172 [rs_id: 0x000020.0000002a.00ac]
29     change 1 operation DELETE xid = 0x0005.004.0000061e
30     change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0005.004.0000061e
31COMMIT  for transaction 0x0005.004.0000061e [sequence: 33 block: 3 offset 16 [rs_id: 0x000021.00000003.0010]

Great! So we can see in here one familiar redo record, some change vectors from layer 13 (Segment management) and 14 (Extent management) and then our redo record for INSERT, followed by a redo record with a DELETE (ROLLBACK information)!

So each time you are doing something to a database, redo is being generated – even if you violate constraints! After constraint violations, Oracle is doing a partial rollback of this transaction.

But the funniest thing can be seen in this situation:

1SQL> update hr.employees
2  set employee_id=1;
3update hr.employees
4*
5ERROR at line 1:
6ORA-00001: unique constraint (HR.EMP_EMP_ID_PK) violated

We could expect, that modifying the second row to the same value as the previous one will throw this kind of exception. But redo log analysis shows something different:

1V00D00 > parse file /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_9695.trc EMPLOYEES
2 
3Redo record sequence: 34 block: 8771 offset 16 [rs_id: 0x000022.00002243.0010]
4     change 1 operation UNDO HEADER xid = 0x000a.002.00000508
5     change 2 operation UNDO xid = 0x000a.002.00000508
6     change 3 operation UPDATE xid = 0x000a.002.00000508
7 
8Redo record sequence: 34 block: 8773 offset 96 [rs_id: 0x000022.00002245.0060]
9     change 1 operation UNDO xid = 0x000a.002.00000508
10     change 2 operation UPDATE xid = 0x000a.002.00000508
11 
12Redo record sequence: 34 block: 8774 offset 100 [rs_id: 0x000022.00002246.0064]
13     change 1 operation UNDO xid = 0x000a.002.00000508
14     change 2 operation UPDATE xid = 0x000a.002.00000508
15 
16Redo record sequence: 34 block: 8775 offset 268 [rs_id: 0x000022.00002247.010c]
17     change 1 operation UPDATE xid = 0x000a.002.00000508
18     change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x000a.002.00000508
19 
20Redo record sequence: 34 block: 8776 offset 136 [rs_id: 0x000022.00002248.0088]
21     change 1 operation UPDATE xid = 0x000a.002.00000508
22     change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x000a.002.00000508
23 
24Redo record sequence: 34 block: 8777 offset 172 [rs_id: 0x000022.00002249.00ac]
25     change 1 operation UPDATE xid = 0x000a.002.00000508
26     change 2 operation ROLLBACK SINGLE STATEMENT (final) xid = 0x000a.002.00000508
27COMMIT [ending rollback]  for transaction 0x000a.002.00000508 [sequence: 34 block: 8777 offset 384 [rs_id: 0x000022.00002249.0180]

We can see that Oracle did 3 modifications and 3 rollbacks after getting a signal that constraint was violated. So maybe we can assume that constraint checking is an asynchronous operation?

Stay tuned for more! Hope you will have fun with analyzing redo dumps with my simple Python tool 🙂

Cheers!


Contact us

Database Whisperers sp. z o. o. sp. k.
al. Jerozolimskie 200, 3rd floor, room 342
02-486 Warszawa
NIP: 5272744987
REGON:362524978
+48 508 943 051
+48 661 966 009
info@ora-600.pl

Newsletter Sign up to be updated