While researching redo log internals for V00D00 we had to face the fact, that we know shit about real transactional behavior. When I say "real", I mean – under the hood.
Even with a very simple stuff like COMMIT and ROLLBACK we were constantly amazed by the internal mechanisms.
Today let’s take ROLLBACK under the investigation. According to documentation:
The ROLLBACK statement ends the current transaction and undoes any changes made during that transaction.
Cool. But what it means? First of all, you have to realize that all changes in redo logs are in a form of REDO RECORD which has its own address, known as RBA or RS_ID.
Sample RS_ID (RBA) looks like this: 0x00000a.00008c0f.006c
- 00000a = 10 : this is sequence#
- 00008c0f = 35855 : this is a block number in a redo log
- 006c = 108 : this is offset in a redo block, where this redo record begins
Each redo record consists of one or more change vectors and each change vector has it’s own op code. Check out this article by Jonathan Lewis: https://jonathanlewis.wordpress.com/2017/07/25/redo-op-codes/
If you update EMPLOYEES table and change 2 rows, you will get usually one redo record with at least 4 change vectors – 2 for actual updates (so REDO) and 2 for reversing the operation (undo)
You can learn it from redolog or archivelog dumps:
Let’s modify a few rows:
1 | SQL> update hr.employees |
3 | 3 where department_id=100; |
Now I’ll make a dump of an archivelog which contains those changes:
1 | SQL> alter system switch logfile; |
8 | 3 select name , rank() over ( order by sequence # desc ) as rn |
18 | +DATA/SKIPER/ARCHIVELOG/2017_09_20/thread_1_seq_13.303.955209887 |
20 | SQL> alter system dump logfile '+DATA/SKIPER/ARCHIVELOG/2017_09_20/thread_1_seq_13.303.955209887' ; |
25 | 1* select value from v$diag_info where name = 'Default Trace File' |
30 | /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_24204.trc |
The dump file is quite big and hard to analyze at the beginning, so I wrote a small and simple (primitive even) tool in Python which will help you to analyze the dump file. The tool can be downloaded from here.
It’s tested against Python 7.5. You can use it like this:
1 | [oracle@skiper ~]$ python logfile_dump_parser.py |
2 | parse file /path/to/a/redo/dump object_id_or_name if you want to parse file (duh!) |
3 | connect to oracle user/password@ip:port/service if you want to connect to db to get table names instead ids |
At the beginning you will get a prompt – you can tell the tool to parse a dumpfile, providing DATA_OBJECT_ID of a table, which you’d like to search in a file or you can connect to Oracle (cx_Oracle module for Python required) with a simple connect string and parse a dump file, using name of the table.
1 | VOODOO > connect to oracle system/oracle@localhost:1521/skiper |
2 | VOODOO > parse file /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_24204.trc EMPLOYEES |
4 | Redo record block: 42 offset: 16 [rs_id: 0x00000d.0000002a.0010] |
5 | change 1 operation UPDATE xid = 0x0004.019.000004f2 |
6 | change 2 operation UNDO HEADER xid = 0x0004.019.000004f2 |
7 | change 3 operation UPDATE xid = 0x0004.019.000004f2 |
8 | change 4 operation UPDATE xid = 0x0004.019.000004f2 |
9 | change 5 operation UPDATE xid = 0x0004.019.000004f2 |
10 | change 6 operation UPDATE xid = 0x0004.019.000004f2 |
11 | change 7 operation UPDATE xid = 0x0004.019.000004f2 |
12 | change 8 operation COMMIT xid = 0x0004.019.000004f2 |
13 | change 9 operation UNDO xid = 0x0004.019.000004f2 |
14 | change 10 operation UNDO xid = 0x0004.019.000004f2 |
15 | change 11 operation UNDO xid = 0x0004.019.000004f2 |
16 | change 12 operation UNDO xid = 0x0004.019.000004f2 |
17 | change 13 operation UNDO xid = 0x0004.019.000004f2 |
18 | change 14 operation UNDO xid = 0x0004.019.000004f2 |
As you can see, this simple update changed 6 rows and generated one redo record with 14 change vectors – 6X UPDATE (redo) 6X UNDO, one UNDO HEADER modification and one COMMIT. This whole redo record is a single, atomic operation for the database.
Let’s see what will happen when we will ROLLBACK this kind of transaction instead.
First we have to repeat the operation and generate archivelog dump for our parser:
4 | 3* where department_id=100 |
13 | SQL> alter system switch logfile; |
21 | +DATA/SKIPER/ARCHIVELOG/2017_09_20/thread_1_seq_29.319.955212079 |
23 | SQL> alter system dump logfile '+DATA/SKIPER/ARCHIVELOG/2017_09_20/thread_1_seq_29.319.955212079' ; |
31 | /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_3601.trc |
And now we can parse logfile with my little tool (note, that you have to make connection to the database only once per parsing session)
1 | VOODOO > parse file /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_3601.trc EMPLOYEES |
3 | Redo record block: 47 offset: 16 [rs_id: 0x00001d.0000002f.0010] |
4 | change 1 operation UPDATE xid = 0x0002.001.00000615 |
5 | change 2 operation UNDO HEADER xid = 0x0002.001.00000615 |
6 | change 3 operation UPDATE xid = 0x0002.001.00000615 |
7 | change 4 operation UPDATE xid = 0x0002.001.00000615 |
8 | change 5 operation UPDATE xid = 0x0002.001.00000615 |
9 | change 6 operation UPDATE xid = 0x0002.001.00000615 |
10 | change 7 operation UPDATE xid = 0x0002.001.00000615 |
11 | change 8 operation UNDO xid = 0x0002.001.00000615 |
12 | change 9 operation UNDO xid = 0x0002.001.00000615 |
13 | change 10 operation UNDO xid = 0x0002.001.00000615 |
14 | change 11 operation UNDO xid = 0x0002.001.00000615 |
15 | change 12 operation UNDO xid = 0x0002.001.00000615 |
16 | change 13 operation UNDO xid = 0x0002.001.00000615 |
18 | Redo record block: 50 offset: 228 [rs_id: 0x00001d.00000032.00e4] |
19 | change 1 operation UPDATE xid = 0x0002.001.00000615 |
20 | change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615 |
22 | Redo record block: 50 offset: 424 [rs_id: 0x00001d.00000032.01a8] |
23 | change 1 operation UPDATE xid = 0x0002.001.00000615 |
24 | change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615 |
26 | Redo record block: 51 offset: 124 [rs_id: 0x00001d.00000033.007c] |
27 | change 1 operation UPDATE xid = 0x0002.001.00000615 |
28 | change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615 |
30 | Redo record block: 51 offset: 320 [rs_id: 0x00001d.00000033.0140] |
31 | change 1 operation UPDATE xid = 0x0002.001.00000615 |
32 | change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615 |
34 | Redo record block: 52 offset: 20 [rs_id: 0x00001d.00000034.0014] |
35 | change 1 operation UPDATE xid = 0x0002.001.00000615 |
36 | change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0002.001.00000615 |
38 | Redo record block: 52 offset: 216 [rs_id: 0x00001d.00000034.00d8] |
39 | change 1 operation UPDATE xid = 0x0002.001.00000615 |
40 | change 2 operation ROLLBACK SINGLE STATEMENT (final) xid = 0x0002.001.00000615 |
41 | COMMIT [ending rollback] for transaction 0x0002.001.00000615 [block: 52 offset: 432 rs_id: 0x00001d.00000034.01b0] |
Nice! So for each and every modified row, Oracle had to generate separate redo record with 2 change vectors – one vector is an actual UPDATE built from UNDO information, which changes a row to its previous value and one rollback code (5.6 – ROLLBACK SINGLE STATEMENT (not final) or 5.11 – ROLLBACK SINGLE STATEMENT (final))
- 5.6 – ROLLBACK SINGLE STATEMENT (not final) – this means that the rollback is in progress and there can be more operations like this
- 5.11 – ROLLBACK SINGLE STATEMENT (final) – this means that the rollback is finished and there should be no more rollback records
Each complete ROLLBACK is actually finished by COMMIT (OP:5.4). This is logical because ROLLBACK is generating commands to reverse previous operations and then it can COMMIT the whole transaction.
The interesting thing is with partial rollback. Let’s UPDATE again 6 records but then let’s try to insert a row which will violate a primary key:
4 | 3* where department_id=100 |
9 | SQL> insert into hr.employees |
10 | 2 values (100, 'Dupa' , 'Blada' , 'DBLADA' , '666.66.666' ,sysdate, 'ST_CLERK' ,5600, null ,100,100); |
11 | insert into hr.employees |
14 | ORA-00001: unique constraint (HR.EMP_EMP_ID_PK) violated |
Now let’s check the parser:
1 | V00D00 > parse file /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_6603.trc EMPLOYEES |
3 | Redo record sequence: 32 block: 38 offset 16 [rs_id: 0x000020.00000026.0010] |
4 | change 1 operation UPDATE xid = 0x0005.004.0000061e |
5 | change 2 operation UNDO HEADER xid = 0x0005.004.0000061e |
6 | change 3 operation UPDATE xid = 0x0005.004.0000061e |
7 | change 4 operation UPDATE xid = 0x0005.004.0000061e |
8 | change 5 operation UPDATE xid = 0x0005.004.0000061e |
9 | change 6 operation UPDATE xid = 0x0005.004.0000061e |
10 | change 7 operation UPDATE xid = 0x0005.004.0000061e |
11 | change 8 operation UNDO xid = 0x0005.004.0000061e |
12 | change 9 operation UNDO xid = 0x0005.004.0000061e |
13 | change 10 operation UNDO xid = 0x0005.004.0000061e |
14 | change 11 operation UNDO xid = 0x0005.004.0000061e |
15 | change 12 operation UNDO xid = 0x0005.004.0000061e |
16 | change 13 operation UNDO xid = 0x0005.004.0000061e |
18 | Redo record sequence: 32 block: 41 offset 188 [rs_id: 0x000020.00000029.00bc] |
19 | change 1 operation OP:14.11 xid = 0x0005.004.0000061e |
21 | Redo record sequence: 32 block: 41 offset 252 [rs_id: 0x000020.00000029.00fc] |
22 | change 1 operation OP:13.22 xid = 0x0005.004.0000061e |
24 | Redo record sequence: 32 block: 41 offset 328 [rs_id: 0x000020.00000029.0148] |
25 | change 1 operation UNDO xid = 0x0005.004.0000061e |
26 | change 2 operation INSERT xid = 0x0005.004.0000061e |
28 | Redo record sequence: 32 block: 42 offset 172 [rs_id: 0x000020.0000002a.00ac] |
29 | change 1 operation DELETE xid = 0x0005.004.0000061e |
30 | change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x0005.004.0000061e |
31 | COMMIT for transaction 0x0005.004.0000061e [sequence: 33 block: 3 offset 16 [rs_id: 0x000021.00000003.0010] |
Great! So we can see in here one familiar redo record, some change vectors from layer 13 (Segment management) and 14 (Extent management) and then our redo record for INSERT, followed by a redo record with a DELETE (ROLLBACK information)!
So each time you are doing something to a database, redo is being generated – even if you violate constraints! After constraint violations, Oracle is doing a partial rollback of this transaction.
But the funniest thing can be seen in this situation:
1 | SQL> update hr.employees |
6 | ORA-00001: unique constraint (HR.EMP_EMP_ID_PK) violated |
We could expect, that modifying the second row to the same value as the previous one will throw this kind of exception. But redo log analysis shows something different:
1 | V00D00 > parse file /u01/app/oracle/diag/rdbms/skiper/skiper/trace/skiper_ora_9695.trc EMPLOYEES |
3 | Redo record sequence: 34 block: 8771 offset 16 [rs_id: 0x000022.00002243.0010] |
4 | change 1 operation UNDO HEADER xid = 0x000a.002.00000508 |
5 | change 2 operation UNDO xid = 0x000a.002.00000508 |
6 | change 3 operation UPDATE xid = 0x000a.002.00000508 |
8 | Redo record sequence: 34 block: 8773 offset 96 [rs_id: 0x000022.00002245.0060] |
9 | change 1 operation UNDO xid = 0x000a.002.00000508 |
10 | change 2 operation UPDATE xid = 0x000a.002.00000508 |
12 | Redo record sequence: 34 block: 8774 offset 100 [rs_id: 0x000022.00002246.0064] |
13 | change 1 operation UNDO xid = 0x000a.002.00000508 |
14 | change 2 operation UPDATE xid = 0x000a.002.00000508 |
16 | Redo record sequence: 34 block: 8775 offset 268 [rs_id: 0x000022.00002247.010c] |
17 | change 1 operation UPDATE xid = 0x000a.002.00000508 |
18 | change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x000a.002.00000508 |
20 | Redo record sequence: 34 block: 8776 offset 136 [rs_id: 0x000022.00002248.0088] |
21 | change 1 operation UPDATE xid = 0x000a.002.00000508 |
22 | change 2 operation ROLLBACK SINGLE STATEMENT (not final) xid = 0x000a.002.00000508 |
24 | Redo record sequence: 34 block: 8777 offset 172 [rs_id: 0x000022.00002249.00ac] |
25 | change 1 operation UPDATE xid = 0x000a.002.00000508 |
26 | change 2 operation ROLLBACK SINGLE STATEMENT (final) xid = 0x000a.002.00000508 |
27 | COMMIT [ending rollback] for transaction 0x000a.002.00000508 [sequence: 34 block: 8777 offset 384 [rs_id: 0x000022.00002249.0180] |
We can see that Oracle did 3 modifications and 3 rollbacks after getting a signal that constraint was violated. So maybe we can assume that constraint checking is an asynchronous operation?
Stay tuned for more! Hope you will have fun with analyzing redo dumps with my simple Python tool 🙂
Cheers!