Noisy neighbors – Oracle latches and multitenant

A year ago I described the basics of checkpoint queue latches:

How checkpoint queue actually works (the basic scenario)

After writing that article, I attended an excellent lecture by Mike Dietrich and Daniel Overby Hansen about noisy neighbors in a multitenant environment and how to make PDBs quieter and easier to live with in a larger society. During that lecture I thought about shared memory structures and semaphores — for example, how latches are shared across different PDBs…

Let’s investigate that in the context of checkpoint queue latches.

We already know that the function responsible for manipulating the checkpoint queue is kcbkgcql.

It is a small function and can be easily reverse-engineered. The part that’s relevant to us is:

ldrb    w22, [x19, #200]
mov     x2, #0x0                
mov     w1, #0x0                
add     x24, x19, #0x10
ldr     x25, [x19, #264]
and     w21, w22, #0x1
and     x3, x21, #0xff
lsl     x20, x3, #5
add     x4, x20, x3
mov     w3, w23
add     x20, x25, x4, lsl #3
mov     x0, x20
bl      0xb5f5360 <kslgetl>

ldrb    w22, [x19, #200]
mov     x2, #0x0                
mov     w1, #0x0                
add     x24, x19, #0x10
ldr     x25, [x19, #264]
and     w21, w22, #0x1
and     x3, x21, #0xff
lsl     x20, x3, #5
add     x4, x20, x3
mov     w3, w23
add     x20, x25, x4, lsl #3
mov     x0, x20
bl      0xb5f5360 <kslgetl>

Fortunately, we live in wonderful times of LLMs and we can use them to make our lives easier. This code is, of course, trivial, and we don’t really need ChatGPT to analyze it; but I want to show you how easy it is when you know how to ask a proper question:

The prompt would be:

Below you will find ARM assembler code:
ldrb    w22, [x19, #200]
mov     x2, #0x0                
mov     w1, #0x0                
add     x24, x19, #0x10
ldr     x25, [x19, #264]
and     w21, w22, #0x1
and     x3, x21, #0xff
lsl     x20, x3, #5
add     x4, x20, x3
mov     w3, w23
add     x20, x25, x4, lsl #3
mov     x0, x20
bl      0xb5f5360 <kslgetl>

Assuming that x19 contains address of Oracle block header in buffer cache and 200 bytes from this header is column set_ds from x$bh, translate this code to python. The premise is that I have value of SET_DS from X$BH and I want to calculate latch address, which passed to kslgetl for acquiring a proper latch.

Below you will find ARM assembler code:
ldrb    w22, [x19, #200]
mov     x2, #0x0                
mov     w1, #0x0                
add     x24, x19, #0x10
ldr     x25, [x19, #264]
and     w21, w22, #0x1
and     x3, x21, #0xff
lsl     x20, x3, #5
add     x4, x20, x3
mov     w3, w23
add     x20, x25, x4, lsl #3
mov     x0, x20
bl      0xb5f5360 <kslgetl>

Assuming that x19 contains address of Oracle block header in buffer cache and 200 bytes from this header is column set_ds from x$bh, translate this code to python. The premise is that I have value of SET_DS from X$BH and I want to calculate latch address, which passed to kslgetl for acquiring a proper latch.

And here is a nice answer:

Here’s the straight translation of what that AArch64 snippet does, distilled to the one thing that matters for the latch argument to kslgetl:

Read set_ds (byte) from the block header.
Take its least-significant bit.
If that bit is 1, choose latch “slot” 33; if 0, choose slot 0.
Latch slots are 8 bytes apart, so the byte offset is slot * 8.
Add that offset to a base pointer read from [block_header + 264].
Pass the resulting address to kslgetl.

Python translation

def latch_address_for_set_ds(set_ds: int, base_ptr_from_hdr_264: int) -> int:
    """
    Compute the latch address that is passed to kslgetl.

    Args:
        set_ds: integer value of X$BH.SET_DS (byte, but int is fine).
        base_ptr_from_hdr_264: 64-bit value loaded from [block_header + 264].
            This is the base of the latch array/structure.

    Returns:
        Absolute address (int) of the latch to pass to kslgetl.
    """
    # w22 <- ldrb [x19,#200]  ==> set_ds
    lsb = set_ds & 0x1            # and w21, w22, #0x1
    slot = lsb * 33               # x4 = (x3<<5) + x3  -> 33 * lsb (0 or 33)
    offset = slot << 3            # ... , lsl #3       -> slot * 8 bytes
    return base_ptr_from_hdr_264 + offset

def latch_address_for_set_ds(set_ds: int, base_ptr_from_hdr_264: int) -> int:
    """
    Compute the latch address that is passed to kslgetl.

    Args:
        set_ds: integer value of X$BH.SET_DS (byte, but int is fine).
        base_ptr_from_hdr_264: 64-bit value loaded from [block_header + 264].
            This is the base of the latch array/structure.

    Returns:
        Absolute address (int) of the latch to pass to kslgetl.
    """
    # w22 <- ldrb [x19,#200]  ==> set_ds
    lsb = set_ds & 0x1            # and w21, w22, #0x1
    slot = lsb * 33               # x4 = (x3<<5) + x3  -> 33 * lsb (0 or 33)
    offset = slot << 3            # ... , lsl #3       -> slot * 8 bytes
    return base_ptr_from_hdr_264 + offset

If you only have SET_DS

If you don’t have the 64-bit value from [block_header + 264], you can still compute which slot it will use and the byte offset within that array:

def latch_slot_and_offset(set_ds: int) -> tuple[int, int]:
    lsb = set_ds & 0x1
    slot = lsb * 33          # 0 or 33
    offset = slot * 8        # 0 or 264 bytes
    return slot, offset

def latch_slot_and_offset(set_ds: int) -> tuple[int, int]:
    lsb = set_ds & 0x1
    slot = lsb * 33          # 0 or 33
    offset = slot * 8        # 0 or 264 bytes
    return slot, offset

slot will be 0 when SET_DS is even, 33 when it’s odd.
The effective address passed to kslgetl is:addr = *(block_header + 264) + (slot * 8).

Everything else in your snippet (mov x2,#0, mov w1,#0, add x24,…, mov w3,w23) doesn’t affect the latch pointer; they’re other call arguments or dead/irrelevant for this specific calculation.

Actually, there’s a small twist — each block is protected by two checkpoint queue latches: if the first one is taken, the second one is used.

You can write a simple query that computes both checkpoint queue latches for each block of a table.

In my test environment I have two PDBs — RICK1 and RICK2. Each has the table HR.EMPLOYEES, a small table with only two data blocks. Let’s inspect the checkpoint Q latches after selecting blocks from that table.

We use the following query:

set linesize 200
set pagesize 100
column name format a10

select p.name, b.dbablk,
      to_char(case when mod(dbablk, 2) = 0 then to_number(set_ds,'XXXXXXXXXXXXXXXX')
           else to_number(set_ds,'XXXXXXXXXXXXXXXX')+264
      end,'XXXXXXXXXXXXXXXX') as primary_checkpoint_q_latch,
      to_char(case when mod(dbablk, 2) = 0 then to_number(set_ds,'XXXXXXXXXXXXXXXX')+264
           else to_number(set_ds,'XXXXXXXXXXXXXXXX')
      end,'XXXXXXXXXXXXXXXX') as secondary_checkpoint_q_latch
from x$bh b, cdb_objects o, v$pdbs p
where o.object_name='EMPLOYEES'
and state!=0
and  b.obj=o.data_object_id
and  o.con_id=b.con_id
and  o.con_id=p.con_id
and  b.dbablk in (38452, 38453)
order by primary_checkpoint_q_latch, dbablk, o.con_id
/

set linesize 200
set pagesize 100
column name format a10

select p.name, b.dbablk,
      to_char(case when mod(dbablk, 2) = 0 then to_number(set_ds,'XXXXXXXXXXXXXXXX')
           else to_number(set_ds,'XXXXXXXXXXXXXXXX')+264
      end,'XXXXXXXXXXXXXXXX') as primary_checkpoint_q_latch,
      to_char(case when mod(dbablk, 2) = 0 then to_number(set_ds,'XXXXXXXXXXXXXXXX')+264
           else to_number(set_ds,'XXXXXXXXXXXXXXXX')
      end,'XXXXXXXXXXXXXXXX') as secondary_checkpoint_q_latch
from x$bh b, cdb_objects o, v$pdbs p
where o.object_name='EMPLOYEES'
and state!=0
and  b.obj=o.data_object_id
and  o.con_id=b.con_id
and  o.con_id=p.con_id
and  b.dbablk in (38452, 38453)
order by primary_checkpoint_q_latch, dbablk, o.con_id
/

SQL> @calculate_latch.sql

NAME           DBABLK PRIMARY_CHECKPOIN SECONDARY_CHECKPO
---------- ---------- ----------------- -----------------
RICK1           38452         1486E8168         1486E8270
RICK2           38453         1486E8270         1486E8168
RICK1           38453         1486E8B30         1486E8A28
RICK2           38452         1486E9BA8         1486E9CB0

SQL> @calculate_latch.sql

NAME           DBABLK PRIMARY_CHECKPOIN SECONDARY_CHECKPO
---------- ---------- ----------------- -----------------
RICK1           38452         1486E8168         1486E8270
RICK2           38453         1486E8270         1486E8168
RICK1           38453         1486E8B30         1486E8A28
RICK2           38452         1486E9BA8         1486E9CB0

As we can see, block 38452 from RICK1 and block 38453 from RICK2 are being protected by the same set of latches:

1486E8168
1486E8270

We can verify that those are a proper latch addresses:

SQL> get latch_names.sql
  1* select name from v$latch_children where addr in ('00000001486E8168','00000001486E8270')
SQL> /

NAME
--------------------------------------------------
checkpoint queue latch
checkpoint queue latch

SQL> get latch_names.sql
  1* select name from v$latch_children where addr in ('00000001486E8168','00000001486E8270')
SQL> /

NAME
--------------------------------------------------
checkpoint queue latch
checkpoint queue latch

So we have demonstrated that two different pluggable databases can use the same set of latches to protect their blocks. What would happen if one database held those latches constantly?

I created a larger table in RICK1: HR.EMPLOYEES_SKEW

SQL> select count(*)
  2  from hr.employees_skew;

  COUNT(*)
----------
   7012352

SQL> select count(*)
  2  from hr.employees_skew;

  COUNT(*)
----------
   7012352

I also prepared scripts to force RICK2 to acquire a latch at a specific address:

[oracle@vrick19 fuck_latches]$ cat latch2.sql 
alter session set container=rick2;
select userenv('sid') from dual;
oradebug setmypid
oradebug call kslgetl 0x&1 0x0 0x0 0xc1e
pause

[oracle@vrick19 fuck_latches]$ cat latch2.sql 
alter session set container=rick2;
select userenv('sid') from dual;
oradebug setmypid
oradebug call kslgetl 0x&1 0x0 0x0 0xc1e
pause

[oracle@vrick19 fuck_latches]$ cat set_latches.sh
#!/bin/bash 

while IFS=  read -r line
do
  tmux new -d sqlplus "/ as sysdba" @latch2.sql ${line}
done < <(sqlplus -S "/ as sysdba" @list_latches.sql)
read
tmux kill-server

[oracle@vrick19 fuck_latches]$ cat set_latches.sh
#!/bin/bash 

while IFS=  read -r line
do
  tmux new -d sqlplus "/ as sysdba" @latch2.sql ${line}
done < <(sqlplus -S "/ as sysdba" @list_latches.sql)
read
tmux kill-server

[oracle@vrick19 fuck_latches]$ cat list_latches.sql 
set heading off
set pagesize 0
set feedback off
with v_l as
(
select b.con_id, b.dbablk,
      to_char(case when mod(dbablk, 2) = 0 then to_number(set_ds,'XXXXXXXXXXXXXXXX')
           else to_number(set_ds,'XXXXXXXXXXXXXXXX')+264
      end,'XXXXXXXXXXXXXXXX') as primary_checkpoint_q_latch,
      to_char(case when mod(dbablk, 2) = 0 then to_number(set_ds,'XXXXXXXXXXXXXXXX')+264
           else to_number(set_ds,'XXXXXXXXXXXXXXXX')
      end,'XXXXXXXXXXXXXXXX') as secondary_checkpoint_q_latch
from x$bh b, cdb_objects o
where o.object_name='EMPLOYEES_SKEW'
and state!=0
and  b.obj=o.data_object_id
and  o.con_id=b.con_id
)
select primary_checkpoint_q_latch 
from v_l
union 
select secondary_checkpoint_q_latch
from v_l
/

exit

[oracle@vrick19 fuck_latches]$ cat list_latches.sql 
set heading off
set pagesize 0
set feedback off
with v_l as
(
select b.con_id, b.dbablk,
      to_char(case when mod(dbablk, 2) = 0 then to_number(set_ds,'XXXXXXXXXXXXXXXX')
           else to_number(set_ds,'XXXXXXXXXXXXXXXX')+264
      end,'XXXXXXXXXXXXXXXX') as primary_checkpoint_q_latch,
      to_char(case when mod(dbablk, 2) = 0 then to_number(set_ds,'XXXXXXXXXXXXXXXX')+264
           else to_number(set_ds,'XXXXXXXXXXXXXXXX')
      end,'XXXXXXXXXXXXXXXX') as secondary_checkpoint_q_latch
from x$bh b, cdb_objects o
where o.object_name='EMPLOYEES_SKEW'
and state!=0
and  b.obj=o.data_object_id
and  o.con_id=b.con_id
)
select primary_checkpoint_q_latch 
from v_l
union 
select secondary_checkpoint_q_latch
from v_l
/

exit

The set_latches.sh script launches multiple tmux sessions with sqlplus, each holding a latch address tied to EMPLOYEES_SKEW blocks.

Now let’s start a massive update from RICK1:

SQL> update hr.employees_skew 
  2  set salary=salary;

SQL> update hr.employees_skew 
  2  set salary=salary;

From another session of RICK1 I will run a simple SELECT * FROM HR.EMPLOYEES_SKEW;

Let’s see what will happen for the select:

As you can see, the SELECT statement is waiting on buffer busy waits, which is a consequence of latch: checkpoint queue latch. The session 26 is not present in my view, because it won’t be visible from RICK1.

It will be visible only from RICK2 or from CDB level:

       SID    SERIAL# BLOCKING_SESSION EVENT                              CON_ID
---------- ---------- ---------------- ------------------------------ ----------
       254       9032               26 latch: checkpoint queue latch           3
        26      48485                  SQL*Net message from client             4
        26      48485                  SQL*Net message from client             4
       524      10410              254 buffer busy waits                       3
       254       9032               26 latch: checkpoint queue latch           3
        26      48485                  SQL*Net message from client             4
        26      48485                  SQL*Net message from client             4


       SID    SERIAL# BLOCKING_SESSION EVENT                              CON_ID
---------- ---------- ---------------- ------------------------------ ----------
       254       9032               26 latch: checkpoint queue latch           3
        26      48485                  SQL*Net message from client             4
        26      48485                  SQL*Net message from client             4
       524      10410              254 buffer busy waits                       3
       254       9032               26 latch: checkpoint queue latch           3
        26      48485                  SQL*Net message from client             4
        26      48485                  SQL*Net message from client             4

Usually checkpoint queue latch is being taken during commit, thanks to private redo strands and IMU.

So it would be possible to degrade the performance of the whole CDB by doing many commits (or rollbacks) — and this is only from the checkpoint queue / redo perspective. There are more latches and more shared structures that can cause nightmares.

Let’s prepare a commit overkill:

declare
cursor c_sql is
  select rowid as rid from hr.employees_skew;

type t_rowid is table of c_sql%ROWTYPE index by pls_integer;
v_rowid t_rowid;

begin
        open c_sql;
          fetch c_sql bulk collect into v_rowid;
        close c_sql;

        for i in v_rowid.first..v_rowid.last loop
                update hr.employees_skew set salary=salary+1 where rowid=v_rowid(i).rid;
                commit;
        end loop;
end;
/

declare
cursor c_sql is
  select rowid as rid from hr.employees_skew;

type t_rowid is table of c_sql%ROWTYPE index by pls_integer;
v_rowid t_rowid;

begin
        open c_sql;
          fetch c_sql bulk collect into v_rowid;
        close c_sql;

        for i in v_rowid.first..v_rowid.last loop
                update hr.employees_skew set salary=salary+1 where rowid=v_rowid(i).rid;
                commit;
        end loop;
end;
/

On RICK1, from one session I created an active transaction which will make any other session to create CR blocks in a buffer cache:

SQL> update hr.employees_skew  set salary=salary;

Zaktualizowano wierszy: 7012352.

SQL> show con_name

CON_NAME
------------------------------
RICK1

SQL> update hr.employees_skew  set salary=salary;

Zaktualizowano wierszy: 7012352.

SQL> show con_name

CON_NAME
------------------------------
RICK1

From another session of RICK1 I will perform a simple SELECT * FROM HR.EMPLOYEES_SKEW;

This is a list of unique wait events when RICK2 is being silent:

[oracle@vrick19 fuck_latches]$ cat /u01/app/oracle/diag/rdbms/rick/rick/trace/rick_ora_8946.trc | grep WAIT | grep -v Net | awk -F\' '{print $2}' | sort -u
Disk file operations I/O
latch: object queue header operation

[oracle@vrick19 fuck_latches]$ cat /u01/app/oracle/diag/rdbms/rick/rick/trace/rick_ora_8946.trc | grep WAIT | grep -v Net | awk -F\' '{print $2}' | sort -u
Disk file operations I/O
latch: object queue header operation

But when RICK2 becomes noisy and performs thousands of transactions:

[oracle@vrick19 fuck_latches]$ cat /u01/app/oracle/diag/rdbms/rick/rick/trace/rick_ora_8763.trc | grep WAIT | grep -v Net | awk -F\' '{
print $2}' | sort -u
Disk file operations I/O
latch: cache buffers chains
latch: redo allocation
log buffer space
log file switch (checkpoint incomplete)
log file switch completion

[oracle@vrick19 fuck_latches]$ cat /u01/app/oracle/diag/rdbms/rick/rick/trace/rick_ora_8763.trc | grep WAIT | grep -v Net | awk -F\' '{
print $2}' | sort -u
Disk file operations I/O
latch: cache buffers chains
latch: redo allocation
log buffer space
log file switch (checkpoint incomplete)
log file switch completion

A big difference!

This was when neighbor was quiet:

SQL ID: dgd9achf2jp7y Plan Hash: 4089363840

select *
from
 hr.employees_skew


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          0          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch   600877      1.97      13.36          0   10260251          0     9013113
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   600881      1.97      13.36          0   10260251          0     9013113

SQL ID: dgd9achf2jp7y Plan Hash: 4089363840

select *
from
 hr.employees_skew


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          0          0           0
Execute      2      0.00       0.00          0          0          0           0
Fetch   600877      1.97      13.36          0   10260251          0     9013113
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   600881      1.97      13.36          0   10260251          0     9013113

This happened when neighbor was noisy:

SQL ID: dgd9achf2jp7y Plan Hash: 4089363840

select *
from
 hr.employees_skew


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch   467492      1.73     147.71          0    7982614          0     7012352
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   467494      1.74     147.71          0    7982614          0     7012352

SQL ID: dgd9achf2jp7y Plan Hash: 4089363840

select *
from
 hr.employees_skew


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch   467492      1.73     147.71          0    7982614          0     7012352
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   467494      1.74     147.71          0    7982614          0     7012352

Conclusion? Be careful what you integrate together! And always analyze database performance from CDB perspective. If you are having problems with overall performance understanding – JAS-MIN can help you 😉

https://github.com/ora600pl/jas-min

Noisy neighbors – Oracle latches and multitenant

Python translation

If you only have SET_DS

Search: