Oracle 12c – internals of threaded execution


17.12.2015
by Kamil Stawiarski

At UKOUG Tech’15 (Super Sunday) there was a session called "12c: A Closer Look at the Multi-Process Multi-Threaded Model for Oracle on Linux/Unix" by Markus Flechtner. Since then I can’t stop thinking about the performance of this feature. Why Oracle decided to provide it?

I did some tests regarding sort operations and threaded execution was always a bit slower. I decided to dig a little dipper into this.

Oracle instance with threaded_execution=false:

1SQL> sho parameters threaded_execution
2 
3NAME                     TYPE    VALUE
4------------------------------------ ----------- ------------------------------
5threaded_execution           boolean     FALSE
6SQL> select count(1) from sales2;
7 
8  COUNT(1)
9----------
10  58805952
11 
12SQL> select count(distinct file# || block#)
13  2  from v$bh b, dba_objects o
14  3  where b.objd=o.data_object_id
15  4  and   o.object_name='SALES2'
16  5  and   b.status!='free';
17 
18COUNT(DISTINCTFILE#||BLOCK#)
19----------------------------
20              283624
21 
22SQL> select blocks from user_segments where segment_name='SALES2';
23 
24    BLOCKS
25----------
26    284672

Oracle instance with threaded_execution=true:

1SQL> sho parameters threaded_execution
2 
3NAME                     TYPE    VALUE
4------------------------------------ ----------- ------------------------------
5threaded_execution           boolean     TRUE
6SQL> select count(1) from sales2;
7 
8  COUNT(1)
9----------
10  58805952
11 
12SQL> select count(distinct file# || block#)
13  2  from v$bh b, dba_objects o
14  3  where b.objd=o.data_object_id
15  4  and   o.object_name='SALES2'
16  5  and   b.status!='free';
17 
18COUNT(DISTINCTFILE#||BLOCK#)
19----------------------------
20              283624
21 
22SQL> select blocks from user_segments where segment_name='SALES2';
23 
24    BLOCKS
25----------
26    284672

As you can see, I have the same table (SALES2) on both instances, which is cached in buffer cache.

Now let’s test some kind of query, that access only this one table:

Oracle instance with threaded_execution=false:

1SQL> set timing on
2SQL> get bigsort
3  1  with v_sales as
4  2  (
5  3    select s.*, row_number() over (order by amount_sold desc) as rn
6  4    from sales2 s
7  5  )
8  6  select *
9  7  from v_sales
10  8* where rn<=2
11SQL> /
12 
13   PROD_ID    CUST_ID TIME_ID   CHANNEL_ID   PROMO_ID QUANTITY_SOLD AMOUNT_SOLD
14---------- ---------- --------- ---------- ---------- ------------- -----------
15    RN
16----------
17    18    510 03-JAN-99      3    999         1 1782.72
18     1
19 
20    18   1330 03-JAN-99      3    999         1 1782.72
21     2
22 
23 
24Elapsed: 00:00:11.98

Oracle instance with threaded_execution=true:

1SQL> set timing on
2SQL> get bigsort
3  1  with v_sales as
4  2  (
5  3    select s.*, row_number() over (order by amount_sold desc) as rn
6  4    from sales2 s
7  5  )
8  6  select *
9  7  from v_sales
10  8* where rn<=2
11SQL> /
12 
13   PROD_ID    CUST_ID TIME_ID   CHANNEL_ID   PROMO_ID QUANTITY_SOLD AMOUNT_SOLD
14---------- ---------- --------- ---------- ---------- ------------- -----------
15    RN
16----------
17    18    510 03-JAN-99      3    999         1 1782.72
18     1
19 
20    18   1330 03-JAN-99      3    999         1 1782.72
21     2
22 
23 
24Elapsed: 00:00:12.58

I used perf to analyze those two executions and I found a lot more syscalls, when threaded execution was used. I noticed also that in threaded execution, oracle is using nsmore2recv function. When I googled it, I found a great article by Tanel Poder: https://blog.tanelpoder.com/2008/02/05/oracle-hidden-costs-revealed-part-1/

Using strace revealed similar situation to this described by Tanel – a lot of POLL syscalls.

Oracle instance with threaded_execution=false:

1[root@rico ~]# strace -cp 12168
2Process 12168 attached
3^CProcess 12168 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6  0.00    0.000000           0         2           read
7  0.00    0.000000           0         2           write
8  0.00    0.000000           0         1           semctl
9  0.00    0.000000           0       159           getrusage
10  0.00    0.000000           0        12           times
11  0.00    0.000000           0         3           semtimedop
12------ ----------- ----------- --------- --------- ----------------
13100.00    0.000000                   179           total

Oracle instance with threaded_execution=true:

1[root@rico fd]# strace -cp 12165
2Process 12165 attached
3^CProcess 12165 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 84.22    0.113706           0    980840           poll
7 10.37    0.014000        7000         2           read
8  5.41    0.007310        1218         6           semtimedop
9  0.00    0.000000           0         2           write
10  0.00    0.000000           0         1           semctl
11  0.00    0.000000           0       419           getrusage
12  0.00    0.000000           0        12           times
13------ ----------- ----------- --------- --------- ----------------
14100.00    0.135016                981282           total
15 
16[root@rico fd]# strace -p 12165 -o /tmp/threaded_exec.out
17Process 12165 attached
18^CProcess 12165 detached
19[root@rico fd]# grep poll /tmp/threaded_exec.out | tail
20poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
21poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
22poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
23poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
24poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
25poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
26poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
27poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
28poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
29poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)

The STID 12165 was signed to SPID 8107:

1SQL> get spid
2  1  select spid, stid
3  2  from v$process p, v$session s
4  3  where p.addr=s.paddr
5  4* and   s.sid=sys_context('userenv','sid')
6SQL> /
7 
8SPID             STID
9------------------------ ------------------------
108107             12165

Let’s check the file descriptors for this thread:

1[root@rico ~]# cd /proc/8107/task/12165/fd
2[root@rico fd]# ls -al | grep 63
3lrwx------. 1 oracle oinstall 64 12-17 21:38 63 -> socket:[73968]
4[root@rico fd]# lsof | grep 73968
5ora_scmn_  8107    oracle   63u     IPv6              73968        0t0        TCP localhost:ncube-lm->localhost:32400 (ESTABLISHED)
6[root@rico fd]# ps aux | grep 8107 | grep -v grep
7oracle    8107  4.7 29.0 6155520 2901516 ?     Ssl  20:01   6:54 ora_u005_orclth
8[root@rico fd]#

So it looks like, that the process ora_u00N is using sockets to communicate with its threads. That’s why we have seen the additional syscalls, which were an overhead causing slower execution.

So I raise the question: why and when use threaded execution? If ever?


Contact us

Database Whisperers sp. z o. o. sp. k.
al. Jerozolimskie 200, 3rd floor, room 342
02-486 Warszawa
NIP: 5272744987
REGON:362524978
+48 508 943 051
+48 661 966 009
info@ora-600.pl

Newsletter Sign up to be updated