Oracle 12c – internals of threaded execution


17.12.2015
by Kamil Stawiarski

At UKOUG Tech’15 (Super Sunday) there was a session called "12c: A Closer Look at the Multi-Process Multi-Threaded Model for Oracle on Linux/Unix" by Markus Flechtner. Since then I can’t stop thinking about the performance of this feature. Why Oracle decided to provide it?

I did some tests regarding sort operations and threaded execution was always a bit slower. I decided to dig a little dipper into this.

Oracle instance with threaded_execution=false:

SQL> sho parameters threaded_execution

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
threaded_execution		     boolean	 FALSE
SQL> select count(1) from sales2;

  COUNT(1)
----------
  58805952

SQL> select count(distinct file# || block#)
  2  from v$bh b, dba_objects o
  3  where b.objd=o.data_object_id
  4  and   o.object_name='SALES2'
  5  and   b.status!='free';

COUNT(DISTINCTFILE#||BLOCK#)
----------------------------
		      283624

SQL> select blocks from user_segments where segment_name='SALES2';

    BLOCKS
----------
    284672

Oracle instance with threaded_execution=true:

SQL> sho parameters threaded_execution

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
threaded_execution		     boolean	 TRUE
SQL> select count(1) from sales2;

  COUNT(1)
----------
  58805952

SQL> select count(distinct file# || block#)
  2  from v$bh b, dba_objects o
  3  where b.objd=o.data_object_id
  4  and   o.object_name='SALES2'
  5  and   b.status!='free';

COUNT(DISTINCTFILE#||BLOCK#)
----------------------------
		      283624

SQL> select blocks from user_segments where segment_name='SALES2';

    BLOCKS
----------
    284672

As you can see, I have the same table (SALES2) on both instances, which is cached in buffer cache.

Now let’s test some kind of query, that access only this one table:

Oracle instance with threaded_execution=false:

SQL> set timing on
SQL> get bigsort
  1  with v_sales as
  2  (
  3    select s.*, row_number() over (order by amount_sold desc) as rn
  4    from sales2 s
  5  )
  6  select *
  7  from v_sales
  8* where rn<=2
SQL> /

   PROD_ID    CUST_ID TIME_ID	CHANNEL_ID   PROMO_ID QUANTITY_SOLD AMOUNT_SOLD
---------- ---------- --------- ---------- ---------- ------------- -----------
	RN
----------
	18	  510 03-JAN-99 	 3	  999		  1	1782.72
	 1

	18	 1330 03-JAN-99 	 3	  999		  1	1782.72
	 2


Elapsed: 00:00:11.98

Oracle instance with threaded_execution=true:

SQL> set timing on
SQL> get bigsort
  1  with v_sales as
  2  (
  3    select s.*, row_number() over (order by amount_sold desc) as rn
  4    from sales2 s
  5  )
  6  select *
  7  from v_sales
  8* where rn<=2
SQL> /

   PROD_ID    CUST_ID TIME_ID	CHANNEL_ID   PROMO_ID QUANTITY_SOLD AMOUNT_SOLD
---------- ---------- --------- ---------- ---------- ------------- -----------
	RN
----------
	18	  510 03-JAN-99 	 3	  999		  1	1782.72
	 1

	18	 1330 03-JAN-99 	 3	  999		  1	1782.72
	 2


Elapsed: 00:00:12.58

I used perf to analyze those two executions and I found a lot more syscalls, when threaded execution was used. I noticed also that in threaded execution, oracle is using nsmore2recv function. When I googled it, I found a great article by Tanel Poder: https://blog.tanelpoder.com/2008/02/05/oracle-hidden-costs-revealed-part-1/

Using strace revealed similar situation to this described by Tanel – a lot of POLL syscalls.

Oracle instance with threaded_execution=false:

[root@rico ~]# strace -cp 12168
Process 12168 attached
^CProcess 12168 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0.00    0.000000           0         2           read
  0.00    0.000000           0         2           write
  0.00    0.000000           0         1           semctl
  0.00    0.000000           0       159           getrusage
  0.00    0.000000           0        12           times
  0.00    0.000000           0         3           semtimedop
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000                   179           total

Oracle instance with threaded_execution=true:

[root@rico fd]# strace -cp 12165
Process 12165 attached
^CProcess 12165 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 84.22    0.113706           0    980840           poll
 10.37    0.014000        7000         2           read
  5.41    0.007310        1218         6           semtimedop
  0.00    0.000000           0         2           write
  0.00    0.000000           0         1           semctl
  0.00    0.000000           0       419           getrusage
  0.00    0.000000           0        12           times
------ ----------- ----------- --------- --------- ----------------
100.00    0.135016                981282           total

[root@rico fd]# strace -p 12165 -o /tmp/threaded_exec.out
Process 12165 attached
^CProcess 12165 detached
[root@rico fd]# grep poll /tmp/threaded_exec.out | tail
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)
poll([{fd=63, events=POLLIN|POLLRDNORM}], 1, 0) = 0 (Timeout)

The STID 12165 was signed to SPID 8107:

SQL> get spid
  1  select spid, stid
  2  from v$process p, v$session s
  3  where p.addr=s.paddr
  4* and   s.sid=sys_context('userenv','sid')
SQL> /

SPID			 STID
------------------------ ------------------------
8107			 12165

Let’s check the file descriptors for this thread:

[root@rico ~]# cd /proc/8107/task/12165/fd
[root@rico fd]# ls -al | grep 63
lrwx------. 1 oracle oinstall 64 12-17 21:38 63 -> socket:[73968]
[root@rico fd]# lsof | grep 73968
ora_scmn_  8107    oracle   63u     IPv6              73968        0t0        TCP localhost:ncube-lm->localhost:32400 (ESTABLISHED)
[root@rico fd]# ps aux | grep 8107 | grep -v grep
oracle    8107  4.7 29.0 6155520 2901516 ?     Ssl  20:01   6:54 ora_u005_orclth
[root@rico fd]#

So it looks like, that the process ora_u00N is using sockets to communicate with its threads. That’s why we have seen the additional syscalls, which were an overhead causing slower execution.

So I raise the question: why and when use threaded execution? If ever?


Contact us

Database Whisperers sp. z o. o. sp. k.
al. Jerozolimskie 200, 3rd floor, room 342
02-486 Warszawa
NIP: 5272744987
REGON:362524978
+48 508 943 051
+48 661 966 009
info@ora-600.pl

Newsletter Sign up to be updated