db file scattered read – MBRC and ASYNC IO on linux


17.07.2015
by Kamil Stawiarski

Recently one of my students asked me if there is any correlation between the AU size in ASM diskgroup and parameter db_file_multiblock_read_count. I made some tests but results are very strange.

I have 4 same tables, located in 4 different diskgroups:

1  1  select t.table_name,  g.name, t.blocks, g.ALLOCATION_UNIT_SIZE
2  2  from dba_tables t, dba_data_files d, v$asm_diskgroup g
3  3  where t.tablespace_name=d.tablespace_name
4  4  and   g.name=substr(d.file_name,2,instr(d.file_name,'/')-2)
5  5  and  t.owner='SH'
6  6* and  t.table_name like 'SALES%'
7SQL> /
8 
9TABLE_NAME NAME                   BLOCKS ALLOCATION_UNIT_SIZE
10---------- ------------------------------ ---------- --------------------
11SALES16M   DATA16M                 93551         16777216
12SALES32M   DATA32M                 93551         33554432
13SALES1M    DATA1M                  93551          1048576
14SALES4M    DATA4M                  93551          4194304

All tables are smaller then a hidden parameter "_small_table_threshold" so table access full on all tables will be using db file scattered read.

1SQL> ;
2  1  select p.ksppinm, v.ksppstvl
3  2  from x$ksppi p, x$ksppcv v
4  3  where p.indx=v.indx
5  4* and p.ksppinm='_small_table_threshold'
6SQL> /
7 
8KSPPINM                  KSPPSTVL
9---------------------------------------- ------------------------------
10_small_table_threshold           101527

Let’s start the main test. Parameter db_file_multiblock_read_count is set to the default value

1[oracle@rico ~]$ sqlplus sh/sh
2 
3SQL*Plus: Release 12.1.0.2.0 Production on Fri Jul 17 10:30:36 2015
4 
5Copyright (c) 1982, 2014, Oracle.  All rights reserved.
6 
7Last Successful login time: Wed Jul 01 2015 11:16:33 +02:00
8 
9Connected to:
10Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
11With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
12and Real Application Testing options
13 
14SQL> sho parameters db_file_multiblock
15 
16NAME                     TYPE    VALUE
17------------------------------------ ----------- ------------------------------
18db_file_multiblock_read_count        integer     128
19SQL>

I will identify the server process, correlated with my session

1SQL> ;
2  1  select spid
3  2  from v$process p, v$session s
4  3  where p.addr=s.paddr
5  4* and s.username='SH'
6SQL> /
7 
8SPID
9------------------------
1010430

Now I will use strace on this server process:

1[root@rico ~]# strace -c -p 10430
2Process 10430 attached - interrupt to quit

I will gather results for the following queries:

1SQL> select count(1) from SALES16M;
2 
3  COUNT(1)
4----------
5  19295703
6 
7SQL> select count(1) from SALES32M;
8 
9  COUNT(1)
10----------
11  19295703
12 
13SQL> select count(1) from SALES1M;
14 
15  COUNT(1)
16----------
17  19295703
18 
19SQL> select count(1) from SALES4M;
20 
21  COUNT(1)
22----------
23  19295703

Results for SALES16M

1[root@rico ~]# strace -c -p 10430
2Process 10430 attached - interrupt to quit
3^CProcess 10430 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6100.00    0.053000          71       750           pread
7  0.00    0.000000           0        31           read
8  0.00    0.000000           0         2           write
9  0.00    0.000000           0        16         7 open
10  0.00    0.000000           0         9           close
11  0.00    0.000000           0         1           stat
12  0.00    0.000000           0         1           fstat
13  0.00    0.000000           0        28           lseek
14  0.00    0.000000           0         3           mmap
15  0.00    0.000000           0         2           munmap
16  0.00    0.000000           0         1           brk
17  0.00    0.000000           0         1           uname
18  0.00    0.000000           0        10           fcntl
19  0.00    0.000000           0         1           getrlimit
20  0.00    0.000000           0       164           getrusage
21  0.00    0.000000           0        28           times
22  0.00    0.000000           0         1           getuid
23  0.00    0.000000           0         1           io_getevents
24  0.00    0.000000           0         1           io_submit
25------ ----------- ----------- --------- --------- ----------------
26100.00    0.053000                  1051         7 total

Results for SALES32M

1[root@rico ~]# strace -c -p 10430
2Process 10430 attached - interrupt to quit
3^CProcess 10430 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 98.16    0.053462          72       747           pread
7  1.84    0.001000         500         2           read
8  0.00    0.000000           0         2           write
9  0.00    0.000000           0         1           open
10  0.00    0.000000           0         1           close
11  0.00    0.000000           0         1           mmap
12  0.00    0.000000           0         3           fcntl
13  0.00    0.000000           0         1           getrlimit
14  0.00    0.000000           0       164           getrusage
15  0.00    0.000000           0        28           times
16------ ----------- ----------- --------- --------- ----------------
17100.00    0.054462                   950           total

Results for SALES1M

1[root@rico ~]# strace -c -p 10430
2Process 10430 attached - interrupt to quit
3^CProcess 10430 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 48.58    0.015000          26       583           io_getevents
7 39.99    0.012349          75       164           pread
8 11.43    0.003528           6       583           io_submit
9  0.00    0.000000           0         3           read
10  0.00    0.000000           0         2           write
11  0.00    0.000000           0         2           open
12  0.00    0.000000           0         2           close
13  0.00    0.000000           0         1           semctl
14  0.00    0.000000           0         3           fcntl
15  0.00    0.000000           0         1           getrlimit
16  0.00    0.000000           0       164           getrusage
17  0.00    0.000000           0        29           times
18  0.00    0.000000           0         3           semtimedop
19------ ----------- ----------- --------- --------- ----------------
20100.00    0.030877                  1540           total

Results for SALES4M

1[root@rico ~]# strace -c -p 10430
2Process 10430 attached - interrupt to quit
3^CProcess 10430 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 95.16    0.043000          65       664           pread
7  2.62    0.001186          14        83           io_submit
8  2.21    0.001000          12        83           io_getevents
9  0.00    0.000000           0         2           read
10  0.00    0.000000           0         2           write
11  0.00    0.000000           0        21           getrusage
12  0.00    0.000000           0        12           times
13------ ----------- ----------- --------- --------- ----------------
14100.00    0.045186                   867           total

As we can see, the ASYNC IO represented by syscalls: io_submit and io_getevents are mostly used while scanning table located in diskgroup with 1MB AU.

At first I thought that this is correlated with db_file_multiblock_read_count. The default value (128) multiplied by 8k (block size for my database) is exactly 1M. So I made a few more tests with altered values for this parameter. First results where promising – I have managed to get ASYNC IO for all of the tables.

But then I made the final (a little stupid :)) test. After creating a session I have altered value of the db_file_multiblock_read_count to 128. Yes: 128. So the same as default:

1[oracle@rico ~]$ sqlplus sh/sh
2 
3SQL*Plus: Release 12.1.0.2.0 Production on Fri Jul 17 10:49:25 2015
4 
5Copyright (c) 1982, 2014, Oracle.  All rights reserved.
6 
7Last Successful login time: Fri Jul 17 2015 10:48:43 +02:00
8 
9Connected to:
10Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
11With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
12and Real Application Testing options
13 
14SQL> sho parameters db_file_multiblock
15 
16NAME                     TYPE    VALUE
17------------------------------------ ----------- ------------------------------
18db_file_multiblock_read_count        integer     128
19SQL> alter session set db_file_multiblock_read_count=128;
20 
21Session altered.

Let’s check the results now:

SALES16M

1[root@rico ~]# strace -c -p 11208
2Process 11208 attached - interrupt to quit
3^CProcess 11208 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 53.96    0.007032           9       747           io_submit
7 46.04    0.006000           8       747           io_getevents
8  0.00    0.000000           0        31           read
9  0.00    0.000000           0         2           write
10  0.00    0.000000           0        16         7 open
11  0.00    0.000000           0         9           close
12  0.00    0.000000           0         1           stat
13  0.00    0.000000           0         1           fstat
14  0.00    0.000000           0        28           lseek
15  0.00    0.000000           0         8           mmap
16  0.00    0.000000           0         1           munmap
17  0.00    0.000000           0         1           brk
18  0.00    0.000000           0         4           pread
19  0.00    0.000000           0         1           uname
20  0.00    0.000000           0        10           fcntl
21  0.00    0.000000           0         1           getrlimit
22  0.00    0.000000           0       164           getrusage
23  0.00    0.000000           0        34           times
24  0.00    0.000000           0         1           getuid
25------ ----------- ----------- --------- --------- ----------------
26100.00    0.013032                  1807         7 total

SALES32M

1[root@rico ~]# strace -c -p 11208
2Process 11208 attached - interrupt to quit
3^CProcess 11208 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 54.18    0.008278          11       746           io_submit
7 45.82    0.007000           9       746           io_getevents
8  0.00    0.000000           0         2           read
9  0.00    0.000000           0         2           write
10  0.00    0.000000           0         1           open
11  0.00    0.000000           0         1           close
12  0.00    0.000000           0         1           pread
13  0.00    0.000000           0         3           fcntl
14  0.00    0.000000           0         1           getrlimit
15  0.00    0.000000           0       164           getrusage
16  0.00    0.000000           0        28           times
17------ ----------- ----------- --------- --------- ----------------
18100.00    0.015278                  1695           total

SALES1M

1[root@rico ~]# strace -c -p 11208
2Process 11208 attached - interrupt to quit
3^CProcess 11208 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 65.36    0.010000          13       749           io_getevents
7 34.64    0.005300           7       746           io_submit
8  0.00    0.000000           0         3           read
9  0.00    0.000000           0         2           write
10  0.00    0.000000           0         2           open
11  0.00    0.000000           0         2           close
12  0.00    0.000000           0         1           pread
13  0.00    0.000000           0         1           semctl
14  0.00    0.000000           0         3           fcntl
15  0.00    0.000000           0         1           getrlimit
16  0.00    0.000000           0       164           getrusage
17  0.00    0.000000           0        28           times
18  0.00    0.000000           0         2           semtimedop
19------ ----------- ----------- --------- --------- ----------------
20100.00    0.015300                  1704           total

SALES4M

1[root@rico ~]# strace -c -p 11208
2Process 11208 attached - interrupt to quit
3^CProcess 11208 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 50.10    0.006000           8       746           io_getevents
7 49.90    0.005975           8       746           io_submit
8  0.00    0.000000           0         2           read
9  0.00    0.000000           0         2           write
10  0.00    0.000000           0         1           pread
11  0.00    0.000000           0        21           getrusage
12  0.00    0.000000           0        12           times
13------ ----------- ----------- --------- --------- ----------------
14100.00    0.011975                  1530           total

So EVERYTHING is being read with ASYNC IO!!! And this not the craziest thing – the other sessions are also using ASYNC IO for those tables now. Without altering the parameter. I have tried to flush buffer_cache and shared_pool, but with no success. The situation goes back to square one after restarting the instance.

I occurs that it takes only one session to set the parameter to any value – even the same value as it already was by default. And when the first query will use db file scattered read in async mode – all the other sessions, querying tables on ASM will also use io_getevents and io_submit instead of pread. (!!!)

1[oracle@rico ~]$ sqlplus sh/sh
2 
3SQL*Plus: Release 12.1.0.2.0 Production on Fri Jul 17 10:59:22 2015
4 
5Copyright (c) 1982, 2014, Oracle.  All rights reserved.
6 
7Last Successful login time: Fri Jul 17 2015 10:56:53 +02:00
8 
9Connected to:
10Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
11With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
12and Real Application Testing options
13 
14SQL> alter session set db_file_multiblock_read_count=128;
15 
16Session altered.
17 
18SQL> select count(1) from sales16m;
19 
20  COUNT(1)
21----------
22  19295703
23 
24SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
25With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
26and Real Application Testing options
27[oracle@rico ~]$ sqlplus sh/sh
28 
29SQL*Plus: Release 12.1.0.2.0 Production on Fri Jul 17 10:59:51 2015
30 
31Copyright (c) 1982, 2014, Oracle.  All rights reserved.
32 
33Last Successful login time: Fri Jul 17 2015 10:59:22 +02:00
34 
35Connected to:
36Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
37With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
38and Real Application Testing options
39 
40SQL> select count(1) from SALES4M;
41 
42  COUNT(1)
43----------
44  19295703
45 
46SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
47With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
48and Real Application Testing options
49[oracle@rico ~]$ sqlplus sh/sh
50 
51SQL*Plus: Release 12.1.0.2.0 Production on Fri Jul 17 11:00:43 2015
52 
53Copyright (c) 1982, 2014, Oracle.  All rights reserved.
54 
55Last Successful login time: Fri Jul 17 2015 10:59:51 +02:00
56 
57Connected to:
58Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
59With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
60and Real Application Testing options
61 
62SQL> select count(1) from sales32m;
63 
64  COUNT(1)
65----------
66  19295703

The appropriate strace results are:

1[root@rico ~]# strace -c -p 11782
2Process 11782 attached - interrupt to quit
3^CProcess 11782 detached
4% time     seconds  usecs/call     calls    errors syscall
5------ ----------- ----------- --------- --------- ----------------
6 53.97    0.006000           8       747           io_getevents
7 45.60    0.005069           7       747           io_submit
8  0.43    0.000048           1        34           times
9  0.00    0.000000           0        31           read
10  0.00    0.000000           0         2           write
11  0.00    0.000000           0        16         7 open
12  0.00    0.000000           0         9           close
13  0.00    0.000000           0         1           stat
14  0.00    0.000000           0         1           fstat
15  0.00    0.000000           0        28           lseek
16  0.00    0.000000           0         7           mmap
17  0.00    0.000000           0         1           munmap
18  0.00    0.000000           0         1           brk
19  0.00    0.000000           0         4           pread
20  0.00    0.000000           0         1           uname
21  0.00    0.000000           0        10           fcntl
22  0.00    0.000000           0         1           getrlimit
23  0.00    0.000000           0       164           getrusage
24  0.00    0.000000           0         1           getuid
25------ ----------- ----------- --------- --------- ----------------
26100.00    0.011117                  1806         7 total
27[root@rico ~]# strace -c -p 11793
28Process 11793 attached - interrupt to quit
29^CProcess 11793 detached
30% time     seconds  usecs/call     calls    errors syscall
31------ ----------- ----------- --------- --------- ----------------
32 51.71    0.007000          10       731           io_getevents
33 48.29    0.006536           9       731           io_submit
34  0.00    0.000000           0        32           read
35  0.00    0.000000           0         2           write
36  0.00    0.000000           0        17         7 open
37  0.00    0.000000           0        10           close
38  0.00    0.000000           0         1           stat
39  0.00    0.000000           0         1           fstat
40  0.00    0.000000           0        28           lseek
41  0.00    0.000000           0         9           mmap
42  0.00    0.000000           0         1           munmap
43  0.00    0.000000           0         1           brk
44  0.00    0.000000           0        17           pread
45  0.00    0.000000           0         1           uname
46  0.00    0.000000           0         1           semctl
47  0.00    0.000000           0        10           fcntl
48  0.00    0.000000           0         1           getrlimit
49  0.00    0.000000           0       209           getrusage
50  0.00    0.000000           0        50           times
51  0.00    0.000000           0         1           getuid
52  0.00    0.000000           0         2           semtimedop
53------ ----------- ----------- --------- --------- ----------------
54100.00    0.013536                  1856         7 total
55[root@rico ~]# strace -c -p 11823
56Process 11823 attached - interrupt to quit
57^CProcess 11823 detached
58% time     seconds  usecs/call     calls    errors syscall
59------ ----------- ----------- --------- --------- ----------------
60 51.65    0.005342           7       731           io_submit
61 48.35    0.005000           7       731           io_getevents
62  0.00    0.000000           0        31           read
63  0.00    0.000000           0         2           write
64  0.00    0.000000           0        16         7 open
65  0.00    0.000000           0         9           close
66  0.00    0.000000           0         1           stat
67  0.00    0.000000           0         1           fstat
68  0.00    0.000000           0        28           lseek
69  0.00    0.000000           0         9           mmap
70  0.00    0.000000           0         1           munmap
71  0.00    0.000000           0         1           brk
72  0.00    0.000000           0        17           pread
73  0.00    0.000000           0         1           uname
74  0.00    0.000000           0        10           fcntl
75  0.00    0.000000           0         1           getrlimit
76  0.00    0.000000           0       209           getrusage
77  0.00    0.000000           0        50           times
78  0.00    0.000000           0         1           getuid
79------ ----------- ----------- --------- --------- ----------------
80100.00    0.010342                  1850         7 total

Results from perf seams to confirm this strange behaviour.

Before setting the parameter:

1[root@rico ~]# perf record -p 13521
2^C[ perf record: Woken up 1 times to write data ]
3[ perf record: Captured and wrote 0.061 MB perf.data (~2644 samples) ]
4 
5[root@rico ~]# perf report | grep aio
6     0.06%  oracle_13521_or  [kernel.kallsyms]   [k] do_aio_read                    
7[root@rico ~]# perf report | grep pread
8     0.06%  oracle_13521_or  libpthread-2.12.so  [.] __pread_nocancel 

After setting the parameter:

1[root@rico ~]# perf record -p 13573
2^C[ perf record: Woken up 1 times to write data ]
3[ perf record: Captured and wrote 0.061 MB perf.data (~2680 samples) ]
4 
5[root@rico ~]# perf report | grep pread
6[root@rico ~]# perf report | grep aio
7     0.18%  oracle_13573_or  libaio.so.1.0.1     [.] 0x0000000000000615             
8     0.06%  oracle_13573_or  oracle              [.] ksfdaio                        
9     0.06%  oracle_13573_or  [kernel.kallsyms]   [k] aio_put_req                    
10     0.06%  oracle_13573_or  [kernel.kallsyms]   [k] aio_run_iocb       

I made tests on: Oracle 12.1.0.2 EE (OEL 6.6) and Oracle 11.2.0.3 EE (OEL 6.3)

So the conclusion is: WTF?!


Contact us

Database Whisperers sp. z o. o. sp. k.
al. Jerozolimskie 200, 3rd floor, room 342
02-486 Warszawa
NIP: 5272744987
REGON:362524978
+48 508 943 051
+48 661 966 009
info@ora-600.pl

Newsletter Sign up to be updated