Exadata – direct path write temp and flash disks


11.02.2016
by Kamil Stawiarski

Some time ago one of my students asked me if temp segments are being written to flash disks on Exadata… Well I wasn’t sure 🙂 But recently I had some time to check it.

Let’s create some query that will generate temp segment:

SQL> ;
  1  with v_cust as
  2  (
  3    select CUST_LAST_NAME, CUST_FIRST_NAME, CUST_CITY,
  4	      count(1) over (partition by cust_city) as cnt
  5    from customers
  6  )
  7  select *
  8  from v_cust
  9* where cnt<=2
SQL> /

no rows selected

SQL> get workarea
  1  select operation_Type, last_tempseg_size/1024/1024
  2  from v$sql_workarea
  3* where sql_id='7c6y1n3tn8rd6'
SQL> /

OPERATION_TYPE		       LAST_TEMPSEG_SIZE/1024/1024
------------------------------ ---------------------------
WINDOW (SORT)					       256
SEGMENT SCAN

Great. Now we have to do some tracing at the cell servers level. The process responsible for all the reads and writes at cells is cellsrv. It’s working in threaded mode, so we will have to follow all forks.

Before we do that, we will save info about all physical disks at each cell to a separate file – we will need this later:

exa1celadm01

[root@exa1celadm01 9317]# cellcli -e list physicaldisk attributes deviceName,diskType > /tmp/cell_disks.out

exa1celadm02

[root@exa1celadm02 9278]# cellcli -e list physicaldisk attributes deviceName,diskType > /tmp/cell_disks.out

exa1celadm03

[root@exa1celadm03 9299]# cellcli -e list physicaldisk attributes deviceName,diskType > /tmp/cell_disks.out

OK. Now we will start the strace at each cell (note that this is my training Exadata and I’m sure that there is no one but me on it right now).

exa1celadm01

[root@exa1celadm01 9317]# cd /proc/9317
[root@exa1celadm01 9317]# strace -fF -o /tmp/cellsrv.trc -p 9317
Process 9317 attached with 139 threads - interrupt to quit

exa1celadm02

[root@exa1celadm02 9278]# cd /proc/9278
[root@exa1celadm02 9278]# strace -fF -o /tmp/cellsrv.trc -p 9278
Process 9278 attached with 139 threads - interrupt to quit

exa1celadm03

[root@exa1celadm03 9299]# cd /proc/9299
[root@exa1celadm03 9299]# strace -fF -o /tmp/cellsrv.trc -p 9299
Process 9299 attached with 139 threads - interrupt to quit

After query execution we can analyze the trace files 🙂

exa1celadm01

[root@exa1celadm01 9317]# grep "pwrite(" /tmp/cellsrv.trc | awk '{print("ls -al task/" $1 "/fd/"substr($2,8,length($2)-8) ) }' | awk '{system($0)}' | awk '{system("grep " $NF " /tmp/cell_disks.out")}' | awk '{names[$0]++;} END {for(dev in names){print dev "\t" names[dev]}}'
	 /dev/sdu 	 FlashDisk	3
	 /dev/sdz 	 FlashDisk	2
	 /dev/sds 	 FlashDisk	3
	 /dev/sdq 	 FlashDisk	3
	 /dev/sdx 	 FlashDisk	3
	 /dev/sdo 	 FlashDisk	3
	 /dev/sdm 	 FlashDisk	2
	 /dev/sdv 	 FlashDisk	3
	 /dev/sdt 	 FlashDisk	2
	 /dev/sdr 	 FlashDisk	2
	 /dev/sdy 	 FlashDisk	3
	 /dev/sdp 	 FlashDisk	2
	 /dev/sdw 	 FlashDisk	3
	 /dev/sdn 	 FlashDisk	3
	 /dev/sdab	 FlashDisk	3
	 /dev/sdaa	 FlashDisk	2

exa1celadm02

[root@exa1celadm02 9278]# grep "pwrite(" /tmp/cellsrv.trc | awk '{print("ls -al task/" $1 "/fd/"substr($2,8,length($2)-8) ) }' | awk '{system($0)}' | awk '{system("grep " $NF " /tmp/cell_disks.out")}' | awk '{names[$0]++;} END {for(dev in names){print dev "\t" names[dev]}}'
	 /dev/sdu 	 FlashDisk	8
	 /dev/sdz 	 FlashDisk	7
	 /dev/sds 	 FlashDisk	8
	 /dev/sdx 	 FlashDisk	7
	 /dev/sdq 	 FlashDisk	7
	 /dev/sdo 	 FlashDisk	7
	 /dev/sdm 	 FlashDisk	7
	 /dev/sdv 	 FlashDisk	7
	 /dev/sdt 	 FlashDisk	7
	 /dev/sdy 	 FlashDisk	7
	 /dev/sdr 	 FlashDisk	8
	 /dev/sdp 	 FlashDisk	7
	 /dev/sdw 	 FlashDisk	7
	 /dev/sdn 	 FlashDisk	7
	 /dev/sdl 	 FlashDisk	7
	 /dev/sdaa	 FlashDisk	7

exa1celadm03

[root@exa1celadm03 9299]# grep "pwrite(" /tmp/cellsrv.trc | awk '{print("ls -al task/" $1 "/fd/"substr($2,8,length($2)-8) ) }' | awk '{system($0)}' | awk '{system("grep " $NF " /tmp/cell_disks.out")}' | awk '{names[$0]++;} END {for(dev in names){print dev "\t" names[dev]}}'
	 /dev/sdu 	 FlashDisk	9
	 /dev/sdz 	 FlashDisk	9
	 /dev/sds 	 FlashDisk	9
	 /dev/sdx 	 FlashDisk	8
	 /dev/sdq 	 FlashDisk	9
	 /dev/sdo 	 FlashDisk	8
	 /dev/sdm 	 FlashDisk	9
	 /dev/sdv 	 FlashDisk	9
	 /dev/sdt 	 FlashDisk	9
	 /dev/sdr 	 FlashDisk	9
	 /dev/sdy 	 FlashDisk	9
	 /dev/sdp 	 FlashDisk	9
	 /dev/sdw 	 FlashDisk	9
	 /dev/sdn 	 FlashDisk	9
	 /dev/sdl 	 FlashDisk	9
	 /dev/sdaa	 FlashDisk	9

We can see that "pwrite" was used only to write on flash disks. I suppose this is the answer we were looking for.


Contact us

Database Whisperers sp. z o. o. sp. k.
al. Jerozolimskie 200, 3rd floor, room 342
02-486 Warszawa
NIP: 5272744987
REGON:362524978
+48 508 943 051
+48 661 966 009
info@ora-600.pl

Newsletter Sign up to be updated