Oracle SPARC M7 – tracing DAX with DTrace and busstat


27.01.2016
by Kamil Stawiarski

Thanks to Oracle I had a possibility to test the new Sparc M7 with DAX coprocessors to boost In-Memory performance.
You can read about it here and here

My first thought was – how to check if and when the DAX coprocessors are being used?

When you have a POC for Exadata, you want to check wait events for CELL SMART SCAN, to be sure that query offloading occurred.
Unfortunately there are no DAX-specific wait events, that you can check at the database level – so how will we know that DAX coprocessors are working? Should we just believe that queries are being offloaded? Well… there’s always DTrace 😉

It’s time for testing! I have three tables, loaded into In-Memory buffer:

1SQL> ;
2  1  select INMEMORY_SIZE/1024/1024 as im_size, bytes/1024/1024 as p_size, segment_name
3  2  from v$im_segments
4  3  where segment_name in ('LINEORDER','CUSTOMER','SUPPLIER')
5  4* and   owner='SSB'
6SQL> /
7 
8   IM_SIZE     P_SIZE SEGMENT_NAME
9---------- ---------- ------------------------------
10  412.1875 1148.03906 CUSTOMER
11  63580.75 202806.672 LINEORDER
12   31.1875  70.484375 SUPPLIER

Let’s start with something simple – what modules/libraries are used, when querying a table stored in memory?
I will use this DTrace script

1#!/usr/sbin/dtrace -s
2pid$1:::entry
3{
4        @[probemod]=count();
5}

to trace the server process, which will execute this query:

1SQL> ;
2  1  select /*+ NOPARALLEL */ sum(LO_QUANTITY)
3  2  from lineorder
4  3  where lo_tax between 1 and 10
5  4* and LO_SHIPMODE like '%MAIL%'

and the results look like this:

1root@m7:~/ks# ./modules.d 18419
2dtrace: script './modules.d' matched 229096 probes
3^C
4 
5  libm.so.2                                                        16
6  LM1`ld.so.1                                                     846
7  libdax_query.so.1                                             53925
8  libshpkm712.so                                                66563
9  libdax.so.1                                                  146204
10  libc.so.1                                                    511366
11  oracle                                                      4763677

WOW – it’s great! We can see that Oracle is using two DAX-specific libraries – libdax.so.1 and libdax_query.so.1.
Those libraries can be found in this path: /usr/lib/sparcv9/
After closer examination we also find out that libdax_query.so.1 contains functions for offloading OZIP operations – that’s why DAX coprocessors are most effective with MEMCOMPRESS FOR QUERY HIGH algorithm.

1root@m7:~/ks# nm /usr/lib/sparcv9/libdax_query.so.1 | grep ozip
2[202]   |               25220|                 472|FUNC |LOCL |0    |15     |check_ozip_hdr
3[224]   |               40688|                 660|FUNC |LOCL |0    |15     |extract_ozip
4[221]   |               40088|                 600|FUNC |LOCL |0    |15     |unpack_ozip

Another great finding was that the fbt provider, provides functions for tracing dax at system level.

1root@m7:~/ks# dtrace -m fbt:dax -l
2   ID   PROVIDER            MODULE                          FUNCTION NAME
378803        fbt               dax                    dax_hma_unload entry
478804        fbt               dax                    dax_hma_unload return
578805        fbt               dax             dax_hma_unload_finish entry
678806        fbt               dax             dax_hma_unload_finish return
778807        fbt               dax                      dax_hma_exit entry
878808        fbt               dax                      dax_hma_exit return
978809        fbt               dax                          dax_open entry
1078810        fbt               dax                          dax_open return
1178811        fbt               dax                         dax_close entry
1278812        fbt               dax                         dax_close return
1378813        fbt               dax                         dax_ioctl entry
1478814        fbt               dax                         dax_ioctl return
1578815        fbt               dax                        dax_devmap entry
1678816        fbt               dax                        dax_devmap return
1778817        fbt               dax                       dax_getinfo entry
1878818        fbt               dax                       dax_getinfo return
1978819        fbt               dax                        dax_attach entry
2078820        fbt               dax                        dax_attach return
2178821        fbt               dax                        dax_detach entry
2278822        fbt               dax                        dax_detach return
2378823        fbt               dax                    dax_devmap_map entry
2478824        fbt               dax                    dax_devmap_map return
2578825        fbt               dax                 dax_devmap_access entry
2678826        fbt               dax                 dax_devmap_access return
2778827        fbt               dax                    dax_devmap_dup entry
2878828        fbt               dax                    dax_devmap_dup return
2978829        fbt               dax                  dax_devmap_unmap entry
3078830        fbt               dax                  dax_devmap_unmap return
3178831        fbt               dax                   dax_minor_alloc entry
3278832        fbt               dax                   dax_minor_alloc return
3378833        fbt               dax                    dax_minor_free entry
3478834        fbt               dax                    dax_minor_free return
3578835        fbt               dax                     dax_minor_get entry
3678836        fbt               dax                     dax_minor_get return
3778837        fbt               dax                    dax_minor_rele entry
3878838        fbt               dax                    dax_minor_rele return
3978839        fbt               dax                    dax_minor_init entry
4078840        fbt               dax                    dax_minor_init return
4178841        fbt               dax                 dax_hsvc_register entry
4278842        fbt               dax                 dax_hsvc_register return
4378843        fbt               dax               dax_hsvc_unregister entry
4478844        fbt               dax               dax_hsvc_unregister return
4578845        fbt               dax                dax_tsd_destructor entry
4678846        fbt               dax                dax_tsd_destructor return
4778847        fbt               dax         dax_set_suspend_callbacks entry
4878848        fbt               dax         dax_set_suspend_callbacks return
4978849        fbt               dax       dax_unset_suspend_callbacks entry
5078850        fbt               dax       dax_unset_suspend_callbacks return
5178851        fbt               dax                       dax_hv_init entry
5278852        fbt               dax                       dax_hv_init return
5378853        fbt               dax                      dax_hma_init entry
5478854        fbt               dax                      dax_hma_init return
5578855        fbt               dax                      dax_hma_fini entry
5678856        fbt               dax                      dax_hma_fini return
5778857        fbt               dax                  dax_state_create entry
5878858        fbt               dax                  dax_state_create return
5978859        fbt               dax       dax_sfmmu_hash_remove_state entry
6078860        fbt               dax       dax_sfmmu_hash_remove_state return
6178861        fbt               dax                 dax_state_destroy entry
6278862        fbt               dax                 dax_state_destroy return
6378863        fbt               dax            dax_ioctl_ccb_thr_init entry
6478864        fbt               dax            dax_ioctl_ccb_thr_init return
6578865        fbt               dax            dax_ioctl_ccb_thr_fini entry
6678866        fbt               dax            dax_ioctl_ccb_thr_fini return
6778867        fbt               dax              dax_ioctl_ca_dequeue entry
6878868        fbt               dax              dax_ioctl_ca_dequeue return
6978869        fbt               dax                      dax_ctx_fini entry
7078870        fbt               dax                      dax_ctx_fini return
7178871        fbt               dax                     dax_ctx_alloc entry
7278872        fbt               dax                     dax_ctx_alloc return
7378873        fbt               dax              dax_state_add_thread entry
7478874        fbt               dax              dax_state_add_thread return
7578875        fbt               dax               dax_state_add_sfmmu entry
7678876        fbt               dax               dax_state_add_sfmmu return
7778877        fbt               dax          dax_sfmmu_hash_add_state entry
7878878        fbt               dax          dax_sfmmu_hash_add_state return
7978879        fbt               dax                    dax_hma_enable entry
8078880        fbt               dax                    dax_hma_enable return
8178881        fbt               dax           dax_state_remove_thread entry
8278882        fbt               dax           dax_state_remove_thread return
8378883        fbt               dax       dax_ctx_flush_decommit_ccbs entry
8478884        fbt               dax       dax_ctx_flush_decommit_ccbs return
8578885        fbt               dax      dax_validate_ca_dequeue_args entry
8678886        fbt               dax      dax_validate_ca_dequeue_args return
8778887        fbt               dax   dax_ccb_buffer_get_contig_block entry
8878888        fbt               dax   dax_ccb_buffer_get_contig_block return
8978889        fbt               dax           dax_ccb_buffer_decommit entry
9078890        fbt               dax           dax_ccb_buffer_decommit return
9178891        fbt               dax                      dax_ccb_info entry
9278892        fbt               dax                      dax_ccb_info return
9378893        fbt               dax                      dax_ccb_kill entry
9478894        fbt               dax                      dax_ccb_kill return
9578895        fbt               dax                dax_ctx_flush_ccbs entry
9678896        fbt               dax                dax_ctx_flush_ccbs return
9778897        fbt               dax              dax_ccb_flush_contig entry
9878898        fbt               dax              dax_ccb_flush_contig return
9978899        fbt               dax                      dax_ccb_wait entry
10078900        fbt               dax                      dax_ccb_wait return
10178901        fbt               dax         dax_sfmmu_hash_ent_create entry
10278902        fbt               dax         dax_sfmmu_hash_ent_create return
10378903        fbt               dax      dax_sfmmu_hash_ent_add_state entry
10478904        fbt               dax      dax_sfmmu_hash_ent_add_state return
10578905        fbt               dax   dax_sfmmu_hash_ent_remove_state entry
10678906        fbt               dax   dax_sfmmu_hash_ent_remove_state return
10778907        fbt               dax       dax_sfmmu_hash_remove_sfmmu entry
10878908        fbt               dax       dax_sfmmu_hash_remove_sfmmu return
10978909        fbt               dax            dax_state_remove_sfmmu entry
11078910        fbt               dax            dax_state_remove_sfmmu return
11178911        fbt               dax        dax_sfmmu_hash_ent_destroy entry
11278912        fbt               dax        dax_sfmmu_hash_ent_destroy return
11378913        fbt               dax      dax_sfmmu_wait_matching_ccbs entry
11478914        fbt               dax      dax_sfmmu_wait_matching_ccbs return
11578915        fbt               dax               dax_thr_search_ccbs entry
11678916        fbt               dax               dax_thr_search_ccbs return
11778917        fbt               dax                  dax_hma_ccb_wait entry
11878918        fbt               dax                  dax_hma_ccb_wait return
11978919        fbt               dax                    dax_ccb_search entry
12078920        fbt               dax                    dax_ccb_search return
12178921        fbt               dax             dax_ccb_search_contig entry
12278922        fbt               dax             dax_ccb_search_contig return
12378923        fbt               dax               dax_ccb_contains_va entry
12478924        fbt               dax               dax_ccb_contains_va return
12578925        fbt               dax                dax_hma_save_state entry
12678926        fbt               dax                dax_hma_save_state return
12778927        fbt               dax                 dax_ccb_flush_all entry
12878928        fbt               dax                 dax_ccb_flush_all return
12978929        fbt               dax             dax_sfmmu_hash_walker entry
13078930        fbt               dax             dax_sfmmu_hash_walker return
13178931        fbt               dax              dax_sfmmu_flush_ccbs entry
13278932        fbt               dax              dax_sfmmu_flush_ccbs return
13378933        fbt               dax               dax_error_decode_cb entry
13478934        fbt               dax               dax_error_decode_cb return
13578935        fbt               dax                dax_pre_suspend_cb entry
13678936        fbt               dax                dax_pre_suspend_cb return
13778937        fbt               dax               dax_post_suspend_cb entry
13878938        fbt               dax               dax_post_suspend_cb return
13978939        fbt               dax                             _info entry

After some investigation, I’ve noticed that most functions responsible for In-Memory execution are called kdmo*, kdzdcol*, kdml* and *ozip*. So I’ve written a short DTrace script to reveal all the functions names and execution times (nanosec) used in my previous query:

1#!/usr/sbin/dtrace -s
2#pragma D option quiet
3 
4fbt:dax::entry,
5pid$1::kdmo*:entry,
6pid$1::kdzdcol*:entry,
7pid$1::kdml*:entry,
8pid$1::*ozip*:entry,
9pid$1::*dax*:entry
10/pid==$1/
11{
12        self->t=timestamp;
13}
14fbt:dax::return,
15pid$1::kdmo*:return,
16pid$1::kdzdcol*:return,
17pid$1::kdml*:return,
18pid$1::*ozip*:return,
19pid$1::*dax*:return
20/pid==$1 && self->t!=0/
21{
22        @ftimes[probemod,probefunc]=sum(timestamp-self->t);
23        self->t=0;
24}
25END
26{
27        printa("%s:%-40s %@d\n",@ftimes);
28}

Here are the results:

1root@m7:~/ks# ./dax_inmem_f.d 18746
2^C
3dax:dax_hma_enable                           1200
4dax:dax_devmap_map                           1380
5dax:dax_state_add_sfmmu                      1395
6dax:dax_devmap_access                        1500
7dax:dax_state_create                         1635
8dax:dax_sfmmu_hash_ent_add_state             1650
9dax:dax_state_add_thread                     1680
10dax:dax_sfmmu_hash_ent_create                2115
11dax:dax_devmap                               2340
12dax:dax_getinfo                              2415
13oracle:kdmoCostOverheadIMCScan                  3915
14oracle:kdmoCostTraversalPerIMCU                 4065
15oracle:kdmoCostIOPerIMCU                        4110
16oracle:kdmoCostMinMaxPerCU                      4140
17oracle:kdmoCostRowTxnJournal                    4140
18oracle:kdmoCostRowStitchPerCol                  4215
19dax:dax_minor_get                            4605
20oracle:kdmoInvalidInMemRowsFrn                  5055
21oracle:kdzdcol_sizeof                           5190
22libdax.so.1:dax_add_queue                            6045
23oracle:kdmoCacheDecompKeyGenCB                  6315
24oracle:kdmoCacheTabKeyCompCB                    7230
25oracle:kdmoCachePredEvalKeyGenCB                8010
26oracle:kdmoMinMaxPruningPossible                8025
27oracle:kdmoCacheMinMaxPruningKeyGenCB           8565
28dax:dax_ctx_alloc                            8655
29dax:dax_hma_unload                           9915
30oracle:kdmoZeroSegStats                         12690
31oracle:kdmoCacheTabKeyGenCB                     15270
32dax:dax_ccb_contains_va                      57000
33dax:dax_ccb_buffer_decommit                  339030
34dax:dax_ccb_buffer_get_contig_block          362925
35dax:dax_validate_ca_dequeue_args             381630
36dax:dax_ioctl                                486945
37oracle:kdzdcol_skip_slots_rowid                 6434055
38oracle:kdzdcol_has_agg_rowset                   7371105
39oracle:kdzdcol_init_rowid                       7478415
40oracle:kdzdcol_eva_likefast_imc_dict            8026725
41oracle:kdzdcol_has_eva                          9398745
42libdax.so.1:dax_query_copy_scan_operand              10169670
43oracle:kdzdcol_agg_cols_imc_dict_fini_agg       10349370
44oracle:kdmoInitSegStatsInt                      11979975
45oracle:kdmoMinMaxProcessOnePred                 13855200
46oracle:kdzdcol_init_unsep                       13935975
47oracle:kdzdcol_reset_unsep                      18861735
48oracle:kdzdcol_get_vals_imc_dict                19224690
49oracle:kdzdcol_isnull_imc_dict                  21221775
50oracle:kdzdcol_dict_binary_search               22327710
51oracle:kdzdcol_skip_slots                       23420385
52libdax.so.1:dax_submit                               25563135
53oracle:kdzdcol_end_imc                          27531735
54oracle:kdzdcol_end                              27557565
55libdax_query.so.1:dax_advantage                            27981420
56libdax_query.so.1:dax_valid                                28049730
57libdax_query.so.1:dax_available                            28829775
58libdax.so.1:dax_ccbca_mark_read                      28991550
59oracle:kdzdcol_dbas_create_valid_vec            29668080
60oracle:kdzdcol_init_imc_dict                    31321560
61oracle:kdzdcol_reset                            35010705
62oracle:kdzdcol_init                             35573295
63oracle:kdzdcol_get_minval                       40057785
64oracle:kdzdcol_prep                             45411450
65oracle:kdmoDecompGetRate                        46526115
66oracle:kdzdcol_has_nulls                        55654635
67oracle:kdzdcol_get_maxval                       59688045
68oracle:kdmoGetHeaderColCU                       64548000
69oracle:kdmlGetColCUPointer                      64854240
70oracle:kdzdcol_get_vals_unsep_one               76925550
71libdax.so.1:dax_log_msg                              80138355
72oracle:kdzdcol_skip_slots_unsep                 554166045

We can also trace DAX usage at the system level, using just DAX module from FBT provider – user stack examination can give us some information about functions which were offloaded. To do that, we can use this DTrace scirpt:

1#!/usr/sbin/dtrace -s
2#pragma D option quiet
3 
4fbt:dax::entry
5{
6        self->t=timestamp;
7}
8fbt:dax::return
9/self->t!=0/
10{
11        @times[ustack(100,500),probemod,probefunc]=quantize(timestamp-self->t);
12        self->t=0;
13}
14tick-1s
15{
16        trunc(@times,10);
17        printa(@times);
18        trunc(@times);
19}
20 
21END
22{
23        printa(@times);
24        trunc(@times);
25}

Sample results will look like this:

1dax                                                 dax_ioctl
2           value  ------------- Distribution ------------- count
3            4096 |                                         0
4            8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1
5           16384 |                                         0
6 
7 
8 
9              libc.so.1`ioctl+0x8
10              libdax.so.1`dax_read_results+0x264
11              libdax_query.so.1`dax_query_execute+0x160
12              libdax_query.so.1`extract_ozip+0x210
13              libdax_query.so.1`dax_extract+0x1a0
14              libshpkm712.so`kdzk_ozip_decode_dydi+0x8c
15              oracle`kdzk_ozip_decode+0x100
16  dax                                                 dax_ccb_buffer_get_contig_block
17           value  ------------- Distribution ------------- count
18            1024 |                                         0
19            2048 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1
20            4096 |                                         0
21 
22 
23              libc.so.1`ioctl+0x8
24              libdax.so.1`dax_read_results+0x264
25              libdax_query.so.1`dax_query_execute+0x160
26              libdax_query.so.1`extract_ozip+0x210
27              libdax_query.so.1`dax_extract+0x1a0
28              libshpkm712.so`kdzk_ozip_decode_dydi+0x8c
29              oracle`kdzk_ozip_decode+0x100
30  dax                                                 dax_validate_ca_dequeue_args

In the above example we can see that OZIP algorithm was offloaded by DAX coprocessors.

Another way to trace DAX usage is busstat:

1root@m7:~/ks# busstat -e dax
2pic0
3DAX_QRY0_output_valid
4DAX_QRY1_output_valid
5DAX_QRY2_output_valid
6DAX_QRY3_output_valid
7DAX_QRY0_input_valid
8DAX_QRY1_input_valid
9DAX_QRY2_input_valid
10DAX_QRY3_input_valid
11DAX_MSG_MDI_req
12DAX_MSG_MDI_nack
13DAX_MSG_data_xfer
14DAX_MSG_membus_PIO_pull
15DAX_MSG_membus_PIO_req
16DAX_MSG_sequencer_busy_cyc
17DAX_MSG_brkpt_hit
18DAX_STRM_addr_brkpt_hit
19DAX_STRM_qpipe_wr_req
20DAX_STRM_qpipe_rd_req
21DAX_STRM_mpipe_wr_req
22DAX_STRM_mpipe_rd_req
23DAX_STRM_mRTD_RMW_COU
24DAX_STRM_mRTD_RMW_MCU
25DAX_STRM_L3_probe_ack
26DAX_STRM_L3_probe
27DAX_STRM_SCC0_L3_no_credit_cyc
28DAX_STRM_SCC1_L3_no_credit_cyc
29DAX_STRM_SCC2_L3_no_credit_cyc
30DAX_STRM_SCC3_L3_no_credit_cyc
31DAX_STRM_SCC4_L3_no_credit_cyc
32DAX_STRM_SCC5_L3_no_credit_cyc
33DAX_STRM_SCC6_L3_no_credit_cyc
34DAX_STRM_SCC7_L3_no_credit_cyc
35DAX_STRM_COU0_no_wr_credit_cyc
36DAX_STRM_COU1_no_wr_credit_cyc
37DAX_STRM_COU2_no_wr_credit_cyc
38DAX_STRM_COU3_no_wr_credit_cyc
39DAX_STRM_COU0_no_rd_credit_cyc
40DAX_STRM_COU1_no_rd_credit_cyc
41DAX_STRM_COU2_no_rd_credit_cyc
42DAX_STRM_COU3_no_rd_credit_cyc
43DAX_STRM_RMW_req
44DAX_STRM_L3_WRA_req
45DAX_STRM_wr_req
46DAX_STRM_rd_req
47DAX_STRM_L3x_no_credit_cyc
48DAX_STRM_COUx_no_wr_credit_cyc
49DAX_STRM_COUx_no_rd_credit_cyc
50DAX_SCH_pipe_cmd_stall_ops
51DAX_SCH_pipe_cmd_stall_cyc
52DAX_SCH_pipe_cmd_sched
53DAX_SCH_remote_PIO
54DAX_SCH_msgq_full_cyc
55DAX_SCH_queryq_full_cyc
56DAX_SCH_ccb_fetch
57DAX_SCH_msg_cmd_stall_cyc
58DAX_SCH_msg_cmd_sched
59DAX_SCH_query_cmd_stall_cyc
60DAX_SCH_query_cmd_sched
61DAX_no_count
62DAX_all_cyc
63 
64(...SAME OUTPUT FOR PIC1 AND PIC2...)

Those are physical CPU counters that can be used to determine the performance of DAX pipelines. There is no documentation (yet?) for those counters, but thanks to Martin Mueller, I know the meaning of two of them:

  • DAX_SCH_query_cmd_sched (number of query cmds scheduled)
  • DAX_SCH_query_cmd_stall_cyc (# of cycles a query cmd had to stall because of lack resources, a high number would indicate DAX overload)

You can monitor them using this command:

1root@m7:~# busstat -w dax,pic0=DAX_SCH_query_cmd_sched,pic1=DAX_SCH_query_cmd_stall_cyc 5
2time dev                      event0               pic0        event1               pic1        event2               pic2
35    dax0                     DAX_SCH_query_cmd_sched 26          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8963253855
45    dax1                     DAX_SCH_query_cmd_sched 26          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8963338617
55    dax2                     DAX_SCH_query_cmd_sched 25          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8963366741
65    dax3                     DAX_SCH_query_cmd_sched 24          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8963394841
75    dax4                     DAX_SCH_query_cmd_sched 21          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8963417412
85    dax5                     DAX_SCH_query_cmd_sched 27          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8963442430
95    dax6                     DAX_SCH_query_cmd_sched 25          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8963464834
105    dax7                     DAX_SCH_query_cmd_sched 28          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8963495126
1110   dax0                     DAX_SCH_query_cmd_sched 38          DAX_SCH_query_cmd_stall_cyc 0           DAX_all_cyc          8961987804

That’s it for now 🙂

P.S.
If you want to learn about dynamic tracing technics for analyzing Oracle behavior – you should read this awesome article by Luca Canali http://db-blog.web.cern.ch/blog/luca-canali/2016-01-linux-perf-probes-oracle-tracing


Contact us

Database Whisperers sp. z o. o. sp. k.
al. Jerozolimskie 200, 3rd floor, room 342
02-486 Warszawa
NIP: 5272744987
REGON:362524978
+48 508 943 051
+48 661 966 009
info@ora-600.pl

Newsletter Sign up to be updated