PL/SQL Profiling 4: Custom Code Timing

This article describes the use of a custom code timing PL/SQL package, Timer_Set, on two example program structures. The examples are designed to illustrate its behaviour over as many different scenarios as possible, while keeping the examples as simple as possible. It’s based on an article published in March 2013 on the hierarchical profiler and updated shortly thereafter with the inclusion of Oracle’s older flat profiler and of custom code timing. In June 2020 installation and source code were put onto GitHub, and the article was restructured into an overview article with the detail on the three profiling methods as separate articles, of which this is the third. Here are links to the other three articles:

PL/SQL Profiling 1: Overview
PL/SQL Profiling 2: Hierarchical Profiler
PL/SQL Profiling 3: Flat Profiler

All source code, including installation scripts, is available on GitHub:
GitHub: Oracle PL/SQL Profiling

For this custom code timing demonstration I created a new version of the package used by the the Oracle profiling demos, and driver scripts (prefixed ts_) for the same examples.

These use an ‘object-oriented’ timing package that I wrote a few years ago Timer_Set: Oracle PL/SQL code timing module on GitHub to instrument at procedure and section level. It is often considered good practice to implement timing and other instrumentation permanently in production code.

In both examples a new timer set object is created, calls are made to increment timers within the set, and at the end a report on the timings is written to log. The way the timer set operates in general is illustrated by a diagram taken from the GitHub module:

Setup

The GitHub project linked to above includes scripts for setup of prerequisites such as grants and tables, and for installation of the custom code used for this demonstration. As described in the overview article, there are two example scripts profiled.

  • Example 1: General. This covers a wide range of scenarios
  • Example 2: Sleep. This covers the particular example of DBMS_Lock.Sleep

Timer Set Data Model

A logical data model is shown below. There are no physical tables involved.

Example 1: General

The example was descibed in PL/SQL Profiling 1: Overview. The driver script is shown below:

SET TIMING ON
PROMPT B1: A_Calls_B 
DECLARE
  l_call_count       PLS_INTEGER := 0;
BEGIN
  Timer_Set_Test.Init;
  Timer_Set_Test.A_Calls_B(l_call_count);

END;
/
PROMPT SQL: Static DB function call
SELECT Timer_Set_Test.DBFunc
  FROM DUAL;

PROMPT B2: Static DB function; dynamic SQL; object constructor
DECLARE
  l_cur              SYS_REFCURSOR;
  l_ret_val          PLS_INTEGER;
  l_tab_count        Table_Count_Type;

BEGIN

  SELECT Timer_Set_Test.DBFunc
    INTO l_ret_val
    FROM DUAL;

  OPEN l_cur FOR 'SELECT Count(*) FROM all_tab_columns'; 
  Timer_Set_Test.Increment_Time('Open cursor');
  FETCH l_cur INTO l_ret_val; 
  Timer_Set_Test.Increment_Time('Fetch from cursor');
  CLOSE l_cur;
  Timer_Set_Test.Increment_Time('Close cursor');

  l_tab_count := Table_Count_Type('EMP');
  Timer_Set_Test.Increment_Time('Construct object');

END;
/
PROMPT B3: R_Calls_R; write times
DECLARE
  l_call_count       PLS_INTEGER := 0;
BEGIN

  Timer_Set_Test.R_Calls_R(l_call_count);

  Timer_Set_Test.Write_Times;

END;
/
SET TIMING OFF

The script is structured as an anonymous block, B1, then a stand-alone SQL query, followed by two more anonymous blocks, B2 and B3. The timer set is constructed in the first block within the call:

  Timer_Set_Test.Init

The script then increments timers at several points, again through calls to Timer_Set_Test, while Timer_Set_Test itself has timer calls inside. The results of the timing are listed at the end by the call:

  Timer_Set_Test.Write_Times;

Results for Example 1: General

Timer Set: Timer_Set_Test, Constructed at 27 Jun 2020 07:52:51, written at 07:52:52
===================================================================================
Timer                      Elapsed         CPU       Calls       Ela/Call       CPU/Call
----------------------  ----------  ----------  ----------  -------------  -------------
A_Calls_B, section one        0.01        0.02           2        0.00600        0.01000
A_Calls_B, section two        0.03        0.02           2        0.01450        0.01000
B_Calls_A: 2                  0.03        0.03           1        0.03100        0.03000
B_Calls_A: 4                  0.06        0.06           1        0.06100        0.06000
DBFunc                        0.08        0.06           2        0.03850        0.03000
Open cursor                   0.00        0.00           1        0.00100        0.00000
Fetch from cursor             0.27        0.29           1        0.27400        0.29000
Close cursor                  0.00        0.00           1        0.00000        0.00000
Construct object              0.03        0.01           1        0.02500        0.01000
R_Calls_R                     0.04        0.03           2        0.02000        0.01500
(Other)                       0.00        0.00           1        0.00100        0.00000
----------------------  ----------  ----------  ----------  -------------  -------------
Total                         0.55        0.52          15        0.03673        0.03467
----------------------  ----------  ----------  ----------  -------------  -------------
[Timer timed (per call in ms): Elapsed: 0.00971, CPU: 0.01068]

Notes on Output

  • Calls, CPU and elapsed times have been captured at the section level for A_Calls_B
  • Observe that, while R_Calls_R and A_Calls_B aggregate over all calls, B_Calls_A records values by call; this is implemented simply by including a value that changes with call in the timer name:
      Increment_Time('B_Calls_A: ' || x_call_no);
  • The output shows how much of the elapsed time comes from CPU usage; in particular, note that R_Calls_R calls an inlined procedure Rest_a_While that does a square root operation in a loop to consume CPU time, and we can see that elapsed and CPU times are the same
  • The timer set object is designed to be very low footprint; here 9 statements (calls to Increment_Time), plus a small global overhead, produced 10 result lines, plus associated information
  • The Total line values are calculated using timing differences betwee reporting and construction of the timer set
  • The (Other) line values are calculated using the diffeences between the Total line values and the sums of the specific line values
  • The Timer timed line allows the overhead of the timing itself to be estimated
  • The ‘object-oriented’ approach allows multiple programs to be be timed at multiple levels, without interference between timings
  • Formatting such as column widths and decimal places can be specified as parameters in the reporting API call, and here take the default values

Example 2: Sleep

The example was described in PL/SQL Profiling 1: Overview. The driver script is shown below:

SET TIMING ON
PROMPT B1: Construct timer set; DBMS_Lock.Sleep, 3 + 6; insert to trigger_tab; write timer set
DECLARE
  l_timer_set       PLS_INTEGER;
BEGIN
  l_timer_set := Timer_Set.Construct('Profiling DBMS_Lock.Sleep');
  DBMS_Lock.Sleep(3);
  Timer_Set.Increment_Time(l_timer_set, '3 second sleep');

  INSERT INTO trigger_tab VALUES (2, 0.5);
  Timer_Set.Increment_Time(l_timer_set, 'INSERT INTO trigger_tab VALUES (2, 0.5)');

  DBMS_Lock.Sleep(6);
  Timer_Set.Increment_Time(l_timer_set, '6 second sleep');
  Utils.W(Timer_Set.Format_Results(l_timer_set));

END;
/
SET TIMING OFF

The script constructs a timer set, then makes calls to a system procedure, DBMS_Lock.Sleep, which sleeps without using CPU time, then inserts to a table with a Before Insert trigger that calls a custom sleep procedure, Utils.Sleep. Utils.Sleep itself calls DBMS_Lock.Sleep to do non-CPU sleeping and also runs a mathematical operation in a loop to use CPU time. Timers are incremented after each main call, and the results of the timing are writtten out at the end.

Results for Example 2: Sleep

Timer Set: Profiling DBMS_Lock.Sleep, Constructed at 27 Jun 2020 07:53:00, written at 07:53:11
==============================================================================================
Timer                                       Elapsed         CPU       Calls       Ela/Call       CPU/Call
---------------------------------------  ----------  ----------  ----------  -------------  -------------
3 second sleep                                 3.00        0.00           1        3.00000        0.00000
INSERT INTO trigger_tab VALUES (2, 0.5)        2.00        1.00           1        1.99900        1.00000
6 second sleep                                 6.00        0.00           1        6.00000        0.00000
(Other)                                        0.00        0.00           1        0.00000        0.00000
---------------------------------------  ----------  ----------  ----------  -------------  -------------
Total                                         11.00        1.00           4        2.74975        0.25000
---------------------------------------  ----------  ----------  ----------  -------------  -------------
[Timer timed (per call in ms): Elapsed: 0.00980, CPU: 0.00980]

Notes on Output

  • The two direct calls to the sleep procedure DBMS_Lock.Sleep consumed elapsed time of 3 and 6 seconds, as specified in the calls, but no CPU time
  • The insert statement consumed 2 seconds elapsed time, of which 1 second comes from CPU time, matching the field values, which the trigger uses to determine elapsed time and fraction of CPU time

Timer Set Custom Code Timing Feature Summary

We can summarise the features of Timer Set custom code timing in the following points:

  • Results are organised as tables of measures for lists of timers in one or more sets
  • Results are reported at the level of code section, and timing is aggregated between calls to API methods
  • Measures reported are elapsed times, numbers of calls, and CPU times
  • External program units may be included in the profiling: calls can be timed; internal profiling requires code changes
  • Profiling is performed, after initial setup, by means of instrumentation within the program units to be profiled, including an API call to return results

I applied my Timer_Set code timing package to some demo PL/SQL APIs in a Github module that also demonstrtates logging and unit testing. This is described here: Oracle PL/SQL API Demos Github Module






 

PL/SQL Profiling 3: Flat Profiler

This article describes the use of Oracle’s flat PL/SQL profiler (DBMS_PROFILER) on two example program structures. The examples are designed to illustrate its behaviour over as many different scenarios as possible, while keeping the examples as simple as possible. It’s based on an article published in March 2013 on the hierarchical profiler and updated shortly thereafter with the inclusion of Oracle’s older flat profiler and of custom code timing. In June 2020 installation and source code were put onto GitHub, and the article was restructured into an overview article with the detail on the three profiling methods as separate articles, of which this is the second. Here are links to the other three articles:

PL/SQL Profiling 1: Overview
PL/SQL Profiling 2: Hierarchical Profiler
PL/SQL Profiling 4: Custom Code Timing

All source code, including installation scripts, is available on GitHub:
GitHub: Oracle PL/SQL Profiling
Tweet with recording attached

Setup

The flat profiler setup and use is described in the Oracle document DBMS_PROFILER. The GitHub project linked to above includes scripts for setup of prerequisites such as grants and tables, and for installation of the custom code used for this demonstration. As described in the overview article, there are two example scripts profiled.

  • Example 1: General. This covers a wide range of scenarios
  • Example 2: Sleep. This covers the particular example of DBMS_Lock.Sleep

PL/SQL Flat Profiler: Data Model

Example 1: General

The example was descibed in PL/SQL Profiling 1: Overview. The driver script is shown below:

SET TIMING ON
PROMPT B1: Start profiling; A_Calls_B 
VAR RUN_ID NUMBER
DECLARE
  l_call_count       PLS_INTEGER := 0;
  l_result           PLS_INTEGER;
BEGIN

  l_result := DBMS_Profiler.Start_Profiler(
          run_comment => 'Profile for small test program with recursion',
          run_number  => :RUN_ID);

  Utils.W('Start: profile run id = ' || :RUN_ID);
  Prof_Test.A_Calls_B(l_call_count);

END;
/
PROMPT SQL: Static DB function call
SELECT Prof_Test.DBFunc
  FROM DUAL;

PROMPT B2: Static DB function; dynamic SQL; object constructor
DECLARE
  l_cur              SYS_REFCURSOR;
  l_ret_val          PLS_INTEGER;
  l_tab_count        Table_Count_Type;

BEGIN

  SELECT Prof_Test.DBFunc
    INTO l_ret_val
    FROM DUAL;

  OPEN l_cur FOR 'SELECT Count(*) FROM all_tab_columns'; 
  FETCH l_cur INTO l_ret_val; 
  CLOSE l_cur;

  l_tab_count := Table_Count_Type('EMP');

END;
/
PROMPT B3: R_Calls_R; stop profiling
DECLARE
  l_call_count       PLS_INTEGER := 0;
BEGIN

  Prof_Test.R_Calls_R(l_call_count);

  Utils.W('Stop: result = ' || DBMS_Profiler.Stop_Profiler);
  
END;
/
SET TIMING OFF
@dprof_queries :RUN_ID

The script is structured as an anonymous block, B1, then a stand-alone SQL query, followed by two more anonymous blocks, B2 and B3. Profiling is started by a call to DBMS_Profiler.Start_Profiler in the first block.

The last block stops the profiling. The custom queries are run at the end from the script dprof_queries.sql, passing in the run identifier that’s been saved in a bind variable.

Results for Example 1: General

The results in this section come from the script dprof_queries.sql that queries the tables populated by the profiler.

Example 1: General – Run Header

The record produced in the run table, PLSQL_PROFILER_RUNS, was:

 Run Id Time     Seconds  Microsecs
------- -------- ------- ----------
      1 07:52:12   0.766     766000

Example 1: General – Profiler Data Summaries

Profiler data overall summary (PLSQL_PROFILER_DATA):

Seconds  Microsecs    Calls
------- ---------- --------
  0.731     730945       70

Profiler data summary by unit (PLSQL_PROFILER_DATA):

Unit                 Unit# Seconds  Microsecs    Calls
-------------------- ----- ------- ---------- --------
<anonymous>              1   0.000         47        3
<anonymous>              3   0.000         63        2
<anonymous>              4   0.000        103        2
<anonymous>              5   0.402     401821        6
<anonymous>              7   0.000        103        2
<anonymous>              8   0.000         87        3
PROF_TEST                2   0.309     308722       48
TABLE_COUNT_TYPE         6   0.020      20000        4

Example 1: General – Profiler Data by Unit

The records produced in the functions table, PLSQL_PROFILER_DATA, are listed below in order of unit name, then unit number and line number. The table is joined to PLSQL_PROFILER_UNITS to get the unit name and other details, which are then used to outer-join to the system view ALL_SOURCE to get the line of source code for stored units (i.e. not for anonymous blocks).

Seconds  Microsecs   Min S   Max S    Calls Unit                 Unit# Type            Line# Line Text
------- ---------- ------- ------- -------- -------------------- ----- --------------- ----- ------------------------------------------------------------------
  0.000          0   0.000   0.000        0 <anonymous>              1 ANONYMOUS BLOCK     1
  0.000          0   0.000   0.000        0 <anonymous>              1 ANONYMOUS BLOCK     2
  0.000          2   0.000   0.000        0 <anonymous>              1 ANONYMOUS BLOCK     6
  0.000         26   0.000   0.000        1 <anonymous>              1 ANONYMOUS BLOCK    10
  0.000         18   0.000   0.000        1 <anonymous>              1 ANONYMOUS BLOCK    11
  0.000          1   0.000   0.000        1 <anonymous>              1 ANONYMOUS BLOCK    13
  0.000         63   0.000   0.000        2 <anonymous>              3 ANONYMOUS BLOCK     1
  0.000        103   0.000   0.000        2 <anonymous>              4 ANONYMOUS BLOCK     1
  0.000          8   0.000   0.000        0 <anonymous>              5 ANONYMOUS BLOCK     1
  0.000        139   0.000   0.000        1 <anonymous>              5 ANONYMOUS BLOCK     8
  0.151     150687   0.151   0.151        1 <anonymous>              5 ANONYMOUS BLOCK    12
  0.251     250899   0.251   0.251        1 <anonymous>              5 ANONYMOUS BLOCK    13
  0.000         30   0.000   0.000        1 <anonymous>              5 ANONYMOUS BLOCK    14
  0.000         55   0.000   0.000        1 <anonymous>              5 ANONYMOUS BLOCK    16
  0.000          3   0.000   0.000        1 <anonymous>              5 ANONYMOUS BLOCK    18
  0.000        103   0.000   0.000        2 <anonymous>              7 ANONYMOUS BLOCK     1
  0.000          4   0.000   0.000        0 <anonymous>              8 ANONYMOUS BLOCK     1
  0.000          0   0.000   0.000        1 <anonymous>              8 ANONYMOUS BLOCK     2
  0.000         44   0.000   0.000        1 <anonymous>              8 ANONYMOUS BLOCK     5
  0.000         39   0.000   0.000        1 <anonymous>              8 ANONYMOUS BLOCK     7
  0.000          0   0.000   0.000        0 <anonymous>              8 ANONYMOUS BLOCK     9
  0.000          1   0.000   0.000        0 PROF_TEST                2 PACKAGE BODY        1 PACKAGE BODY Prof_Test AS
  0.000          0   0.000   0.000        1 PROF_TEST                2 PACKAGE BODY       44 g_num NUMBER := 0;
  0.000          3   0.000   0.000        0 PROF_TEST                2 PACKAGE BODY       71 PROCEDURE A_Calls_B(
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY       75 x_call_no := x_call_no + 1;
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY       76 PRAGMA INLINE (Rest_a_While, 'YES');
  0.017      16691   0.004   0.013        2 PROF_TEST                2 PACKAGE BODY       77 Rest_a_While(1000 * x_call_no);
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY       79 PRAGMA INLINE (Rest_a_While, 'YES'); -- Both pragmas are required
  0.034      33515   0.009   0.025        2 PROF_TEST                2 PACKAGE BODY       80 Rest_a_While(2000 * x_call_no);
  0.000          1   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY       82 IF x_call_no < 4 THEN
  0.000          1   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY       83 B_Calls_A(x_call_no);
  0.000          2   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY       86 END A_Calls_B;
  0.000          3   0.000   0.000        0 PROF_TEST                2 PACKAGE BODY       94 PROCEDURE B_Calls_A(
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY       98 x_call_no := x_call_no + 1;
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY       99 PRAGMA INLINE (Rest_a_While, 'YES');
  0.126     126418   0.044   0.083        2 PROF_TEST                2 PACKAGE BODY      100 Rest_a_While(5000 * x_call_no);
  0.000          1   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY      102 IF x_call_no < 4 THEN
  0.000          0   0.000   0.000        1 PROF_TEST                2 PACKAGE BODY      103 A_Calls_B(x_call_no);
  0.000          1   0.000   0.000        1 PROF_TEST                2 PACKAGE BODY      106 END B_Calls_A;
  0.000          3   0.000   0.000        0 PROF_TEST                2 PACKAGE BODY      114 PROCEDURE R_Calls_R(
  0.000         40   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY      118 DBMS_Output.Put_Line('In R_Calls_R, x_call_no = ' || x_call_no);
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY      119 x_call_no := x_call_no + 1;
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY      120 PRAGMA INLINE (Rest_a_While, 'YES');
  0.041      40688   0.015   0.026        2 PROF_TEST                2 PACKAGE BODY      121 Rest_a_While(3000 * x_call_no);
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY      123 IF x_call_no < 2 THEN
  0.000          0   0.000   0.000        1 PROF_TEST                2 PACKAGE BODY      124 R_Calls_R(x_call_no);
  0.000          2   0.000   0.000        1 PROF_TEST                2 PACKAGE BODY      127 END R_Calls_R;
  0.000          3   0.000   0.000        0 PROF_TEST                2 PACKAGE BODY      134 FUNCTION DBFunc
  0.000          0   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY      138 PRAGMA INLINE (Rest_a_While, 'YES');
  0.091      91346   0.042   0.049        2 PROF_TEST                2 PACKAGE BODY      139 Rest_a_While(10000);
  0.000          2   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY      141 RETURN 99;
  0.000          2   0.000   0.000        2 PROF_TEST                2 PACKAGE BODY      143 END DBFunc;
  0.000          0   0.000   0.000        1 PROF_TEST                2 PACKAGE BODY      145 END Prof_Test;
  0.000          3   0.000   0.000        0 TABLE_COUNT_TYPE         6 TYPE BODY          50 CONSTRUCTOR FUNCTION Table_Count_Type(
  0.000          1   0.000   0.000        1 TABLE_COUNT_TYPE         6 TYPE BODY          54 SELF.search_str := p_search_str;
  0.020      19991   0.020   0.020        1 TABLE_COUNT_TYPE         6 TYPE BODY          55 SELECT Count(*)
  0.000          3   0.000   0.000        1 TABLE_COUNT_TYPE         6 TYPE BODY          60 RETURN;
  0.000          2   0.000   0.000        1 TABLE_COUNT_TYPE         6 TYPE BODY          62 END;

58 rows selected.

Notes on Output

The manual has notes on interpreting output, DBMS_PROFILER Operational Notes, and we can add some notes here on the output above.

Unit Names and Numbers

Unit numbers appear to be consecutive integers generated at run time in the order in which the units are encountered. Unit names for saved source code units such as packages and types are the names of the stored units. Anonymous blocks of PL/SQL are not stored as saved units and so do not have their own names, and are assigned the name <anonymous>, but separate top-level blocks have distinct unit numbers.

Named Units and Source Lines

Named units such as package and type bodies have source code lines accessible in the system view ALL_SOURCE, and these are displayed in the output above. See the query section below for the join code.

Anonymous Blocks

We can't include the source text for anonymous blocks as it is not stored in the database. However, we can generally work out which unit numbers correspond to which blocks manually, and then use the line numbers to identify the corresponding source lines, with the line numbers being counted from the first line in the block as line 1. For example, line 11 in the output above for unit 1 used about 1ms and can be seen to correspond to the 11'th line in the block labelled B1 in the driver script:

  Prof_Test.A_Calls_B(l_call_count);
PRAGMA INLINE

Calls to the Rest_a_While procedure in package Prof_Test are preceded by the inline pragma that causes the procedure code to be inlined by the PL/SQL engine. In this case we do not see any references to lines numbers 49-59 in Prof_Test where the procedure is located, only to the calling statements on lines 75, 78, 98, 119 and 137.

Example 2: Sleep

The example was descibed in PL/SQL Profiling 1: Overview. The driver script is shown below:

SET TIMING ON
VAR RUN_ID NUMBER
PROMPT B1: Start profiling; DBMS_Lock.Sleep, 3 + 6; stop profiling
DECLARE
  l_result           PLS_INTEGER;
BEGIN

  l_result := DBMS_Profiler.Start_Profiler(
          run_comment => 'Profile for DBMS_Lock.Sleep example',
          run_number  => :RUN_ID);

  Utils.W('Start: profile run id = ' || :RUN_ID);

  DBMS_Lock.Sleep(3);

  INSERT INTO trigger_tab VALUES (2, 0.5);

  DBMS_Lock.Sleep(6);

  Utils.W('Stop: result = ' || DBMS_Profiler.Stop_Profiler);

END;
/
SET TIMING OFF
@dprof_queries :RUN_ID

The script runs the start profiling procedure, then makes calls to a system procedure, DBMS_Lock.Sleep, which sleeps without using CPU time, then inserts to a table with a Before Insert trigger that calls a custom sleep procedure, Utils.Sleep, and finally calls a custom utility that stops the profiling and analyses the trace file created. Utils.Sleep itself calls DBMS_Lock.Sleep to do non-CPU sleeping and also runs a mathematical operation in a loop to use CPU time. The custom queries are run at the end from the script hprof_queries.sql, passing in the run identifier that's been saved in a bind variable.

Results for Example 2: Sleep

The results in this section come from the script dprof_queries.sql that queries the tables populated by the profiler.

This second example was added after I came across an AskTom post concerning a discrepancy between reported times at the aggregate level and detail levels for the flat profiler. I posted a suggestion that using the hierarchical profiler might resolve the problem Try the hierarchical profiler..., and then added this second example to my original article in 2013 (since extended).

Example 2: Sleep - Run Header

The record produced in the run table, PLSQL_PROFILER_RUNS, was:

 Run Id Time     Seconds  Microsecs
------- -------- ------- ----------
      2 07:52:22  11.000   11000000

Example 2: Sleep - Profiler Data Summaries

Profiler data overall summary (PLSQL_PROFILER_DATA):

Seconds  Microsecs    Calls
------- ---------- --------
  0.000        265        7

Profiler data summary by unit (PLSQL_PROFILER_DATA):

Unit                 Unit# Seconds  Microsecs    Calls
-------------------- ----- ------- ---------- --------
<anonymous>              1   0.000        242        5
SLEEP_BI                 2   0.000         23        2

Example 2: Sleep - Profiler Data by Unit and Line

The records produced in the functions table, PLSQL_PROFILER_DATA, are listed below in order of unit name, then unit number and line number.

Seconds  Microsecs   Min S   Max S    Calls Unit                 Unit# Type            Line# Line Text
------- ---------- ------- ------- -------- -------------------- ----- --------------- ----- ------------------------------------------------------------------
  0.000          0   0.000   0.000        0 <anonymous>              1 ANONYMOUS BLOCK     1
  0.000          2   0.000   0.000        0 <anonymous>              1 ANONYMOUS BLOCK     5
  0.000         23   0.000   0.000        1 <anonymous>              1 ANONYMOUS BLOCK     9
  0.000         10   0.000   0.000        1 <anonymous>              1 ANONYMOUS BLOCK    11
  0.000        199   0.000   0.000        1 <anonymous>              1 ANONYMOUS BLOCK    13
  0.000          3   0.000   0.000        1 <anonymous>              1 ANONYMOUS BLOCK    15
  0.000          5   0.000   0.000        1 <anonymous>              1 ANONYMOUS BLOCK    17
  0.000          0   0.000   0.000        0 <anonymous>              1 ANONYMOUS BLOCK    19
  0.000          2   0.000   0.000        0 SLEEP_BI                 2 TRIGGER             1 TRIGGER sleep_bi
  0.000         20   0.000   0.000        1 SLEEP_BI                 2 TRIGGER             2 BEFORE INSERT
  0.000          1   0.000   0.000        1 SLEEP_BI                 2 TRIGGER             4 FOR EACH ROW

11 rows selected.

Notes on Output

Calls to Units with EXECUTE ONLY Access

The manual states "you cannot use the package to profile units for which EXECUTE ONLY access has been granted". In this example there are calls to two units where this applies: the system package DBMS_Lock, and the custom utility package Utils, which is in a different schema (lib) from the one (app) in which the script is run.

In the output above we can see the lines from which the calls are made but nothing within the units called.

Aggregate/Detail Timing Discrepancy

As was noted in the AskTom thread referenced above, where the flat profiler does not provide data for program units, such as DBMS_Lock.Sleep, the timings at the detail level do not add up to the overall time recorded in the runs table. As there were three calls using elapsed time of 11 seconds in total the total recorded in the runs table is 11 seconds, while this 11 seconds is missing from the detail records, which add up to only 265 microseconds in total.

Queries

The queries are in the script dprof_queries.sql. All queries are for a given RUNID, passed in as a sqlplus parameter.

Run Header Query

SELECT runid,
       To_Char(run_date, 'hh24:mi:ss') run_date,
       Round(run_total_time/1000000000, 3) seconds,
       Round(run_total_time/1000, 0)  micro_s
  FROM plsql_profiler_runs
 WHERE runid = &RUN_ID
  • RUN_TOTAL_TIME in nanoseconds is converted to seconds and microseconds

Profiler Data Overall Summary Query

SELECT Round(Sum(dat.total_time/1000000000), 3) seconds,
       Round(Sum(dat.total_time/1000), 0)  micro_s,
       Sum(dat.total_occur) calls
  FROM plsql_profiler_data dat
 WHERE dat.runid = &RUN_ID
  • This query sums the times and calls in PLSQL_PROFILER_DATA for the given RUNID

Profiler Data Summary by Unit Query

SELECT unt.unit_name,
       dat.unit_number,
       Round(Sum(dat.total_time/1000000000), 3) seconds,
       Round(Sum(dat.total_time/1000), 0)  micro_s,
       Sum(dat.total_occur) calls
  FROM plsql_profiler_data dat
  JOIN plsql_profiler_units unt
    ON unt.runid = dat.runid
   AND unt.unit_number = dat.unit_number
 WHERE dat.runid = &RUN_ID
 GROUP BY unt.unit_name,
          dat.unit_number
 ORDER BY 1, 2, 3
  • This query sums the times and calls in PLSQL_PROFILER_DATA by unit name and number, for the given RUNID

Lines Query

SELECT Round(dat.total_time/1000000000, 3) seconds,
       Round(dat.total_time/1000, 0)  micro_s,
       Round(dat.min_time/1000000000, 3) min_secs,
       Round(dat.max_time/1000000000, 3) max_secs,
       dat.total_occur calls,
       unt.unit_name,
       dat.unit_number,
       unt.unit_type,
       dat.line#,
       Trim(src.text) text
  FROM plsql_profiler_data dat
  JOIN plsql_profiler_units unt
    ON unt.runid            = dat.runid
   AND unt.unit_number      = dat.unit_number
  LEFT JOIN all_source      src
    ON src.type             != 'ANONYMOUS BLOCK'
   AND src.name             = unt.unit_name 
   AND src.line             = dat.line#
   AND src.owner            = unt.unit_owner
   AND src.type             = unt.unit_type
 WHERE dat.runid            = &RUN_ID
 ORDER BY unt.unit_name, unt.unit_number, dat.line#
  • This query lists the times and calls in PLSQL_PROFILER_DATA, for the given RUNID
  • PLSQL_PROFILER_UNITS is joined on UNIT_NUMBER
  • The view ALL_SOURCE is outer-joined on UNIT_NAME, UNIT_OWNER, UNIT_TYPE, LINE# to give the source line for stored source
  • Anonymous blocks do not have any saved source lines

Flat Profiler Feature Summary

We can summarise the features of the Flat Profiler in the following points:

  • Results are organised as lists of measures by line
  • Results are reported at the line level
  • Measures reported are elapsed times and numbers of calls, but not CPU times
  • External program units may not be included in the profiling (they are included only if the user can debug the unit)
  • Profiling is performed, after initial setup, by means of before and after API calls, followed by querying of results in tables






 

PL/SQL Profiling 2: Hierarchical Profiler

This article describes the use of Oracle’s hierarchical PL/SQL profiler (DBMS_HProf) on two example program structures. The examples are designed to illustrate its behaviour over as many different scenarios as possible, while keeping the examples as simple as possible. It’s based on an article published in March 2013 on the hierarchical profiler and updated shortly thereafter with the inclusion of Oracle’s older flat profiler and of custom code timing. In June 2020 installation and source code were put onto GitHub, and the article was restructured into an overview article with the detail on the three profiling methods as separate articles, of which this is the first. Here are links to the other three articles:

PL/SQL Profiling 1: Overview
PL/SQL Profiling 3: Flat Profiler
PL/SQL Profiling 4: Custom Code Timing

All source code, including installation scripts, is available on GitHub:
GitHub: Oracle PL/SQL Profiling
Tweet with recording attached

Setup

The hierarchical profiler setup and use is described in the Oracle document Using the PL/SQL Hierarchical Profiler. The GitHub project linked to above includes scripts for setup of prerequisites such as grants and tables, and for installation of the custom code used for this demonstration. As described in the overview article, there are two example scripts profiled.

  • Example 1: General. This covers a wide range of scenarios
  • Example 2: Sleep. This covers the particular example of DBMS_Lock.Sleep

PL/SQL Hierarchical Profiler: Data Model

Example 1: General

The example was descibed in PL/SQL Profiling 1: Overview. The driver script is shown below:

SET TIMING ON
PROMPT B1: Start profiling; A_Calls_B 
DECLARE
  l_call_count       PLS_INTEGER := 0;
BEGIN

  HProf_Utils.Start_Profiling;
  Prof_Test.A_Calls_B(l_call_count);

END;
/
PROMPT SQL: Static DB function call
SELECT Prof_Test.DBFunc
  FROM DUAL;

PROMPT B2: Static DB function; dynamic SQL; object constructor
DECLARE
  l_cur              SYS_REFCURSOR;
  l_ret_val          PLS_INTEGER;
  l_tab_count        Table_Count_Type;

BEGIN

  SELECT Prof_Test.DBFunc
    INTO l_ret_val
    FROM DUAL;

  OPEN l_cur FOR 'SELECT Count(*) FROM all_tab_columns'; 
  FETCH l_cur INTO l_ret_val; 
  CLOSE l_cur;

  l_tab_count := Table_Count_Type('EMP');

END;
/
PROMPT B3: R_Calls_R; stop profiling
VAR RUN_ID NUMBER
DECLARE
  l_call_count       PLS_INTEGER := 0;
BEGIN

  Prof_Test.R_Calls_R(l_call_count);

  :RUN_ID := HProf_Utils.Stop_Profiling(
    p_run_comment => 'Profile for small test program with recursion',
    p_filename    => 'hp_example_general.html');
  Utils.W('Run id is ' || :RUN_ID);
END;
/
SET TIMING OFF
@hprof_queries :RUN_ID

The script is structured as an anonymous block, B1, then a stand-alone SQL query, followed by two more anonymous blocks, B2 and B3. Profiling is started by a call to DBMS_Profiler.Start_Profiler in the first block.

The last block includes a call to a custom utility, HProf_Utils.Stop_Profiling, that stops the profiling and analyses the trace file created in two ways:

  • Writing a (standard) HTML report on the results using a DBMS_HProf call
  • Writing the results to standard tables created during installation using a DBMS_HProf call

The custom queries are run at the end from the script hprof_queries.sql, passing in the run identifier that’s been saved in a bind variable.

Results for Example 1: General

The results in this section come from the script hprof_queries.sql that queries the tables populated by the analyse step.

The record produced in the run table, DBMSHP_RUNS, was:

 Run Id Run Time                      Microsecs Seconds Comment
------- ---------------------------- ---------- ------- ------------------------------------------------------------
     12 02-JUN-20 06.30.37.128000 AM     823833   0.820 Profile for small test program with recursion

The records produced in the functions table, DBMSHP_FUNCTION_INFO, were:

Owner Module            Sy Function                  Line# Subtree MicroS Function MicroS Calls
----- ---------------- --- ------------------------- ----- -------------- --------------- -----
APP   HPROF_UTILS        4 STOP_PROFILING               61             26              26     1
APP   PROF_TEST          5 A_CALLS_B                    71         128199            9631     1
APP   PROF_TEST          6 A_CALLS_B@1                  71          87951           27332     1
APP   PROF_TEST          7 B_CALLS_A                    63         118568           30617     1
APP   PROF_TEST          8 B_CALLS_A@1                  63          60619           60619     1
APP   PROF_TEST          9 DBFUNC                      134          62114           62114     2
APP   PROF_TEST         10 R_CALLS_R                   114          27684            9234     1
APP   PROF_TEST         11 R_CALLS_R@1                 114          18428           18426     1
APP   PROF_TEST         12 __pkg_init                    0              3               3     1
APP   TABLE_COUNT_TYPE  13 TABLE_COUNT_TYPE             50         521332              31     1
APP   TABLE_COUNT_TYPE  22 __static_sql_exec_line55     55         521301          521301     1
SYS   DBMS_HPROF        14 STOP_PROFILING              747              0               0     1
SYS   DBMS_HPROF        23 __static_sql_exec_line700   700             67              67     1
SYS   DBMS_OUTPUT       15 GET_LINES                   180             79              79     3
SYS   DBMS_OUTPUT       16 NEW_LINE                    117              0               0     2
SYS   DBMS_OUTPUT       17 PUT                          77             23              23     2
SYS   DBMS_OUTPUT       18 PUT_LINE                    109             24               1     2
                         1 __anonymous_block             0        1818326             590     5
                         2 __plsql_vm                    0        1848468              17     6
                         3 __plsql_vm@1                  0          31993               4     1
                        19 __dyn_sql_exec_line12        12         133537          133537     1
                        20 __sql_fetch_line13           13        1100678         1100678     1
                        21 __static_sql_exec_line8       8          34400            2407     1

23 rows selected.

The “Subtree MicroS” and “Function MicroS” values are the total times in microseconds for the subtree including function, and function-only processing, respectively.

In the “Function” column, as well as procedure and function names in upper case, there are a number of names with a prefix “__”. The meanings of these are outlined in a table in the Oracle document mentioned above and reproduced below:


Notice that some of the function names have a suffix “@1”. These correspond to recursive calls, as will be clear from the network diagram later.

The records produced in the functions parent-child table, DBMSHP_PARENT_CHILD_INFO, (with joins to DBMSHP_FUNCTION_INFO to get parent and child function details) were:

Parent: Module            Sy Function                       Child:  Module            Sy Function                       Subtree MicroS Function MicroS Calls
------- ---------------- --- ------------------------------ ------- ---------------- --- ------------------------------ -------------- --------------- -----
APP     HPROF_UTILS        4 STOP_PROFILING                 SYS     DBMS_HPROF        14 STOP_PROFILING                              0               0     1
APP     PROF_TEST          5 A_CALLS_B                      APP     PROF_TEST          7 B_CALLS_A                              118568           30617     1
APP     PROF_TEST          7 B_CALLS_A                      APP     PROF_TEST          6 A_CALLS_B@1                             87951           27332     1
APP     PROF_TEST         11 R_CALLS_R@1                    SYS     DBMS_OUTPUT       18 PUT_LINE                                    2               0     1
APP     PROF_TEST          6 A_CALLS_B@1                    APP     PROF_TEST          8 B_CALLS_A@1                             60619           60619     1
APP     PROF_TEST         10 R_CALLS_R                      APP     PROF_TEST         11 R_CALLS_R@1                             18428           18426     1
APP     PROF_TEST         10 R_CALLS_R                      SYS     DBMS_OUTPUT       18 PUT_LINE                                   22               1     1
APP     TABLE_COUNT_TYPE  13 TABLE_COUNT_TYPE               APP     TABLE_COUNT_TYPE  22 __static_sql_exec_line55               521301          521301     1
SYS     DBMS_OUTPUT       18 PUT_LINE                       SYS     DBMS_OUTPUT       17 PUT                                        23              23     2
SYS     DBMS_OUTPUT       18 PUT_LINE                       SYS     DBMS_OUTPUT       16 NEW_LINE                                    0               0     2
                           1 __anonymous_block                                        21 __static_sql_exec_line8                 34400            2407     1
                           1 __anonymous_block                                        20 __sql_fetch_line13                    1100678         1100678     1
                           1 __anonymous_block                                        19 __dyn_sql_exec_line12                  133537          133537     1
                           1 __anonymous_block              SYS     DBMS_OUTPUT       15 GET_LINES                                  79              79     3
                           1 __anonymous_block              APP     TABLE_COUNT_TYPE  13 TABLE_COUNT_TYPE                       521332              31     1
                           1 __anonymous_block              APP     PROF_TEST         10 R_CALLS_R                               27684            9234     1
                           3 __plsql_vm@1                   APP     PROF_TEST          9 DBFUNC                                  31989           31989     1
                           2 __plsql_vm                     APP     PROF_TEST          9 DBFUNC                                  30125           30125     1
                           1 __anonymous_block              APP     HPROF_UTILS        4 STOP_PROFILING                             26              26     1
                           2 __plsql_vm                                                1 __anonymous_block                     1818326             590     5
                          21 __static_sql_exec_line8                                   3 __plsql_vm@1                            31993               4     1

21 rows selected.

The “Subtree MicroS” and “Function MicroS” values are the total times in microseconds for the subtree including function, and function-only processing, respectively, for the child function while called from all instances of the parent.

Network Diagrams for Example 1: General

The DBMSHP_PARENT_CHILD_INFO holds parent-child links between functions, while the function detail information is held in DBMSHP_FUNCTION_INFO. The following three columns occur in both tables (descriptions from the Oracle doc):

  • SUBTREE_ELAPSED_TIME – Elapsed time, in microseconds, for subprogram, including time spent in descendant subprograms
  • FUNCTION_ELAPSED_TIME – Elapsed time, in microseconds, for subprogram, excluding time spent in descendant subprograms
  • CALLS – Number of calls to subprogram

In the DBMSHP_FUNCTION_INFO they are aggregates for the subprogram overall, whereas in the DBMSHP_PARENT_CHILD_INFO they are aggregates for the subprogram when called from its parent.

The data in the tables represent a directed network (or ‘digraph’) of nodes and links. A root node is defined as a node that has no parent node, and we may have isolated root nodes that do not appear in the links table. It can sometimes be useful when querying network data to add a dummy node as a parent of all the root nodes, creating a single root for the network and ensuring all the non-dummy nodes appear as child nodes in the adjusted links set. Here is a diagram of the adjusted network for this example, with just the links and the function numbers included:


Notice that there are four (real) root nodes, two of which are isolated. Notice also that there are two loops in the network, whereby nodes (functions) 9 and 19 can be reached by two different paths each. However, there are no cycles since no node is both ancestor and descendant of another.

We can display more of the information the HProf utility produces in extended versions of the network diagram, where we’ll separate the network into its four subnetworks, shown in two diagrams.



This diagram, of the first subnetwork, has the function names and shows the total elapsed times used by the functions within the nodes, and the elapsed times used by the functions when called by their parents within the links.

Loops and Hierarchies

The diagram shows two loops, where there are two routes between the loop start and end points, indicated by different colours. The second loop has two child nodes coming from the end point, and hierarchical queries (both CONNECT BY and recursive subquery factors in Oracle) cause the links to them to be duplicated. We’ll see how to filter out the duplicates in the queries section below.


This diagram is for the other three subnetworks.

Network Query Output for Example 1: General

Function tree                   Sy Owner Module           Inst.  Subtree MicroS Function MicroS Calls Row
------------------------------ --- ----- ---------------- ------ -------------- --------------- ----- ---
__plsql_vm                       2                                      1848468              17     6   1
  __anonymous_block              1                                      1818326             590     5   2
    __sql_fetch_line13          20                                      1100678         1100678     1   3
    TABLE_COUNT_TYPE            13 APP   TABLE_COUNT_TYPE                521332              31     1   4
      __static_sql_exec_line55  22 APP   TABLE_COUNT_TYPE                521301          521301     1   5
    __dyn_sql_exec_line12       19                                       133537          133537     1   6
    __static_sql_exec_line8     21                                        34400            2407     1   7
      __plsql_vm@1               3                                        31993               4     1   8
        DBFUNC                   9 APP   PROF_TEST        1 of 2          31989           31989     1   9
    R_CALLS_R                   10 APP   PROF_TEST                        27684            9234     1  10
      R_CALLS_R@1               11 APP   PROF_TEST                        18428           18426     1  11
        PUT_LINE                18 SYS   DBMS_OUTPUT      1 of 2              2               0     1  12
          PUT                   17 SYS   DBMS_OUTPUT      1 of 2             23              23     2  13
          NEW_LINE              16 SYS   DBMS_OUTPUT      1 of 2              0               0     2  14
      PUT_LINE                  18 SYS   DBMS_OUTPUT      2 of 2             22               1     1  15
    GET_LINES                   15 SYS   DBMS_OUTPUT                         79              79     3  18
    STOP_PROFILING               4 APP   HPROF_UTILS                         26              26     1  19
      STOP_PROFILING            14 SYS   DBMS_HPROF                           0               0     1  20
  DBFUNC                         9 APP   PROF_TEST        2 of 2          30125           30125     1  21
A_CALLS_B                        5 APP   PROF_TEST                       128199            9631     1  22
  B_CALLS_A                      7 APP   PROF_TEST                       118568           30617     1  23
    A_CALLS_B@1                  6 APP   PROF_TEST                        87951           27332     1  24
      B_CALLS_A@1                8 APP   PROF_TEST                        60619           60619     1  25
__static_sql_exec_line700       23 SYS   DBMS_HPROF                          67              67     1  26
__pkg_init                      12 APP   PROF_TEST                            3               3     1  27

25 rows selected.

The report query script hprof_queries.sql queries the network in two ways, using Oracle’s proprietary Connect By syntax, and also using the ANSI standard recursive subquery factoring. The output above comes from the recursive subquery factoring query. The query code is discussed in a later section.

Notes on Output

Recursive Calls – “@1” Suffix

The hierarchical profiler appends a suffix “@1” on to functions called recursively, as in A_CALLS_B@1 and B_CALLS_A@1 which is an example of mutual recursion between A_CALLS_B and B_CALLS_A.

Anonymous Block (__anonymous_block)

This function corresponds to invocations of anonymous blocks, obviously enough. However, there is an apparent anomaly in the number of calls listed, 5, because the driving program has only three such blocks, and there are none in the called PL/SQL code. I would surmise that the apparent discrepancy arises from the enabling of SERVEROUTPUT, which appears to result in a secondary block being associated with each explicit SQL*Plus block, that issues a call to GET_LINES to process buffered output.

PL/SQL Engine (__plsql_vm)

This function corresponds to external invocations of PL/SQL such as from a SQL*Plus session. There are 6 calls, 5 of them presumably being linked with the external anonymous blocks, and the 6’th with DBFUNC, where a PL/SQL function is called from a SQL statement from SQL*Plus.

Notice that the SQL statement calling a database function from within PL/SQL generates the recursive call to the engine, __plsql_vm@1

Package Initialization (__pkg_init)

The Prof_Test package has a package variable g_num that is initialized to 0 on first invocation, which gives rise to the __pkg_init function in the output.

Second Root (A_CALLS_B)

The above function does not have the __plsql_vm/__anonymous_block ancestry that might be expected because profiling only started within the enclosing block.

Inlined Procedure (Rest_a_While)

I wrote a small procedure, Rest_a_While, to generate some elapsed time in the recursive procedures, but preceded it with the INLINE pragma, an optimization feature new in 11g. This had the desired effect of removing the calls from the profiling output and including the times in the calling procedures. Rest_a_While does not make the obvious call to DBMS_Lock.Sleep because that procedure cannot be inlined. subprogram inlining in 11g provides some analysis of the inlining feature.

Link Duplication

As mentioned earlier, hierarchical queries cause links to be duplicated below any loops, as the queries follow all paths. However, in the output above we have a single record for each of the links, including the additional root node dummy links. This is achieved by filtering out the duplicates in an additional subquery. The “Inst.” column lists the instance number of a function having more than one instance, and the children of any such function are only listed once with the gaps in the “Row” column indicating where duplicates have been suppressed (rows 13 and 14 are omitted here).

Example 2: Sleep

The example was descibed in PL/SQL Profiling 1: Overview. The driver script is shown below:

SET TIMING ON
VAR RUN_ID NUMBER
BEGIN

  HProf_Utils.Start_Profiling;
  DBMS_Lock.Sleep(3);

  DBMS_Lock.Sleep(6);

  :RUN_ID := HProf_Utils.Stop_Profiling(
    p_run_comment => 'Profile for DBMS_Lock.Sleep',
    p_filename    => 'hp_example_sleep.html');
  Utils.W('Run id is ' || :RUN_ID);
END;
/
SET TIMING OFF
@hprof_queries :RUN_ID

The script runs the start profiling procedure, then makes calls to a system procedure, DBMS_Lock.Sleep, which sleeps without using CPU time, then inserts to a table with a Before Insert trigger that calls a custom sleep procedure, Utils.Sleep, and finally calls a custom utility that stops the profiling and analyses the trace file created. Utils.Sleep itself calls DBMS_Lock.Sleep to do non-CPU sleeping and also runs a mathematical operation in a loop to use CPU time. The trace file is analysed in two ways:

  • Writing a (standard) HTML report on the results using a DBMS_HProf call
  • Writing the results to standard tables created during installation using a DBMS_HProf call

The custom queries are run at the end from the script hprof_queries.sql, passing in the run identifier that’s been saved in a bind variable.

Results for Example 2: Sleep

The results in this section come from the script hprof_queries.sql that queries the tables populated by the analyse step.

The record produced in the run table, DBMSHP_RUNS, was:

 Run Id Run Time                      Microsecs Seconds Comment
------- ---------------------------- ---------- ------- ------------------------------------------------------------
     22 18-JUN-20 12.49.56.816000 PM   10997855  11.000 Profile for DBMS_Lock.Sleep

The records produced in the functions table, DBMSHP_FUNCTION_INFO, were:

Owner Module            Sy Function                  Line# Subtree MicroS Function MicroS Calls
----- ---------------- --- ------------------------- ----- -------------- --------------- -----
APP   HPROF_UTILS        2 STOP_PROFILING               61             87              87     1
APP   SLEEP_BI           3 SLEEP_BI                      0        1999602             327     1
LIB   UTILS              4 SLEEP                       351        1999268          999837     1
LIB   UTILS              5 __pkg_init                    0              2               2     1
LIB   UTILS              6 __pkg_init                    0              5               5     1
SYS   DBMS_HPROF         7 STOP_PROFILING              747              0               0     1
SYS   DBMS_HPROF        11 __static_sql_exec_line700   700             77              77     1
SYS   DBMS_LOCK          8 SLEEP                       207        9997626         9997626     3
SYS   DBMS_LOCK          9 __pkg_init                    0              3               3     1
                         1 __plsql_vm                    0        1999606               4     1
                        10 __static_sql_exec_line6       6        2005266            5660     1

11 rows selected.

The “Subtree MicroS” and “Function MicroS” values are the total times in microseconds for the subtree including function, and function-only processing, respectively.

The records produced in the functions parent-child table, DBMSHP_PARENT_CHILD_INFO, (with joins to DBMSHP_FUNCTION_INFO to get parent and child function details) were:

Parent: Module            Sy Function                       Child:  Module            Sy Function                       Subtree MicroS Function MicroS Calls
------- ---------------- --- ------------------------------ ------- ---------------- --- ------------------------------ -------------- --------------- -----
APP     HPROF_UTILS        2 STOP_PROFILING                 SYS     DBMS_HPROF         7 STOP_PROFILING                              0               0     1
APP     SLEEP_BI           3 SLEEP_BI                       LIB     UTILS              5 __pkg_init                                  2               2     1
APP     SLEEP_BI           3 SLEEP_BI                       LIB     UTILS              6 __pkg_init                                  5               5     1
APP     SLEEP_BI           3 SLEEP_BI                       LIB     UTILS              4 SLEEP                                 1999268          999837     1
LIB     UTILS              4 SLEEP                          SYS     DBMS_LOCK          8 SLEEP                                  999431          999431     1
                          10 __static_sql_exec_line6                                   1 __plsql_vm                            1999606               4     1
                           1 __plsql_vm                     APP     SLEEP_BI           3 SLEEP_BI                              1999602             327     1

7 rows selected.

The “Subtree MicroS” and “Function MicroS” values are the total times in microseconds for the subtree including function, and function-only processing, respectively, for the child function while called from all instances of the parent.

Network Diagrams for Example 2: Sleep



The network diagram shows 5 nodes appearing as (real) roots, of which one (8) is also linked as a child.


The extended network diagram with function names and times included shows 9 seconds of time used by the SLEEP function in two root-level calls
and 1 second in a third call as the child of the custom Utils Sleep function.

Network Query Output for Example 2: Sleep

Function tree                   Sy Owner Module           Inst.  Subtree MicroS Function MicroS Calls Row
------------------------------ --- ----- ---------------- ------ -------------- --------------- ----- ---
SLEEP                            8 SYS   DBMS_LOCK        1 of 2        8998195         8998195     2   1
__static_sql_exec_line6         10                                      2005266            5660     1   2
  __plsql_vm                     1                                      1999606               4     1   3
    SLEEP_BI                     3 APP   SLEEP_BI                       1999602             327     1   4
      SLEEP                      4 LIB   UTILS                          1999268          999837     1   5
        SLEEP                    8 SYS   DBMS_LOCK        2 of 2         999431          999431     1   6
      __pkg_init                 6 LIB   UTILS                                5               5     1   7
      __pkg_init                 5 LIB   UTILS                                2               2     1   8
STOP_PROFILING                   2 APP   HPROF_UTILS                         87              87     1   9
  STOP_PROFILING                 7 SYS   DBMS_HPROF                           0               0     1  10
__static_sql_exec_line700       11 SYS   DBMS_HPROF                          77              77     1  11
__pkg_init                       9 SYS   DBMS_LOCK                            3               3     1  12

12 rows selected.

Oracle HProf HTML Reports

A standard HTML report can be produced after the profiling is stopped, by a call to one of the Analyze methods. The custom procedure below calls this method to write the report to a CLOB field, then calls DBMS_XSLProcessor.CLOB2File to write the contents to file.

The second Analyze call populates the tables read by the custom queries.

FUNCTION Stop_Profiling(
            p_run_comment                  VARCHAR2,                       -- run comment
            p_filename                     VARCHAR2) RETURN PLS_INTEGER IS -- file name for html report
  l_report_clob   CLOB;
BEGIN

  DBMS_HProf.Stop_Profiling;
  DBMS_HProf.Analyze(trace_id => g_trace_id, report_clob => l_report_clob);
  DBMS_XSLProcessor.CLOB2File(cl => l_report_clob, flocation => 'INPUT_DIR', fname => p_filename);
  RETURN DBMS_HProf.Analyze(trace_id => g_trace_id, run_comment => p_run_comment);

END Stop_Profiling;

The driver scripts specify the file name hp_example_general.html and hp_example_sleep.html, and these are written to the folder in the Oracle directory INPUT_DIR. The files for the examples are included in the associated GitHub project. The report starts with a list of the included subreports, as in the screenshot below:


Here is a screenshot for the first of the subreports for the general example:

Network Queries

The queries are in the script hprof_queries.sql, and we discuss the network queries in this section. The script also contains simple queries against the base tables. All queries are for a given RUNID, passed in as a sqlplus parameter.

Connect By Query

WITH links (node_fr, node_to, owner, module, function, sub_t, fun_t, calls) AS (
    SELECT  0, fni.symbolid, fni.owner, fni.module, fni.function, fni.subtree_elapsed_time, fni.function_elapsed_time, fni.calls
      FROM dbmshp_function_info fni
     WHERE fni.runid = &RUN_ID
       AND NOT EXISTS (SELECT 1 FROM dbmshp_parent_child_info pci WHERE pci.childsymid = fni.symbolid AND pci.runid = fni.runid)
     UNION ALL
    SELECT pci.parentsymid,
           pci.childsymid,
           fi.owner,
           fi.module ,
           fi.function,
           pci.subtree_elapsed_time,
           pci.function_elapsed_time,
           pci.calls
      FROM dbmshp_parent_child_info  pci
      JOIN dbmshp_function_info      fi 
        ON pci.runid                 = fi.runid 
       AND pci.childsymid            = fi.symbolid
     WHERE pci.runid                 = &RUN_ID
)
SELECT RPAD(' ', (LEVEL-1)*2, ' ') || function ftree,
       node_to sy,
       owner,
       module,
       sub_t,
       fun_t,
       calls
  FROM links
CONNECT BY PRIOR node_to = node_fr
START WITH node_fr = 0
ORDER SIBLINGS BY sub_t DESC, fun_t DESC, calls DESC, node_to

Notes on Connect By Query

Links Subquery

In the basic network diagrams we added a dummy root node so that all real nodes appear as child nodes, and this is implemented in the links subquery here as a union between new links to the real root nodes and the original links. This allows the main query connecting the revised links set to cover all nodes.

Sibling Ordering

The ORDER SIBLINGS BY clause allows us to order sibling nodes by subtree elapsed time descending, with additional tie-break columns.

Link Duplication

The network for the first example has two loops, and the second loop has two child nodes coming from the end point. Hierarchical queries (both CONNECT BY and recursive subquery factors in Oracle) cause the links below the loop to be duplicated as all paths are followed. This is seen in the output below for the first example, with 28 rows compared with 26 rows for the output displayed earlier, using a query with filtering implemented. The query can be extended to filter out the duplicates using analytic functions, as is implemented for the recursive subquery factoring query.

Function tree                   Sy Owner Module           Subtree MicroS Function MicroS Calls
------------------------------ --- ----- ---------------- -------------- --------------- -----
__plsql_vm                       2                               1848468              17     6
  __anonymous_block              1                               1818326             590     5
    __sql_fetch_line13          20                               1100678         1100678     1
    TABLE_COUNT_TYPE            13 APP   TABLE_COUNT_TYPE         521332              31     1
      __static_sql_exec_line55  22 APP   TABLE_COUNT_TYPE         521301          521301     1
    __dyn_sql_exec_line12       19                                133537          133537     1
    __static_sql_exec_line8     21                                 34400            2407     1
      __plsql_vm@1               3                                 31993               4     1
        DBFUNC                   9 APP   PROF_TEST                 31989           31989     1
    R_CALLS_R                   10 APP   PROF_TEST                 27684            9234     1
      R_CALLS_R@1               11 APP   PROF_TEST                 18428           18426     1
        PUT_LINE                18 SYS   DBMS_OUTPUT                   2               0     1
          PUT                   17 SYS   DBMS_OUTPUT                  23              23     2
          NEW_LINE              16 SYS   DBMS_OUTPUT                   0               0     2
      PUT_LINE                  18 SYS   DBMS_OUTPUT                  22               1     1
        PUT                     17 SYS   DBMS_OUTPUT                  23              23     2
        NEW_LINE                16 SYS   DBMS_OUTPUT                   0               0     2
    GET_LINES                   15 SYS   DBMS_OUTPUT                  79              79     3
    STOP_PROFILING               4 APP   HPROF_UTILS                  26              26     1
      STOP_PROFILING            14 SYS   DBMS_HPROF                    0               0     1
  DBFUNC                         9 APP   PROF_TEST                 30125           30125     1
A_CALLS_B                        5 APP   PROF_TEST                128199            9631     1
  B_CALLS_A                      7 APP   PROF_TEST                118568           30617     1
    A_CALLS_B@1                  6 APP   PROF_TEST                 87951           27332     1
      B_CALLS_A@1                8 APP   PROF_TEST                 60619           60619     1
__static_sql_exec_line700       23 SYS   DBMS_HPROF                   67              67     1
__pkg_init                      12 APP   PROF_TEST                     3               3     1

27 rows selected.

Recursive Subquery Factoring Query

WITH pci_sums (childsymid, subtree_elapsed_time, function_elapsed_time, calls) AS (
    SELECT childsymid, Sum(subtree_elapsed_time), 
                       Sum(function_elapsed_time), 
                       Sum(calls)
      FROM dbmshp_parent_child_info pci
     WHERE runid = &RUN_ID
     GROUP BY childsymid
), full_tree (runid, lev, node_id, sub_t, fun_t, calls, link_id) AS (
    SELECT fni.runid, 0, fni.symbolid,
           fni.subtree_elapsed_time - Nvl(psm.subtree_elapsed_time, 0),
           fni.function_elapsed_time - Nvl(psm.function_elapsed_time, 0),
           fni.calls - Nvl(psm.calls, 0), 'root' || ROWNUM
      FROM dbmshp_function_info fni
      LEFT JOIN pci_sums psm
        ON psm.childsymid = fni.symbolid
     WHERE fni.runid = &RUN_ID
       AND fni.calls - Nvl(psm.calls, 0) > 0
     UNION ALL
    SELECT ftr.runid, 
           ftr.lev + 1, 
           pci.childsymid, 
           pci.subtree_elapsed_time, 
           pci.function_elapsed_time, 
           pci.calls,
           pci.parentsymid || '-' || pci.childsymid
      FROM full_tree ftr
      JOIN dbmshp_parent_child_info pci
        ON pci.parentsymid = ftr.node_id
       AND pci.runid = ftr.runid
) SEARCH DEPTH FIRST BY sub_t DESC, fun_t DESC, calls DESC, node_id SET rn
, tree_ranked AS (
    SELECT runid, node_id, lev, rn, 
           sub_t, fun_t, calls, 
           Row_Number () OVER (PARTITION BY node_id ORDER BY rn) node_rn,
           Count (*) OVER (PARTITION BY node_id) node_cnt,
           Row_Number () OVER (PARTITION BY link_id ORDER BY rn) link_rn
      FROM full_tree
)
SELECT RPad (' ', trr.lev*2, ' ') || fni.function ftree,
       fni.symbolid sy, fni.owner, fni.module,
       CASE WHEN trr.node_cnt > 1 THEN trr.node_rn || ' of ' || trr.node_cnt END "Inst.",
       trr.sub_t, trr.fun_t, trr.calls, 
       trr.rn "Row"
  FROM tree_ranked trr
  JOIN dbmshp_function_info fni
    ON fni.symbolid = trr.node_id
   AND fni.runid = trr.runid
 WHERE trr.link_rn = 1
 ORDER BY trr.rn

Notes on Recursive Subquery Factoring Query

Root Nodes and Child Sums Subquery

A function may be called both as a child of another function, and also at the top level, as the second example showed. We detect this situation by counting the number of calls for the functions in the links table (dbmshp_parent_child_info) and comparing with the number of calls in the functions table (dbmshp_function_info). The difference corresponds to the number of root-level calls, and the root-level timings are the timing differences between the function timings and the sums of the link timings.

Link Duplication

As noted earlier, hierarchical queries (both Connect By and recursive subquery factors in Oracle) cause the links below any loops to be duplicated as all paths are followed. In this query an additional subquery has been added, tree_ranked, that ranks the nodes and links by order of appearance. The node rank is used just for information in the main block, while the link rank is used to eliminate duplicate links. Gaps in the “Row” column indicate where duplicates have been suppressed.

It’s worth remembering this because it’s a general feature of SQL for querying hierarchies, and one that seems not to be widely understood. For larger hierarchies it can cause serious performance problems, and may justify a PL/SQL programmed solution that need not suffer the same problem, such as: PL/SQL Pipelined Function for Network Analysis.

Here is a query structure diagram for the recursive subquery factoring query:


I wrote about query diagramming in August 2012: Query Structure Diagramming

Hierarchical Profiler Feature Summary

We can summarise the features of the Hierarchical Profiler in the following points:

  • Results are organised as lists of measures by program unit and by ‘unit-within-caller’, and can be represented in a call hierarchy
  • Results are reported at the level of program unit call or ‘tracked operation’; all calls to a given program unit within a single parent unit are aggregated
  • Measures reported are elapsed times and numbers of calls, but not CPU times
  • External program units are included in the profiling
  • Profiling is performed, after initial setup, by means of before and after API calls, followed by querying of results in tables, or viewing of HTML reports