SQL for Period Aggregate Reporting

[There is a recording on this article here: Tweet, and also in my GitHub project, Small SQL projects.]

In financial reporting there is often a requirement to report on sales performance across multiple time periods. For example we might want to know total sales of products for a given month, then for the 3 month period, the year to date, and the 12 month period up to that month, etc.. We might also want to break down associated costs of selling across the same periods. There are several ways to approach implementing such reports in SQL, and in real world applications that may have many columns and periods to report on, different approaches will have different levels of complexity and performance characteristics. In this article I will describe three approaches, starting with perhaps the most obvious one, which is to write a Group By query for each period type and combine them all in a union. This obvious approach involves quite a few repeated scans of the source table, so we will describe two other approaches that are less obvious, but avoid so many scans and should be better performing. We’ll use a simple set of generated test data to illustrate the three queries, and will provide the execution plans associated with each one.

Test Data

We take two products and generate three years of monthly sales and cost values randomly between fixed limits in each case. Here is the generation code, for the table and first product (script c_sales_history.sql):

CREATE TABLE sales_history (
        prod_code               VARCHAR2(30) NOT NULL,
        month_dt                DATE NOT NULL,
        sales_value             NUMBER(10,0),
        sales_cost              NUMBER(10,0),
        CONSTRAINT slh_pk       PRIMARY KEY (prod_code, month_dt)
)
/
PROMPT 3 years random values for PROD_ONE
INSERT INTO sales_history
WITH month_gen AS (
    SELECT LEVEL rn, Add_Months(Trunc(SYSDATE, 'MONTH'), LEVEL - 36) dt
      FROM DUAL
   CONNECT BY LEVEL < 37
)
SELECT 'PROD_ONE', dt,
       DBMS_Random.Value (low => 1000, high => 10000),
       DBMS_Random.Value (low => 100, high => 1000)
  FROM month_gen
/

Here is the table data generated, with running sums of the two measures, that can help in testing the queries:

Sales History Report with Running Sums

Product    Month          Value Value to Date     Cost Cost to Date
---------- ----------- -------- ------------- -------- ------------
PROD_ONE   01-MAR-18      9,004         9,004      800          800
           01-APR-18      8,757        17,761      766        1,566
           01-MAY-18      5,748        23,509      835        2,401
           01-JUN-18      6,931        30,440      436        2,837
           01-JUL-18      4,717        35,157      115        2,952
           01-AUG-18      2,309        37,466      478        3,430
           01-SEP-18      1,415        38,881      253        3,683
           01-OCT-18      8,767        47,648      122        3,805
           01-NOV-18      4,560        52,208      547        4,352
           01-DEC-18      2,656        54,864      783        5,135
           01-JAN-19      9,984        64,848      401        5,536
           01-FEB-19      8,131        72,979      381        5,917
           01-MAR-19      2,063        75,042      191        6,108
           01-APR-19      2,933        77,975      147        6,255
           01-MAY-19      7,058        85,033      997        7,252
           01-JUN-19      7,941        92,974      392        7,644
           01-JUL-19      2,270        95,244      949        8,593
           01-AUG-19      2,586        97,830      921        9,514
           01-SEP-19      8,912       106,742      109        9,623
           01-OCT-19      4,018       110,760      272        9,895
           01-NOV-19      8,056       118,816      807       10,702
           01-DEC-19      8,497       127,313      880       11,582
           01-JAN-20      5,266       132,579      888       12,470
           01-FEB-20      1,636       134,215      532       13,002
           01-MAR-20      5,241       139,456      287       13,289
           01-APR-20      8,519       147,975      579       13,868
           01-MAY-20      4,034       152,009      245       14,113
           01-JUN-20      4,633       156,642      107       14,220
           01-JUL-20      8,906       165,548      169       14,389
           01-AUG-20      8,126       173,674      281       14,670
           01-SEP-20      8,319       181,993      487       15,157
           01-OCT-20      7,273       189,266      599       15,756
           01-NOV-20      9,646       198,912      390       16,146
           01-DEC-20      3,988       202,900      933       17,079
           01-JAN-21      2,626       205,526      278       17,357
           01-FEB-21      2,943       208,469      262       17,619
PROD_TWO   01-MAR-18      3,546         3,546      115          115
           01-APR-18      6,627        10,173      525          640
           01-MAY-18      2,892        13,065      877        1,517
           01-JUN-18      8,132        21,197      630        2,147
           01-JUL-18      3,778        24,975      145        2,292
           01-AUG-18      6,258        31,233      109        2,401
           01-SEP-18      2,425        33,658      257        2,658
           01-OCT-18      3,983        37,641      118        2,776
           01-NOV-18      1,181        38,822      971        3,747
           01-DEC-18      2,015        40,837      723        4,470
           01-JAN-19      4,966        45,803      129        4,599
           01-FEB-19      7,711        53,514      770        5,369
           01-MAR-19      1,966        55,480      249        5,618
           01-APR-19      8,129        63,609      689        6,307
           01-MAY-19      1,669        65,278      358        6,665
           01-JUN-19      6,936        72,214      779        7,444
           01-JUL-19      3,023        75,237      515        7,959
           01-AUG-19      1,039        76,276      593        8,552
           01-SEP-19      1,220        77,496      276        8,828
           01-OCT-19      6,371        83,867      268        9,096
           01-NOV-19      2,142        86,009      438        9,534
           01-DEC-19      2,897        88,906      974       10,508
           01-JAN-20      6,446        95,352      126       10,634
           01-FEB-20      6,163       101,515      983       11,617
           01-MAR-20      3,607       105,122      500       12,117
           01-APR-20      6,558       111,680      659       12,776
           01-MAY-20      3,075       114,755      965       13,741
           01-JUN-20      9,317       124,072      451       14,192
           01-JUL-20      2,639       126,711      476       14,668
           01-AUG-20      2,143       128,854      356       15,024
           01-SEP-20      2,347       131,201      507       15,531
           01-OCT-20      4,090       135,291      580       16,111
           01-NOV-20      5,598       140,889      630       16,741
           01-DEC-20      6,018       146,907      264       17,005
           01-JAN-21      4,339       151,246      935       17,940
           01-FEB-21      2,860       154,106      232       18,172

72 rows selected.

Group By Union Query

Here is the first query:

SELECT /*+ gather_plan_statistics XPLAN_MARKER_UGB */ 
     month_dt, prod_code, 'P1 - 1 Month' per_tp, sales_value, sales_cost
  FROM sales_history
UNION ALL
SELECT drv.month_dt, drv.prod_code, 'P2 - 3 Months', Sum(msr.sales_value), Sum(msr.sales_cost)
  FROM sales_history drv
  JOIN sales_history msr
  ON msr.prod_code = drv.prod_code
   AND msr.month_dt BETWEEN Add_Months (drv.month_dt, -2) AND drv.month_dt
 GROUP BY drv.prod_code, drv.month_dt
UNION ALL
SELECT drv.month_dt, drv.prod_code, 'P3 - YTD', Sum(msr.sales_value), Sum(msr.sales_cost)
  FROM sales_history drv
  JOIN sales_history msr
  ON msr.prod_code = drv.prod_code
   AND msr.month_dt BETWEEN Trunc(drv.month_dt, 'YEAR') AND drv.month_dt
 GROUP BY drv.prod_code, drv.month_dt
UNION ALL
SELECT drv.month_dt, drv.prod_code, 'P4 - 1 Year', Sum(msr.sales_value), Sum(msr.sales_cost)
  FROM sales_history drv
  JOIN sales_history msr
  ON msr.prod_code = drv.prod_code
   AND msr.month_dt BETWEEN Add_Months (drv.month_dt, -11) AND drv.month_dt
 GROUP BY drv.prod_code, drv.month_dt
 ORDER BY 1, 2, 3

Notes on Group By Union Query

  • First union member subquery does not aggregate, and includes the label for the period type
  • The remainig aggregation subqueries drive from one scan of the table and join a second instance
  • The second instance has the date range for the period type in its join condition
  • The gather_plan_statistics hint allows capture of plan statistics
  • The XPLAN_MARKER_UGB is a string used to identify the SQL id to pass to the API for displaying the plan

Here are the results from the first query (script period_agg_report_queries.sql), which are the same for the other two queries.

Periods Report by Union of Group Bys

Month       Product    Period             Value     Cost
----------- ---------- --------------- -------- --------
01-MAR-18   PROD_ONE   P1 - 1 Month       9,004      800
                       P2 - 3 Months      9,004      800
                       P3 - YTD           9,004      800
                       P4 - 1 Year        9,004      800
            PROD_TWO   P1 - 1 Month       3,546      115
                       P2 - 3 Months      3,546      115
                       P3 - YTD           3,546      115
                       P4 - 1 Year        3,546      115
01-APR-18   PROD_ONE   P1 - 1 Month       8,757      766
                       P2 - 3 Months     17,761    1,566
                       P3 - YTD          17,761    1,566
                       P4 - 1 Year       17,761    1,566
            PROD_TWO   P1 - 1 Month       6,627      525
                       P2 - 3 Months     10,173      640
                       P3 - YTD          10,173      640
                       P4 - 1 Year       10,173      640
01-MAY-18   PROD_ONE   P1 - 1 Month       5,748      835
                       P2 - 3 Months     23,509    2,401
                       P3 - YTD          23,509    2,401
                       P4 - 1 Year       23,509    2,401
            PROD_TWO   P1 - 1 Month       2,892      877
                       P2 - 3 Months     13,065    1,517
                       P3 - YTD          13,065    1,517
                       P4 - 1 Year       13,065    1,517
01-JUN-18   PROD_ONE   P1 - 1 Month       6,931      436
                       P2 - 3 Months     21,436    2,037
                       P3 - YTD          30,440    2,837
                       P4 - 1 Year       30,440    2,837
            PROD_TWO   P1 - 1 Month       8,132      630
                       P2 - 3 Months     17,651    2,032
                       P3 - YTD          21,197    2,147
                       P4 - 1 Year       21,197    2,147
01-JUL-18   PROD_ONE   P1 - 1 Month       4,717      115
                       P2 - 3 Months     17,396    1,386
                       P3 - YTD          35,157    2,952
                       P4 - 1 Year       35,157    2,952
            PROD_TWO   P1 - 1 Month       3,778      145
                       P2 - 3 Months     14,802    1,652
                       P3 - YTD          24,975    2,292
                       P4 - 1 Year       24,975    2,292
01-AUG-18   PROD_ONE   P1 - 1 Month       2,309      478
                       P2 - 3 Months     13,957    1,029
                       P3 - YTD          37,466    3,430
                       P4 - 1 Year       37,466    3,430
            PROD_TWO   P1 - 1 Month       6,258      109
                       P2 - 3 Months     18,168      884
                       P3 - YTD          31,233    2,401
                       P4 - 1 Year       31,233    2,401
01-SEP-18   PROD_ONE   P1 - 1 Month       1,415      253
                       P2 - 3 Months      8,441      846
                       P3 - YTD          38,881    3,683
                       P4 - 1 Year       38,881    3,683
            PROD_TWO   P1 - 1 Month       2,425      257
                       P2 - 3 Months     12,461      511
                       P3 - YTD          33,658    2,658
                       P4 - 1 Year       33,658    2,658
01-OCT-18   PROD_ONE   P1 - 1 Month       8,767      122
                       P2 - 3 Months     12,491      853
                       P3 - YTD          47,648    3,805
                       P4 - 1 Year       47,648    3,805
            PROD_TWO   P1 - 1 Month       3,983      118
                       P2 - 3 Months     12,666      484
                       P3 - YTD          37,641    2,776
                       P4 - 1 Year       37,641    2,776
01-NOV-18   PROD_ONE   P1 - 1 Month       4,560      547
                       P2 - 3 Months     14,742      922
                       P3 - YTD          52,208    4,352
                       P4 - 1 Year       52,208    4,352
            PROD_TWO   P1 - 1 Month       1,181      971
                       P2 - 3 Months      7,589    1,346
                       P3 - YTD          38,822    3,747
                       P4 - 1 Year       38,822    3,747
01-DEC-18   PROD_ONE   P1 - 1 Month       2,656      783
                       P2 - 3 Months     15,983    1,452
                       P3 - YTD          54,864    5,135
                       P4 - 1 Year       54,864    5,135
            PROD_TWO   P1 - 1 Month       2,015      723
                       P2 - 3 Months      7,179    1,812
                       P3 - YTD          40,837    4,470
                       P4 - 1 Year       40,837    4,470
01-JAN-19   PROD_ONE   P1 - 1 Month       9,984      401
                       P2 - 3 Months     17,200    1,731
                       P3 - YTD           9,984      401
                       P4 - 1 Year       64,848    5,536
            PROD_TWO   P1 - 1 Month       4,966      129
                       P2 - 3 Months      8,162    1,823
                       P3 - YTD           4,966      129
                       P4 - 1 Year       45,803    4,599
01-FEB-19   PROD_ONE   P1 - 1 Month       8,131      381
                       P2 - 3 Months     20,771    1,565
                       P3 - YTD          18,115      782
                       P4 - 1 Year       72,979    5,917
            PROD_TWO   P1 - 1 Month       7,711      770
                       P2 - 3 Months     14,692    1,622
                       P3 - YTD          12,677      899
                       P4 - 1 Year       53,514    5,369
01-MAR-19   PROD_ONE   P1 - 1 Month       2,063      191
                       P2 - 3 Months     20,178      973
                       P3 - YTD          20,178      973
                       P4 - 1 Year       66,038    5,308
            PROD_TWO   P1 - 1 Month       1,966      249
                       P2 - 3 Months     14,643    1,148
                       P3 - YTD          14,643    1,148
                       P4 - 1 Year       51,934    5,503
01-APR-19   PROD_ONE   P1 - 1 Month       2,933      147
                       P2 - 3 Months     13,127      719
                       P3 - YTD          23,111    1,120
                       P4 - 1 Year       60,214    4,689
            PROD_TWO   P1 - 1 Month       8,129      689
                       P2 - 3 Months     17,806    1,708
                       P3 - YTD          22,772    1,837
                       P4 - 1 Year       53,436    5,667
01-MAY-19   PROD_ONE   P1 - 1 Month       7,058      997
                       P2 - 3 Months     12,054    1,335
                       P3 - YTD          30,169    2,117
                       P4 - 1 Year       61,524    4,851
            PROD_TWO   P1 - 1 Month       1,669      358
                       P2 - 3 Months     11,764    1,296
                       P3 - YTD          24,441    2,195
                       P4 - 1 Year       52,213    5,148
01-JUN-19   PROD_ONE   P1 - 1 Month       7,941      392
                       P2 - 3 Months     17,932    1,536
                       P3 - YTD          38,110    2,509
                       P4 - 1 Year       62,534    4,807
            PROD_TWO   P1 - 1 Month       6,936      779
                       P2 - 3 Months     16,734    1,826
                       P3 - YTD          31,377    2,974
                       P4 - 1 Year       51,017    5,297
01-JUL-19   PROD_ONE   P1 - 1 Month       2,270      949
                       P2 - 3 Months     17,269    2,338
                       P3 - YTD          40,380    3,458
                       P4 - 1 Year       60,087    5,641
            PROD_TWO   P1 - 1 Month       3,023      515
                       P2 - 3 Months     11,628    1,652
                       P3 - YTD          34,400    3,489
                       P4 - 1 Year       50,262    5,667
01-AUG-19   PROD_ONE   P1 - 1 Month       2,586      921
                       P2 - 3 Months     12,797    2,262
                       P3 - YTD          42,966    4,379
                       P4 - 1 Year       60,364    6,084
            PROD_TWO   P1 - 1 Month       1,039      593
                       P2 - 3 Months     10,998    1,887
                       P3 - YTD          35,439    4,082
                       P4 - 1 Year       45,043    6,151
01-SEP-19   PROD_ONE   P1 - 1 Month       8,912      109
                       P2 - 3 Months     13,768    1,979
                       P3 - YTD          51,878    4,488
                       P4 - 1 Year       67,861    5,940
            PROD_TWO   P1 - 1 Month       1,220      276
                       P2 - 3 Months      5,282    1,384
                       P3 - YTD          36,659    4,358
                       P4 - 1 Year       43,838    6,170
01-OCT-19   PROD_ONE   P1 - 1 Month       4,018      272
                       P2 - 3 Months     15,516    1,302
                       P3 - YTD          55,896    4,760
                       P4 - 1 Year       63,112    6,090
            PROD_TWO   P1 - 1 Month       6,371      268
                       P2 - 3 Months      8,630    1,137
                       P3 - YTD          43,030    4,626
                       P4 - 1 Year       46,226    6,320
01-NOV-19   PROD_ONE   P1 - 1 Month       8,056      807
                       P2 - 3 Months     20,986    1,188
                       P3 - YTD          63,952    5,567
                       P4 - 1 Year       66,608    6,350
            PROD_TWO   P1 - 1 Month       2,142      438
                       P2 - 3 Months      9,733      982
                       P3 - YTD          45,172    5,064
                       P4 - 1 Year       47,187    5,787
01-DEC-19   PROD_ONE   P1 - 1 Month       8,497      880
                       P2 - 3 Months     20,571    1,959
                       P3 - YTD          72,449    6,447
                       P4 - 1 Year       72,449    6,447
            PROD_TWO   P1 - 1 Month       2,897      974
                       P2 - 3 Months     11,410    1,680
                       P3 - YTD          48,069    6,038
                       P4 - 1 Year       48,069    6,038
01-JAN-20   PROD_ONE   P1 - 1 Month       5,266      888
                       P2 - 3 Months     21,819    2,575
                       P3 - YTD           5,266      888
                       P4 - 1 Year       67,731    6,934
            PROD_TWO   P1 - 1 Month       6,446      126
                       P2 - 3 Months     11,485    1,538
                       P3 - YTD           6,446      126
                       P4 - 1 Year       49,549    6,035
01-FEB-20   PROD_ONE   P1 - 1 Month       1,636      532
                       P2 - 3 Months     15,399    2,300
                       P3 - YTD           6,902    1,420
                       P4 - 1 Year       61,236    7,085
            PROD_TWO   P1 - 1 Month       6,163      983
                       P2 - 3 Months     15,506    2,083
                       P3 - YTD          12,609    1,109
                       P4 - 1 Year       48,001    6,248
01-MAR-20   PROD_ONE   P1 - 1 Month       5,241      287
                       P2 - 3 Months     12,143    1,707
                       P3 - YTD          12,143    1,707
                       P4 - 1 Year       64,414    7,181
            PROD_TWO   P1 - 1 Month       3,607      500
                       P2 - 3 Months     16,216    1,609
                       P3 - YTD          16,216    1,609
                       P4 - 1 Year       49,642    6,499
01-APR-20   PROD_ONE   P1 - 1 Month       8,519      579
                       P2 - 3 Months     15,396    1,398
                       P3 - YTD          20,662    2,286
                       P4 - 1 Year       70,000    7,613
            PROD_TWO   P1 - 1 Month       6,558      659
                       P2 - 3 Months     16,328    2,142
                       P3 - YTD          22,774    2,268
                       P4 - 1 Year       48,071    6,469
01-MAY-20   PROD_ONE   P1 - 1 Month       4,034      245
                       P2 - 3 Months     17,794    1,111
                       P3 - YTD          24,696    2,531
                       P4 - 1 Year       66,976    6,861
            PROD_TWO   P1 - 1 Month       3,075      965
                       P2 - 3 Months     13,240    2,124
                       P3 - YTD          25,849    3,233
                       P4 - 1 Year       49,477    7,076
01-JUN-20   PROD_ONE   P1 - 1 Month       4,633      107
                       P2 - 3 Months     17,186      931
                       P3 - YTD          29,329    2,638
                       P4 - 1 Year       63,668    6,576
            PROD_TWO   P1 - 1 Month       9,317      451
                       P2 - 3 Months     18,950    2,075
                       P3 - YTD          35,166    3,684
                       P4 - 1 Year       51,858    6,748
01-JUL-20   PROD_ONE   P1 - 1 Month       8,906      169
                       P2 - 3 Months     17,573      521
                       P3 - YTD          38,235    2,807
                       P4 - 1 Year       70,304    5,796
            PROD_TWO   P1 - 1 Month       2,639      476
                       P2 - 3 Months     15,031    1,892
                       P3 - YTD          37,805    4,160
                       P4 - 1 Year       51,474    6,709
01-AUG-20   PROD_ONE   P1 - 1 Month       8,126      281
                       P2 - 3 Months     21,665      557
                       P3 - YTD          46,361    3,088
                       P4 - 1 Year       75,844    5,156
            PROD_TWO   P1 - 1 Month       2,143      356
                       P2 - 3 Months     14,099    1,283
                       P3 - YTD          39,948    4,516
                       P4 - 1 Year       52,578    6,472
01-SEP-20   PROD_ONE   P1 - 1 Month       8,319      487
                       P2 - 3 Months     25,351      937
                       P3 - YTD          54,680    3,575
                       P4 - 1 Year       75,251    5,534
            PROD_TWO   P1 - 1 Month       2,347      507
                       P2 - 3 Months      7,129    1,339
                       P3 - YTD          42,295    5,023
                       P4 - 1 Year       53,705    6,703
01-OCT-20   PROD_ONE   P1 - 1 Month       7,273      599
                       P2 - 3 Months     23,718    1,367
                       P3 - YTD          61,953    4,174
                       P4 - 1 Year       78,506    5,861
            PROD_TWO   P1 - 1 Month       4,090      580
                       P2 - 3 Months      8,580    1,443
                       P3 - YTD          46,385    5,603
                       P4 - 1 Year       51,424    7,015
01-NOV-20   PROD_ONE   P1 - 1 Month       9,646      390
                       P2 - 3 Months     25,238    1,476
                       P3 - YTD          71,599    4,564
                       P4 - 1 Year       80,096    5,444
            PROD_TWO   P1 - 1 Month       5,598      630
                       P2 - 3 Months     12,035    1,717
                       P3 - YTD          51,983    6,233
                       P4 - 1 Year       54,880    7,207
01-DEC-20   PROD_ONE   P1 - 1 Month       3,988      933
                       P2 - 3 Months     20,907    1,922
                       P3 - YTD          75,587    5,497
                       P4 - 1 Year       75,587    5,497
            PROD_TWO   P1 - 1 Month       6,018      264
                       P2 - 3 Months     15,706    1,474
                       P3 - YTD          58,001    6,497
                       P4 - 1 Year       58,001    6,497
01-JAN-21   PROD_ONE   P1 - 1 Month       2,626      278
                       P2 - 3 Months     16,260    1,601
                       P3 - YTD           2,626      278
                       P4 - 1 Year       72,947    4,887
            PROD_TWO   P1 - 1 Month       4,339      935
                       P2 - 3 Months     15,955    1,829
                       P3 - YTD           4,339      935
                       P4 - 1 Year       55,894    7,306
01-FEB-21   PROD_ONE   P1 - 1 Month       2,943      262
                       P2 - 3 Months      9,557    1,473
                       P3 - YTD           5,569      540
                       P4 - 1 Year       74,254    4,617
            PROD_TWO   P1 - 1 Month       2,860      232
                       P2 - 3 Months     13,217    1,431
                       P3 - YTD           7,199    1,167
                       P4 - 1 Year       52,591    6,555

288 rows selected.

Here is the execution plan:

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |               |      1 |        |    288 |00:00:00.01 |      27 |       |       |          |
|   1 |  SORT ORDER BY        |               |      1 |    225 |    288 |00:00:00.01 |      27 | 27648 | 27648 |24576  (0)|
|   2 |   UNION-ALL           |               |      1 |        |    288 |00:00:00.01 |      27 |       |       |          |
|   3 |    TABLE ACCESS FULL  | SALES_HISTORY |      1 |     72 |     72 |00:00:00.01 |       6 |       |       |          |
|   4 |    HASH GROUP BY      |               |      1 |     51 |     72 |00:00:00.01 |       7 |   934K|   934K| 1401K (0)|
|*  5 |     HASH JOIN         |               |      1 |     67 |    210 |00:00:00.01 |       7 |  1506K|  1506K|  787K (0)|
|   6 |      INDEX FULL SCAN  | SLH_PK        |      1 |     72 |     72 |00:00:00.01 |       1 |       |       |          |
|   7 |      TABLE ACCESS FULL| SALES_HISTORY |      1 |     72 |     72 |00:00:00.01 |       6 |       |       |          |
|   8 |    HASH GROUP BY      |               |      1 |     51 |     72 |00:00:00.01 |       7 |   934K|   934K| 1401K (0)|
|*  9 |     HASH JOIN         |               |      1 |     67 |    428 |00:00:00.01 |       7 |  1506K|  1506K|  789K (0)|
|  10 |      INDEX FULL SCAN  | SLH_PK        |      1 |     72 |     72 |00:00:00.01 |       1 |       |       |          |
|  11 |      TABLE ACCESS FULL| SALES_HISTORY |      1 |     72 |     72 |00:00:00.01 |       6 |       |       |          |
|  12 |    HASH GROUP BY      |               |      1 |     51 |     72 |00:00:00.01 |       7 |   934K|   934K| 1408K (0)|
|* 13 |     HASH JOIN         |               |      1 |     67 |    732 |00:00:00.01 |       7 |  1506K|  1506K|  787K (0)|
|  14 |      INDEX FULL SCAN  | SLH_PK        |      1 |     72 |     72 |00:00:00.01 |       1 |       |       |          |
|  15 |      TABLE ACCESS FULL| SALES_HISTORY |      1 |     72 |     72 |00:00:00.01 |       6 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("MSR"."PROD_CODE"="DRV"."PROD_CODE")
filter(("MSR"."MONTH_DT"<="DRV"."MONTH_DT" AND "MSR"."MONTH_DT">=ADD_MONTHS(INTERNAL_FUNCTION("DRV"."MONTH_DT
"),-2)))
9 - access("MSR"."PROD_CODE"="DRV"."PROD_CODE")
filter(("MSR"."MONTH_DT"<="DRV"."MONTH_DT" AND "MSR"."MONTH_DT">=TRUNC(INTERNAL_FUNCTION("DRV"."MONTH_DT"),'f
myear')))
13 - access("MSR"."PROD_CODE"="DRV"."PROD_CODE")
filter(("MSR"."MONTH_DT"<="DRV"."MONTH_DT" AND "MSR"."MONTH_DT">=ADD_MONTHS(INTERNAL_FUNCTION("DRV"."MONTH_DT
"),-11)))

Notes on Group By Union Query Execution Plan

  • There are 4 full table scans, and 3 index full scans
  • The Buffers value of 27 is a measure of the work done, in logical I/O operations

Analytic Functions and Unpivot Query

Here is the second query:

WITH period_aggs AS (
  SELECT /*+ gather_plan_statistics XPLAN_MARKER_AAG */ 
       month_dt, prod_code, sales_value, 
       Sum(sales_value) OVER (PARTITION BY prod_code ORDER BY month_dt
                  RANGE BETWEEN INTERVAL '2' MONTH PRECEDING AND CURRENT ROW)     sales_value_3m, 
       Sum(sales_value) OVER (PARTITION BY prod_code, Trunc(month_dt, 'YEAR') ORDER BY month_dt
                  RANGE BETWEEN INTERVAL '11' MONTH PRECEDING AND CURRENT ROW)    sales_value_ytd, 
       Sum(sales_value) OVER (PARTITION BY prod_code ORDER BY month_dt
                  RANGE BETWEEN INTERVAL '11' MONTH PRECEDING AND CURRENT ROW)    sales_value_1y, 
       sales_cost,
       Sum(sales_cost) OVER (PARTITION BY prod_code ORDER BY month_dt
                  RANGE BETWEEN INTERVAL '2' MONTH PRECEDING AND CURRENT ROW)     sales_cost_3m, 
       Sum(sales_cost) OVER (PARTITION BY prod_code, Trunc(month_dt, 'YEAR') ORDER BY month_dt
                  RANGE BETWEEN INTERVAL '11' MONTH PRECEDING AND CURRENT ROW)    sales_cost_ytd, 
       Sum(sales_cost) OVER (PARTITION BY prod_code ORDER BY month_dt
                  RANGE BETWEEN INTERVAL '11' MONTH PRECEDING AND CURRENT ROW)    sales_cost_1y
    FROM sales_history
)
SELECT *
  FROM period_aggs
UNPIVOT (
    (sales_value, sales_cost)
    FOR per_tp IN (
      (sales_value, sales_cost)         AS 'P1 - 1 Month',
      (sales_value_3m, sales_cost_3m)   AS 'P2 - 3 Months',
      (sales_value_ytd, sales_cost_ytd) AS 'P3 - YTD',
      (sales_value_1y, sales_cost_1y)   AS 'P4 - 1 Year'
    )
)
 ORDER BY 1, 2, 3

Notes on Analytic Functions and Unpivot Query

  • For each measure a column is added for each period type to do the aggregation via analytic functions
  • The UNPIVOT clause in the main query converts the period type columns into rows with column pair as specified in the first line
  • The column name pair is specified in the first line for the unpivoted row values
  • The ‘FOR per_tp IN’ clauses specifies the name of the period type column with values given in the rows below

Here is the execution plan:

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |               |      1 |        |    288 |00:00:00.01 |       2 |       |       |          |
|   1 |  SORT ORDER BY                    |               |      1 |    288 |    288 |00:00:00.01 |       2 | 27648 | 27648 |24576  (0)|
|*  2 |   VIEW                            |               |      1 |    288 |    288 |00:00:00.01 |       2 |       |       |          |
|   3 |    UNPIVOT                        |               |      1 |        |    288 |00:00:00.01 |       2 |       |       |          |
|   4 |     VIEW                          |               |      1 |     72 |     72 |00:00:00.01 |       2 |       |       |          |
|   5 |      WINDOW SORT                  |               |      1 |     72 |     72 |00:00:00.01 |       2 | 13312 | 13312 |12288  (0)|
|   6 |       WINDOW BUFFER               |               |      1 |     72 |     72 |00:00:00.01 |       2 |  6144 |  6144 | 6144  (0)|
|   7 |        TABLE ACCESS BY INDEX ROWID| SALES_HISTORY |      1 |     72 |     72 |00:00:00.01 |       2 |       |       |          |
|   8 |         INDEX FULL SCAN           | SLH_PK        |      1 |     72 |     72 |00:00:00.01 |       1 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("unpivot_view_006"."SALES_VALUE" IS NOT NULL OR "unpivot_view_006"."SALES_COST" IS NOT NULL))

Notes on Execution Plan for Analytic Functions and Unpivot Query

  • There is a single index full scan and a single table access by index rowid
  • The number of buffers is only 2
  • The plan suggests that this query will be a lot more efficient than the first one

Single Group By with CASE Expressions Query

Here is the third query:

WITH period_list AS (
  SELECT month_dt, prod_code, COLUMN_VALUE per_tp
    FROM TABLE(SYS.ODCIVarchar2List(
          'P1 - 1 Month',
          'P2 - 3 Months',
          'P3 - YTD',
          'P4 - 1 Year')
      )
  CROSS JOIN (SELECT month_dt, prod_code FROM sales_history)
)
SELECT /*+ gather_plan_statistics XPLAN_MARKER_GBC */
       drv.month_dt, drv.prod_code, drv.per_tp,
       Sum( CASE WHEN ( per_tp = 'P1 - 1 Month'  AND msr.month_dt = drv.month_dt ) OR 
                      ( per_tp = 'P2 - 3 Months' AND msr.month_dt >= Add_Months (drv.month_dt, -2) ) OR 
                      ( per_tp = 'P3 - YTD'      AND Trunc (msr.month_dt, 'YEAR') = Trunc (drv.month_dt, 'YEAR') ) OR 
                      ( per_tp = 'P4 - 1 Year'   AND msr.month_dt >= Add_Months (drv.month_dt, -11) )
                 THEN msr.sales_value END) sales_value,
       Sum( CASE WHEN ( per_tp = 'P1 - 1 Month'  AND msr.month_dt = drv.month_dt ) OR 
                      ( per_tp = 'P2 - 3 Months' AND msr.month_dt >= Add_Months (drv.month_dt, -2) ) OR 
                      ( per_tp = 'P3 - YTD'      AND Trunc (msr.month_dt, 'YEAR') = Trunc (drv.month_dt, 'YEAR') ) OR 
                      ( per_tp = 'P4 - 1 Year'   AND msr.month_dt >= Add_Months (drv.month_dt, -11) )
                 THEN msr.sales_cost END) sales_cost
  FROM period_list drv
  JOIN sales_history msr
  ON msr.prod_code = drv.prod_code
   AND msr.month_dt <= drv.month_dt
 GROUP BY drv.prod_code, drv.month_dt, drv.per_tp
 ORDER BY 1, 2, 3

Notes on Single Group By with CASE Expressions Query

  • In the first subquery we add in the period type values for each product and month
  • The main query then includes the extra column in its grouping fields
  • The main query drives from the first subquery, joining the table to aggregate over, and including only records not later than the driving record
  • The CASE expressions within the Sums ensure that a measure is counted in the sum only if its date on the joined table falls in the required range for the period type, relative to the date in the driving subquery
------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |               |      1 |        |    288 |00:00:00.01 |       7 |       |       |          |
|   1 |  SORT GROUP BY                            |               |      1 |     51 |    288 |00:00:00.01 |       7 | 40960 | 40960 |36864  (0)|
|*  2 |   HASH JOIN                               |               |      1 |     10M|   5328 |00:00:00.01 |       7 |  1476K|  1476K|  826K (0)|
|   3 |    INDEX FULL SCAN                        | SLH_PK        |      1 |     72 |     72 |00:00:00.01 |       1 |       |       |          |
|   4 |    MERGE JOIN CARTESIAN                   |               |      1 |    588K|    288 |00:00:00.01 |       6 |       |       |          |
|   5 |     TABLE ACCESS FULL                     | SALES_HISTORY |      1 |     72 |     72 |00:00:00.01 |       6 |       |       |          |
|   6 |     BUFFER SORT                           |               |     72 |   8168 |    288 |00:00:00.01 |       0 |  2048 |  2048 | 2048  (0)|
|   7 |      COLLECTION ITERATOR CONSTRUCTOR FETCH|               |      1 |   8168 |      4 |00:00:00.01 |       0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("MSR"."PROD_CODE"="PROD_CODE")
filter("MSR"."MONTH_DT"<="MONTH_DT")

Notes on Execution Plan for Single Group By with CASE Expressions Query

  • There is a single index full scan and a single full table access
  • The number of buffers is 7
  • The plan suggests that this query will also be a lot more efficient than the first one
  • The data set is too small to be conclusive regarding performance comparison with the second query

See also:






SQL for Continuum and Contiguity Grouping

A question was asked some time ago on Oracle’s SQL&PL/SQL forum (Solution design and performance – SQL or PL/SQL?), the gist of which the poster specified thus:

I have a set of records which keep track of a status at a given date for a range of points on a given road. Basically, I need the current bits to “shine through”.

The thread, despite the open-minded title, gives a nice illustration of the odd preference that people often have for complicated PL/SQL solutions to problems amenable to simpler SQL. I provided an SQL solution in that thread for this problem, and in this article I have taken a simpler, more general form of the problem defined there, and provide SQL solutions with diagrams illustrating how they work.

The class of problem can be characterised as follows:

  • Records have range fields, and we want to aggregate separately at each point within the 1-dimensional range domains (or ‘continuum’), independently by grouping key
  • Within the groups we want to aggregate only the first or last records, ordering by some non-key fields

The first point above can be seen as equivalent to adding in to the main grouping key an implicit field specifying the domain leg, i.e. the region between the break points defined by the record ranges, and splitting the records by leg. We might term this form of aggregation continuum aggregation, or perhaps vertical aggregation if we view the range as distance, the internal ordering as by time, and visualise it graphically as in my diagrams below. Once this vertical aggregation is established, it is natural to think of adding a second aggregation, that might be termed contiguity aggregation, or with the same visualisation, horizontal aggregation:

  • Contiguous legs having the same values for the aggregated attributes will be grouped together

Here is an extract on grouping from Oracle® Database SQL Language Reference:

Oracle applies the aggregate functions to each group of rows and returns a single result row for each group.

The aggregate functions MIN, MAX, SUM, AVG, COUNT, VARIANCE, and STDDEV, when followed by the KEEP keyword, can be used in conjunction with the FIRST or LAST function to operate on a set of values from a set of rows that rank as the FIRST or LAST with respect to a given sorting specification. Refer to FIRST for more information.

A simple use-case for the KEEP form of grouping is to list contacts with their most recent telephone number when multiple numbers are stored in a separate table. A few years ago, I realised that in most examples that I see developers write their own code instead of using the built-in constructs, and I wrote an article comparing the performance and complexity of the various alternatives for this requirement (as well as alternatives for SQL pivoting), SQL Pivot and Prune Queries – Keeping an Eye on Performance.

Update, 20 November 2013: Added a Postgres version of the SQL solution.

Test Problem

Data Model

We take for our example problem a very simple model consisting of roads and road events where the events occur over a stretch of road at a given time, and event type forms the horizontal grouping attribute.

Road - ERD2

Test Data

Input data - Roads

ROAD_ID ROAD_DESC       START_POINT  END_POINT
------- --------------- ----------- ----------
      1 The Strand                0        200
      2 Piccadilly                0        100

Input data - road_events

ROAD_ID     REV_ID E_TYPE START_POINT  END_POINT E_DATE
------- ---------- ------ ----------- ---------- ---------
      1          1 OPEN             0         10 01-JAN-07
                 2 OPEN            20         50 01-JAN-08
                 3 CLOSED         130        160 01-FEB-08
                 4 CLOSED          55         85 05-JUN-08
                 5 OTHER           45        115 01-JAN-09
                 6 OPEN            60        100 12-FEB-11
                 7 CLOSED         115        145 12-FEB-12
      2          8 CLOSED          10         30 01-JAN-10
                 9 OPEN            40         50 01-JAN-11
                10 OPEN            50         70 01-JAN-12

10 rows selected.

The following diagrams display the data and solutions for our test problem.

Road - Road 1

Road - Road 2

SQL Solution for ‘Point’ Problem

To start with, here is a standard SQL solution for the ‘zero-dimensional’ problem, where the most recent record is required for each road:

Zero-Dimensional Solution

ROAD_DESC          R_START      R_END    E_START      E_END E_DATE    E_TYPE
--------------- ---------- ---------- ---------- ---------- --------- ------
Piccadilly               0        100         50         70 01-JAN-12 OPEN
The Strand               0        200        115        145 12-FEB-12 CLOSED

  1  SELECT r.road_desc, r.start_point r_start, r.end_point r_end,
  2         Max (e.start_point) KEEP (DENSE_RANK LAST ORDER BY e.event_date) e_start,
  3         Max (e.end_point) KEEP (DENSE_RANK LAST ORDER BY e.event_date) e_end,
  4         Max (e.event_date) KEEP (DENSE_RANK LAST ORDER BY e.event_date) e_date,
  5         Max (e.event_type) KEEP (DENSE_RANK LAST ORDER BY e.event_date) e_type
  6    FROM road_events e
  7    JOIN roads r
  8      ON r.id = e.road_id
  9   GROUP BY r.road_desc, r.start_point, r.end_point
 10*  ORDER BY 1, 2, 3

The SQL is more complicated for the continuum problem, and we provide two versions, that differ in the final stage, of horizontal grouping by contiguous event type. The first uses a differencing method popular in Oracle 11g and earlier versions for this common type of grouping problem; the second uses a new feature from Oracle 12c, row pattern matching. [I have also used another technique for this type of grouping, in an article on contiguity and other range-based grouping problems in June 2011, Forming Range-Based Break Groups With Advanced SQL.]

SQL Solution for ‘Continuum’ Problem, Contiguity Grouping by Differences

SQL (Contiguity by Differences)

WITH breaks AS  (
        SELECT road_id, start_point bp FROM road_events
         UNION
        SELECT road_id, end_point FROM road_events
         UNION
        SELECT id, start_point FROM roads
         UNION
        SELECT id, end_point FROM roads
), legs AS (
        SELECT road_id, bp leg_start, Lead (bp) OVER (PARTITION BY road_id ORDER BY bp) leg_end
          FROM breaks
), latest_events AS ( 
        SELECT l.road_id, l.leg_start, l.leg_end,
               Max (e.id) KEEP (DENSE_RANK LAST ORDER BY e.event_date) event_id,
               Nvl (Max (e.event_type) KEEP (DENSE_RANK LAST ORDER BY e.event_date), '(none)') event_type
          FROM legs l
          LEFT JOIN road_events e
            ON e.road_id = l.road_id
           AND e.start_point <= l.leg_start
	   AND e.end_point >= l.leg_end
         WHERE l.leg_end IS NOT NULL
         GROUP BY l.road_id, l.leg_start, l.leg_end
), latest_events_group AS ( 
        SELECT road_id,
               leg_start,
               leg_end,
               event_id,
               event_type,
               Dense_Rank () OVER (PARTITION BY road_id ORDER BY leg_start, leg_end) -
               Dense_Rank () OVER (PARTITION BY road_id, event_type ORDER BY leg_start, leg_end) group_no
          FROM latest_events
)
SELECT l.road_id, r.road_desc,
       Min (l.leg_start)        sec_start,
       Max (l.leg_end)          sec_end,
       l.event_type             e_type,
       l.group_no
  FROM latest_events_group l
  JOIN roads r
    ON r.id = l.road_id
 GROUP BY l.road_id,
        r.road_desc, 
        l.event_type,
        l.group_no
ORDER BY 1, 2, 3
/

ROAD_ID ROAD_DESC        SEC_START    SEC_END E_TYPE   GROUP_NO
------- --------------- ---------- ---------- ------ ----------
      1 The Strand               0         10 OPEN            0
                                10         20 (none)          1
                                20         45 OPEN            1
                                45         60 OTHER           3
                                60        100 OPEN            4
                               100        115 OTHER           5
                               115        160 CLOSED          9
                               160        200 (none)         11
      2 Piccadilly               0         10 (none)          0
                                10         30 CLOSED          1
                                30         40 (none)          1
                                40         70 OPEN            3
                                70        100 (none)          3

13 rows selected.

Query Structure Diagram (Contiguity by Differences)Road, V1.5 - QSD-DiffExecution Plan (Contiguity by Differences)

--------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |               |      1 |        |     13 |00:00:00.01 |      25 |       |       |          |
|   1 |  SORT ORDER BY                  |               |      1 |      2 |     13 |00:00:00.01 |      25 |  2048 |  2048 | 2048  (0)|
|   2 |   HASH GROUP BY                 |               |      1 |      2 |     13 |00:00:00.01 |      25 |   900K|   900K| 1343K (0)|
|   3 |    MERGE JOIN                   |               |      1 |     24 |     19 |00:00:00.01 |      25 |       |       |          |
|   4 |     TABLE ACCESS BY INDEX ROWID | ROADS         |      1 |      2 |      2 |00:00:00.01 |       2 |       |       |          |
|   5 |      INDEX FULL SCAN            | ROAD_PK       |      1 |      2 |      2 |00:00:00.01 |       1 |       |       |          |
|*  6 |     SORT JOIN                   |               |      2 |     24 |     19 |00:00:00.01 |      23 |  2048 |  2048 | 2048  (0)|
|   7 |      VIEW                       |               |      1 |     24 |     19 |00:00:00.01 |      23 |       |       |          |
|   8 |       WINDOW SORT               |               |      1 |     24 |     19 |00:00:00.01 |      23 |  2048 |  2048 | 2048  (0)|
|   9 |        WINDOW NOSORT            |               |      1 |     24 |     19 |00:00:00.01 |      23 | 73728 | 73728 |          |
|  10 |         SORT GROUP BY           |               |      1 |     24 |     19 |00:00:00.01 |      23 |  4096 |  4096 | 4096  (0)|
|* 11 |          HASH JOIN OUTER        |               |      1 |     24 |     25 |00:00:00.01 |      23 |  1696K|  1696K|  540K (0)|
|* 12 |           VIEW                  |               |      1 |     24 |     19 |00:00:00.01 |      16 |       |       |          |
|  13 |            WINDOW SORT          |               |      1 |     24 |     21 |00:00:00.01 |      16 |  2048 |  2048 | 2048  (0)|
|  14 |             VIEW                |               |      1 |     24 |     21 |00:00:00.01 |      16 |       |       |          |
|  15 |              SORT UNIQUE        |               |      1 |     24 |     21 |00:00:00.01 |      16 |  2048 |  2048 | 2048  (0)|
|  16 |               UNION-ALL         |               |      1 |        |     24 |00:00:00.01 |      16 |       |       |          |
|  17 |                INDEX FULL SCAN  | ROAD_EVENT_N1 |      1 |     10 |     10 |00:00:00.01 |       1 |       |       |          |
|  18 |                INDEX FULL SCAN  | ROAD_EVENT_N3 |      1 |     10 |     10 |00:00:00.01 |       1 |       |       |          |
|  19 |                TABLE ACCESS FULL| ROADS         |      1 |      2 |      2 |00:00:00.01 |       7 |       |       |          |
|  20 |                TABLE ACCESS FULL| ROADS         |      1 |      2 |      2 |00:00:00.01 |       7 |       |       |          |
|  21 |           TABLE ACCESS FULL     | ROAD_EVENTS   |      1 |     10 |     10 |00:00:00.01 |       7 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("R"."ID"="L"."ROAD_ID")
       filter("R"."ID"="L"."ROAD_ID")
  11 - access("E"."ROAD_ID"="L"."ROAD_ID")
       filter(("E"."START_POINT"<="L"."LEG_START" AND "E"."END_POINT">="L"."LEG_END"))
  12 - filter("L"."LEG_END" IS NOT NULL)

SQL Solution for ‘Continuum’ Problem, Contiguity Grouping by 12c Row Pattern Matching

SQL (Contiguity by Pattern Matching)

WITH breaks AS  (
        SELECT road_id, start_point bp FROM road_events
         UNION
        SELECT road_id, end_point FROM road_events
         UNION
        SELECT id, start_point FROM roads
         UNION
        SELECT id, end_point FROM roads
), legs AS (
        SELECT road_id, bp leg_start, Lead (bp) OVER (PARTITION BY road_id ORDER BY bp) leg_end
          FROM breaks
), latest_events AS ( 
        SELECT l.road_id, r.road_desc, l.leg_start, l.leg_end,
               Max (e.id) KEEP (DENSE_RANK LAST ORDER BY e.event_date) event_id,
               Nvl (Max (e.event_type) KEEP (DENSE_RANK LAST ORDER BY e.event_date), '(none)') event_type
          FROM legs l
          JOIN roads r
            ON r.id = l.road_id
          LEFT JOIN road_events e
            ON e.road_id = l.road_id
           AND e.start_point <= l.leg_start
	   AND e.end_point >= l.leg_end
         WHERE l.leg_end IS NOT NULL
         GROUP BY l.road_id, r.road_desc, l.leg_start, l.leg_end
)
SELECT m.road_id, m.road_desc, m.sec_start, m.sec_end, m.event_type e_type
  FROM latest_events
 MATCH_RECOGNIZE (
   PARTITION BY road_id, road_desc
   ORDER BY leg_start, leg_end
   MEASURES FIRST (leg_start) sec_start,
            LAST (leg_end) sec_end,
            LAST (event_type) event_type
   PATTERN (strt sm*)
   DEFINE sm AS PREV(sm.event_type) = sm.event_type
 ) m
ORDER BY 1, 2, 3
/

ROAD_ID ROAD_DESC        SEC_START    SEC_END E_TYPE
------- --------------- ---------- ---------- ------
      1 The Strand               0         10 OPEN
                                10         20 (none)
                                20         45 OPEN
                                45         60 OTHER
                                60        100 OPEN
                               100        115 OTHER
                               115        160 CLOSED
                               160        200 (none)
      2 Piccadilly               0         10 (none)
                                10         30 CLOSED
                                30         40 (none)
                                40         70 OPEN
                                70        100 (none)

13 rows selected.

Query Structure Diagram (Contiguity by Pattern Matching)Road, V1.5 - QSD-MRExecution Plan (Contiguity by Pattern Matching)

-------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                        | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                 |               |      1 |        |     13 |00:00:00.01 |      25 |       |       |          |
|   1 |  SORT ORDER BY                                   |               |      1 |      2 |     13 |00:00:00.01 |      25 |  2048 |  2048 | 2048  (0)|
|   2 |   VIEW                                           |               |      1 |      2 |     13 |00:00:00.01 |      25 |       |       |          |
|   3 |    MATCH RECOGNIZE SORT DETERMINISTIC FINITE AUTO|               |      1 |      2 |     13 |00:00:00.01 |      25 |  2048 |  2048 | 2048  (0)|
|   4 |     VIEW                                         |               |      1 |      2 |     19 |00:00:00.01 |      25 |       |       |          |
|   5 |      SORT GROUP BY                               |               |      1 |      2 |     19 |00:00:00.01 |      25 |  4096 |  4096 | 4096  (0)|
|*  6 |       HASH JOIN OUTER                            |               |      1 |     24 |     25 |00:00:00.01 |      25 |   987K|   987K|  525K (0)|
|   7 |        MERGE JOIN                                |               |      1 |     24 |     19 |00:00:00.01 |      18 |       |       |          |
|   8 |         TABLE ACCESS BY INDEX ROWID              | ROADS         |      1 |      2 |      2 |00:00:00.01 |       2 |       |       |          |
|   9 |          INDEX FULL SCAN                         | ROAD_PK       |      1 |      2 |      2 |00:00:00.01 |       1 |       |       |          |
|* 10 |         SORT JOIN                                |               |      2 |     24 |     19 |00:00:00.01 |      16 |  2048 |  2048 | 2048  (0)|
|* 11 |          VIEW                                    |               |      1 |     24 |     19 |00:00:00.01 |      16 |       |       |          |
|  12 |           WINDOW SORT                            |               |      1 |     24 |     21 |00:00:00.01 |      16 |  2048 |  2048 | 2048  (0)|
|  13 |            VIEW                                  |               |      1 |     24 |     21 |00:00:00.01 |      16 |       |       |          |
|  14 |             SORT UNIQUE                          |               |      1 |     24 |     21 |00:00:00.01 |      16 |  2048 |  2048 | 2048  (0)|
|  15 |              UNION-ALL                           |               |      1 |        |     24 |00:00:00.01 |      16 |       |       |          |
|  16 |               INDEX FULL SCAN                    | ROAD_EVENT_N1 |      1 |     10 |     10 |00:00:00.01 |       1 |       |       |          |
|  17 |               INDEX FULL SCAN                    | ROAD_EVENT_N3 |      1 |     10 |     10 |00:00:00.01 |       1 |       |       |          |
|  18 |               TABLE ACCESS FULL                  | ROADS         |      1 |      2 |      2 |00:00:00.01 |       7 |       |       |          |
|  19 |               TABLE ACCESS FULL                  | ROADS         |      1 |      2 |      2 |00:00:00.01 |       7 |       |       |          |
|  20 |        TABLE ACCESS FULL                         | ROAD_EVENTS   |      1 |     10 |     10 |00:00:00.01 |       7 |       |       |          |
-------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("E"."ROAD_ID"="L"."ROAD_ID")
       filter(("E"."START_POINT"<="L"."LEG_START" AND "E"."END_POINT">="L"."LEG_END"))
  10 - access("R"."ID"="L"."ROAD_ID")
       filter("R"."ID"="L"."ROAD_ID")
  11 - filter("L"."LEG_END" IS NOT NULL)

Note
-----
   - this is an adaptive plan

SQL Solution for ‘Continuum’ Problem, Contiguity Grouping – Postgres

The Oracle v11 (and earlier versions) solution, with contiguity by differences, can be converted to work in Postgres, as shown below.

SQL (Continuum by Row_Number, Contiguity by Differences – Postgres)

WITH breaks AS  (
        SELECT road_id, start_point bp FROM road_events
         UNION
        SELECT road_id, end_point FROM road_events
         UNION
        SELECT id, start_point FROM roads
         UNION
        SELECT id, end_point FROM roads
), legs AS (
        SELECT road_id, bp leg_start, Lead (bp) OVER (PARTITION BY road_id ORDER BY bp) leg_end
          FROM breaks
), ranked_events AS ( 
        SELECT l.road_id, l.leg_start, l.leg_end,
               e.id event_id, Coalesce (e.event_type, '(none)') event_type,
               Row_Number() OVER (PARTITION BY l.road_id, l.leg_start ORDER BY e.event_date DESC) rnk
          FROM legs l
          LEFT JOIN road_events e
            ON e.road_id = l.road_id
           AND e.start_point <= l.leg_start            AND e.end_point >= l.leg_end
         WHERE l.leg_end IS NOT NULL
), latest_events_group AS ( 
        SELECT road_id,
               leg_start,
               leg_end,
               event_id,
               event_type,
               Dense_Rank () OVER (PARTITION BY road_id ORDER BY leg_start, leg_end) -
               Dense_Rank () OVER (PARTITION BY road_id, event_type ORDER BY leg_start, leg_end) group_no
          FROM ranked_events
         WHERE rnk = 1
)
SELECT l.road_id, r.road_desc,
       Min (l.leg_start)        sec_start,
       Max (l.leg_end)          sec_end,
       l.event_type             e_type,
       l.group_no
  FROM latest_events_group l
  JOIN roads r
    ON r.id = l.road_id
 GROUP BY l.road_id,
        r.road_desc, 
        l.event_type,
        l.group_no
ORDER BY 1, 2, 3;

Continuum/contiguity Solution with Row_Number...
 road_id | road_desc  | sec_start | sec_end | e_type | group_no 
---------+------------+-----------+---------+--------+----------
       1 | The Strand |         0 |      10 | OPEN   |        0
       1 | The Strand |        10 |      20 | (none) |        1
       1 | The Strand |        20 |      45 | OPEN   |        1
       1 | The Strand |        45 |      60 | OTHER  |        3
       1 | The Strand |        60 |     100 | OPEN   |        4
       1 | The Strand |       100 |     115 | OTHER  |        5
       1 | The Strand |       115 |     160 | CLOSED |        9
       1 | The Strand |       160 |     200 | (none) |       11
       2 | Piccadilly |         0 |      10 | (none) |        0
       2 | Piccadilly |        10 |      30 | CLOSED |        1
       2 | Piccadilly |        30 |      40 | (none) |        1
       2 | Piccadilly |        40 |      70 | OPEN   |        3
       2 | Piccadilly |        70 |     100 | (none) |        3
(13 rows)

SELECT Version();

Postgres version...
                           version                           
-------------------------------------------------------------
 PostgreSQL 9.3.1, compiled by Visual C++ build 1600, 64-bit
(1 row)

Notes on Conversion of SQL to Postgres

  • Postgres does not have an exact equivalent of Oracle’s KEEP/FIRST grouping functionality, but it can be emulated via the analytic function Row_Number within a subquery
  • Coalesce, which is also available in Oracle, replaces Nvl, which does not exist in Postgres
  • Oracle’s execution plans were obtained using DBMS_XPlan after running the queries, while the Postgres version was obtained by running the query prefaced by ‘EXPLAIN ANALYZE ‘
  • There is no Postgres equivalent of Oracle v12’s row pattern matching

Query Structure Diagram (Continuum by Row_Number, Contiguity by Differences – Postgres) Road, V1.5 - QSD-PGExecution Plan (Continuum by Row_Number, Contiguity by Differences – Postgres)

Prefacing the query with ‘EXPLAIN ANALYZE ‘ gives:

Explaining Continuum/contiguity Solution with Row_Number...
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=619.45..619.47 rows=8 width=270) (actual time=0.235..0.235 rows=13 loops=1)
   Sort Key: l.road_id, r.road_desc, (min(l.leg_start))
   Sort Method: quicksort  Memory: 26kB
   CTE breaks
     ->  HashAggregate  (cost=79.50..95.30 rows=1580 width=8) (actual time=0.013..0.020 rows=21 loops=1)
           ->  Append  (cost=0.00..71.60 rows=1580 width=8) (actual time=0.002..0.006 rows=24 loops=1)
                 ->  Seq Scan on road_events  (cost=0.00..14.80 rows=480 width=8) (actual time=0.001..0.003 rows=10 loops=1)
                 ->  Seq Scan on road_events road_events_1  (cost=0.00..14.80 rows=480 width=8) (actual time=0.000..0.002 rows=10 loops=1)
                 ->  Seq Scan on roads  (cost=0.00..13.10 rows=310 width=8) (actual time=0.000..0.001 rows=2 loops=1)
                 ->  Seq Scan on roads roads_1  (cost=0.00..13.10 rows=310 width=8) (actual time=0.000..0.000 rows=2 loops=1)
   CTE legs
     ->  WindowAgg  (cost=115.54..147.14 rows=1580 width=8) (actual time=0.030..0.041 rows=21 loops=1)
           ->  Sort  (cost=115.54..119.49 rows=1580 width=8) (actual time=0.028..0.029 rows=21 loops=1)
                 Sort Key: breaks.road_id, breaks.bp
                 Sort Method: quicksort  Memory: 25kB
                 ->  CTE Scan on breaks  (cost=0.00..31.60 rows=1580 width=8) (actual time=0.014..0.023 rows=21 loops=1)
   CTE ranked_events
     ->  WindowAgg  (cost=290.71..326.08 rows=1572 width=138) (actual time=0.089..0.104 rows=25 loops=1)
           ->  Sort  (cost=290.71..294.64 rows=1572 width=138) (actual time=0.088..0.089 rows=25 loops=1)
                 Sort Key: l_1.road_id, l_1.leg_start, e.event_date
                 Sort Method: quicksort  Memory: 26kB
                 ->  Hash Left Join  (cost=20.80..207.25 rows=1572 width=138) (actual time=0.044..0.079 rows=25 loops=1)
                       Hash Cond: (l_1.road_id = e.road_id)
                       Join Filter: ((e.start_point <= l_1.leg_start) AND (e.end_point >= l_1.leg_end))
                       Rows Removed by Join Filter: 89
                       ->  CTE Scan on legs l_1  (cost=0.00..31.60 rows=1572 width=12) (actual time=0.031..0.048 rows=19 loops=1)
                             Filter: (leg_end IS NOT NULL)
                             Rows Removed by Filter: 2
                       ->  Hash  (cost=14.80..14.80 rows=480 width=138) (actual time=0.005..0.005 rows=10 loops=1)
                             Buckets: 1024  Batches: 1  Memory Usage: 1kB
                             ->  Seq Scan on road_events e  (cost=0.00..14.80 rows=480 width=138) (actual time=0.001..0.002 rows=10 loops=1)
   CTE latest_events_group
     ->  WindowAgg  (cost=35.79..36.01 rows=8 width=48) (actual time=0.165..0.181 rows=19 loops=1)
           ->  Sort  (cost=35.79..35.81 rows=8 width=48) (actual time=0.164..0.165 rows=19 loops=1)
                 Sort Key: ranked_events.road_id, ranked_events.event_type, ranked_events.leg_start, ranked_events.leg_end
                 Sort Method: quicksort  Memory: 26kB
                 ->  WindowAgg  (cost=35.49..35.67 rows=8 width=48) (actual time=0.124..0.138 rows=19 loops=1)
                       ->  Sort  (cost=35.49..35.51 rows=8 width=48) (actual time=0.123..0.124 rows=19 loops=1)
                             Sort Key: ranked_events.road_id, ranked_events.leg_start, ranked_events.leg_end
                             Sort Method: quicksort  Memory: 26kB
                             ->  CTE Scan on ranked_events  (cost=0.00..35.37 rows=8 width=48) (actual time=0.090..0.117 rows=19 loops=1)
                                   Filter: (rnk = 1)
                                   Rows Removed by Filter: 6
   ->  HashAggregate  (cost=14.72..14.80 rows=8 width=270) (actual time=0.218..0.221 rows=13 loops=1)
         ->  Hash Join  (cost=0.26..14.60 rows=8 width=270) (actual time=0.203..0.207 rows=19 loops=1)
               Hash Cond: (r.id = l.road_id)
               ->  Seq Scan on roads r  (cost=0.00..13.10 rows=310 width=222) (actual time=0.002..0.002 rows=2 loops=1)
               ->  Hash  (cost=0.16..0.16 rows=8 width=52) (actual time=0.192..0.192 rows=19 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 2kB
                     ->  CTE Scan on latest_events_group l  (cost=0.00..0.16 rows=8 width=52) (actual time=0.167..0.190 rows=19 loops=1)
 Total runtime: 0.432 ms
(51 rows)

Code

Here is the code: Road-SQL