# SQL for the Fantasy Football Knapsack Problem

Someone (FilippeSoaresRoza) asked a question 21 June 2013 on OTN about finding the best fantasy football team in SQL, Processing Cost - How to catch a soccer team with the highest combined score?. I saw that this was another knapsack problem, of the single container type. I had solved that problem on the forum before, and here, A Simple SQL Solution for the Knapsack Problem (SKP-1), so I decided to adapt the solution for this case. This is in fact a more general form of the problem, wherein the items now have categories, with constraints on the numbers in each category, and on the overall number of items. The first solution I posted provided an exact solution, as in the above article, and performed well enough on the simple sample data, returning in a few seconds. However, the poster reported that the query was still running on his full data set after a couple of hours. I therefore decided to look for a mechanism to reduce the work done by the query on what is a hard combinatorial problem, and to return 'good' solutions in a practical amount of time, but without guaranteeing optimality (I recently provided solutions like this for a related problem, SQL for the Balanced Number Partitioning Problem).

This article provides the SQL that does this, and also a PL/SQL package containing a pipelined function that applies a slightly different algorithm; the latter is also practical, although it proved less efficient on my test problems.

Test Problems

I used two test problems.

Test Problem 1: Brazilian League
The first problem was supplied by the OTN poster and appears to be based on a Brazilian league. It has 114 players, in seven positions (one being coach), with twelve players forming a team. The problem is to find the team with maximum total player points within a given maximum price, and matching the positional constraints:

Input positions

ID MIN_PLAYERS MAX_PLAYERS
-- ----------- -----------
AL          12          12
CB           2           3
CO           1           1
FW           1           3
GK           1           1
MF           3           5
WB           0           2

6 rows selected.

Input players

ID  CLUB_NAME                      PLAYER_NAME                    PO      PRICE AVG_POINTS APPEARANCES      PRF_R      VFP_R      PRC_R
--- ------------------------------ ------------------------------ -- ---------- ---------- ----------- ---------- ---------- ----------
038 Portuguesa                     Ivan                           WB        755       1320         100          1          2         20
001 Atlético-PR                    Éderson                        FW       1712       1012         500          2         22         97
002 Vitória                        Maxi Biancucchi                FW       1962       1005         400          3         33        103
003 Fluminense                     Rafael Sobis                   FW       2303        955         400          4         47        112
098 Fluminense                     Digão                          CB        931        927         300          5          5         34
058 Internacional                  Fred                           MF       3028        892         500          6         92        114
059 Grêmio                         Zé Roberto                     MF       2593        878         400          7         73        113
039 Vasco                          Elsinho                        WB       1468        850         400          8         25         83
004 Bahia                          Fernandão                      FW       1328        822         500          9         19         70
060 Internacional                  Otavinho                       MF        762        807         300         10          4         21
078 Flamengo                       Jaime De AlMFda                CO       1156        803         100         11         12         52
022 Vitória                        Wilson                         GK       1239        794         500         12         17         59
021 Cruzeiro                       Fábio                          GK       2090        794         500         12         59        106
023 Coritiba                       Vanderlei                      GK       1858        776         500         14         45        101
005 São Paulo                      Luis Fabiano                   FW       2154        758         400         15         67        107
040 Cruzeiro                       Egídio                         WB       1482        752         500         16         34         84
041 Fluminense                     Carlinhos                      WB       1240        693         300         17         26         60
099 Flamengo                       Samir                          CB        267        680         100         18          1          1
061 Vasco                          Carlos Alberto                 MF       1501        675         200         19         42         85
006 Botafogo                       Rafael Marques                 FW       1974        668         500         20         74        105
062 Cruzeiro                       Nilton                         MF       2239        646         500         21         95        110
100 Cruzeiro                       Dedé                           CB       2254        640         500         22         97        111
063 Coritiba                       Júnior Urso                    MF       1438        622         500         23         43         81
064 Crisciúma                      João Vitor                     MF       1327        604         500         24         41         69
101 São Paulo                      Lúcio                          CB       2171        602         500         25         99        108
007 Cruzeiro                       Dagoberto                      FW       2211        594         500         26        102        109
102 Grêmio                         Bressan                        CB       1085        590         400         27         28         48
103 Atlético-PR                    Manoel                         CB       1699        588         500         28         70         96
065 Corinthians                    Guilherme                      MF        883        587         400         29         14         32
104 Ponte Preta                    Cléber                         CB       1461        578         500         30         55         82
008 Náutico                        Rogério                        FW       1062        570         500         31         29         44
066 Corinthians                    Ralf                           MF       1965        570         500         31         93        104
067 Vitória                        Escudero                       MF       1638        568         500         33         68         93
068 Portuguesa                     Correa                         MF        844        560         400         34         15         26
042 Náutico                        Auremir                        WB        773        548         400         35         11         22
079 Cruzeiro                       Marcelo Oliveira               CO       1611        543         500         36         75         92
080 Fluminense                     Abel Braga                     CO       1751        536         400         37         84         98
105 Cruzeiro                       Bruno Rodrigo                  CB       1547        528         500         38         72         88
043 Cruzeiro                       Mayke                          WB        374        525         200         39          3          3
069 Portuguesa                     Souza                          MF       1262        517         400         40         49         62
070 Coritiba                       Alex                           MF       1698        508         500         41         88         95
009 Flamengo                       Hernane                        FW       1387        498         500         42         65         75
071 Grêmio                         Souza                          MF       1380        498         400         42         64         74
106 Santos                         Edu Dracena                    CB       1682        497         300         44         90         94
010 Crisciúma                      Lins                           FW       1840        490         500         45        103        100
011 Santos                         Neilton                        FW        638        488         400         46          9         11
012 Fluminense                     Samuel                         FW       1001        487         300         47         36         37
072 Ponte Preta                    Cicinho                        MF       1142        472         500         48         48         51
024 Atlético-MG                    Victor                         GK       1163        467         400         49         52         53
045 Atlético-MG                    Richarlyson                    WB       1020        467         300         49         40         38
044 Portuguesa                     Luis Ricardo                   WB        858        467         300         49         27         28
013 Ponte Preta                    Chiquinho                      FW        997        464         500         52         38         36
081 Internacional                  Dunga                          CO       1422        463         500         53         79         80
047 São Paulo                      Juan                           WB        789        457         300         54         24         23
046 Internacional                  Fabrício                       WB        876        457         400         54         31         30
014 Atlético-MG                    Luan                           FW       1318        455         400         56         71         67
048 São Paulo                      Paulo Miranda                  WB       1053        454         500         57         44         41
049 Flamengo                       João Paulo                     WB        715        453         300         58         18         19
050 São Paulo                      Rodrigo Caio                   WB       1192        452         500         59         60         56
025 Bahia                          Marcelo Lomba                  GK       1364        450         500         60         78         71
073 Botafogo                       Fellype Gabriel                MF        860        447         400         61         32         29
082 Vitória                        Caio Júnior                    CO       1140        445         500         62         56         50
015 Ponte Preta                    William                        FW       1393        444         500         63         81         76
107 Náutico                        William Alves                  CB        556        443         300         64          8          8
083 Grêmio                         Vanderlei Luxemburgo           CO       1577        442         400         65         98         89
084 São Paulo                      Ney Franco                     CO       1515        439         500         66         94         86
074 Atlético-PR                    João Paulo                     MF       1056        438         500         67         46         42
026 Botafogo                       Renan                          GK        677        437         400         68         16         13
075 Vasco                          Sandro Silva                   MF       1076        428         500         69         53         46
108 Fluminense                     Gum                            CB       1218        422         400         70         69         58
085 Náutico                        Levi Gomes                     CO        708        420         200         71         21         18
109 Flamengo                       Wallace                        CB        429        420         200         71          6          4
051 Coritiba                       Victor Ferraz                  WB       1304        420         500         71         80         65
076 Santos                         Cícero                         MF       1415        418         500         74         91         78
027 Flamengo                       Felipe                         GK       1526        414         500         75        101         87
077 Fluminense                     Wagner                         MF        855        413         300         76         37         27
052 Bahia                          Jussandro                      WB        694        410         500         77         23         16
110 Náutico                        João Filipe                    CB        547        410         400         77         10          7
016 Botafogo                       Vitinho                        FW       1020        404         500         79         54         38
053 Santos                         Rafael Galhardo                WB       1288        404         500         79         83         64
111 Grêmio                         Werley                         CB       1590        403         400         81        105         90
055 Náutico                        Maranhão                       WB        653        402         500         82         20         12
054 Goiás                          William Matheus                WB        587        402         500         82         13          9
112 Corinthians                    Gil                            CB       1323        398         500         84         85         68
113 Vitória                        Gabriel Paulista               CB       1177        394         500         85         76         54
086 Atlético-PR                    Ricardo Drubscky               CO        796        392         500         86         35         24
087 Coritiba                       Marquinhos Santos              CO       1059        389         500         87         62         43
017 Coritiba                       Deivid                         FW       1590        376         500         88        107         90
028 Grêmio                         Dida                           GK       1132        375         400         89         77         49
114 Goiás                          Ernando                        CB       1024        374         500         90         63         40
029 Corinthians                    Cássio                         GK       1251        374         500         90         89         61
018 Grêmio                         Barcos                         FW       1896        367         400         92        110        102
088 Vasco                          Paulo Autuori                  CO       1313        361         500         93        100         66
030 Vasco                          Michel Alves                   GK        899        348         500         94         57         33
019 Atlético-MG                    Jô                             FW       1393        340         200         95        106         76
056 Internacional                  Gabriel                        WB       1181        338         500         96         96         55
057 Goiás                          Vítor                          WB        877        336         500         97         58         31
089 Portuguesa                     Edson Pimenta                  CO        367        326         400         98          7          2
090 Botafogo                       Oswaldo De Oliveira            CO       1077        323         500         99         87         47
031 Crisciúma                      Bruno                          GK       1066        320         500        100         86         45
092 Santos                         Claudinei Oliveira             CO       1192        317         300        101        104         56
091 Corinthians                    Tite                           CO       1368        317         500        101        108         73
020 São Paulo                      Osvaldo                        FW       1364        312         500        103        109         71
032 Internacional                  Muriel                         GK        981        310         400        104         82         35
033 Santos                         Rafael                         GK       1782        300         500        105        112         99
093 Bahia                          Cristóvão Borges               CO        827        292         500        106         66         25
094 Crisciúma                      Vadão                          CO        704        286         500        107         50         17
095 Goiás                          Enderson Moreira               CO        680        253         500        108         61         14
034 Atlético-PR                    Weverton                       GK        616        248         500        109         51         10
035 Fluminense                     Ricardo Berna                  GK        460        242         400        110         30          6
096 Atlético-MG                    Cuca                           CO       1262        232         400        111        111         62
036 Portuguesa                     Gledson                        GK        452        210         400        112         39          5
037 São Paulo                      Rogério Ceni                   GK       1420        117         400        113        114         79
097 Ponte Preta                    Zé Sérgio                      CO        685         75         100        114        113         15

114 rows selected.


Note that I dropped the poster's formations based data model in favour of the above, more general one. I used AL as a code for team size, and chose the maximum price arbitrarily (but having an influence on results). I also multiplied the points and prices by a factor of 100 to allow me to work in integers.

Test Problem 2: English Premier League
The second problem is be based on English Premier League and I got the data from a 'scraping' web-site, https://scraperwiki.com/scrapers/fantasy_premier_league_player_stats/. There are some data quality issues with the data, but it is good enough for technical testing. I summed the players' points over the last season and took their values at the last week as prices.

After excluding zero-point players, there remained 576 players, of five positions, with eleven players forming a team, and the problem is the same, with the positional constraints:

Input positions

ID MIN_PLAYERS MAX_PLAYERS
-- ----------- -----------
AL          11          11
DF           3           5
FW           1           3
GK           1           1
MF           2           5

Input players

ID CLUB_NAME       PLAYER_NAME          PO      PRICE AVG_POINTS APPEARANCES      PRF_R      VFP_R      PRC_R
---------- --------------- -------------------- -- ---------- ---------- ----------- ---------- ---------- ----------
661 Tottenham       Gareth Bale          MF        111        240          38          1         36        573
286 Liverpool       Luis Suarez          FW        105        213          38          2         62        572
30 Arsenal         Santi Santi Cazorla  MF         97        198          36          3         57        569
149 Chelsea         Juan Mata            MF        102        190          36          4         90        571
265 Liverpool       Steven Gerrard       MF         92        187          38          5         60        562
533 Southampton     Rickie Lambert       FW         69        178          37          6          5        513
165 Everton         Leighton Baines      DF         78        173          38          7         25        537
318 Man City        Carlos Tevez         FW         92        172          38          8         87        562
139 Chelsea         Eden Hazard          MF         96        171          35          9         99        568
641 Swansea         Miguel Michu         MF         79        169          36         10         41        542
177 Everton         Marouane Fellaini    MF         73        168          38         11         14        527
47 Aston Villa     Christian Benteke    FW         74        166          35         12         20        530
204 Fulham          Dimitar Berbatov     FW         71        161          37         13         18        523
720 West Brom       Romelu Lukaku        FW         66        157          37         14          8        498
314 Man City        David Silva          MF         92        154          38         15        117        562
298 Man City        Joe Hart             GK         69        154          38         15         22        513
549 Stoke City      Asmir Begovic        GK         56        154          40         15          1        421
428 Norwich         Robert Snodgrass     MF         62        152          38         18          7        474
332 Man Utd         Patrice Evra         DF         73        152          38         18         51        527
126 Chelsea         Demba Ba             FW         78        149          37         20         77        537
616 Sunderland      Stephane Sessegnon   MF         67        148          38         21         26        504
575 Stoke City      Jonathan Walters     MF         63        147          40         22         13        482
770 West Ham        Kevin Nolan          MF         61        145          36         23          9        465
354 Man Utd         Wayne Rooney         FW        116        141          37         24        201        575
322 Man City        Yaya Yaya Toure      MF         82        141          37         24        108        548
268 Liverpool       Glen Johnson         DF         65        141          37         24         35        491
760 West Ham        Jussi Jaaskelainen   GK         52        139          36         27          2        361
198 Everton         Steven Pienaar       MF         66        139          38         27         46        498
609 Sunderland      Simon Mignolet       GK         53        139          38         27          3        379
598 Sunderland      Adam Johnson         MF         68        138          37         30         61        508
726 West Brom       James Morrison       MF         57        135          39         31         11        429
569 Stoke City      Ryan Shawcross       DF         56        133          40         32         10        421
248 Liverpool       Daniel Agger         DF         64        133          38         32         52        488
270 Liverpool       Sanchez Jose Enrique DF         61        133          37         32         33        465
239 Fulham          Mark Schwarzer       GK         51        133          38         32          4        344
800 Wigan           Arouna Kone          FW         69        131          37         36         78        513
684 Tottenham       Aaron Lennon         MF         71        131          38         36         91        523
161 Chelsea         Fernando Torres      FW         93        131          36         36        165        566
552 Stoke City      Peter Crouch         FW         60        131          40         36         31        458
594 Sunderland      Steven Fletcher      FW         67        131          36         36         71        504
295 Man City        Edin Dzeko           FW         68        130          38         41         76        508
700 Tottenham       Jan Vertonghen       DF         68        129          37         42         80        508
132 Chelsea         Petr Cech            GK         64        129          38         42         64        488
144 Chelsea         Frank Lampard        MF         85        128          36         44        150        553
289 Man City        Sergio Aguero        FW        111        127          39         45        217        573
278 Liverpool       Jose Reina           GK         58        126          38         46         34        438
628 Swansea         Jonathan De Guzman   MF         57        122          36         47         39        429
667 Tottenham       Jermain Defoe        FW         79        122          37         47        137        542
723 West Brom       Gareth McAuley       DF         52        122          38         47         12        361
802 Wigan           Shaun Maloney        MF         54        121          37         50         21        395
405 Norwich         Sebastien Bassong    DF         53        121          37         50         16        379
186 Everton         Phil Jagielka        DF         59        120          38         52         59        449
558 Stoke City      Robert Huth          DF         55        120          40         52         32        409
353 Man Utd         Rafael Rafael        DF         61        119          38         54         72        465
771 West Ham        Joey O'Brien         DF         48        119          36         54          6        269
196 Everton         Leon Osman           MF         62        119          37         54         75        474
650 Swansea         Wayne Routledge      MF         53        118          36         57         24        379
323 Man City        Pablo Zabaleta       DF         64        117          38         58         92        488
669 Tottenham       Clint Dempsey        MF         89        116          37         59        186        557
612 Sunderland      John O'Shea          DF         51        115          38         60         19        344
374 Newcastle       Papiss Cisse         FW         87        115          39         60        182        556
142 Chelsea         Branislav Ivanovic   DF         69        114          36         62        120        513
211 Fulham          Damien Duff          MF         58        114          38         62         69        438
364 Man Utd         David de Gea         GK         58        114          38         62         69        438
185 Everton         Tim Howard           GK         53        113          38         65         43        379
154 Chelsea         Emboaba Oscar        MF         79        113          35         65        163        542
602 Sunderland      Sebastian Larsson    MF         59        112          38         67         79        449
719 West Brom       Shane Long           FW         58        110          38         68         81        438
413 Norwich         Grant Holt           FW         59        110          38         68         89        449
713 West Brom       Ben Foster           GK         51        109          39         70         42        344
536 Southampton     Jason Puncheon       MF         47        107          37         71         17        238
232 Fulham          Sascha Riether       DF         48        107          37         71         23        269
145 Chelsea         David Luiz           DF         67        107          36         71        132        504
784 Wigan           Jean Beausejour      MF         53        106          38         74         65        379
60 Aston Villa     Bradley Guzan        GK         48        106          38         74         27        269
293 Man City        Gael Clichy          DF         58        106          38         74         93        438
476 QPR             Adel Taarabt         MF         53        105          38         77         66        379
804 Wigan           James McCarthy       MF         48        105          38         77         30        269
595 Sunderland      Craig Gardner        MF         49        104          38         79         44        293
131 Chelsea         Gary Cahill          DF         60        104          38         79        106        458
423 Norwich         Anthony Pilkington   MF         55        104          38         79         82        409
541 Southampton     Morgan Schneiderlin  MF         48        103          37         82         38        269
412 Norwich         Javier Garrido       DF         47        103          38         82         29        238
414 Norwich         Wes Hoolahan         MF         55        103          38         82         86        409
134 Chelsea         Ashley Cole A        DF         63        103          36         82        123        482
701 Tottenham       Kyle Walker          DF         61        103          37         82        115        465
540 Southampton     Jay Rodriguez        FW         52        103          37         82         67        361
173 Everton         Sylvain Distin       DF         54        102          38         88         83        395
236 Fulham          Bryan Ruiz           FW         50        102          38         88         58        313
501 Reading         Jobi McAnuff         MF         47        101          36         90         37        238
776 West Ham        Winston Reid         DF         48        101          36         90         47        269
576 Stoke City      Glenn Whelan         MF         49        101          40         90         54        293
233 Fulham          John Arne Riise      DF         52        100          38         93         74        361
358 Man Utd         Antonio Valencia     MF         82        100          38         93        200        548
331 Man Utd         Jonny Evans J        DF         53         99          37         95         88        379
285 Liverpool       Daniel Sturridge     FW         74         99          35         95        178        530
187 Everton         Nikica Jelavic       FW         77         98          37         97        190        534
498 Reading         Adam Le Fondre       FW         44         97          36         98         28        156
261 Liverpool       Stewart Downing      MF         57         97          37         98        109        429
417 Norwich         Bradley Johnson      MF         47         97          38         98         53        238
420 Norwich         Russell Martin R     DF         42         96          38        101         15         82
648 Swansea         Angel Rangel         DF         47         96          36        101         56        238
769 West Ham        Mark Noble           MF         46         96          36        101         50        216
267 Liverpool       Jordan Henderson     MF         48         95          38        104         68        269
156 Chelsea         Nascimento Ramires   MF         62         95          35        104        140        474
635 Swansea         Pablo Hernandez      MF         59         95          33        104        130        449
606 Sunderland      James McClean        MF         56         95          39        104        111        421
372 Newcastle       Yohan Cabaye         MF         65         94          38        108        158        491
327 Man Utd         Michael Carrick      MF         59         94          38        108        133        449
631 Swansea         Nathan Dyer          MF         50         94          36        108         84        313
305 Man City        James Milner         MF         61         93          38        111        143        465
532 Southampton     Adam Lallana         MF         56         93          37        111        118        421
550 Stoke City      Geoff Cameron        DF         43         92          38        113         40        114
627 Swansea         Ben Davies           DF         44         92          35        113         49        156
334 Man Utd         Rio Ferdinand        DF         58         92          38        113        135        438
705 West Brom       Chris Brunt          MF         53         92          37        113        105        379
731 West Brom       Jonas Olsson         DF         49         92          39        113         85        293
786 Wigan           Emmerson Boyce       DF         47         91          38        118         73        238
338 Man Utd         Javier Hernandez     FW         65         90          37        119        170        491
789 Wigan           Franco Di Santo      FW         52         90          38        119        107        361
191 Everton         Kevin Mirallas       FW         66         90          36        119        173        498
658 Swansea         Ashley Williams      DF         49         89          36        122         94        293
657 Swansea         Michel Vorm          GK         51         89          37        122        104        344
686 Tottenham       Hugo Lloris          GK         58         89          34        122        139        438
282 Liverpool       Martin Skrtel        DF         56         89          38        122        134        421
761 West Ham        Matthew Jarvis       MF         55         89          35        122        129        409
164 Everton         Victor Anichebe      FW         43         88          38        127         55        114
735 West Brom       Liam Ridgewell       DF         48         87          38        128         97        269
754 West Ham        Guy Demel            DF         41         87          36        128         45         60
433 Norwich         Michael Turner       DF         41         86          38        130         48         60
747 West Ham        Andy Carroll         FW         82         86          36        130        237        548
547 Stoke City      Charlie Adam         MF         65         85          38        132        184        491
291 Man City        Gareth Barry         MF         52         85          38        132        124        361
537 Southampton     Gaston Ramirez       MF         52         85          34        132        124        361
302 Man City        Vincent Kompany      DF         70         85          38        132        202        521
383 Newcastle       Jonas Gutierrez      MF         55         84          39        136        142        409
341 Man Utd         Shinji Kagawa        MF         79         84          37        136        235        542
306 Man City        Samir Nasri          MF         81         83          37        138        243        546
125 Chelsea         Cesar Azpilicueta    DF         56         83          34        138        154        421
519 Southampton     Nathaniel Clyne      DF         41         83          37        138         63         60
78 Aston Villa     Ashley Westwood      MF         49         83          35        138        112        293
445 QPR             Soares Cesar         GK         47         83          35        138        100        238
241 Fulham          Steve Sidwell        MF         49         83          38        138        112        293
755 West Ham        Mohamed Diame        MF         47         83          36        138        100        238
564 Stoke City      Steven Nzonzi        MF         50         81          35        145        128        313
284 Liverpool       Raheem Sterling      MF         46         81          37        145        102        216
359 Man Utd         Robin Van Persie     FW        137         80          12        147        330        576
727 West Brom       Youssouf Mulumbu     MF         53         80          39        147        149        379
77 Aston Villa     Andreas Weimann      FW         51         80          31        147        136        344
782 Wigan           Ali Al-Habsi         GK         49         80          38        147        126        293
597 Sunderland      Danny Graham         FW         54         79          38        151        156        395
803 Wigan           James McArthur       MF         54         78          38        152        159        395
224 Fulham          Alex Kacaniklic      MF         43         78          38        152         95        114
591 Sunderland      Carlos Cuellar       DF         43         78          37        152         95        114
303 Man City        Joleon Lescott       DF         58         77          38        155        179        438
668 Tottenham       Mousa Dembele        MF         58         77          37        155        179        438
696 Tottenham       Gylfi Sigurdsson     MF         78         76          37        157        260        537
511 Reading         Hal Robson-Kanu      MF         42         76          36        157         98         82
307 Man City        Matija Nastasic      DF         53         76          34        157        161        379
386 Newcastle       Tim Krul             GK         51         75          38        160        155        344
522 Southampton     Steven Davis         MF         45         74          37        161        122        183
730 West Brom       Peter Odemwingie     FW         69         74          39        161        233        513
503 Reading         Garath McCleary      MF         44         73          36        163        119        156
221 Fulham          Brede Hangeland      DF         48         73          38        163        146        269
589 Sunderland      Jack Colback         MF         45         73          38        163        127        183
370 Newcastle       Hatem Ben Arfa       MF         73         72          38        166        252        527
392 Newcastle       Davide Santon        DF         47         72          39        166        141        238
415 Norwich         Jonathan Howson      MF         45         72          39        166        131        183
496 Reading         Jimmy Kebe           MF         41         72          36        166        103         60
659 Tottenham       Emmanuel Adebayor    FW         91         71          37        170        297        560
235 Fulham          Hugo Rodallega       FW         54         71          37        170        183        395
172 Everton         Seamus Coleman       MF         46         71          38        170        138        216
792 Wigan           Maynor Figueroa      DF         43         71          38        170        121        114
509 Reading         Pavel Pogrebnyak     FW         42         71          36        170        114         82
560 Stoke City      Kenwyne Jones        FW         50         70          40        175        166        313
194 Everton         Steven Naismith      FW         59         70          37        175        207        449
328 Man Utd         Tom Cleverley        MF         56         70          37        175        194        421
469 QPR             Ryan Nelsen          DF         41         69          38        178        116         60
259 Liverpool       Phillippe Coutinho   MF         71         69          13        178        261        523
481 QPR             Bobby Zamora         FW         61         68          39        180        226        465
546 Southampton     Maya Yoshida         DF         45         68          34        180        148        183
524 Southampton     Jose Fonte           DF         40         68          37        180        110         34
425 Norwich         John Ruddy           GK         44         67          38        183        145        156
360 Man Utd         Nemanja Vidic        DF         66         67          38        183        245        498
781 West Ham        Ricardo Vaz Te       FW         51         66          36        185        188        344
663 Tottenham       Steven Caulker       DF         44         66          37        185        151        156
337 Man Utd         Ryan Giggs           MF         60         65          38        187        232        458
13 Arsenal         Olivier Giroud       FW         77         65          18        187        281        534
255 Liverpool       Jamie Carragher      DF         50         65          38        187        187        313
467 QPR             Stephane Mbia        DF         49         65          35        187        181        293
499 Reading         Mikele Leigertwood   MF         45         65          36        187        159        183
462 QPR             Clint Hill           DF         43         64          38        192        153        114
520 Southampton     Jack Cork            MF         44         64          37        192        157        156
666 Tottenham       Michael Dawson       DF         45         64          38        192        164        183
624 Swansea         Leon Britton         MF         42         64          36        192        144         82
715 West Brom       Zoltan Gera          MF         47         64          39        192        175        238
561 Stoke City      Michael Kightly      MF         51         64          38        192        193        344
625 Swansea          Chico               DF         46         64          36        192        169        216
718 West Brom       Billy Jones          DF         44         63          38        199        162        156
387 Newcastle       Sylvain Marveaux     MF         41         62          38        200        147         60
463 QPR             David Hoilett        MF         56         62          38        200        229        421
556 Stoke City      Matthew Etherington  MF         59         61          40        202        241        449
230 Fulham          Mladen Petric        FW         54         61          37        202        221        395
478 QPR             Armand Traore        DF         48         61          38        202        191        269
636 Swansea         Sung-Yeung Ki        MF         60         60          34        205        246        458
544 Southampton     Luke Shaw            DF         40         60          37        205        151         34
779 West Ham        Matthew Taylor       MF         46         60          36        205        185        216
614 Sunderland      Danny Rose           MF         44         60          38        205        173        156
407 Norwich         Mark Bunn            GK         43         60          34        205        167        114
160 Chelsea         John Terry           DF         65         59          36        210        269        491
796 Wigan           Jordi Gomez          MF         52         59          38        210        220        361
363 Man Utd         Ashley Young         MF         82         58          37        213        307        548
431 Norwich         Alexander Tettey     MF         43         58          36        213        176        114
710 West Brom       Graham Dorrans       MF         50         58          39        213        213        313
695 Tottenham       Raniere Sandro       MF         47         58          38        213        198        238
75 Aston Villa     Ron Vlaar            DF         45         58          38        213        189        183
361 Man Utd         Danny Welbeck        FW         78         56          37        218        304        537
466 QPR             Jamie Mackie         FW         50         56          38        218        223        313
744 West Brom       Claudio Yacob        MF         49         56          37        218        218        293
516 Southampton     Artur Boruc          GK         45         56          30        218        196        183
297 Man City        Francisco Garcia     MF         50         56          34        218        223        313
559 Stoke City      Cameron Jerome       FW         50         55          38        223        231        313
787 Wigan           Gary Caldwell        DF         47         55          38        223        211        238
301 Man City        Aleksandar Kolarov   DF         55         55          38        223        246        409
41 Aston Villa     Gabriel Agbonlahor   FW         68         54          21        226        295        508
38 Arsenal         Theo Walcott         MF         90         53          12        228        329        559
474 QPR             Loic Remy            FW         54         53          16        228        253        395
394 Newcastle       Moussa Sissoko       MF         54         53          15        228        253        395
274 Liverpool       Leiva Lucas          MF         46         53          38        228        216        216
506 Reading         Sean Morrison        DF         38         53          29        228        168          4
495 Reading         Jem Karacan          MF         42         53          36        228        192         82
153 Chelsea         Victor Moses         MF         62         53          35        228        277        474
249 Liverpool       Joe Allen            MF         45         52          37        235        214        183
406 Norwich         Elliott Bennett      MF         47         52          40        235        230        238
375 Newcastle       Fabricio Coloccini   DF         49         51          38        237        240        293
656 Swansea         Gerhard Tremmel      GK         41         51          38        237        197         60
470 QPR             Nedum Onuoha         DF         38         51          39        237        177          4
12 Arsenal         Kieran Gibbs         DF         53         51          15        237        262        379
26 Arsenal         Lukas Podolski       FW         81         50          12        241        322        546
674 Tottenham       William Gallas       DF         50         50          38        241        246        313
23 Arsenal         Nacho Monreal        DF         52         50          13        241        263        361
750 West Ham        Carlton Cole         FW         44         50          36        241        219        156
222 Fulham          Aaron Hughes         DF         40         50          38        241        194         34
772 West Ham        Gary O'Neil          MF         43         50          36        241        212        114
382 Newcastle       Yoan Gouffran        FW         62         50          15        241        290        474
477 QPR             Andros Townsend      MF         44         49          38        248        227        156
369 Newcastle       Vurnon Anita         MF         44         49          38        248        227        156
446 QPR             Djibril Cisse        FW         58         49          39        248        280        438
182 Everton         Johnny Heitinga      DF         50         49          38        248        255        313
225 Fulham          Giorgos Karagounis   MF         47         49          34        248        239        238
356 Man Utd         Chris Smalling       DF         45         48          38        253        234        183
570 Stoke City      Ryan Shotton         MF         46         48          38        253        238        216
319 Man City        Kolo Toure           DF         51         48          39        253        267        344
492 Reading         Danny Guthrie        MF         41         48          36        253        210         60
578 Stoke City      Andy Wilkinson       DF         40         48          40        253        204         34
502 Reading         Alex McCarthy        GK         40         48          36        253        204         34
690 Tottenham       Kyle Naughton        DF         39         48          37        253        199         19
810 Wigan           Ivan Ramis           DF         42         47          37        260        225         82
457 QPR             Esteban Granero      MF         52         47          36        260        270        361
180 Everton         Darron Gibson        MF         47         47          38        260        246        238
393 Newcastle       Danny Simpson        DF         46         46          38        263        246        216
440 QPR             Jose Bosingwa        DF         48         46          38        263        264        269
780 West Ham        James Tomkins        DF         41         46          36        263        222         60
579 Stoke City      Marc Wilson          DF         39         46          38        263        209         19
491 Reading         Chris Gunter         DF         38         46          36        263        203          4
240 Fulham          Philippe Senderos    DF         47         45          38        268        265        238
202 Fulham          Chris Baird          DF         39         45          38        268        215         19
288 Liverpool       Andre Wisdom         DF         38         45          32        268        208          4
207 Fulham          Ashkan Dejagah       MF         55         45          34        268        288        409
152 Chelsea          Mikel               MF         43         44          36        272        244        114
398 Newcastle       Steven Taylor S      DF         46         44          38        272        266        216
493 Reading         Ian Harte            DF         37         44          36        272        206          1
471 QPR             Ji-Sung Park         MF         52         44          38        272        279        361
479 QPR             Shaun Wright-Phillip MF         48         44          39        272        268        269
s

691 Tottenham       Scott Parker         MF         52         43          37        277        285        361
11 Arsenal         Yao Gervinho         MF         68         42          12        278        321        508
577 Stoke City      Dean Whitehead       MF         42         42          40        278        246         82
453 QPR             Fabio Fabio          DF         40         42          38        278        236         34
2 Arsenal         Mikel Arteta         MF         75         41          13        281        339        533
325 Man Utd         Oliveira Anderson    MF         51         41          38        281        292        344
712 West Brom       Marc-Antoine Fortune FW         48         41          38        281        278        269
752 West Ham        James Collins        DF         46         41          14        281        274        216
195 Everton         Phil Neville         MF         41         40          38        285        256         60
660 Tottenham       Benoit Assou-Ekotto  DF         60         40          38        285        313        458
21 Arsenal         Per Mertesacker      DF         53         39          13        287        303        379
513 Reading         Nicky Shorey         DF         38         39          36        287        242          4
16 Arsenal         Carl Jenkinson       DF         40         39          13        287        257         34
531 Southampton     Jos Hooiveld         DF         40         39          37        287        257         34
436 Norwich         Steven Whittaker     DF         40         39          38        287        257         34
643 Swansea         Luke Moore           FW         43         38          36        292        275        114
399 Newcastle       Cheick Tiote         MF         48         38          38        292        296        269
326 Man Utd         Alexander Buttner    DF         50         37          36        294        302        313
458 QPR             Rob Green            GK         41         37          38        294        271         60
494 Reading         Noel Hunt            FW         46         37          36        294        291        216
678 Tottenham       Tom Huddlestone      MF         45         37          37        294        287        183
389 Newcastle       James Perch          DF         44         37          38        294        283        156
655 Swansea         Dwight Tiendalli     DF         45         36          32        299        293        183
69 Aston Villa     Matthew Lowton       DF         45         36          12        299        293        183
37 Arsenal         Thomas Vermaelen     DF         67         36          12        299        343        504
128 Chelsea         Ryan Bertrand        DF         39         35          36        302        272         19
751 West Ham        Joe Cole             MF         51         35          37        302        310        344
805 Wigan           Callum McManaman     FW         45         35          38        302        298        183
401 Newcastle       Mike Williamson      DF         40         35          38        302        276         34
508 Reading         Alex Pearce          DF         38         34          36        306        273          4
350 Man Utd         Luis Nani            MF         82         34          38        306        369        548
368 Newcastle       Shola Ameobi         FW         51         34          38        306        313        344
610 Sunderland      Alfred N'Diaye       MF         42         34          17        306        289         82
584 Sunderland      Titus Bramble        DF         40         33          38        310        286         34
311 Man City        Micah Richards       DF         57         33          38        310        332        429
618 Sunderland      David Vaughan        MF         49         33          38        310        312        293
766 West Ham        George McCartney     DF         38         32          36        313        282          4
528 Southampton     Guly Guilherme       MF         47         32          37        313        311        238
753 West Ham        Jack Collison        MF         46         32          36        313        309        216
464 QPR             Jermaine Jenas       MF         42         32          38        313        300         82
489 Reading         Kaspars Gorkss       DF         37         31          36        317        284          1
521 Southampton     Kelvin Davis         GK         41         31          37        317        301         60
19 Arsenal         Vito Mannone         GK         40         31          36        317        299         34
448 QPR             Shaun Derry          MF         42         30          38        320        306         82
764 West Ham        Modibo Maiga         FW         50         30          36        320        324        313
818 Wigan           Ben Watson           MF         50         30          38        320        324        313
402 Newcastle       Mapou Yanga-Mbiwa    DF         49         29          15        323        328        293
737 West Brom       Markus Rosenberg     FW         59         29          37        323        357        449
355 Man Utd         Paul Scholes         MF         50         29          38        323        331        313
426 Norwich         Ryan Ryan Bennett    DF         39         28          38        326        304         19
526 Southampton     Daniel Fox           DF         40         28          37        326        308         34
51 Aston Villa     Ciaran Clark         DF         44         28          12        326        318        156
290 Man City        Mario Balotelli      FW         86         28          38        326        389        554
454 QPR             Alejandro Faurlin    MF         47         28          38        326        326        238
304 Man City        Sisenando Maicon     DF         62         28          34        326        363        474
45 Aston Villa     Joe Bennett          DF         44         28          36        326        318        156
791 Wigan           Roger Espinoza       MF         41         27          16        333        315         60
44 Aston Villa     Barry Bannan         MF         47         27          12        333        333        238
651 Swansea         Itay Shechter        FW         50         27          35        333        342        313
46 Aston Villa     Darren Bent          FW         78         27          18        333        384        537
280 Liverpool       Nuri Sahin           MF         54         27          35        333        352        395
27 Arsenal         Aaron Ramsey         MF         54         27          12        333        352        395
231 Fulham          Kieran Richardson    MF         53         27          36        333        351        379
527 Southampton     Paulo Gazzaniga      GK         40         26          37        340        316         34
497 Reading         Stephen Kelly        DF         40         26          38        340        316         34
783 Wigan           Antolin Alcaraz      DF         42         26          38        340        320         82
340 Man Utd         Phil Jones           DF         57         26          38        340        362        429
421 Norwich         Steve Morison        FW         49         26          38        340        345        293
346 Man Utd         Anders Lindegaard    GK         51         26          38        340        350        344
312 Man City        Jack Rodwell         MF         46         26          37        340        336        216
388 Newcastle       Gabriel Obertan      MF         41         25          39        347        323         60
728 West Brom       Boaz Myhill          GK         44         25          39        347        334        156
582 Sunderland      Phil Bardsley        DF         44         25          37        347        334        156
56 Aston Villa     Karim El Ahmadi      MF         42         25          16        347        327         82
798 Wigan           David Jones          MF         43         24          39        351        337        114
514 Reading         Jay Tabb             MF         43         24          36        351        337        114
269 Liverpool       Brad Jones           GK         44         24          37        351        340        156
62 Aston Villa     Brett Holman         MF         55         24          12        351        364        409
482 Reading         Hope Akpan           MF         45         24          17        351        344        183
376 Newcastle       Mathieu Debuchy      DF         47         24          17        351        348        238
416 Norwich         Simeon Jackson       FW         47         24          38        351        348        238
449 QPR             Samba Diakite        MF         44         24          39        351        340        156
672 Tottenham       Brad Friedel         GK         48         23          37        359        360        269
642 Swansea         Garry Monk           DF         42         22          36        360        346         82
5 Arsenal         Alex Chamberlain     MF         69         22          13        360        391        513
621 Swansea         Kemy Agustien        MF         45         22          36        360        358        183
545 Southampton     James Ward-Prowse    MF         43         22          37        360        347        114
371 Newcastle       Gael Bigirimana      MF         43         21          38        364        359        114
281 Liverpool       Jonjo Shelvey        MF         51         21          39        364        371        344
212 Fulham          Urby Emanuelson      MF         46         21          13        364        361        216
377 Newcastle       Rob Elliot           GK         40         20          38        367        352         34
734 West Brom       Steven Reid          MF         47         20          39        367        367        238
812 Wigan           Joel Robles          GK         40         20          15        367        352         34
815 Wigan           Ronnie Stam          DF         38         19          38        370        352          4
17 Arsenal         Laurent Koscielny    DF         53         19          12        370        382        379
64 Aston Villa     Stephen Ireland      MF         50         19          12        370        377        313
733 West Brom       Goran Popov          DF         44         19          34        370        366        156
510 Reading         Jason Roberts        FW         45         18          36        374        372        183
263 Liverpool        Fernandez Saez      FW         47         18          33        374        375        238
254 Liverpool       Fabio Borini         FW         72         18          37        374        409        526
315 Man City        Scott Sinclair       MF         60         18          37        374        392        458
197 Everton         Bryan Oviedo         MF         48         18          34        374        379        269
455 QPR             Anton Ferdinand      DF         41         17          39        379        369         60
67 Aston Villa     Eric Lichaj          DF         43         17          12        379        373        114
588 Sunderland      Lee Cattermole       MF         43         17          38        379        373        114
654 Swansea         Neil Taylor          DF         45         17          36        379        378        183
677 Tottenham       Lewis Holtby         MF         63         17          14        379        401        482
586 Sunderland      Fraizer Campbell     FW         49         17          37        379        383        293
742 West Brom       Jerome Thomas        MF         51         17          39        379        386        344
7 Arsenal         Vassiriki Diaby      MF         61         17          12        379        398        465
127 Chelsea         Yossi Benayoun       MF         61         17          36        379        398        465
29 Arsenal         Bacary Sagna         DF         47         17          15        379        381        238
404 Norwich         Leon Barnett         DF         37         16          38        389        365          1
210 Fulham          Mahamadou Diarra     MF         47         16          37        389        385        238
74 Aston Villa     Yacouba Sylla        MF         42         16          14        389        376         82
599 Sunderland      Matthew Kilgallon    DF         38         16          37        389        368          4
685 Tottenham       Jake Livermore       MF         41         15          39        393        380         60
209 Fulham          Clint Dempsey        MF         92         15           1        393        436        562
385 Newcastle       Steve Harper         GK         45         15          38        393        386        183
228 Fulham          Stanislav Manolev    DF         42         14          13        396        386         82
378 Newcastle       Shane Ferguson       MF         43         14          38        396        389        114
183 Everton         Tony Hibbert         DF         50         14          38        396        396        313
535 Southampton     Emmanuel Mayuka      FW         48         14          35        396        394        269
441 QPR             Jay Bothroyd         FW         47         14          40        396        393        238
48 Aston Villa     Jordan Bowery        FW         45         13          36        401        395        183
330 Man Utd         Jonathan Evans J     DF         48         13           1        401        400        269
620 Sunderland      Connor Wickham       FW         50         13          37        401        405        313
213 Fulham          Eyong Enoh           MF         50         13          13        401        405        313
31 Arsenal         Clarindo Santos      DF         49         13          14        401        403        293
192 Everton         Jan Mucha            GK         43         12          38        406        397        114
217 Fulham          Emmanuel Frimpong    MF         45         12          15        406        402        183
103 Bolton          Mark Davies M        MF         48         12           2        406        409        269
148 Chelsea         Marko Marin          MF         66         12          35        406        429        498
184 Everton         Thomas Hitzlsperger  MF         50         12          30        406        412        313
335 Man Utd         Darren Fletcher      MF         54         12          38        406        416        395
565 Stoke City      Michael Owen         FW         50         12          35        406        412        313
740 West Brom       Gabriel Tamas        DF         42         11          39        413        404         82
391 Newcastle       Sammy Sammy Ameobi   FW         43         11          39        413        408        114
72 Aston Villa     Charles N'Zogbia     MF         66         11          13        413        434        498
114 Bolton          Martin Petrov        MF         52         11           2        413        419        361
88 Blackburn       David Hoilett        FW         55         11           3        413        423        409
475 QPR             Tommy Smith          FW         45         11          40        413        411        183
43 Aston Villa     Nathan Baker         DF         39         10          12        419        407         19
662 Tottenham       Tom Carroll          MF         42         10          37        419        414         82
39 Arsenal         Jack Wilshere        MF         63         10          12        419        441        482
35 Arsenal         Wojciech Szczesny    GK         53         10          12        419        427        379
85 Blackburn       Morten Gamst Gamst P MF         62         10           2        419        439        474
edersen

615 Sunderland      Louis Saha           FW         49         10          37        419        422        293
205 Fulham          Matthew Briggs       DF         39          9          38        425        415         19
158 Chelsea         Oriol Romeu          MF         41          9          37        425        417         60
574 Stoke City      Matthew Upson        DF         41          9          39        425        417         60
216 Fulham          Kerim Frei           MF         43          9          37        425        420        114
613 Sunderland      Kieran Richardson    MF         58          9           1        425        444        438
585 Sunderland      Wes Brown            DF         46          9          38        425        424        216
201 Everton         Apostolos Vellios    FW         47          9          38        425        425        238
223 Fulham          Andrew Johnson A     FW         47          9           1        425        425        238
257 Liverpool       Sebastian Coates     DF         44          9          39        425        421        156
450 QPR             Kieron Dyer          MF         44          8          39        434        429        156
785 Wigan           Mauro Boselli        FW         50          8          37        434        440        313
54 Aston Villa     Fabian Delph         MF         46          8          12        434        432        216
486 Reading         Shaun Cummings       DF         38          7          36        437        428          4
273 Liverpool       Dirk Kuyt            MF         94          7           1        437        478        567
166 Everton         Ross Barkley         MF         41          7          39        437        433         60
797 Wigan           Angelo Henriquez     FW         42          7          16        437        434         82
743 West Brom       George Thorne        MF         43          7          39        437        437        114
430 Norwich         Andrew Surman        MF         43          7          38        437        437        114
580 Stoke City      Jonathan Woodgate    DF         45          7           2        437        442        183
352 Man Utd         Nick Powell          MF         45          7          37        437        442        183
465 QPR             Andrew Johnson       FW         46          7          38        437        446        216
6 Arsenal         Francis Coquelin     MF         47          7          13        437        448        238
82 Blackburn       Scott Dann           DF         47          7           1        437        448        238
272 Liverpool       Martin Kelly         DF         51          7          38        437        452        344
42 Aston Villa     Marc Albrighton      MF         52          7          12        437        453        361
109 Bolton          Ivan Klasnic         FW         59          7           2        437        456        449
515 Reading         Stuart Taylor        GK         40          7          35        437        431         34
162 Chelsea         Ross Turnbull        GK         39          6          36        452        445         19
409 Norwich         Lee Camp             GK         40          6          15        452        447         34
310 Man City        Karim Rekik          DF         43          6          38        452        450        114
1 Arsenal         Andrey Arshavin      MF         65          6          12        452        470        491
637 Swansea         Roland Lamah         MF         50          6          14        452        455        313
229 Fulham          Danny Murphy         MF         61          6           1        452        464        465
9 Arsenal         Lukasz Fabianski     GK         43          6          12        452        450        114
539 Southampton     Frazer Richardson    DF         41          5          37        459        454         60
680 Tottenham       Harry Kane           FW         43          5          38        459        457        114
151 Chelsea         Raul Meireles        MF         63          5          36        459        474        482
84 Blackburn       Mauro Formica        MF         49          5           2        459        462        293
244 Fulham          David Stockdale      GK         43          5          38        459        457        114
709 West Brom       Craig Dawson         DF         38          4          39        464        459          4
439 QPR             Tal Ben Haim         DF         39          4          17        464        460         19
801 Wigan           Adrian Lopez         DF         39          4          38        464        460         19
777 West Ham        Jordan Spence        DF         40          4          29        464        463         34
566 Stoke City      Wilson Palacios      MF         41          4          39        464        465         60
52 Aston Villa     Simon Dawkins        MF         42          4          14        464        466         82
699 Tottenham       Rafael Van der Vaart MF         89          4          37        464        509        557
717 West Brom       Gonzalo Jara         DF         43          4          39        464        468        114
61 Aston Villa     Chris Herd           MF         43          4          12        464        468        114
294 Man City        Nigel De Jong        MF         44          4          38        464        471        156
58 Aston Villa     Shay Given           GK         45          4          12        464        472        183
107 Bolton          Jussi Jaaskelainen   GK         48          4           2        464        473        269
251 Liverpool       Oussama Assaidi      MF         57          4          36        464        483        429
110 Bolton          Zat Knight           DF         42          4           2        464        466         82
538 Southampton     Ben Reeves           DF         38          3          37        478        475          4
567 Stoke City      Jermaine Pennant     MF         50          3          38        478        491        313
122 Chelsea         Nathan Ake           DF         40          3          20        478        477         34
384 Newcastle       Massadio Haidara     DF         41          3          15        478        479         60
608 Sunderland      David Meyler         MF         42          3          38        478        480         82
140 Chelsea         Henrique Hilario     GK         42          3          38        478        480         82
518 Southampton     Richard Chaplow      MF         42          3          37        478        480         82
309 Man City        Abdul Razak          MF         43          3          39        478        484        114
806 Wigan           Ryo Miyaichi         MF         43          3          12        478        484        114
774 West Ham        Emanuel Pogatetz     DF         43          3          14        478        484        114
373 Newcastle       Adam Campbell        FW         45          3          20        478        487        183
756 West Ham        Alou Diarra          MF         45          3          36        478        487        183
381 Newcastle       Dan Gosling          MF         46          3          38        478        489        216
181 Everton         Magaye Gueye         FW         46          3          38        478        489        216
568 Stoke City      Danny Pugh           MF         50          3           2        478        491        313
607 Sunderland      James McFadden       FW         50          3          28        478        491        313
487 Reading         Daniel Daniel Carric DF         38          3          17        478        475          4
o

653 Swansea         Alan Tate            DF         38          2          36        495        494          4
321 Man City        Gnegneri Yaya Toure  MF         77          2           1        495        530        534
459 QPR             Michael Harriman     DF         39          2          38        495        496         19
795 Wigan           Roman Golobart       DF         40          2          24        495        497         34
605 Sunderland      Kader Mangane        DF         40          2          16        495        497         34
400 Newcastle       Haris Vuckic         MF         42          2          40        495        499         82
119 Bolton          Gretar Rafn Steinsso DF         42          2           2        495        499         82
n

155 Chelsea         Lucas Piazon         MF         42          2          35        495        499         82
226 Fulham          Pajtim Kasami        MF         42          2          38        495        499         82
523 Southampton     Steve De Ridder      MF         42          2          37        495        499         82
411 Norwich         David Fox            MF         43          2          38        495        504        114
118 Bolton          Paul Robinson        DF         43          2           2        495        504        114
116 Bolton          Nigel Reo-Coker      MF         44          2           1        495        506        156
833 Wolves          Karl Henry           MF         44          2           1        495        506        156
113 Bolton          Fabrice Muamba       MF         44          2           1        495        506        156
707 West Brom       Simon Cox            FW         45          2          38        495        510        183
390 Newcastle       Nile Ranger          MF         45          2          26        495        510        183
832 Wolves          Wayne Hennessey      GK         46          2           1        495        512        216
593 Sunderland      Ahmed Elmohamady     MF         49          2          38        495        513        293
543 Southampton     Billy Sharp          FW         49          2          37        495        513        293
634 Swansea         Danny Graham         FW         50          2           1        495        515        313
679 Tottenham       Younes Kaboul        DF         50          2          38        495        515        313
91 Blackburn       Marcus Marcus Olsson MF         50          2           1        495        515        313
738 West Brom       Paul Scharner        MF         51          2           2        495        518        344
115 Bolton          Darren Pratley       MF         52          2           2        495        519        361
208 Fulham          Moussa Dembele       FW         52          2           1        495        519        361
95 Blackburn       Jason Roberts        FW         53          2           3        495        521        379
283 Liverpool       Jay Spearing         MF         54          2          38        495        522        395
175 Everton         Royston Drenthe      MF         54          2           1        495        522        395
102 Bolton          Kevin Davies K       FW         57          2           2        495        524        429
835 Wolves          Matthew Jarvis       MF         57          2           1        495        524        429
682 Tottenham       Niko Kranjcar        MF         60          2           1        495        526        458
83 Blackburn       David Dunn           MF         62          2           3        495        527        474
157 Chelsea          Ramires             MF         69          2           1        495        528        513
694 Tottenham       Louis Saha           FW         69          2           1        495        528        513
775 West Ham        Daniel Potts         DF         38          2          36        495        494          4
396 Newcastle       James Tavernier      DF         39          1          38        531        531         19
136 Chelsea         Didier Drogba        FW        101          1           1        531        576        570
73 Aston Villa     Enda Stevens         DF         39          1          12        531        531         19
138 Chelsea         Paulo Ferreira       DF         40          1          36        531        534         34
242 Fulham          Alex Smith           DF         40          1          34        531        534         34
227 Fulham          Stephen Kelly        DF         40          1           1        531        534         34
333 Man Utd         Fabio Fabio          DF         42          1           1        531        537         82
793 Wigan           Fraser Fyvie         MF         42          1          37        531        537         82
176 Everton         Shane Duffy          DF         42          1          38        531        537         82
57 Aston Villa     Gary Gardner         MF         42          1          13        531        537         82
14 Arsenal         Serge Gnabry         MF         43          1          29        531        541        114
443 QPR             DJ Campbell          FW         43          1          40        531        541        114
555 Stoke City      Maurice Edu          MF         43          1          36        531        541        114
848 Wolves          Stephen Ward         DF         43          1           1        531        541        114
820 Wolves          Christophe Berra     DF         43          1           1        531        541        114
846 Wolves          Richard Stearman     DF         43          1           1        531        541        114
758 West Ham        Robert Hall          FW         43          1          34        531        541        114
512 Reading         Dominic Samuel       FW         44          1          23        531        548        156
419 Norwich         Chris Martin C       FW         44          1          38        531        548        156
814 Wigan           Conor Sammon         FW         45          1          38        531        550        183
633 Swansea         Mark Gower           MF         45          1          36        531        550        183
553 Stoke City      Rory Delap           MF         45          1          40        531        550        183
837 Wolves          Eggert Jonsson       MF         45          1           1        531        550        183
206 Fulham          Simon Davies         MF         46          1          38        531        554        216
670 Tottenham       Yago Falque          MF         46          1          37        531        554        216
96 Blackburn       Ruben Rochina        FW         47          1           3        531        556        238
790 Wigan           Mohamed Diame        MF         48          1           1        531        557        269
397 Newcastle       Ryan Taylor R        DF         48          1          38        531        557        269
583 Sunderland      Phillip Bardsley     DF         48          1           1        531        557        269
741 West Brom       Somen Tchoyi         MF         48          1           2        531        557        269
825 Wolves          David Edwards        MF         49          1           1        531        561        293
53 Aston Villa     Nathan Delfouneso    FW         50          1          21        531        562        313
841 Wolves          Nenad Milijas        MF         52          1           1        531        563        361
827 Wolves          Steven Fletcher      FW         52          1           1        531        563        361
839 Wolves          Michael Kightly      MF         54          1           1        531        565        395
130 Chelsea         Jose Bosingwa        DF         55          1           1        531        566        409
86 Blackburn       David Goodwillie     FW         55          1           2        531        566        409
342 Man Utd         Will Keane           FW         55          1           2        531        566        409
823 Wolves          Kevin Doyle          FW         57          1           1        531        569        429
137 Chelsea         Michael Essien       MF         63          1          36        531        570        482
171 Everton         Tim Cahill           MF         65          1          38        531        571        491
692 Tottenham       Roman Pavlyuchenko   FW         70          1           1        531        572        521
146 Chelsea         Romelu Lukaku        FW         74          1           3        531        573        530
247 Liverpool       Charlie Adam         MF         86          1           1        531        574        554
256 Liverpool       Andy Carroll         FW         91          1           1        531        575        560
98 Bolton          Marcos Alonso        DF         39          1           1        531        531         19

576 rows selected.


SQL Solution with Recursive Subquery Factoring

SQL

Note that currently I have retained the fantasy league table and column names, but they could as well be the generic items and categories in place of players and teams: This is a generic solution.

VAR KEEP_NUM NUMBER
VAR MAX_PRICE NUMBER
BEGIN
:KEEP_NUM := 40;
:MAX_PRICE := 900;
END;
/
PROMPT Top ten solutions
WITH  /* XS_EPL */ position_counts AS (
SELECT Max (CASE id WHEN 'AL' THEN min_players END) team_size
FROM positions
), pos_runs AS (
SELECT id, Sum (CASE WHEN id != 'AL' THEN min_players END) OVER (ORDER BY id DESC) num_remain, min_players, max_players
FROM positions
), players_ranked AS (
SELECT id,
position_id,
price,
avg_points,
appearances,
Row_Number() OVER (ORDER BY position_id, avg_points DESC) rnk,
Min (price) OVER () min_price
FROM players
), rsf (path_rnk, nxt_id, lev, tot_price, tot_profit, pos_id, n_pos, team_size, min_players, pos_path, path) AS (
SELECT 0, 0, 0, 0, 0, 'AL', 0, c.team_size, 0, CAST (NULL AS VARCHAR2(400)) pos_path, CAST (NULL AS VARCHAR2(400)) path
FROM position_counts c
UNION ALL
SELECT Row_Number() OVER (PARTITION BY r.pos_path || p.position_id ORDER BY r.tot_profit + p.avg_points DESC),
p.rnk,
r.lev + 1,
r.tot_price + p.price,
r.tot_profit + p.avg_points,
p.position_id,
CASE p.position_id WHEN r.pos_id THEN r.n_pos + 1 ELSE 1 END,
r.team_size,
m1.min_players,
r.pos_path || p.position_id,
r.path || LPad (p.id, 3, '0')
FROM rsf r
JOIN players_ranked p
ON p.rnk > r.nxt_id
JOIN pos_runs m1
ON m1.id = p.position_id
AND CASE p.position_id WHEN r.pos_id THEN r.n_pos + 1 ELSE 1 END <= m1.max_players
AND r.team_size - r.lev - 1 >= m1.num_remain - CASE p.position_id WHEN r.pos_id THEN r.n_pos + 1 ELSE 1 END
AND (r.lev = 0 OR p.position_id = r.pos_id OR r.n_pos >= r.min_players)
WHERE r.tot_price + p.price + (r.team_size - r.lev - 1) * p.min_price <= :MAX_PRICE
AND r.path_rnk < :KEEP_NUM
AND r.lev < r.team_size
), paths_ranked AS (
SELECT tot_price,
tot_profit,
team_size,
Row_Number () OVER (ORDER BY tot_profit DESC, tot_price) r_profit,
path
FROM rsf
WHERE lev = team_size
), top_ten_paths AS (
SELECT tot_price,
tot_profit,
r_profit,
path,
player_index
FROM paths_ranked
CROSS JOIN (SELECT LEVEL player_index FROM position_counts CONNECT BY LEVEL <= team_size)
WHERE r_profit <= 10
), top_ten_teams AS (
SELECT tot_price,
tot_profit,
r_profit,
path,
player_index,
Substr (path, (player_index - 1) * 3 + 1, 3) player_id
FROM top_ten_paths
)
SELECT  /*+ GATHER_PLAN_STATISTICS */  t.tot_profit,
t.tot_price,
t.r_profit rnk,
p.position_id,
t.player_id p_id,
p.player_name,
p.club_name,
p.price,
p.avg_points
FROM top_ten_teams t
JOIN players p
ON p.id = t.player_id
ORDER BY t.tot_profit DESC, t.tot_price, t.path, p.position_id, t.player_index

How It Works

The solution approach is based on the method used to provide exact solutions for knapsack problems in my earlier article, but with a number of extensions to cater for the new category constraints, and to reduce searching to manageable proportions.

• position_counts subquery: Gets the team size
• pos_runs subquery: Computes the running sums of the item category minima going backwards by category id
• players_ranked subquery: Computes a unique rank for the items, ordered by category, then profit descending
• rsf subquery: A recursive subquery that returns a set of item sets in the form of strings of the concatenated item ids
• rsf anchor branch: Initialises the recursion with a single record
• rsf recursive branch: Items are joined having strictly higher rank, and such that the constraints are not violated, both at the current position and with any possible extrapolations
• Row_Number is used to rank the records by overall profit, and the where clause excludes records from the previous iteration that have rank below an input figure;
this exclusion is what makes the computation practical; the ranking is partitioned by the category path, which is important to avoid closing off solution paths too early
• Item category minima are treated differently from the maxima; once a category is in a position, the subsequent positions are required to be of the same category until the minimum number is reached
• paths_ranked subquery: Excludes records that are not of full length,, and ranks those that are by profit
• top_ten_paths subquery: Selects the top ten paths and cross-joins them with a row-generator to provide an indexed set of records with set size cardinality for each path
• top_ten_teams subquery: Builds the item records for each of the best sets by extracting the item id from the paths according to index
• Main query: Joins items table to provide additional attributes

PL/SQL Recursive Solution

This is a version in the form of a pipelined function.

SQL

SELECT  /*+ GATHER_PLAN_STATISTICS XP_EPL */
t.sol_profit,
t.sol_price,
Dense_Rank() OVER (ORDER BY t.sol_profit DESC, t.sol_price) RNK,
p.position_id,
t.item_id,
p.club_name,
p.player_name,
p.price,
p.avg_points
FROM TABLE (Item_Cats.Best_N_Sets (
p_keep_size => 10,
p_max_calls => 100000,
p_n_size => 10,
p_max_price => 900,
p_cat_cur => CURSOR (
SELECT id, min_players, max_players
FROM positions
ORDER BY CASE WHEN id != 'AL' THEN 0 END, id
),
p_item_cur => CURSOR (
SELECT id, price, avg_points, position_id
FROM players
ORDER BY position_id, avg_points DESC
)
)
) t
JOIN players p
ON p.id = t.item_id
ORDER BY t.sol_profit DESC, t.sol_price, p.position_id, t.item_id


Package

CREATE OR REPLACE PACKAGE Item_Cats AS
/**************************************************************************************************

Author:         Brendan Furey
Date:           7 July 2013
Description:    Brendan's pipelined function solution for the knapsack problem with one container,
and items having categories with validity bands, as described at
http://aprogrammerwrites.eu/?p=878 (SQL for the Fantasy Football Knapsack Problem)

***************************************************************************************************/
TYPE sol_detail_rec_type IS RECORD (
set_id                  NUMBER,
item_id                 VARCHAR2(100),
sol_price               NUMBER,
sol_profit              NUMBER
);
TYPE sol_detail_list_type IS VARRAY(100) OF sol_detail_rec_type;

FUNCTION Best_N_Sets (  p_keep_size     PLS_INTEGER,
p_max_calls     PLS_INTEGER,
p_n_size        PLS_INTEGER,
p_max_price     PLS_INTEGER,
p_cat_cur       SYS_REFCURSOR,
p_item_cur      SYS_REFCURSOR) RETURN sol_detail_list_type PIPELINED;

END Item_Cats;
/
SHO ERR

CREATE OR REPLACE PACKAGE BODY Item_Cats AS

c_cat_all           CONSTANT VARCHAR2(3) := 'AL';
c_hash_renew_point  CONSTANT PLS_INTEGER := 1000;
--
-- Bulk collect array types
--
TYPE cat_rec_type IS RECORD (
id                      VARCHAR2(3),
min_items               PLS_INTEGER,
max_items               PLS_INTEGER
);
TYPE cat_list_type IS VARRAY(100) OF cat_rec_type;

TYPE item_cat_rec_type IS RECORD (
id                      VARCHAR2(10),
price                   PLS_INTEGER,
profit                  PLS_INTEGER,
cat_id                  VARCHAR2(3)
);
TYPE item_cat_list_type IS VARRAY(1000) OF item_cat_rec_type;

TYPE chr_hash_type IS TABLE OF PLS_INTEGER INDEX BY VARCHAR2(30);
--
-- Input data LOL types
--
TYPE num_range_rec_type IS RECORD (
item_beg                PLS_INTEGER,
item_end                PLS_INTEGER
);
TYPE num_range_list_type IS VARRAY(1000) OF num_range_rec_type;
TYPE num_list_type IS VARRAY(100) OF PLS_INTEGER;
--
-- Solution types
--
TYPE id_list_type IS VARRAY(100) OF VARCHAR2(10);
TYPE sol_rec_type IS RECORD (                       -- trial solution and record in retained array
item_list               id_list_type,
price                   PLS_INTEGER,
profit                  PLS_INTEGER
);
TYPE sol_list_type IS VARRAY(100) OF sol_rec_type;  -- retained solutions

TYPE int_hash_type IS TABLE OF PLS_INTEGER INDEX BY PLS_INTEGER;

g_keep_size                 PLS_INTEGER;
g_max_calls                 PLS_INTEGER;
g_n_size                    PLS_INTEGER;
g_max_price                 PLS_INTEGER;

g_cat_hash                  chr_hash_type;
g_item_range_list           num_range_list_type := num_range_list_type();
g_hash_buffer               int_hash_type;
g_profit_hash               int_hash_type;
g_trial_sol                 sol_rec_type;
g_sol_list                  sol_list_type := sol_list_type();
g_cat_list                  cat_list_type;
g_item_cat_list             item_cat_list_type;

g_n_cats                    PLS_INTEGER;
g_n_items                   PLS_INTEGER;
g_set_size                  PLS_INTEGER;
g_nth_profit                PLS_INTEGER := 0;
g_min_item_price            PLS_INTEGER := 1000000;
g_max_item_profit           PLS_INTEGER := 0;
g_min_price_togo            num_list_type := num_list_type();
g_max_profit_togo           num_list_type := num_list_type();
g_n_recursive_calls         PLS_INTEGER := 0;
g_n_sols                    PLS_INTEGER := 0;

PROCEDURE Write_Log (p_line VARCHAR2, p_debug_level PLS_INTEGER DEFAULT 0) IS
BEGIN

IF Utils.g_debug_level >= p_debug_level THEN
Utils.Write_Log (p_line);
END IF;

END Write_Log;

FUNCTION Dedup_Hash (p_card PLS_INTEGER, p_key PLS_INTEGER, p_hash int_hash_type) RETURN PLS_INTEGER IS
l_trial_key       PLS_INTEGER := p_card * p_key;
BEGIN

LOOP

IF p_hash.EXISTS (l_trial_key) THEN
l_trial_key := l_trial_key + 1;
ELSE
EXIT;
END IF;

END LOOP;
RETURN l_trial_key;

END Dedup_Hash;

PROCEDURE Pop_Arrays (p_cat_cur SYS_REFCURSOR, p_item_cur SYS_REFCURSOR) IS
n_cat                     PLS_INTEGER := 0;
l_price                   PLS_INTEGER;
l_profit                  PLS_INTEGER;

l_last_cat                VARCHAR2(30) := '???';

l_item_price_hash         int_hash_type;
l_item_profit_hash        int_hash_type;

BEGIN

FETCH p_cat_cur BULK COLLECT INTO g_cat_list;
CLOSE p_cat_cur;
Write_Log ('Collected ' || g_cat_list.COUNT || ' cats');

FETCH p_item_cur BULK COLLECT INTO g_item_cat_list;
CLOSE p_item_cur;
Write_Log ('Collected ' || g_item_cat_list.COUNT || ' items');

Write_Log (g_n_cats || ' cats');

g_n_cats := g_cat_list.COUNT - 1;
g_item_range_list.EXTEND (g_n_cats);
FOR i IN 1..g_cat_list.COUNT LOOP

IF g_cat_list(i).id = c_cat_all THEN
g_set_size := g_cat_list(i).min_items;
ELSE
g_cat_hash (g_cat_list(i).id) := i;
END IF;

END LOOP;
g_cat_list.TRIM;

FOR i IN 1..g_item_cat_list.COUNT LOOP

IF g_item_cat_list(i).price < g_min_item_price THEN
g_min_item_price := g_item_cat_list(i).price;
END IF;

IF g_item_cat_list(i).profit > g_max_item_profit THEN
g_max_item_profit := g_item_cat_list(i).profit;
END IF;
l_item_price_hash (Dedup_Hash (p_card => g_item_cat_list.COUNT, p_key => g_item_cat_list(i).price, p_hash => l_item_price_hash)) := i;
l_item_profit_hash (Dedup_Hash (p_card => g_item_cat_list.COUNT, p_key => g_item_cat_list(i).profit, p_hash => l_item_profit_hash)) := i;

IF g_item_cat_list(i).cat_id != l_last_cat THEN
--
-- Cat has changed, so reset the itm number to zero, and assign the list of items
--  for previous cat
--
n_cat := n_cat + 1;
g_item_range_list (n_cat).item_beg := i;
IF i > 1 THEN
g_item_range_list (n_cat - 1).item_end := i - 1;
END IF;
l_last_cat := g_item_cat_list(i).cat_id;

END IF;

END LOOP;

g_n_items := g_item_cat_list.COUNT;
g_item_range_list (g_n_cats).item_end := g_n_items;
g_min_price_togo.EXTEND (g_set_size);
g_max_profit_togo.EXTEND (g_set_size);
l_price := l_item_price_hash.FIRST;
l_profit := l_item_profit_hash.LAST;
Write_Log ('Hash first price min / profit max ' || l_price || ' / ' || l_profit);
g_min_price_togo (g_set_size) := 0;
g_max_profit_togo (g_set_size) := 0;

FOR i IN 1..g_set_size - 1 LOOP

g_min_price_togo (g_set_size - i) := g_min_price_togo (g_set_size - i + 1) + l_price / g_item_cat_list.COUNT;
g_max_profit_togo (g_set_size - i) := g_max_profit_togo (g_set_size - i + 1) + l_profit / g_item_cat_list.COUNT;

l_price := l_item_price_hash.NEXT (l_price);
l_profit := l_item_profit_hash.PRIOR (l_profit);
Write_Log ((g_set_size - i) || ': price min / profit max ' || g_min_price_togo (g_set_size - i) || ' / ' || g_max_profit_togo (g_set_size - i));

END LOOP;

Write_Log ('Price min / profit max ' || g_min_item_price || ' / ' || g_max_item_profit);
FOR i IN 1..g_n_cats LOOP

Utils.Write_Log ('Cat ' || i || ' : ' || g_cat_list(i).id || ' - ' || g_cat_list(i).min_items || ' - ' || g_cat_list(i).max_items || ' - ' || g_item_range_list(i).item_beg || ' - ' || g_item_range_list(i).item_end);

END LOOP;

g_sol_list.EXTEND (g_n_size);
FOR i IN 1..g_n_size LOOP
g_profit_hash (i) := i;
END LOOP;
g_nth_profit := g_profit_hash.FIRST;
g_trial_sol.price := 0;
g_trial_sol.profit := 0;

END Pop_Arrays;

PROCEDURE Get_Best_Item_List (p_position PLS_INTEGER, p_item_index_beg PLS_INTEGER, p_item_index_end PLS_INTEGER, x_item_hash IN OUT NOCOPY int_hash_type) IS

PROCEDURE Check_Item (p_item_index PLS_INTEGER) IS

l_item_rec         item_cat_rec_type := g_item_cat_list (p_item_index);
l_item_list        num_list_type;
Item_Failed        EXCEPTION;
l_item_str         VARCHAR2(200) := LPad (l_item_rec.id, (p_position)*3, '.') || '-' || l_item_rec.cat_id  || '-' || l_item_rec.price  || '-' || l_item_rec.profit;

FUNCTION Price_LB (p_position PLS_INTEGER) RETURN PLS_INTEGER IS
BEGIN

RETURN g_min_price_togo (p_position);

END Price_LB;

FUNCTION Profit_UB (p_position PLS_INTEGER) RETURN PLS_INTEGER IS
BEGIN

RETURN g_max_profit_togo (p_position);

END Profit_UB;

BEGIN

IF l_item_rec.price + g_trial_sol.price + Price_LB (p_position) > g_max_price THEN

l_item_str := l_item_str || ' [price failed ' || (l_item_rec.price + g_trial_sol.price) || ']';
IF (g_set_size - p_position) = 0 THEN
Write_Log ('Solution fails with price of ' || (l_item_rec.price + g_trial_sol.price), 1);
END IF;
RAISE Item_Failed;
END IF;

IF l_item_rec.profit + g_trial_sol.profit + Profit_UB (p_position) <= g_nth_profit THEN
l_item_str := l_item_str || ' [profit failed ' || (l_item_rec.profit + g_trial_sol.profit) || ', nth = ' || g_nth_profit || ']';
IF (g_set_size - p_position) = 0 THEN
Write_Log ('Solution fails with profit of ' || (l_item_rec.profit + g_trial_sol.profit), 1);
g_n_sols := g_n_sols + 1;
END IF;
RAISE Item_Failed;
END IF;

x_item_hash (Dedup_Hash (p_card => g_keep_size, p_key => l_item_rec.profit + g_trial_sol.profit, p_hash => x_item_hash)) := p_item_index;

EXCEPTION

WHEN Item_Failed THEN
Write_Log (l_item_str, 2);

END Check_Item;

BEGIN

FOR i IN p_item_index_beg..p_item_index_end LOOP

Check_Item (i);

END LOOP;

END Get_Best_Item_List;

l_nth_index        PLS_INTEGER;
BEGIN

g_n_sols := g_n_sols + 1;
l_nth_index := g_profit_hash (g_profit_hash.FIRST);
Write_Log ('Solution replaces in position ' || l_nth_index || ' profit is ' || g_trial_sol.profit || ' price is ' || g_trial_sol.price, 1);

g_profit_hash.DELETE (g_profit_hash.FIRST);
g_profit_hash (Dedup_Hash (p_card => g_n_size, p_key => g_trial_sol.profit, p_hash => g_profit_hash)) := l_nth_index;

g_sol_list (l_nth_index) := g_trial_sol;
g_nth_profit := g_profit_hash.FIRST / g_n_size;

IF Mod (g_n_sols, c_hash_renew_point) = 0 THEN -- Not sur eif this works, but is intended to clear memory overhang
g_hash_buffer :=  g_profit_hash;
g_profit_hash :=  g_hash_buffer;
END IF;

PROCEDURE Add_Item_To_Trial (p_position PLS_INTEGER, p_item_index PLS_INTEGER) IS

l_item_rec          item_cat_rec_type := g_item_cat_list (p_item_index);

BEGIN

g_trial_sol.price := g_trial_sol.price + l_item_rec.price;
g_trial_sol.profit := g_trial_sol.profit + l_item_rec.profit;

IF g_trial_sol.item_list IS NULL THEN
g_trial_sol.item_list := id_list_type (l_item_rec.id);
ELSE
g_trial_sol.item_list.EXTEND;
g_trial_sol.item_list (p_position) := l_item_rec.id;
END IF;

IF p_position = g_set_size THEN

END IF;

FUNCTION Try_Position (p_position PLS_INTEGER, p_n_curr_cat PLS_INTEGER, p_cat_index_beg PLS_INTEGER, p_item_index_beg PLS_INTEGER) RETURN BOOLEAN IS

l_item_hash       int_hash_type;
l_item_index      PLS_INTEGER;
l_cat_index_beg   PLS_INTEGER := p_cat_index_beg;
l_item_index_beg  PLS_INTEGER := p_item_index_beg;
l_n_curr_cat      PLS_INTEGER := p_n_curr_cat;
l_profit          PLS_INTEGER;
BEGIN

g_n_recursive_calls := g_n_recursive_calls + 1;
IF g_n_recursive_calls > g_max_calls THEN
Write_Log (LPad ('*', p_position, '*') || 'Truncating search after ' || g_max_calls || ' recursive calls***');
RETURN TRUE;
END IF;

IF p_n_curr_cat = g_cat_list (p_cat_index_beg).max_items THEN
--
-- passed in the cat we were on in last position
-- check max not passed, if so go to next cat and reset item range
--
Write_Log ('Maxed Cat ' || p_cat_index_beg || ': ' || p_n_curr_cat || '-' || g_cat_list (p_cat_index_beg).max_items, 5);
l_cat_index_beg := p_cat_index_beg + 1;
IF l_cat_index_beg > g_n_cats THEN
RETURN FALSE;
END IF;
l_item_index_beg := g_item_range_list (l_cat_index_beg).item_beg;
l_n_curr_cat := 0;

END IF;

FOR j IN l_cat_index_beg..g_n_cats LOOP

IF l_item_index_beg < g_item_range_list (j).item_beg THEN
l_item_index_beg := g_item_range_list (j).item_beg;
END IF;

Write_Log ('Start Cat ' || j || ': ' || l_n_curr_cat || '-' || g_cat_list(j).min_items, 5);
l_n_curr_cat := l_n_curr_cat + 1;
l_item_hash.DELETE;
Get_Best_Item_List (p_position => p_position, p_item_index_beg => l_item_index_beg, p_item_index_end => g_item_range_list(j).item_end, x_item_hash => l_item_hash);
IF l_item_hash IS NOT NULL THEN

l_profit := l_item_hash.LAST;
FOR i IN 1..Least (g_keep_size, l_item_hash.COUNT) LOOP
l_item_index := l_item_hash (l_profit);
Write_Log (LPad (g_item_cat_list (l_item_index).id, (p_position)*3, '.') || '-' || g_item_cat_list (l_item_index).cat_id  || '-' || g_item_cat_list (l_item_index).price  || '-' || g_item_cat_list (l_item_index).profit, 1);
Add_Item_To_Trial (p_position => p_position, p_item_index => l_item_index);
IF p_position < g_set_size THEN

IF Try_Position (p_position => p_position + 1, p_n_curr_cat => l_n_curr_cat, p_cat_index_beg => j, p_item_index_beg => l_item_index + 1) THEN RETURN TRUE; END IF;

END IF;

IF g_trial_sol.item_list IS NOT NULL AND g_trial_sol.item_list.COUNT = p_position THEN
g_trial_sol.item_list.TRIM;
g_trial_sol.price := g_trial_sol.price - g_item_cat_list (l_item_index).price;
g_trial_sol.profit := g_trial_sol.profit - g_item_cat_list (l_item_index).profit;
END IF;

l_profit := l_item_hash.PRIOR (l_profit);

END LOOP;

ELSE
Write_Log ('No items found');
END IF;
--
--  Don't look at any more cats if we are not past the minimum for the current one at this position
--
Write_Log ('Cat ' || j || ': ' || l_n_curr_cat || '-' || g_cat_list(j).min_items, 5);
IF l_n_curr_cat <= g_cat_list(j).min_items THEN
EXIT;
END IF;

l_n_curr_cat := 0;

END LOOP;
RETURN FALSE;

END Try_Position;

FUNCTION Best_N_Sets (  p_keep_size     PLS_INTEGER,
p_max_calls     PLS_INTEGER,
p_n_size        PLS_INTEGER,
p_max_price     PLS_INTEGER,
p_cat_cur       SYS_REFCURSOR,
p_item_cur      SYS_REFCURSOR) RETURN sol_detail_list_type PIPELINED IS

l_sol_detail_rec          sol_detail_rec_type;
l_position                PLS_INTEGER := 1;
BEGIN

g_keep_size := p_keep_size; g_max_calls := p_max_calls; g_n_size := p_n_size; g_max_price := p_max_price;

Pop_Arrays (p_cat_cur, p_item_cur);

IF Try_Position (p_position => 1, p_n_curr_cat => 0, p_cat_index_beg => 1, p_item_index_beg => 1) THEN NULL; END IF;

FOR i IN 1..g_n_size LOOP

l_sol_detail_rec.set_id := i;
l_sol_detail_rec.sol_price := g_sol_list(i).price;
l_sol_detail_rec.sol_profit := g_sol_list(i).profit;

IF g_sol_list(i).item_list IS NOT NULL THEN

FOR j IN 1..g_sol_list(i).item_list.COUNT LOOP

l_sol_detail_rec.item_id := g_sol_list(i).item_list(j);
PIPE ROW (l_sol_detail_rec);

END LOOP;

END IF;

END LOOP;

Write_Log (g_n_sols || ' solutions found in ' || g_n_recursive_calls || ' recursive calls');

RETURN;

END Best_N_Sets;

END Item_Cats;
/
SHO ERR


How It Works

The solution approach uses a modified depth-first recursion, following a similar idea to the SQL method, of adding items in strictly increasing order of category and profit ranking. Treatment of constraints uses similar ideas to the SQL solution.

• The package is completely generic, with the items and categories being specified by means of input cursors
• Depth-first is modified by a ranking of the next sets of feasible items, partitioned by category, in order to limit the number progressed
• Hashes (associative arrays in Oracle) are used for ranking
• A function, Dedup_Hash is used to allow for duplicate hash keys; it works by storing as key the actual key multiplied by the ranking set cardinality, then adding one iteratively until no duplication occurs
• The recursion is truncated if the number of recursive calls exceeds an input limit
• The input cursors are read into arrays and all subsequent processing is in memory; this is not a scalability problem because only the current best solutions are retained; also, I reset hashes after a given number of updates

Results

Test Problem 1: Brazilian League

The pipelined function solved this in 5 seconds, while the SQL solution solved it in 21 seconds. The solutions were identical, as follows:

TOT_PROFIT  TOT_PRICE        RNK PO P_ID PLAYER_NAME                    CLUB_NAME                           PRICE AVG_POINTS
---------- ---------- ---------- -- ---- ------------------------------ ------------------------------ ---------- ----------
10923      18176          1 CB 098  Digão                          Fluminense                            931        927
099  Samir                          Flamengo                              267        680
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 022  Wilson                         Vitória                              1239        794
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
039  Elsinho                        Vasco                                1468        850
19027          2 CB 098  Digão                          Fluminense                            931        927
099  Samir                          Flamengo                              267        680
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 021  Fábio                          Cruzeiro                             2090        794
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
039  Elsinho                        Vasco                                1468        850
10905      18795          3 CB 098  Digão                          Fluminense                            931        927
099  Samir                          Flamengo                              267        680
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 023  Vanderlei                      Coritiba                             1858        776
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
039  Elsinho                        Vasco                                1468        850
10833      18994          4 CB 098  Digão                          Fluminense                            931        927
102  Bressan                        Grêmio                               1085        590
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 022  Wilson                         Vitória                              1239        794
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
039  Elsinho                        Vasco                                1468        850
19845          5 CB 098  Digão                          Fluminense                            931        927
102  Bressan                        Grêmio                               1085        590
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 021  Fábio                          Cruzeiro                             2090        794
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
039  Elsinho                        Vasco                                1468        850
10831      19608          6 CB 098  Digão                          Fluminense                            931        927
103  Manoel                         Atlético-PR                          1699        588
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 022  Wilson                         Vitória                              1239        794
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
039  Elsinho                        Vasco                                1468        850
10825      18190          7 CB 098  Digão                          Fluminense                            931        927
099  Samir                          Flamengo                              267        680
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 022  Wilson                         Vitória                              1239        794
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
040  Egídio                         Cruzeiro                             1482        752
19041          8 CB 098  Digão                          Fluminense                            931        927
099  Samir                          Flamengo                              267        680
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 021  Fábio                          Cruzeiro                             2090        794
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
040  Egídio                         Cruzeiro                             1482        752
10821      19370          9 CB 098  Digão                          Fluminense                            931        927
104  Cléber                         Ponte Preta                          1461        578
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 022  Wilson                         Vitória                              1239        794
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
039  Elsinho                        Vasco                                1468        850
10815      19613         10 CB 098  Digão                          Fluminense                            931        927
102  Bressan                        Grêmio                               1085        590
CO 078  Jaime De AlMFda                Flamengo                             1156        803
FW 001  Éderson                        Atlético-PR                          1712       1012
002  Maxi Biancucchi                Vitória                              1962       1005
003  Rafael Sobis                   Fluminense                           2303        955
GK 023  Vanderlei                      Coritiba                             1858        776
MF 058  Fred                           Internacional                        3028        892
059  Zé Roberto                     Grêmio                               2593        878
060  Otavinho                       Internacional                         762        807
WB 038  Ivan                           Portuguesa                            755       1320
039  Elsinho                        Vasco                                1468        850

120 rows selected.


Test Problem 2: English Premier League

I used a maximum price of 900, and a keep parameter of 40, meaning retain the best 40 records by partition during recursion for the SQL and a value of 10 for the pipelined function. The keep parameter operates differently in the two cases so does not need to be the same value.

The SQL solution took 98 seconds, while the pipelined function took 290 seconds. Both methods got the same best solution, but the tenth best was marginally better for the SQL, at 1965, compared with 1962 for pipelined function (which truncated after 100000 recursive calls).

[For followers of Manchester United, and David Moyes, it may be of interest to note that all of the best solutions included both Leighton Baines and Patrice Evra, and the best also had Marouane Fellaini. Will these players be united also in the real world next season?]

SQL Solution

TOT_PROFIT  TOT_PRICE        RNK PO P_ID PLAYER_NAME                    CLUB_NAME            PRICE AVG_POINTS
---------- ---------- ---------- -- ---- ------------------------------ --------------- ---------- ----------
1973        898          1 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
268  Glen Johnson                   Liverpool               65        141
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
265  Steven Gerrard                 Liverpool               92        187
641  Miguel Michu                   Swansea                 79        169
177  Marouane Fellaini              Everton                 73        168
1971        899          2 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
268  Glen Johnson                   Liverpool               65        141
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
047  Christian Benteke              Aston Villa             74        166
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
265  Steven Gerrard                 Liverpool               92        187
641  Miguel Michu                   Swansea                 79        169
1970        893          3 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
268  Glen Johnson                   Liverpool               65        141
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
047  Christian Benteke              Aston Villa             74        166
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
265  Steven Gerrard                 Liverpool               92        187
177  Marouane Fellaini              Everton                 73        168
1968        899          4 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
268  Glen Johnson                   Liverpool               65        141
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
047  Christian Benteke              Aston Villa             74        166
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
265  Steven Gerrard                 Liverpool               92        187
177  Marouane Fellaini              Everton                 73        168
428  Robert Snodgrass               Norwich                 62        152
5 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
569  Ryan Shawcross                 Stoke City              56        133
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
149  Juan Mata                      Chelsea                102        190
641  Miguel Michu                   Swansea                 79        169
177  Marouane Fellaini              Everton                 73        168
900          6 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
268  Glen Johnson                   Liverpool               65        141
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
204  Dimitar Berbatov               Fulham                  71        161
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
149  Juan Mata                      Chelsea                102        190
177  Marouane Fellaini              Everton                 73        168
1966        896          7 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
268  Glen Johnson                   Liverpool               65        141
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
204  Dimitar Berbatov               Fulham                  71        161
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
265  Steven Gerrard                 Liverpool               92        187
641  Miguel Michu                   Swansea                 79        169
900          8 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
569  Ryan Shawcross                 Stoke City              56        133
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
047  Christian Benteke              Aston Villa             74        166
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
149  Juan Mata                      Chelsea                102        190
641  Miguel Michu                   Swansea                 79        169
1965        890          9 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
268  Glen Johnson                   Liverpool               65        141
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
204  Dimitar Berbatov               Fulham                  71        161
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
265  Steven Gerrard                 Liverpool               92        187
177  Marouane Fellaini              Everton                 73        168
894         10 DF 165  Leighton Baines                Everton                 78        173
332  Patrice Evra                   Man Utd                 73        152
569  Ryan Shawcross                 Stoke City              56        133
FW 286  Luis Suarez                    Liverpool              105        213
533  Rickie Lambert                 Southampton             69        178
047  Christian Benteke              Aston Villa             74        166
GK 549  Asmir Begovic                  Stoke City              56        154
MF 661  Gareth Bale                    Tottenham              111        240
030  Santi Santi Cazorla            Arsenal                 97        198
149  Juan Mata                      Chelsea                102        190
177  Marouane Fellaini              Everton                 73        168

110 rows selected.

Elapsed: 00:01:38.26


Pipelined Function Solution

SOL_PROFIT  SOL_PRICE        RNK PO ITEM_ID    CLUB_NAME       PLAYER_NAME               PRICE AVG_POINTS
---------- ---------- ---------- -- ---------- --------------- -------------------- ---------- ----------
1973        898          1 DF 165        Everton         Leighton Baines              78        173
268        Liverpool       Glen Johnson                 65        141
332        Man Utd         Patrice Evra                 73        152
FW 286        Liverpool       Luis Suarez                 105        213
533        Southampton     Rickie Lambert               69        178
GK 549        Stoke City      Asmir Begovic                56        154
MF 177        Everton         Marouane Fellaini            73        168
265        Liverpool       Steven Gerrard               92        187
30         Arsenal         Santi Santi Cazorla          97        198
641        Swansea         Miguel Michu                 79        169
661        Tottenham       Gareth Bale                 111        240
1971        899          2 DF 165        Everton         Leighton Baines              78        173
268        Liverpool       Glen Johnson                 65        141
332        Man Utd         Patrice Evra                 73        152
FW 286        Liverpool       Luis Suarez                 105        213
47         Aston Villa     Christian Benteke            74        166
533        Southampton     Rickie Lambert               69        178
GK 549        Stoke City      Asmir Begovic                56        154
MF 265        Liverpool       Steven Gerrard               92        187
30         Arsenal         Santi Santi Cazorla          97        198
641        Swansea         Miguel Michu                 79        169
661        Tottenham       Gareth Bale                 111        240
1970        893          3 DF 165        Everton         Leighton Baines              78        173
268        Liverpool       Glen Johnson                 65        141
332        Man Utd         Patrice Evra                 73        152
FW 286        Liverpool       Luis Suarez                 105        213
47         Aston Villa     Christian Benteke            74        166
533        Southampton     Rickie Lambert               69        178
GK 549        Stoke City      Asmir Begovic                56        154
MF 177        Everton         Marouane Fellaini            73        168
265        Liverpool       Steven Gerrard               92        187
30         Arsenal         Santi Santi Cazorla          97        198
661        Tottenham       Gareth Bale                 111        240
1968        900          4 DF 165        Everton         Leighton Baines              78        173
268        Liverpool       Glen Johnson                 65        141
332        Man Utd         Patrice Evra                 73        152
FW 204        Fulham          Dimitar Berbatov             71        161
286        Liverpool       Luis Suarez                 105        213
533        Southampton     Rickie Lambert               69        178
GK 549        Stoke City      Asmir Begovic                56        154
MF 149        Chelsea         Juan Mata                   102        190
177        Everton         Marouane Fellaini            73        168
30         Arsenal         Santi Santi Cazorla          97        198
661        Tottenham       Gareth Bale                 111        240
1966        896          5 DF 165        Everton         Leighton Baines              78        173
268        Liverpool       Glen Johnson                 65        141
332        Man Utd         Patrice Evra                 73        152
FW 204        Fulham          Dimitar Berbatov             71        161
286        Liverpool       Luis Suarez                 105        213
533        Southampton     Rickie Lambert               69        178
GK 549        Stoke City      Asmir Begovic                56        154
MF 265        Liverpool       Steven Gerrard               92        187
30         Arsenal         Santi Santi Cazorla          97        198
641        Swansea         Miguel Michu                 79        169
661        Tottenham       Gareth Bale                 111        240
1965        890          6 DF 165        Everton         Leighton Baines              78        173
268        Liverpool       Glen Johnson                 65        141
332        Man Utd         Patrice Evra                 73        152
FW 204        Fulham          Dimitar Berbatov             71        161
286        Liverpool       Luis Suarez                 105        213
533        Southampton     Rickie Lambert               69        178
GK 549        Stoke City      Asmir Begovic                56        154
MF 177        Everton         Marouane Fellaini            73        168
265        Liverpool       Steven Gerrard               92        187
30         Arsenal         Santi Santi Cazorla          97        198
661        Tottenham       Gareth Bale                 111        240
897          7 DF 165        Everton         Leighton Baines              78        173
248        Liverpool       Daniel Agger                 64        133
332        Man Utd         Patrice Evra                 73        152
FW 286        Liverpool       Luis Suarez                 105        213
533        Southampton     Rickie Lambert               69        178
GK 549        Stoke City      Asmir Begovic                56        154
MF 177        Everton         Marouane Fellaini            73        168
265        Liverpool       Steven Gerrard               92        187
30         Arsenal         Santi Santi Cazorla          97        198
641        Swansea         Miguel Michu                 79        169
661        Tottenham       Gareth Bale                 111        240
1964        895          8 DF 165        Everton         Leighton Baines              78        173
268        Liverpool       Glen Johnson                 65        141
332        Man Utd         Patrice Evra                 73        152
FW 286        Liverpool       Luis Suarez                 105        213
533        Southampton     Rickie Lambert               69        178
720        West Brom       Romelu Lukaku                66        157
GK 549        Stoke City      Asmir Begovic                56        154
MF 149        Chelsea         Juan Mata                   102        190
177        Everton         Marouane Fellaini            73        168
30         Arsenal         Santi Santi Cazorla          97        198
661        Tottenham       Gareth Bale                 111        240
1963        898          9 DF 165        Everton         Leighton Baines              78        173
248        Liverpool       Daniel Agger                 64        133
332        Man Utd         Patrice Evra                 73        152
FW 286        Liverpool       Luis Suarez                 105        213
47         Aston Villa     Christian Benteke            74        166
533        Southampton     Rickie Lambert               69        178
GK 549        Stoke City      Asmir Begovic                56        154
MF 265        Liverpool       Steven Gerrard               92        187
30         Arsenal         Santi Santi Cazorla          97        198
641        Swansea         Miguel Michu                 79        169
661        Tottenham       Gareth Bale                 111        240
1962        891         10 DF 165        Everton         Leighton Baines              78        173
268        Liverpool       Glen Johnson                 65        141
332        Man Utd         Patrice Evra                 73        152
FW 286        Liverpool       Luis Suarez                 105        213
533        Southampton     Rickie Lambert               69        178
720        West Brom       Romelu Lukaku                66        157
GK 549        Stoke City      Asmir Begovic                56        154
MF 265        Liverpool       Steven Gerrard               92        187
30         Arsenal         Santi Santi Cazorla          97        198
641        Swansea         Miguel Michu                 79        169
661        Tottenham       Gareth Bale                 111        240

110 rows selected.

Elapsed: 00:04:49.92


Conclusions

My idea for using recursive subquery factoring to solve combinatorial optimisation problems, such as knapsack problems, described in other articles on my blog, was previously only practical for small problems. The extensions described here render it a practical proposition even for larger problems. It is also relatively simple compared with procedural approaches.

# SQL for the Balanced Number Partitioning Problem

I noticed a post on AskTom recently that referred to an SQL solution to a version of the so-called Bin Fitting problem, where even distribution is the aim. The solution, How do I solve a Bin Fitting problem in an SQL statement?, uses Oracle's Model clause, and, as the poster of the link observed, has the drawback that the number of bins is embedded in the query structure. I thought it might be interesting to find solutions without that drawback, so that the number of bins could be passed to the query as a bind variable. I came up with three solutions using different techniques, starting here.

An interesting article in American Scientist, The Easiest Hard Problem, notes that the problem is NP-complete, or certifiably hard, but that simple greedy heuristics often produce a good solution, including one used by schoolboys to pick football teams. The article uses the more descriptive term for the problem of balanced number partitioning, and notes some practical applications. The Model clause solution implements a multiple-bin version of the main Greedy Algorithm, while my non-Model SQL solutions implement variants of it that allow other techniques to be used, one of which is very simple and fast: this implements the team picking heuristic for multiple teams.

Another poster, Stew Ashton, suggested a simple change to my Model solution that improved performance, and I use this modified version here. He also suggested that using PL/SQL might be faster, and I have added my own simple PL/SQL implementation of the Greedy Algorithm, as well as a second version of the recursive subquery factoring solution that performs better than the first.

This article explains the solutions, considers two simple examples to illustrate them, and reports on performance testing across dimensions of number of items and number of bins. These show that the solutions exhibit either linear or quadratic variation in execution time with number of items, and some methods are sensitive to the number of bins while others are not.

After I had posted my solutions on the AskTom thread, I came across a thread on OTN, need help to resolve this issue, that requested a solution to a form of bin fitting problem where the bins have fixed capacity and the number of bins required must be determined. I realised that my solutions could easily be extended to add that feature, and posted extended versions of two of the solutions there. I have added a section here for this.

Updated, 5 June 2013: added Model and RSF diagrams

Greedy Algorithm Variants

Say there are N bins and M items.

Greedy Algorithm (GDY)
Set bin sizes zero
Loop over items in descending order of size

• Add item to current smallest bin
• Calculate new bin size

End Loop

Greedy Algorithm with Batched Rebalancing (GBR)
Set bin sizes zero
Loop over items in descending order of size in batches of N items

• Assign batch to N bins, with bins in ascending order of size
• Calculate new bin sizes

End Loop

Greedy Algorithm with No Rebalancing - or, Team Picking Algorithm (TPA)
Assign items to bins cyclically by bin sequence in descending order of item size

Two Examples

Example: Four Items

Here we see that the Greedy Algorithm finds the perfect solution, with no difference in bin size, but the two variants have a difference of two.
Example: Six Items

Here we see that none of the algorithms finds the perfect solution. Both the standard Greedy Algorithm and its batched variant give a difference of two, while the variant without rebalancing gives a difference of four.

SQL Solutions

Original Model for GDY
See the link above for the SQL for the problem with three bins only.

The author has two measures for each bin and implements the GDY algorithm using CASE expressions and aggregation within the rules. The idea is to iterate over the items in descending order of size, setting the item bin to the bin with current smallest value. I use the word 'bin' for his 'bucket'. Some notes:

• Dimension by row number, ordered by item value
• Add measures for the iteration, it, and number of iterations required, counter
• Add measures for the bin name, bucket_name, and current minimum bin value, min_tmp (only first entry used)
• Add measures for each item bin value, bucket_1-3, being the item value if it's in that bin, else zero
• Add measures for each bin running sum, pbucket_1-3, being the current value of each bin (only first two entries used)
• The current minimum bin value, bin_tmp[1] is computed as the least of the running sums
• The current item bin value is set to the item value for the bin whose value matches the minimum just computed, and null for the others
• The current bin name is set similarly to be the bin matching the minimum
• The new running sums are computed for each bin

Brendan's Generic Model for GDY

SELECT item_name, bin, item_value, Max (bin_value) OVER (PARTITION BY bin) bin_value
FROM (
SELECT * FROM items
MODEL
DIMENSION BY (Row_Number() OVER (ORDER BY item_value DESC) rn)
MEASURES (item_name,
item_value,
Row_Number() OVER (ORDER BY item_value DESC) bin,
item_value bin_value,
Row_Number() OVER (ORDER BY item_value DESC) rn_m,
0 min_bin,
Count(*) OVER () - :N_BINS - 1 n_iters
)
RULES ITERATE(100000) UNTIL (ITERATION_NUMBER >= n_iters[1]) (
min_bin[1] = Min(rn_m) KEEP (DENSE_RANK FIRST ORDER BY bin_value)[rn <= :N_BINS],
bin[ITERATION_NUMBER + :N_BINS + 1] = min_bin[1],
bin_value[min_bin[1]] = bin_value[CV()] + Nvl (item_value[ITERATION_NUMBER + :N_BINS + 1], 0)
)
)
WHERE item_name IS NOT NULL
ORDER BY item_value DESC


My Model solution works for any number of bins, passing the number of bins as a bind variable. The key idea here is to use values in the first N rows of a generic bin value measure to store all the running bin values, rather than as individual measures. I have included two modifications suggested by Stew in the AskTom thread.

• Dimension by row number, ordered by item value
• Initialise a bin measure to the row number (the first N items will remain fixed)
• Initialise a bin value measure to item value (only first N entries used)
• Add the row number as a measure, rn_m, in addition to a dimension, for referencing purposes
• Add a min_bin measure for current minimum bin index (first entry only)
• Add a measure for the number of iterations required, n_iters
• The first N items are correctly binned in the measure initialisation
• Set the minimum bin index using analytic Min function with KEEP clause over the first N rows of bin value
• Set the bin for the current item to this index
• Update the bin value for the corresponding bin only

Recursive Subquery Factor for GBR

WITH bins AS (
SELECT LEVEL bin, :N_BINS n_bins FROM DUAL CONNECT BY LEVEL <= :N_BINS
), items_desc AS (
SELECT item_name, item_value, Row_Number () OVER (ORDER BY item_value DESC) rn
FROM items
), rsf (bin, item_name, item_value, bin_value, lev, bin_rank, n_bins) AS (
SELECT b.bin,
i.item_name,
i.item_value,
i.item_value,
1,
b.n_bins - i.rn + 1,
b.n_bins
FROM bins b
JOIN items_desc i
ON i.rn = b.bin
UNION ALL
SELECT r.bin,
i.item_name,
i.item_value,
r.bin_value + i.item_value,
r.lev + 1,
Row_Number () OVER (ORDER BY r.bin_value + i.item_value),
r.n_bins
FROM rsf r
JOIN items_desc i
ON i.rn = r.bin_rank + r.lev * r.n_bins
)
SELECT r.item_name,
r.bin, r.item_value, r.bin_value
FROM rsf r
ORDER BY item_value DESC

The idea here is to use recursive subquery factors to iterate through the items in batches of N items, assigning each item to a bin according to the rank of the bin on the previous iteration.

• Initial subquery factors form record sets for the bins and for the items with their ranks in descending order of value
• The anchor branch assign bins to the first N items, assigning the item values to a bin value field, and setting the bin rank in ascending order of this bin value
• The recursive branch joins the batch of items to the record in the previous batch whose bin rank matches that of the item in the reverse sense (so largest item goes to smallest bin etc.)
• The analytic Row_Number function computes the updated bin ranks, and the bin values are updated by simple addition

Recursive Subquery Factor for GBR with Temporary Table
Create Table and Index

DROP TABLE items_desc_temp
/
CREATE GLOBAL TEMPORARY TABLE items_desc_temp (
item_name  VARCHAR2(30) NOT NULL,
item_value NUMBER(8) NOT NULL,
rn         NUMBER
)
ON COMMIT DELETE ROWS
/
CREATE INDEX items_desc_temp_N1 ON items_desc_temp (rn)
/

Insert into Temporary Table

INSERT INTO items_desc_temp
SELECT item_name, item_value, Row_Number () OVER (ORDER BY item_value DESC) rn
FROM items;

RSF Query with Temporary Table

WITH bins AS (
SELECT LEVEL bin, :N_BINS n_bins FROM DUAL CONNECT BY LEVEL <= :N_BINS
), rsf (bin, item_name, item_value, bin_value, lev, bin_rank, n_bins) AS (
SELECT b.bin,
i.item_name,
i.item_value,
i.item_value,
1,
b.n_bins - i.rn + 1,
b.n_bins
FROM bins b
JOIN items_desc_temp i
ON i.rn = b.bin
UNION ALL
SELECT r.bin,
i.item_name,
i.item_value,
r.bin_value + i.item_value,
r.lev + 1,
Row_Number () OVER (ORDER BY r.bin_value + i.item_value),
r.n_bins
FROM rsf r
JOIN items_desc_temp i
ON i.rn = r.bin_rank + r.lev * r.n_bins
)
SELECT item_name, bin, item_value, bin_value
FROM rsf
ORDER BY item_value DESC

The idea here is that in the initial RSF query a subquery factor of items was joined on a calculated field, so the whole record set had to be read, and performance could be improved by putting that initial record set into an indexed temporary table ahead of the main query. We'll see in the performance testing section that this changes quadratic variation with problem size into linear variation.

Plain Old SQL Solution for TPA

WITH items_desc AS (
SELECT item_name, item_value,
Mod (Row_Number () OVER (ORDER BY item_value DESC), :N_BINS) + 1 bin
FROM items
)
SELECT item_name, bin, item_value, Sum (item_value) OVER (PARTITION BY bin) bin_total
FROM items_desc
ORDER BY item_value DESC


The idea here is that the TPA algorithm can be implemented in simple SQL using analyic functions.

• The subquery factor assigns the bins by taking the item rank in descending order of value and applying the modulo (N) function
• The main query returns the bin totals in addition by analytic summing by bin

Pipelined Function for GDY
Package

CREATE OR REPLACE PACKAGE Bin_Fit AS

TYPE bin_fit_rec_type IS RECORD (item_name VARCHAR2(100), item_value NUMBER, bin NUMBER);
TYPE bin_fit_list_type IS VARRAY(1000) OF bin_fit_rec_type;

TYPE bin_fit_cur_rec_type IS RECORD (item_name VARCHAR2(100), item_value NUMBER);
TYPE bin_fit_cur_type IS REF CURSOR RETURN bin_fit_cur_rec_type;

FUNCTION Items_Binned (p_items_cur bin_fit_cur_type, p_n_bins PLS_INTEGER) RETURN bin_fit_list_type PIPELINED;

END Bin_Fit;
/
CREATE OR REPLACE PACKAGE BODY Bin_Fit AS

c_big_value                 CONSTANT NUMBER := 100000000;
TYPE bin_fit_cur_list_type  IS VARRAY(100) OF bin_fit_cur_rec_type;

FUNCTION Items_Binned (p_items_cur bin_fit_cur_type, p_n_bins PLS_INTEGER) RETURN bin_fit_list_type PIPELINED IS

l_min_bin              PLS_INTEGER := 1;
l_min_bin_val             NUMBER;
l_bins                    SYS.ODCINumberList := SYS.ODCINumberList();
l_bin_fit_cur_rec         bin_fit_cur_rec_type;
l_bin_fit_rec             bin_fit_rec_type;
l_bin_fit_cur_list        bin_fit_cur_list_type;

BEGIN

l_bins.Extend (p_n_bins);
FOR i IN 1..p_n_bins LOOP
l_bins(i) := 0;
END LOOP;

LOOP

FETCH p_items_cur BULK COLLECT INTO l_bin_fit_cur_list LIMIT 100;
EXIT WHEN l_bin_fit_cur_list.COUNT = 0;

FOR j IN 1..l_bin_fit_cur_list.COUNT LOOP

l_bin_fit_rec.item_name := l_bin_fit_cur_list(j).item_name;
l_bin_fit_rec.item_value := l_bin_fit_cur_list(j).item_value;
l_bin_fit_rec.bin := l_min_bin;

PIPE ROW (l_bin_fit_rec);
l_bins(l_min_bin) := l_bins(l_min_bin) + l_bin_fit_cur_list(j).item_value;

l_min_bin_val := c_big_value;
FOR i IN 1..p_n_bins LOOP

IF l_bins(i) < l_min_bin_val THEN
l_min_bin := i;
l_min_bin_val := l_bins(i);
END IF;

END LOOP;

END LOOP;

END LOOP;

END Items_Binned;


SQL Query

SELECT item_name, bin, item_value, Sum (item_value) OVER (PARTITION BY bin) bin_value
FROM TABLE (Bin_Fit.Items_Binned (
CURSOR (SELECT item_name, item_value FROM items ORDER BY item_value DESC),
:N_BINS))
ORDER BY item_value DESC


The idea here is that procedural algorithms can often be implemented more efficiently in PL/SQL than in SQL.

• The first parameter to the function is a strongly-typed reference cursor
• The SQL call passes in a SELECT statement wrapped in the CURSOR keyword, so the function can be used for any set of records that returns name and numeric value pairs
• The item records are fetched in batches of 100 using the LIMIT clause to improves efficiency

Performance Testing
I tested performance of the various queries using my own benchmarking framework across grids of data points, with two data sets to split the queries into two sets based on performance.

Query Modifications for Performance Testing

• The RSF query with staging table was run within a pipelined function in order to easily include the insert in the timings
• A system context was used to pass the bind variables as the framework runs the queries from PL/SQL, not from SQL*Plus
• I found that calculating the bin values using analytic sums, as in the code above, affected performance, so I removed this for clarity of results, outputting only item name, value and bin

Test Data Sets
For a given depth parameter, d, random numbers were inserted within the range 0-d for d-1 records. The insert was:

 INSERT INTO items
SELECT 'item-' || n, DBMS_Random.Value (0, p_point_deep) FROM
(SELECT LEVEL n FROM DUAL CONNECT BY LEVEL < p_point_deep);

The number of bins was passed as a width parameter, but note that the original, linked Model solution, MODO, hard-codes the number of bins to 3.

Test Results

Data Set 1 - Small
This was used for the following queries:

• MODO - Original Model for GDY
• MODB - Brendan's Generic Model for GDY
• RSFQ - Recursive Subquery Factor for GBR
 Depth         W3         W3         W3
Run Type=MODO
D1000       1.03       1.77       1.05
D2000       3.98       6.46       5.38
D4000      15.79       20.7      25.58
D8000      63.18      88.75      92.27
D16000      364.2     347.74     351.99
Run Type=MODB
Depth         W3         W6        W12
D1000        .27        .42        .27
D2000          1       1.58       1.59
D4000       3.86        3.8       6.19
D8000      23.26      24.57      17.19
D16000      82.29      92.04      96.02
Run Type=RSFQ
D1000       3.24       3.17       1.53
D2000       8.58       9.68       8.02
D4000      25.65      24.07      23.17
D8000      111.3     108.25      98.33
D16000     471.17     407.65     399.99


The results show:

• Quadratic variation of CPU time with number of items
• Little variation of CPU time with number of bins, although RSFQ seems to show some decline
• RSFQ is slightly slower than MODO, while my version of Model, MODB is about 4 times faster than MODO

Data Set 2 - Large
This was used for the following queries:

• RSFT - Recursive Subquery Factor for GBR with Temporary Table
• POSS - Plain Old SQL Solution for TPA
• PLFN - Pipelined Function for GDY

This table gives the CPU times in seconds across the data set:

  Depth       W100      W1000     W10000
Run Type=PLFN
D20000        .31       1.92      19.25
D40000        .65       3.87      55.78
D80000       1.28       7.72      92.83
D160000       2.67      16.59     214.96
D320000       5.29      38.68      418.7
D640000      11.61      84.57      823.9
Run Type=POSS
D20000        .09        .13        .13
D40000        .18        .21        .18
D80000        .27        .36         .6
D160000        .74       1.07        .83
D320000       1.36       1.58       1.58
D640000       3.13       3.97       4.04
Run Type=RSFT
D20000        .78        .78        .84
D40000       1.41       1.54        1.7
D80000       3.02       3.39       4.88
D160000       6.11       9.56       8.42
D320000      13.05      18.93      20.84
D640000      41.62      40.98      41.09


The results show:

• Linear variation of CPU time with number of items
• Little variation of CPU time with number of bins for POSS and RSFT, but roughly linear variation for PLFN
• These linear methods are much faster than the earlier quadratic ones for larger numbers of items
• Its approximate proportionality of time to number of bins means that, while PLFN is faster than RSFT for small number of bins, it becomes slower from around 50 bins for our problem
• The proportionality to number of bins for PLFN presumably arises from the step to find the bin of minimum value
• The lack of proportionality to number of bins for RSFT may appear surprising since it performs a sort of the bins iteratively: However, while the work for this sort is likely to be proportional to the number of bins, the number of iterations is inversely proportional and thus cancels out the variation

Solution Quality

The methods reported above implement three underlying algorithms, none of which guarantees an optimal solution. In order to get an idea of how the quality compares, I created new versions of the second set of queries using analytic functions to output the difference between minimum and maximum bin values, with percentage of the maximum also output. I ran these on the same grid, and report below the results for the four corners.

Method:			PLFN		RSFT		POSS
Point:	W100/D20000
Diff/%:			72/.004%	72/.004%	19,825/1%
Point:	W100/D640000
Diff/%:			60/.000003%	60/.000003%	633499/.03%
Point:	W10000/D20000
Diff/%:			189/.9%		180/.9%		19,995/67%
Point:	W10000/D640000
Diff/%:			695/.003%	695/.003%	639,933/3%

The results indicate that GDY (Greedy Algorithm) and GBR (Greedy Algorithm with Batched Rebalancing) generally give very similar quality results, while TPA (Team Picking Algorithm) tends to be quite a lot worse.

Extended Problem: Finding the Number of Bins Required

An important extension to the problem is when the bins have fixed capacity, and it is desired to find the minimum number of bins, then spread the items evenly between them. As mentioned at the start, I posted extensions to two of my solutions on an OTN thread, and I reproduce them here. It turns out to be quite easy to make the extension. The remainder of this section is just lifted from my OTN post and refers to the table of the original poster.

Start OTN Extract
So how do we determine the number of bins? The total quantity divided by bin capacity, rounded up, gives a lower bound on the number of bins needed. The actual number required may be larger, but mostly it will be within a very small range from the lower bound, I believe (I suspect it will nearly always be the lower bound). A good practical solution, therefore, would be to compute the solutions for a base number, plus one or more increments, and this can be done with negligible extra work (although Model might be an exception, I haven't tried it). Then the bin totals can be computed, and the first solution that meets the constraints can be used. I took two bin sets here.

SQL POS

WITH items AS (
SELECT sl_pm_code item_name, sl_wt item_amt, sl_qty item_qty,
Ceil (Sum(sl_qty) OVER () / :MAX_QTY) n_bins
FROM ow_ship_det
), items_desc AS (
SELECT item_name, item_amt, item_qty, n_bins,
Mod (Row_Number () OVER (ORDER BY item_qty DESC), n_bins) bin_1,
Mod (Row_Number () OVER (ORDER BY item_qty DESC), n_bins + 1) bin_2
FROM items
)
SELECT item_name, item_amt, item_qty,
CASE bin_1 WHEN 0 THEN n_bins ELSE bin_1 END bin_1,
CASE bin_2 WHEN 0 THEN n_bins + 1 ELSE bin_2 END bin_2,
Sum (item_amt) OVER (PARTITION BY bin_1) bin_1_amt,
Sum (item_qty) OVER (PARTITION BY bin_1) bin_1_qty,
Sum (item_amt) OVER (PARTITION BY bin_2) bin_2_amt,
Sum (item_qty) OVER (PARTITION BY bin_2) bin_2_qty
FROM items_desc
ORDER BY item_qty DESC, bin_1, bin_2

SQL Pipelined

SELECT osd.sl_pm_code item_name, osd.sl_wt item_amt, osd.sl_qty item_qty,
tab.bin_1, tab.bin_2,
Sum (osd.sl_wt) OVER (PARTITION BY tab.bin_1) bin_1_amt,
Sum (osd.sl_qty) OVER (PARTITION BY tab.bin_1) bin_1_qty,
Sum (osd.sl_wt) OVER (PARTITION BY tab.bin_2) bin_2_amt,
Sum (osd.sl_qty) OVER (PARTITION BY tab.bin_2) bin_2_qty
FROM ow_ship_det osd
JOIN TABLE (Bin_Even.Items_Binned (
CURSOR (SELECT sl_pm_code item_name, sl_qty item_value,
Sum(sl_qty) OVER () item_total
FROM ow_ship_det
ORDER BY sl_qty DESC, sl_wt DESC),
:MAX_QTY)) tab
ON tab.item_name = osd.sl_pm_code
ORDER BY osd.sl_qty DESC, tab.bin_1

Pipelined Function

CREATE OR REPLACE PACKAGE Bin_Even AS

TYPE bin_even_rec_type IS RECORD (item_name VARCHAR2(100), item_value NUMBER, bin_1 NUMBER, bin_2 NUMBER);
TYPE bin_even_list_type IS VARRAY(1000) OF bin_even_rec_type;

TYPE bin_even_cur_rec_type IS RECORD (item_name VARCHAR2(100), item_value NUMBER, item_total NUMBER);
TYPE bin_even_cur_type IS REF CURSOR RETURN bin_even_cur_rec_type;

FUNCTION Items_Binned (p_items_cur bin_even_cur_type, p_bin_max NUMBER) RETURN bin_even_list_type PIPELINED;

END Bin_Even;
/
SHO ERR
CREATE OR REPLACE PACKAGE BODY Bin_Even AS

c_big_value                 CONSTANT NUMBER := 100000000;
c_n_bin_sets                CONSTANT NUMBER := 2;

TYPE bin_even_cur_list_type IS VARRAY(100) OF bin_even_cur_rec_type;
TYPE num_lol_list_type      IS VARRAY(100) OF SYS.ODCINumberList;

FUNCTION Items_Binned (p_items_cur bin_even_cur_type, p_bin_max NUMBER) RETURN bin_even_list_type PIPELINED IS

l_min_bin                 SYS.ODCINumberList := SYS.ODCINumberList (1, 1);
l_min_bin_val             SYS.ODCINumberList := SYS.ODCINumberList (c_big_value, c_big_value);
l_bins                    num_lol_list_type := num_lol_list_type (SYS.ODCINumberList(), SYS.ODCINumberList());

l_bin_even_cur_rec        bin_even_cur_rec_type;
l_bin_even_rec            bin_even_rec_type;
l_bin_even_cur_list       bin_even_cur_list_type;

l_n_bins                  PLS_INTEGER;
l_n_bins_base             PLS_INTEGER;
l_is_first_fetch          BOOLEAN := TRUE;

BEGIN

LOOP

FETCH p_items_cur BULK COLLECT INTO l_bin_even_cur_list LIMIT 100;
EXIT WHEN l_Bin_Even_cur_list.COUNT = 0;
IF l_is_first_fetch THEN

l_n_bins_base := Ceil (l_Bin_Even_cur_list(1).item_total / p_bin_max) - 1;

l_is_first_fetch := FALSE;

l_n_bins := l_n_bins_base;
FOR i IN 1..c_n_bin_sets LOOP

l_n_bins := l_n_bins + 1;
l_bins(i).Extend (l_n_bins);
FOR k IN 1..l_n_bins LOOP
l_bins(i)(k) := 0;
END LOOP;

END LOOP;

END IF;

FOR j IN 1..l_Bin_Even_cur_list.COUNT LOOP

l_bin_even_rec.item_name := l_bin_even_cur_list(j).item_name;
l_bin_even_rec.item_value := l_bin_even_cur_list(j).item_value;
l_bin_even_rec.bin_1 := l_min_bin(1);
l_bin_even_rec.bin_2 := l_min_bin(2);

PIPE ROW (l_bin_even_rec);

l_n_bins := l_n_bins_base;
FOR i IN 1..c_n_bin_sets LOOP
l_n_bins := l_n_bins + 1;
l_bins(i)(l_min_bin(i)) := l_bins(i)(l_min_bin(i)) + l_Bin_Even_cur_list(j).item_value;

l_min_bin_val(i) := c_big_value;
FOR k IN 1..l_n_bins LOOP

IF l_bins(i)(k) < l_min_bin_val(i) THEN
l_min_bin(i) := k;
l_min_bin_val(i) := l_bins(i)(k);
END IF;

END LOOP;

END LOOP;

END LOOP;

END LOOP;

END Items_Binned;

END Bin_Even;

Output POS
Note BIN_1 means bin set 1, which turns out to have 4 bins, while bin set 2 then necessarily has 5.

ITEM_NAME         ITEM_AMT   ITEM_QTY      BIN_1      BIN_2  BIN_1_AMT  BIN_1_QTY  BIN_2_AMT  BIN_2_QTY
--------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1239606-1080          4024        266          1          1      25562        995      17482        827
1239606-1045          1880        192          2          2      19394        886      14568        732
1239606-1044          1567        160          3          3      18115        835      14097        688
1239606-1081          2118        140          4          4      18988        793      17130        657
1239606-2094          5741         96          1          5      25562        995      18782        605
...
1239606-2107            80          3          4          2      18988        793      14568        732
1239606-2084           122          3          4          3      18988        793      14097        688
1239606-2110           210          2          2          3      19394        886      14097        688
1239606-4022           212          2          3          4      18115        835      17130        657
1239606-4021           212          2          4          5      18988        793      18782        605

Output Pipelined

ITEM_NAME         ITEM_AMT   ITEM_QTY      BIN_1      BIN_2  BIN_1_AMT  BIN_1_QTY  BIN_2_AMT  BIN_2_QTY
--------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1239606-1080          4024        266          1          1      20627        878      15805        703
1239606-1045          1880        192          2          2      18220        877      16176        703
1239606-1044          1567        160          3          3      20425        878      15651        701
1239606-1081          2118        140          4          4      22787        876      14797        701
1239606-2094          5741         96          4          5      22787        876      19630        701
...
1239606-2089            80          3          4          1      22787        876      15805        703
1239606-2112           141          3          4          2      22787        876      16176        703
1239606-4022           212          2          1          1      20627        878      15805        703
1239606-4021           212          2          2          1      18220        877      15805        703
1239606-2110           210          2          3          2      20425        878      16176        703

End OTN Extract

Conclusions

• Various solutions for the balanced number partitioning problem have been presented, using Oracle's Model clause, Recursive Subquery Factoring, Pipelined Functions and simple SQL
• The performance characteristics of these solutions have been tested across a range of data sets
• As is often the case, the best solution depends on the shape and size of the data set
• A simple extension has been shown to allow determining the number of bins required in the bin-fitting interpretation of the problem
• Replacing a WITH clause with a staging table can be a useful technique to allow indexed scans

# SQL for Network Grouping

I noticed an interesting question posted on OTN this (European) Saturday morning,
Hierarchical query to combine two groupings into one broad joint grouping. I quickly realised that the problem posed was an example of a very general class of network problems that arises quite often:

Given a set of nodes and a rule for pair-wise (non-directional) linking, obtain the set of implied networks

Usually, in response to such a problem someone will suggest a CONNECT BY query solution. Unfortunately, although hierarchical SQL techniques can be used theoretically to resolve these non-hierarchical networks, they tend to be extremely inefficient for networks of any size and are therefore often impractical. There are two problems in particular:

1. Non-hierarchical networks have no root nodes, so the traversal needs to be repeated from every node in the network set
2. Hierarchical queries retrieve all possible routes through a network

I illustrated the second problem in my last post, Notes on Profiling Oracle PL/SQL, and I intend to write a longer article on the subject of networks at a later date. The most efficient way to traverse generalised networks in Oracle involves the use of PL/SQL, such as in my Scribd article of June 2010, An Oracle Network Traversal PL SQL Program. For this article, though, I will stick to SQL-only techniques and will write down three solutions in a general format whereby the tables of the specific problem are read by initial subquery factors links_v and nodes_v that are used in the rest of the queries. I'll save detailed explanation and performance analysis for the later article (see update at end of this section).

The three queries use two hierarchical methods and a method involving the Model clause:

1. CONNECT BY: This is the least efficient
2. Recursive subquery factors: This is more efficient than the first but still suffers from the two problems above
3. Model clause: This is intended to bypass the performance problems of hierarchical queries, but is still slower than PL/SQL

[Update, 2 September 2015] I have since written two articles on related subjects, the first, PL/SQL Pipelined Function for Network Analysis describes a PL/SQL program that traverses all networks and lists their structure. The second, Recursive SQL for Network Analysis, and Duality, uses a series of examples to illustrate and explain the different characteristics of the first two recursive SQL methods.

Problem Definition
Data Structure
I have taken the data structure of the OTN poster, made all fields character, and added three more records comprising a second isolated node (10) and a subnetwork of nodes 08 and 09. ITEM_ID is taken to be the primary key.

SQL> SELECT *
2    FROM item_groups
3  /

ITEM_ID    GROUP1   GROUP2
---------- -------- --------
01         A        100
02         A        100
03         A        101
04         B        100
05         B        102
06         C        103
07         D        101
08         E        104
09         E        105
10         F        106

10 rows selected.


Grouping Structure
The poster defines two items to be linked if they share the same value for either GROUP1 or GROUP2 attributes (which could obviously be generalised to any number of attributes), and items are in the same group if they can be connected by a chain of links. Observe that if there were only one grouping attribute then the problem would be trivial as that would itself group the items. Having more than one makes it more interesting and more difficult.

A real world example of such networks can be seen to be sibling networks if one takes people as the nodes and father and mother as the attributes.

Network Diagram

CONNECT BY Solution
SQL

WITH links_v AS (
SELECT t_fr.item_id node_id_fr,
t_to.item_id node_id_to,
t_fr.item_id || '-' || Row_Number() OVER (PARTITION BY t_fr.item_id ORDER BY t_to.item_id) link_id
FROM item_groups t_fr
JOIN item_groups t_to
ON t_to.item_id > t_fr.item_id
AND (t_to.group1 = t_fr.group1 OR t_to.group2 = t_fr.group2)
), nodes_v AS (
SELECT item_id node_id
FROM item_groups
), tree AS (
CONNECT BY NOCYCLE (node_id_fr = PRIOR node_id_to OR node_id_to = PRIOR node_id_fr OR
node_id_fr = PRIOR node_id_fr OR node_id_to = PRIOR node_id_to)
FROM tree
SELECT g.group_id, l.node_id_fr node_id
UNION
SELECT g.group_id, l.node_id_to
)
SELECT l.group_id "Network", l.node_id "Node"
UNION ALL
FROM nodes_v n
WHERE n.node_id NOT IN (SELECT node_id FROM linked_nodes)
ORDER BY 1, 2


Output

All networks by CONNECT BY - Unlinked nodes share network id 0
Network       Node
------------- ----
10
01-1          01
02
03
04
05
07
08-1          08
09

10 rows selected.


Notes on CONNECT BY Solution

• For convenience I have grouped the unlinked nodes into one dummy network; it's easy to assign them individual identifiers if desired

Recursive Subquery Factors (RSF) Solution
SQL

WITH links_v AS (
SELECT t_fr.item_id node_id_fr,
t_to.item_id node_id_to,
t_fr.item_id || '-' || Row_Number() OVER (PARTITION BY t_fr.item_id ORDER BY t_to.item_id) link_id
FROM item_groups t_fr
JOIN item_groups t_to
ON t_to.item_id > t_fr.item_id
AND (t_to.group1 = t_fr.group1 OR t_to.group2 = t_fr.group2)
), nodes_v AS (
SELECT item_id node_id
FROM item_groups
), rsf (node_id, id, root_id) AS (
SELECT node_id, NULL, node_id
FROM nodes_v
UNION ALL
SELECT CASE WHEN l.node_id_to = r.node_id THEN l.node_id_fr ELSE l.node_id_to END,
FROM rsf r
ON (l.node_id_fr = r.node_id OR l.node_id_to = r.node_id)
AND l.link_id != Nvl (r.id, '0')
) CYCLE node_id SET is_cycle TO '*' DEFAULT ' '
SELECT DISTINCT Min (root_id) OVER (PARTITION BY node_id) "Network", node_id "Node"
FROM rsf
ORDER BY 1, 2


Output

All networks by RSF - Unlinked nodes have their own network ids
Network       Node
------------- ----
01            01
02
03
04
05
07
06            06
08            08
09
10            10

10 rows selected.


Notes on Recursive Subquery Factors (RSF) Solution

• Here I have given the unlinked nodes their own network identifiers; they could equally have been grouped together under a dummy network

Model Clause Solution
SQL

WITH links_v AS (
SELECT t_fr.item_id node_id_fr,
t_to.item_id node_id_to,
t_fr.item_id || '-' || Row_Number() OVER (PARTITION BY t_fr.item_id ORDER BY t_to.item_id) link_id
FROM item_groups t_fr
JOIN item_groups t_to
ON t_to.item_id > t_fr.item_id
AND (t_to.group1 = t_fr.group1 OR t_to.group2 = t_fr.group2)
), nodes_v AS (
SELECT item_id node_id
FROM item_groups
), lnk_iter AS (
SELECT *
CROSS JOIN (SELECT 0 iter FROM DUAL UNION SELECT 1 FROM DUAL)
), mod AS (
SELECT *
FROM lnk_iter
MODEL
DIMENSION BY (Row_Number() OVER (PARTITION BY iter ORDER BY link_id) rn, iter)
MEASURES (Row_Number() OVER (PARTITION BY iter ORDER BY link_id) id_rn, link_id id,
node_id_fr nd1, node_id_to nd2,
1 lnk_cur,
CAST ('x' AS VARCHAR2(100)) nd1_cur,
CAST ('x' AS VARCHAR2(100)) nd2_cur,
0 net_cur,
CAST (NULL AS NUMBER) net,
CAST (NULL AS NUMBER) lnk_prc,
1 not_done,
0 itnum)
RULES UPSERT ALL
ITERATE(100000) UNTIL (lnk_cur[1, Mod (iteration_number+1, 2)] IS NULL)
(
itnum[ANY, ANY] = iteration_number,
not_done[ANY, Mod (iteration_number+1, 2)] = Count (CASE WHEN net IS NULL THEN 1 END)[ANY, Mod (iteration_number, 2)],
lnk_cur[ANY, Mod (iteration_number+1, 2)] =
CASE WHEN not_done[CV(), Mod (iteration_number+1, 2)] > 0 THEN
Nvl (Min (CASE WHEN lnk_prc IS NULL AND net = net_cur THEN id_rn END)[ANY, Mod (iteration_number, 2)],
Min (CASE WHEN net IS NULL THEN id_rn END)[ANY, Mod (iteration_number, 2)])
END,
lnk_prc[ANY, Mod (iteration_number+1, 2)] = lnk_prc[CV(), Mod (iteration_number, 2)],
lnk_prc[lnk_cur[1, Mod (iteration_number+1, 2)], Mod (iteration_number+1, 2)] = 1,
net_cur[ANY, Mod (iteration_number+1, 2)] =
CASE WHEN Min (CASE WHEN lnk_prc IS NULL AND net = net_cur THEN id_rn END)[ANY, Mod (iteration_number, 2)] IS NULL THEN
net_cur[CV(), Mod (iteration_number, 2)] + 1
ELSE
net_cur[CV(), Mod (iteration_number, 2)]
END,
nd1_cur[ANY, Mod (iteration_number+1, 2)] = nd1[lnk_cur[CV(), Mod (iteration_number+1, 2)], Mod (iteration_number, 2)],
nd2_cur[ANY, Mod (iteration_number+1, 2)] = nd2[lnk_cur[CV(), Mod (iteration_number+1, 2)], Mod (iteration_number, 2)],
net[ANY, Mod (iteration_number+1, 2)] =
CASE WHEN (nd1[CV(),Mod (iteration_number+1, 2)] IN (nd1_cur[CV(),Mod (iteration_number+1, 2)], nd2_cur[CV(),Mod (iteration_number+1, 2)]) OR
nd2[CV(),Mod (iteration_number+1, 2)] IN (nd1_cur[CV(),Mod (iteration_number+1, 2)], nd2_cur[CV(),Mod (iteration_number+1, 2)]))
AND net[CV(),Mod (iteration_number, 2)] IS NULL THEN
net_cur[CV(),Mod (iteration_number+1, 2)]
ELSE
net[CV(),Mod (iteration_number, 2)]
END
)
)
SELECT To_Char (net) "Network", nd1 "Node"
FROM mod
WHERE not_done = 0
UNION
SELECT To_Char (net), nd2
FROM mod
WHERE not_done = 0
UNION ALL
FROM nodes_v n
WHERE n.node_id NOT IN (SELECT nd1 FROM mod WHERE nd1 IS NOT NULL UNION SELECT nd2 FROM mod WHERE nd2 IS NOT NULL)
ORDER by 1, 2


Output

All networks by Model - Unlinked nodes share network id 00
Network       Node
------------- ----
10
1             01
02
03
04
05
07
2             08
09

10 rows selected.


Notes on Model Clause Solution

• For convenience I have grouped the unlinked nodes into one dummy network; it's easy to assign them individual identifiers if desired
• My Cyclic Iteration technique used here appears to be novel

Conclusions

• It is always advisable with a new problem in SQL to consider whether it falls into a general class of problems for which solutions have already been found
• Three solution methods for network resolution in pure SQL have been presented and demonstrated on a small test problem; the performance issues mentioned should be considered carefully before applying them on larger problems
• The Model clause solution is likely to be the most efficient on larger, looped networks, but if better performance is required then PL/SQL recursion-based methods would be faster
• For smaller problems with few loops the simpler method of recursive subquery factors may be preferred, or, for versions prior to v11.2, CONNECT BY

# Notes on Profiling Oracle PL/SQL

'Everything should be made as simple as possible, but not simpler'

This phrase is often attributed to Albert Einstein, although the attribution is apparently questionable:
Everything Should Be Made as Simple as Possible, But Not Simpler. In any case it's not a bad approach to follow, even if the quote did come from a non-Oracle guy :).

I recently started looking at the hierarchical profiler tool with a view to using it in an upcoming project. In order to understand the tool properly, I felt it would be a good idea to start by using it to profile a test program that would be as simple as possible while covering as wide a range of scenarios as possible. This article documents the results of that profiling, highlighting the different scenarios covered, discusses the output from the profiler, and includes a query I wrote to display the function call tree.

The article goes on to illustrate profiling through manual code instrumentation, and by the old flat profiler (DBMS_Profiler) on the same test program, concluding that each method has its own strengths and weaknesses.

Setup
The hierarchical profiler setup and use is described in Oracle® Database Advanced Application Developer's Guide 11g Release 2 (11.2), and some code snippets are available here:PL/SQL Hierarchical Profiler in Oracle Database 11g Release 1

Scenarios
The test program consists of a driving script, Test_Rep_p.sql (attached), that calls a package (HProf_Test) and an object type (Table_Count_Type), both defined in the attached script, HProf_Test_Code.sql. The test program covers the following scenarios:

• Multiple root calls (__plsql_vm, A_CALLS_B)
• Recursive procedure calls (procedure calling itself: R_CALLS_R)
• Mutually recursive procedure calls (procedures call each other: A_CALLS_B and B_CALLS_A)
• Procedure called by multiple procedures (child with multiple parents: PUT_LINE)
• Procedure 'inlined' within PL/SQL (Rest_a_While)
• Static SQL within PL/SQL (__static_sql_exec_line8)
• Dynamic SQL within PL/SQL (__dyn_sql_exec_line12)
• 'Everything should be made as simple as possible, but not simpler'

This phrase is often attributed to Albert Einstein, although the attribution is apparently questionable:
Everything Should Be Made as Simple as Possible, But Not Simpler. In any case it's not a bad approach to follow, even if the quote did come from a non-Oracle guy :).

I recently started looking at the hierarchical profiler tool with a view to using it in an upcoming project. In order to understand the tool properly, I felt it would be a good idea to start by using it to profile a test program that would be as simple as possible while covering as wide a range of scenarios as possible. This article documents the results of that profiling, highlighting the different scenarios covered, discusses the output from the profiler, and includes a query I wrote to display the function call tree.

The article goes on to illustrate profiling through manual code instrumentation, and by the old flat profiler (DBMS_Profiler) on the same test program, concluding that each method has its own strengths and weaknesses.

Setup
The hierarchical profiler setup and use is described in Oracle® Database Advanced Application Developer's Guide 11g Release 2 (11.2), and some code snippets are available here:PL/SQL Hierarchical Profiler in Oracle Database 11g Release 1

Scenarios
The test program consists of a driving script, Test_Rep_p.sql (attached), that calls a package (HProf_Test) and an object type (Table_Count_Type), both defined in the attached script, HProf_Test_Code.sql. The test program covers the following scenarios:

• Multiple root calls (__plsql_vm, A_CALLS_B)
• Recursive procedure calls (procedure calling itself: R_CALLS_R)
• Mutually recursive procedure calls (procedures call each other: A_CALLS_B and B_CALLS_A)
• Procedure called by multiple procedures (child with multiple parents: PUT_LINE)
• Procedure 'inlined' within PL/SQL (Rest_a_While)
• Static SQL within PL/SQL (__static_sql_exec_line8)
• Dynamic SQL within PL/SQL (__dyn_sql_exec_line12)
• Database function called from SQL in SQL*Plus (DBFUNC)
• Database function called from SQL in PL/SQL (DBFUNC)
• Object constructor call (TABLE_COUNT_TYPE)

Call Structure Diagram

Raw Results
The attached script Test_Rep_h.sql was used to report on the results. The record produced in the run table, DBMSHP_RUNS, was:

     RUNID RUN_TIMESTAMP                   MICRO_S    SECONDS RUN_COMMENT
---------- ---------------------------- ---------- ---------- ------------------------------------------------------------
11 04-MAR-13 07.07.36.803000        890719        .89 Profile for small test program with recursion

The records produced in the functions table, DBMSHP_FUNCTION_INFO, were:

OWNER MODULE               FUNCTION                         ID  LINE#      SUB_T      FUN_T  CALLS
----- -------------------- ------------------------------ ---- ------ ---------- ---------- ------
NET   HPROF_TEST           A_CALLS_B                         4     40      62340       4450      1
NET   HPROF_TEST           A_CALLS_B@1                       5     40      43729      13663      1
NET   HPROF_TEST           B_CALLS_A                         6     38      57890      14161      1
NET   HPROF_TEST           B_CALLS_A@1                       7     38      30066      30066      1
NET   HPROF_TEST           DBFUNC                            8     84      32629      32629      2
NET   HPROF_TEST           R_CALLS_R                         9     70      12823       4159      1
NET   HPROF_TEST           R_CALLS_R@1                      10     70       8633       8618      1
NET   HPROF_TEST           STOP_PROFILING                   11     16         21         21      1
NET   TABLE_COUNT_TYPE     TABLE_COUNT_TYPE                 12      3      55049         82      1
NET   TABLE_COUNT_TYPE     __static_sql_exec_line6          22      6      54967      54967      1
SYS   DBMS_HPROF           STOP_PROFILING                   13     59          0          0      1
SYS   DBMS_OUTPUT          GET_LINE                         14    129          8          8      3
SYS   DBMS_OUTPUT          GET_LINES                        15    160         68         60      3
SYS   DBMS_OUTPUT          NEW_LINE                         16    117          7          7      2
SYS   DBMS_OUTPUT          PUT                              17     77         28         28      2
SYS   DBMS_OUTPUT          PUT_LINE                         18    109         46         11      2
__anonymous_block                 1      0     809839        521      5
__dyn_sql_exec_line12            19     12        226        226      1
__plsql_vm                        2      0     828379         58      6
__plsql_vm@1                      3      0      14158         11      1
__sql_fetch_line13               20     13     726713     726713      1
__static_sql_exec_line8          21      8      14418        260      1

22 rows selected.

The SUB_T and FUN_T values are the total times in microseconds for the subtree including function, and function-only processing.

The records produced in the functions parent-child table, DBMSHP_PARENT_CHILD_INFO, were:

OWNER_P MODULE_P             FUNCTION_P                     OWNER_C MODULE_C             FUNCTION_C                          SUB_T      FUN_T  CALLS
------- -------------------- ------------------------------ ------- -------------------- ------------------------------ ---------- ---------- ------
NET     HPROF_TEST           STOP_PROFILING                 SYS     DBMS_HPROF           STOP_PROFILING                          0          0      1
NET     HPROF_TEST           R_CALLS_R@1                    SYS     DBMS_OUTPUT          PUT_LINE                               15          6      1
NET     HPROF_TEST           R_CALLS_R                      SYS     DBMS_OUTPUT          PUT_LINE                               31          5      1
NET     HPROF_TEST           R_CALLS_R                      NET     HPROF_TEST           R_CALLS_R@1                          8633       8618      1
NET     HPROF_TEST           B_CALLS_A                      NET     HPROF_TEST           A_CALLS_B@1                         43729      13663      1
NET     HPROF_TEST           A_CALLS_B@1                    NET     HPROF_TEST           B_CALLS_A@1                         30066      30066      1
NET     HPROF_TEST           A_CALLS_B                      NET     HPROF_TEST           B_CALLS_A                           57890      14161      1
NET     TABLE_COUNT_TYPE     TABLE_COUNT_TYPE               NET     TABLE_COUNT_TYPE     __static_sql_exec_line6             54967      54967      1
SYS     DBMS_OUTPUT          PUT_LINE                       SYS     DBMS_OUTPUT          NEW_LINE                                7          7      2
SYS     DBMS_OUTPUT          PUT_LINE                       SYS     DBMS_OUTPUT          PUT                                    28         28      2
SYS     DBMS_OUTPUT          GET_LINES                      SYS     DBMS_OUTPUT          GET_LINE                                8          8      3
__anonymous_block              NET     HPROF_TEST           STOP_PROFILING                         21         21      1
__anonymous_block              SYS     DBMS_OUTPUT          GET_LINES                              68         60      3
__anonymous_block                                           __dyn_sql_exec_line12                 226        226      1
__anonymous_block              NET     HPROF_TEST           R_CALLS_R                           12823       4159      1
__anonymous_block                                           __static_sql_exec_line8             14418        260      1
__plsql_vm                     NET     HPROF_TEST           DBFUNC                              18482      18482      1
__anonymous_block                                           __sql_fetch_line13                 726713     726713      1
__static_sql_exec_line8                                     __plsql_vm@1                        14158         11      1
__plsql_vm                                                  __anonymous_block                  809839        521      5
__plsql_vm@1                   NET     HPROF_TEST           DBFUNC                              14147      14147      1
__anonymous_block              NET     TABLE_COUNT_TYPE     TABLE_COUNT_TYPE                    55049         82      1

22 rows selected.

The SUB_T and FUN_T values are the total times in microseconds for the subtree including function, and function-only processing, respectively, for the child function while called from all instances of the parent.

Function Call Tree
The raw data above can be used to identify processing bottlenecks at a function level, but it's also useful to process the data in order to display the function hierarchies, both for performance tuning and also for understanding the program structure. This is not quite as trivial as it may seem. The oracle-base article provides an SQL statement that attempts to do this:

SELECT RPAD(' ', level*2, ' ') || fi.owner || '.' || fi.module AS name,
fi.function,
pci.subtree_elapsed_time,
pci.function_elapsed_time,
pci.calls
FROM   dbmshp_parent_child_info pci
JOIN dbmshp_function_info fi ON pci.runid = fi.runid AND pci.childsymid = fi.symbolid
WHERE  pci.runid = :RUN_ID
CONNECT BY PRIOR childsymid = parentsymid
START WITH pci.parentsymid = :START_ID

Here, bind variables replace the original hard-coded values. On running this query I often got the following result:

ERROR at line 1:
ORA-01436: CONNECT BY loop in user data

On the run used in this article, the query returned 157 records, which is obviously incorrect. There is of course a NOCYCLE keyword that can be used to return results in the case of loops. However, it is not worth adding in this case, because there are in fact no loops in the data (at least no cyclic loops - apparent loops are discussed later). Oracle avoids loops by treating a function call that is a descendant of itself as a call to a new function, identified by suffices @1, @2 etc. as we can see from the recursive procedures above (eg R_CALLS_R@1 is the second call of R_CALLS_R, this one from itself). The problem here is that the query is incorrect in its handling of runid, with the result that the tree-walk traverses records from other runs as well as the intended one. A further problem is that there may be several roots, and it would be best to calculate these within a subquery. We can correct these problems by the following query:

SELECT RPAD(' ', level*2, ' ') || fi.owner || '.' || fi.module AS name,
fi.symbolid || ': ' || fi.function function,
pci.subtree_elapsed_time sub_t,
pci.function_elapsed_time fun_t,
pci.calls
FROM dbmshp_parent_child_info	pci
JOIN dbmshp_function_info		fi
ON pci.runid	              = fi.runid
AND pci.childsymid	       = fi.symbolid
WHERE pci.runid                   = :RUN_ID
CONNECT BY PRIOR pci.childsymid    = pci.parentsymid
AND pci.runid	              = :RUN_ID
START WITH pci.parentsymid         IN (SELECT f.symbolid FROM dbmshp_function_info f WHERE NOT EXISTS
(SELECT 1 FROM dbmshp_parent_child_info i WHERE i.childsymid = f.symbolid AND i.runid = :RUN_ID) AND f.runid = :RUN_ID)
AND pci.runid	              = :RUN_ID

This query returns the results:

NAME                           FUNCTION                            SUB_T      FUN_T  CALLS
------------------------------ ------------------------------ ---------- ---------- ------
.                            1: __anonymous_block              809,839        521      5
NET.HPROF_TEST             9: R_CALLS_R                       12,823      4,159      1
NET.HPROF_TEST           10: R_CALLS_R@1                     8,633      8,618      1
SYS.DBMS_OUTPUT        18: PUT_LINE                           15          6      1
SYS.DBMS_OUTPUT      16: NEW_LINE                            7          7      2
SYS.DBMS_OUTPUT      17: PUT                                28         28      2
SYS.DBMS_OUTPUT          18: PUT_LINE                           31          5      1
SYS.DBMS_OUTPUT        16: NEW_LINE                            7          7      2
SYS.DBMS_OUTPUT        17: PUT                                28         28      2
NET.HPROF_TEST             11: STOP_PROFILING                     21         21      1
SYS.DBMS_HPROF           13: STOP_PROFILING                      0          0      1
NET.TABLE_COUNT_TYPE       12: TABLE_COUNT_TYPE               55,049         82      1
NET.TABLE_COUNT_TYPE     22: __static_sql_exec_line6        54,967     54,967      1
SYS.DBMS_OUTPUT            15: GET_LINES                          68         60      3
SYS.DBMS_OUTPUT          14: GET_LINE                            8          8      3
.                          19: __dyn_sql_exec_line12             226        226      1
.                          20: __sql_fetch_line13            726,713    726,713      1
.                          21: __static_sql_exec_line8        14,418        260      1
.                        3: __plsql_vm@1                    14,158         11      1
NET.HPROF_TEST         8: DBFUNC                          14,147     14,147      1
NET.HPROF_TEST               8: DBFUNC                          18,482     18,482      1
NET.HPROF_TEST               6: B_CALLS_A                       57,890     14,161      1
NET.HPROF_TEST             5: A_CALLS_B@1                     43,729     13,663      1
NET.HPROF_TEST           7: B_CALLS_A@1                     30,066     30,066      1

24 rows selected.

This is better, but we can identify some further issues.

Missing Roots
The true root results are missing: For example, A_CALLS_B is missing. This arises because the query is traversing the link records (DBMSHP_PARENT_CHILD_INFO), while the root information is stored in the nodes (DBMSHP_FUNCTION_INFO). This suggests a change from the CONNECT BY syntax to Oracle's v11.2 recursive subquery factoring syntax, which allows you easily to start from the nodes, then traverse recursively via the links. (Incidentally, moving the start of profiling to its own block would result in A_CALLS_B appearing under __anonymous_block, but I prefer to retain the current structure in order to deal with the general case in which multiple roots are possible.)

Notice that function PUT_LINE is reported separately under R_CALLS_R and R_CALLS_R@1, and the timings differ. Also, its own child calls appear under each of its instances, but in those cases the timings are identical. The reason for this is that in the first case, there are separate records of the times used in each call, whereas in the second, the child calls have only a single record giving the total times across both instances of the parent call. The call from R_CALLS_R shows (9 - 4 = ) 5µs used in child calls, while the call from R_CALLS_R@1 shows 14µs. The child calls show totals of (3 + 16 = ) 19µs, equalling the sum across the parent calls.

At this point it is worth looking at this from the more general perspective of a hierarchical data structure where parents can have multiple children and children multiple parents, with one or more roots. If a network diagram were constructed there would be loops apparent indicating multiple routes between nodes. In these situations, Oracle's hierarchical queries effectively traverse all routes, and this is what causes the link duplication (in other scenarios this behaviour can cause big performance problems, but probably not here). Oracle's cycle detection mechanism does not trigger because the loops do not result in any node being a descendant of itself (as noted above, extra nodes are generated by the profiler to avoid this).

It seems to me better to avoid this duplication, and also to signal those cases where times are not aggregated up the tree. We can achieve this by the use of analytic functions. Note that, although the query below refers to the specific tables and attributes for this problem, the proposed solution could be used for any member of this general class of problem. The new query, which orders sibling records by descending subtree elapsed time, is:

WITH last_run AS (
SELECT Max (runid) runid FROM dbmshp_runs
), full_tree (runid, lev, node_id, sub_t, fun_t, calls, link_id) AS (
SELECT fni.runid, 0, fni.symbolid, fni.subtree_elapsed_time, fni.function_elapsed_time, fni.calls, 'root' || ROWNUM
FROM dbmshp_function_info fni
JOIN last_run lrn
ON lrn.runid = fni.runid
WHERE NOT EXISTS (SELECT 1 FROM dbmshp_parent_child_info pci WHERE pci.childsymid = fni.symbolid AND pci.runid = fni.runid)
UNION ALL
SELECT ftr.runid,
ftr.lev + 1,
pci.childsymid,
pci.subtree_elapsed_time,
pci.function_elapsed_time,
pci.calls,
pci.parentsymid || '-' || pci.childsymid
FROM full_tree ftr
JOIN dbmshp_parent_child_info pci
ON pci.parentsymid = ftr.node_id
AND pci.runid = ftr.runid
) SEARCH DEPTH FIRST BY sub_t DESC, fun_t DESC, calls DESC, node_id SET rn
, tree_ranked AS (
SELECT runid, node_id, lev, rn,
sub_t, fun_t, calls,
Row_Number () OVER (PARTITION BY node_id ORDER BY rn) node_rn,
Count (*) OVER (PARTITION BY node_id) node_cnt,
FROM full_tree
)
SELECT RPad (' ', trr.lev*2, ' ') || fni.function "Function tree",
fni.symbolid sy, fni.owner, fni.module,
CASE WHEN trr.node_cnt > 1 THEN trr.node_rn || ' of ' || trr.node_cnt END "Inst.",
trr.sub_t, trr.fun_t, trr.calls,
trr.rn "Row"
FROM tree_ranked trr
JOIN dbmshp_function_info fni
ON fni.symbolid = trr.node_id
AND fni.runid = trr.runid
ORDER BY trr.rn

Query Structure Diagram

The results are then:

Function tree                        SY OWNER MODULE               Inst.         SUB_T      FUN_T  CALLS  Row
----------------------------------- --- ----- -------------------- -------- ---------- ---------- ------ ----
__plsql_vm                            2                                        828,379         58      6    1
__anonymous_block                   1                                        809,839        521      5    2
__sql_fetch_line13               20                                        726,713    726,713      1    3
TABLE_COUNT_TYPE                 12 NET   TABLE_COUNT_TYPE                  55,049         82      1    4
__static_sql_exec_line6        22 NET   TABLE_COUNT_TYPE                  54,967     54,967      1    5
__static_sql_exec_line8          21                                         14,418        260      1    6
__plsql_vm@1                    3                                         14,158         11      1    7
DBFUNC                        8 NET   HPROF_TEST           1 of 2       14,147     14,147      1    8
R_CALLS_R                         9 NET   HPROF_TEST                        12,823      4,159      1    9
R_CALLS_R@1                    10 NET   HPROF_TEST                         8,633      8,618      1   10
PUT_LINE                     18 SYS   DBMS_OUTPUT          1 of 2           15          6      1   11
PUT                        17 SYS   DBMS_OUTPUT          1 of 2           28         28      2   12
NEW_LINE                   16 SYS   DBMS_OUTPUT          1 of 2            7          7      2   13
PUT_LINE                       18 SYS   DBMS_OUTPUT          2 of 2           31          5      1   14
__dyn_sql_exec_line12            19                                            226        226      1   17
GET_LINES                        15 SYS   DBMS_OUTPUT                           68         60      3   18
GET_LINE                       14 SYS   DBMS_OUTPUT                            8          8      3   19
STOP_PROFILING                   11 NET   HPROF_TEST                            21         21      1   20
STOP_PROFILING                 13 SYS   DBMS_HPROF                             0          0      1   21
DBFUNC                              8 NET   HPROF_TEST           2 of 2       18,482     18,482      1   22
A_CALLS_B                             4 NET   HPROF_TEST                        62,340      4,450      1   23
B_CALLS_A                           6 NET   HPROF_TEST                        57,890     14,161      1   24
A_CALLS_B@1                       5 NET   HPROF_TEST                        43,729     13,663      1   25
B_CALLS_A@1                     7 NET   HPROF_TEST                        30,066     30,066      1   26

24 rows selected.

Notice that we now have a single record for each of the 22 links, plus the two root nodes. Also, the "Inst." column lists the instance number of a function having more than one instance, and the children of any such function are only listed once with the gaps in the "Row" column indicating where duplicates have been suppressed.

Network Diagrams
It may be interesting to display the call tree in two diagrams, one for each root.
Root __plsql_vm

Root A_CALLS_B

Notes on Tree Output
Anonymous Block (__anonymous_block)
This function seems to correspond to invocations of anonymous blocks, obviously enough. However, there is an apparent anomaly in the number of calls listed, 6, because the driving program has only three such blocks, and there are none in the called PL/SQL code. I would surmise that the apparent discrepancy arises from the enabling of SERVEROUTPUT, which appears to result in a secondary block being associated with each explicit SQL*Plus block, that issues a call to GET_LINES to process buffered output.

PL/SQL Engine (__plsql_vm)
This function seems to correspond to external invocations of PL/SQL such as from a SQL*Plus session. There are 7 calls, 6 of them presumably being linked with the external anonymous blocks, and the seventh with DBFUNC, where a PL/SQL function is called from a SQL statement from SQL*Plus.

Notice that the SQL statement calling a database function from within PL/SQL generates the recursive call to the engine, __plsql_vm@1

Second Root (A_CALLS_B)
The above function does not have the __plsql_vm/__anonymous_block ancestry that might be expected because profiling only started within the enclosing block.

Inlined Procedure (Rest_a_While)
I wrote a small procedure, Rest_a_While, to generate some elapsed time in the recursive procedures, but preceded it with the INLINE pragma, a new optimisation feature in 11g. This had the desired effect of removing the calls from the profiling output and including the times in the calling procedures. Rest_a_While does not make the obvious call to DBMS_Lock.Sleep because that procedure cannot be inlined. subprogram inlining in 11g provides some analysis of the inlining feature.

Sibling Ordering
We have ordered siblings by descending subtree elapsed time, using the SEARCH clause. It would be nice to have the option to order the siblings by initial invocation time, but Oracle does not provide the data to do this.

Loops and Hierarchies
The first diagram shows two loops, where there are two routes between the loop start and end points, indicated by different colours. The second loop has two child nodes coming from the end point, and hierarchical queries (both CONNECT BY and recursive subquery factors in Oracle) cause the links to be duplicated. Our query has filtered out the duplicates by analytic functions.

It's worth remembering this because it's a general feature of SQL for querying hierarchies, and judging by Oracle forums, not one that's widely understood. For larger hierarchies it can cause serious performance problems, and may justify a PL/SQL programmed solution that need not suffer the same problem.

Manual Instrumentation
Oracle's hierarchical profiler clearly provides extremely useful information on both performance and structure of PL/SQL programs with very little effort. However, it does have the limitation of only providing information down to the subprogram level (which includes embedded SQL statements in this context). It is also often considered good practice to implement timing and other instrumentation permanently in production code, sometimes in a switchable fashion. In the test program, one of the called procedures, A_Calls_B, makes two calls to the inlined procedure, Rest_a_While, the second doing about twice as much work as the first. The profiler reports total within-function times of 4,450µs and 13,663µs on first and second calls, respectively (the work is scaled by a call number parameter, equal to 1, then 3).

I created a second instance of the package and driver script (suffix _TS) to illustrate manual instrumentation. This uses an 'object-oriented' timing package that I wrote a couple of years ago Code Timing and Object Orientation and Zombies (November, 2010) to instrument at procedure and section level. I multiplied the work in Rest_a_While by a factor of ten to get larger times. This produced the output:

Timer Set: HProf, Constructed at 05 Mar 2013 10:21:27, written at 10:21:30
==========================================================================
[Timer timed: Elapsed (per call): 0.04 (0.000044), CPU (per call): 0.05 (0.000050), calls: 1000, '***' denotes corrected line below]

Timer                       Elapsed          CPU          Calls        Ela/Call        CPU/Call
----------------------   ----------   ----------   ------------   -------------   -------------
A_Calls_B, section one         0.06         0.05              2         0.03150         0.02500
A_Calls_B, section two         0.12         0.12              2         0.06050         0.06000
B_Calls_A: 2                   0.15         0.16              1         0.15400         0.16000
B_Calls_A: 4                   0.31         0.30              1         0.30700         0.30000
DBFunc                         0.32         0.31              2         0.15950         0.15500
Open cursor                    0.69         0.69              1         0.68900         0.69000
Fetch from cursor              0.70         0.70              1         0.69600         0.70000
Close cursor                   0.00         0.00              1         0.00000         0.00000
Construct object               0.06         0.04              1         0.05500         0.04000
R_Calls_R                      0.14         0.14              2         0.07000         0.07000
(Other)                        0.00         0.00              1         0.00000         0.00000
----------------------   ----------   ----------   ------------   -------------   -------------
Total                          2.54         2.51             15         0.16960         0.16733
----------------------   ----------   ----------   ------------   -------------   -------------


Notes on Code Timing

• Calls, CPU and elapsed times have been captured at the section level for A_Calls_B
• Observe that, while R_Calls_R and A_Calls_B aggregate over all calls, B_Calls_A records values by call; this is implemented simply by including a value that changes with call in the timer name
• The timing set object is designed to be very low footprint; here 9 statements (calls to Increment_Time), plus a small global overhead, produced 10 result lines, plus associated information
• The 'object-oriented' approach allows multiple programs to be be timed at multiple levels, without interference between timings
• There are Perl and Java implementations of this timing set object included in the Scribd article mentioned

Oracle's Flat Profiler (DBMS_Profiler)
The hierarchical profiler was introduced in v11.1, while prior to this there was a non-hierarchical profiler, DBMS_Profiler. This package still exists in v11: It is omitted from the advanced application developer's guide for v11, but is described in the packages and types manual (Oracle® Database PL/SQL Packages and Types Reference, 11g Release 2 (11.2)); also, SQL*Developer appears to support only the newer hierarchical verion (via right-click on a package). I thought it interesting to run the older version on the same test program (package Old_Test_Prof, driver script Test_Rep_p_Old.sql and reporting script Test_Rep_h_Old.sql). The output from the first three queries is:

Run header (PLSQL_PROFILER_RUNS)

RUNID RUN_DATE        MICRO_S    SECONDS
---------- ------------ ---------- ----------
3 11:03:13        2164000       2.16

Profiler data summary (PLSQL_PROFILER_DATA)

MICRO_S SECONDS    CALLS
---------- ------- --------
2126949    2.13       72

Profiler data by time (PLSQL_PROFILER_DATA)

MICRO_S SECONDS    CALLS UNIT_NAME            UNIT_NUMBER  LINE#
---------- ------- -------- -------------------- ----------- ------
729932    0.73        1                                5     13
569563    0.57        2 OLD_PROF_TEST                  1     56
377880    0.38        2 OLD_PROF_TEST                  1     82
166019    0.17        2 OLD_PROF_TEST                  1     70
150117    0.15        2 OLD_PROF_TEST                  1     43
72742    0.07        2 OLD_PROF_TEST                  1     40
56473    0.06        1 TABLE_COUNT_TYPE               6      6
3338    0.00        1                                5      8
258    0.00        1                                5     12
109    0.00        1                                5     16
68    0.00        2 OLD_PROF_TEST                  1     67
66    0.00        2                                4      1
60    0.00        2                                7      1
60    0.00        2                                3      1
44    0.00        1                                5     14
42    0.00        0                                2      5
31    0.00        1 OLD_PROF_TEST                  1     18
26    0.00        1                                8      5
13    0.00        0                                5      1
9    0.00        1 TABLE_COUNT_TYPE               6     11
9    0.00        2 OLD_PROF_TEST                  1     86
8    0.00        0 OLD_PROF_TEST                  1     51
8    0.00        1 TABLE_COUNT_TYPE               6     13
7    0.00        1                                5     18
6    0.00        0 OLD_PROF_TEST                  1     78
6    0.00        0 OLD_PROF_TEST                  1     64
6    0.00        0                                8      1
6    0.00        1 TABLE_COUNT_TYPE               6      3
5    0.00        0 OLD_PROF_TEST                  1     35
5    0.00        0 OLD_PROF_TEST                  1     15
4    0.00        1                                8      7
4    0.00        1 OLD_PROF_TEST                  1     76
3    0.00        1                                2      8
2    0.00        1 OLD_PROF_TEST                  1     62
2    0.00        1 OLD_PROF_TEST                  1     13
2    0.00        1 TABLE_COUNT_TYPE               6      5
2    0.00        2 OLD_PROF_TEST                  1     72
2    0.00        2 OLD_PROF_TEST                  1     45
2    0.00        2 OLD_PROF_TEST                  1     49
2    0.00        2 OLD_PROF_TEST                  1     46
2    0.00        2 OLD_PROF_TEST                  1     58
1    0.00        1 OLD_PROF_TEST                  1     73
1    0.00        1                                2      6
1    0.00        1 OLD_PROF_TEST                  1     59
1    0.00        1 OLD_PROF_TEST                  1     11
1    0.00        2 OLD_PROF_TEST                  1     54
1    0.00        2 OLD_PROF_TEST                  1     84
0    0.00        0 OLD_PROF_TEST                  1      1
0    0.00        0 OLD_PROF_TEST                  1     88
0    0.00        0                                8      9
0    0.00        0                                2      1
0    0.00        0                                2      2
0    0.00        0 OLD_PROF_TEST                  1      3
0    0.00        0 OLD_PROF_TEST                  1      5
0    0.00        0 OLD_PROF_TEST                  1      9
0    0.00        0 OLD_PROF_TEST                  1     20
0    0.00        1 TABLE_COUNT_TYPE               6      4
0    0.00        1                                8      2
0    0.00        2 OLD_PROF_TEST                  1     39
0    0.00        2 OLD_PROF_TEST                  1     55
0    0.00        2 OLD_PROF_TEST                  1     69
0    0.00        2 OLD_PROF_TEST                  1     38
0    0.00        2 OLD_PROF_TEST                  1     81
0    0.00        2 OLD_PROF_TEST                  1     42
0    0.00        2 OLD_PROF_TEST                  1     68

65 rows selected.



Referring to the package, type and anonymous blocks, I assigned labels to all the lines having more than 10µs, as follows:

   MICRO_S SECONDS    CALLS UNIT_NAME            UNIT_NUMBER  LINE#
---------- ------- -------- -------------------- ----------- ------
729932    0.73        1                                5     13  B2: FETCH
569563    0.57        2 OLD_PROF_TEST                  1     56  B_Calls_A (Rest_a_While)
377880    0.38        2 OLD_PROF_TEST                  1     82  DBFunc (Rest_a_While)
166019    0.17        2 OLD_PROF_TEST                  1     70  R_Calls_R (Rest_a_While)
150117    0.15        2 OLD_PROF_TEST                  1     43  A_Calls_B (Rest_a_While, section 2)
72742    0.07        2 OLD_PROF_TEST                  1     40  A_Calls_B (Rest_a_While, section 1)
56473    0.06        1 TABLE_COUNT_TYPE               6      6  SELECT
3338    0.00        1                                5      8  B2: SELECT DBFunc
258    0.00        1                                5     12  B2: OPEN
109    0.00        1                                5     16  B2: Assign Table_Count_Type
68    0.00        2 OLD_PROF_TEST                  1     67  Put_Line
66    0.00        2                                4      1  Auxiliary SERVEROUTPUT block for B2 (surmised)
60    0.00        2                                7      1  Auxiliary SERVEROUTPUT block for B3 (surmised)
60    0.00        2                                3      1  Auxiliary SERVEROUTPUT block for B1 (surmised)
44    0.00        1                                5     14  B2: CLOSE
42    0.00        0                                2      5  B1: Call to Start_Profiling
31    0.00        1 OLD_PROF_TEST                  1     18  RETURN DBMS_Profiler.Stop_Profiler;
26    0.00        1                                8      5  B3: Call R_Calls_R
13    0.00        0                                5      1  B2: DECLARE


Notes on Output of Flat Profiler
There were six units with no linked information in DBMS_PROFILER_UNITS. By examining the data, I was able to associate unit numbers 2, 5 and 8 with my anonymous blocks B1, B2 and B3. That left three unassigned, and I have surmised that these correspond to the auxiliary blocks associated with processing server output that we earlier surmised when examining the output from the hierarchical profiler.

• The useful call tree structure is not present in the data from the old profiler
• However, the results are at a line level, which the hierarchical profiler does not provide; for example, the two sections of A_Calls_B are reported separately
• Deciphering the output requires significantly more manual effort than with the hierarchical profiler
• Both old and new profiler have their own advantages, and so both should be considered of value
• Manual code timing offers more flexibility in terms of aggregating lines and call instances, but requires more effort

Conclusions

• Running Oracle's hierarchical profiler would seem to be the default first step in tuning PL/SQL programs from v11.1
• Some care is needed in interpreting the output data; I've provided a query for displaying the hierarchies
• Performance is recorded only down to function level, so it will still often be worthwhile to use the old flat profiler in addition
• Manually timing code sections also still has a part to play, in terms of instrumentation and greater flexibility where necessary

• Database function called from SQL in SQL*Plus (DBFUNC)
• Database function called from SQL in PL/SQL (DBFUNC)
• Object constructor call (TABLE_COUNT_TYPE)

Call Structure Diagram

Raw Results
The attached script Test_Rep_h.sql was used to report on the results. The record produced in the run table, DBMSHP_RUNS, was:

     RUNID RUN_TIMESTAMP                   MICRO_S    SECONDS RUN_COMMENT
---------- ---------------------------- ---------- ---------- ------------------------------------------------------------
11 04-MAR-13 07.07.36.803000        890719        .89 Profile for small test program with recursion

The records produced in the functions table, DBMSHP_FUNCTION_INFO, were:

OWNER MODULE               FUNCTION                         ID  LINE#      SUB_T      FUN_T  CALLS
----- -------------------- ------------------------------ ---- ------ ---------- ---------- ------
NET   HPROF_TEST           A_CALLS_B                         4     40      62340       4450      1
NET   HPROF_TEST           A_CALLS_B@1                       5     40      43729      13663      1
NET   HPROF_TEST           B_CALLS_A                         6     38      57890      14161      1
NET   HPROF_TEST           B_CALLS_A@1                       7     38      30066      30066      1
NET   HPROF_TEST           DBFUNC                            8     84      32629      32629      2
NET   HPROF_TEST           R_CALLS_R                         9     70      12823       4159      1
NET   HPROF_TEST           R_CALLS_R@1                      10     70       8633       8618      1
NET   HPROF_TEST           STOP_PROFILING                   11     16         21         21      1
NET   TABLE_COUNT_TYPE     TABLE_COUNT_TYPE                 12      3      55049         82      1
NET   TABLE_COUNT_TYPE     __static_sql_exec_line6          22      6      54967      54967      1
SYS   DBMS_HPROF           STOP_PROFILING                   13     59          0          0      1
SYS   DBMS_OUTPUT          GET_LINE                         14    129          8          8      3
SYS   DBMS_OUTPUT          GET_LINES                        15    160         68         60      3
SYS   DBMS_OUTPUT          NEW_LINE                         16    117          7          7      2
SYS   DBMS_OUTPUT          PUT                              17     77         28         28      2
SYS   DBMS_OUTPUT          PUT_LINE                         18    109         46         11      2
__anonymous_block                 1      0     809839        521      5
__dyn_sql_exec_line12            19     12        226        226      1
__plsql_vm                        2      0     828379         58      6
__plsql_vm@1                      3      0      14158         11      1
__sql_fetch_line13               20     13     726713     726713      1
__static_sql_exec_line8          21      8      14418        260      1

22 rows selected.

The SUB_T and FUN_T values are the total times in microseconds for the subtree including function, and function-only processing.

The records produced in the functions parent-child table, DBMSHP_PARENT_CHILD_INFO, were:

OWNER_P MODULE_P             FUNCTION_P                     OWNER_C MODULE_C             FUNCTION_C                          SUB_T      FUN_T  CALLS
------- -------------------- ------------------------------ ------- -------------------- ------------------------------ ---------- ---------- ------
NET     HPROF_TEST           STOP_PROFILING                 SYS     DBMS_HPROF           STOP_PROFILING                          0          0      1
NET     HPROF_TEST           R_CALLS_R@1                    SYS     DBMS_OUTPUT          PUT_LINE                               15          6      1
NET     HPROF_TEST           R_CALLS_R                      SYS     DBMS_OUTPUT          PUT_LINE                               31          5      1
NET     HPROF_TEST           R_CALLS_R                      NET     HPROF_TEST           R_CALLS_R@1                          8633       8618      1
NET     HPROF_TEST           B_CALLS_A                      NET     HPROF_TEST           A_CALLS_B@1                         43729      13663      1
NET     HPROF_TEST           A_CALLS_B@1                    NET     HPROF_TEST           B_CALLS_A@1                         30066      30066      1
NET     HPROF_TEST           A_CALLS_B                      NET     HPROF_TEST           B_CALLS_A                           57890      14161      1
NET     TABLE_COUNT_TYPE     TABLE_COUNT_TYPE               NET     TABLE_COUNT_TYPE     __static_sql_exec_line6             54967      54967      1
SYS     DBMS_OUTPUT          PUT_LINE                       SYS     DBMS_OUTPUT          NEW_LINE                                7          7      2
SYS     DBMS_OUTPUT          PUT_LINE                       SYS     DBMS_OUTPUT          PUT                                    28         28      2
SYS     DBMS_OUTPUT          GET_LINES                      SYS     DBMS_OUTPUT          GET_LINE                                8          8      3
__anonymous_block              NET     HPROF_TEST           STOP_PROFILING                         21         21      1
__anonymous_block              SYS     DBMS_OUTPUT          GET_LINES                              68         60      3
__anonymous_block                                           __dyn_sql_exec_line12                 226        226      1
__anonymous_block              NET     HPROF_TEST           R_CALLS_R                           12823       4159      1
__anonymous_block                                           __static_sql_exec_line8             14418        260      1
__plsql_vm                     NET     HPROF_TEST           DBFUNC                              18482      18482      1
__anonymous_block                                           __sql_fetch_line13                 726713     726713      1
__static_sql_exec_line8                                     __plsql_vm@1                        14158         11      1
__plsql_vm                                                  __anonymous_block                  809839        521      5
__plsql_vm@1                   NET     HPROF_TEST           DBFUNC                              14147      14147      1
__anonymous_block              NET     TABLE_COUNT_TYPE     TABLE_COUNT_TYPE                    55049         82      1

22 rows selected.

The SUB_T and FUN_T values are the total times in microseconds for the subtree including function, and function-only processing, respectively, for the child function while called from all instances of the parent.

Function Call Tree
The raw data above can be used to identify processing bottlenecks at a function level, but it's also useful to process the data in order to display the function hierarchies, both for performance tuning and also for understanding the program structure. This is not quite as trivial as it may seem. The oracle-base article provides an SQL statement that attempts to do this:

SELECT RPAD(' ', level*2, ' ') || fi.owner || '.' || fi.module AS name,
fi.function,
pci.subtree_elapsed_time,
pci.function_elapsed_time,
pci.calls
FROM   dbmshp_parent_child_info pci
JOIN dbmshp_function_info fi ON pci.runid = fi.runid AND pci.childsymid = fi.symbolid
WHERE  pci.runid = :RUN_ID
CONNECT BY PRIOR childsymid = parentsymid
START WITH pci.parentsymid = :START_ID

Here, bind variables replace the original hard-coded values. On running this query I often got the following result:

ERROR at line 1:
ORA-01436: CONNECT BY loop in user data

On the run used in this article, the query returned 157 records, which is obviously incorrect. There is of course a NOCYCLE keyword that can be used to return results in the case of loops. However, it is not worth adding in this case, because there are in fact no loops in the data (at least no cyclic loops - apparent loops are discussed later). Oracle avoids loops by treating a function call that is a descendant of itself as a call to a new function, identified by suffices @1, @2 etc. as we can see from the recursive procedures above (eg R_CALLS_R@1 is the second call of R_CALLS_R, this one from itself). The problem here is that the query is incorrect in its handling of runid, with the result that the tree-walk traverses records from other runs as well as the intended one. A further problem is that there may be several roots, and it would be best to calculate these within a subquery. We can correct these problems by the following query:

SELECT RPAD(' ', level*2, ' ') || fi.owner || '.' || fi.module AS name,
fi.symbolid || ': ' || fi.function function,
pci.subtree_elapsed_time sub_t,
pci.function_elapsed_time fun_t,
pci.calls
FROM dbmshp_parent_child_info	pci
JOIN dbmshp_function_info		fi
ON pci.runid	              = fi.runid
AND pci.childsymid	       = fi.symbolid
WHERE pci.runid                   = :RUN_ID
CONNECT BY PRIOR pci.childsymid    = pci.parentsymid
AND pci.runid	              = :RUN_ID
START WITH pci.parentsymid         IN (SELECT f.symbolid FROM dbmshp_function_info f WHERE NOT EXISTS
(SELECT 1 FROM dbmshp_parent_child_info i WHERE i.childsymid = f.symbolid AND i.runid = :RUN_ID) AND f.runid = :RUN_ID)
AND pci.runid	              = :RUN_ID

This query returns the results:

NAME                           FUNCTION                            SUB_T      FUN_T  CALLS
------------------------------ ------------------------------ ---------- ---------- ------
.                            1: __anonymous_block              809,839        521      5
NET.HPROF_TEST             9: R_CALLS_R                       12,823      4,159      1
NET.HPROF_TEST           10: R_CALLS_R@1                     8,633      8,618      1
SYS.DBMS_OUTPUT        18: PUT_LINE                           15          6      1
SYS.DBMS_OUTPUT      16: NEW_LINE                            7          7      2
SYS.DBMS_OUTPUT      17: PUT                                28         28      2
SYS.DBMS_OUTPUT          18: PUT_LINE                           31          5      1
SYS.DBMS_OUTPUT        16: NEW_LINE                            7          7      2
SYS.DBMS_OUTPUT        17: PUT                                28         28      2
NET.HPROF_TEST             11: STOP_PROFILING                     21         21      1
SYS.DBMS_HPROF           13: STOP_PROFILING                      0          0      1
NET.TABLE_COUNT_TYPE       12: TABLE_COUNT_TYPE               55,049         82      1
NET.TABLE_COUNT_TYPE     22: __static_sql_exec_line6        54,967     54,967      1
SYS.DBMS_OUTPUT            15: GET_LINES                          68         60      3
SYS.DBMS_OUTPUT          14: GET_LINE                            8          8      3
.                          19: __dyn_sql_exec_line12             226        226      1
.                          20: __sql_fetch_line13            726,713    726,713      1
.                          21: __static_sql_exec_line8        14,418        260      1
.                        3: __plsql_vm@1                    14,158         11      1
NET.HPROF_TEST         8: DBFUNC                          14,147     14,147      1
NET.HPROF_TEST               8: DBFUNC                          18,482     18,482      1
NET.HPROF_TEST               6: B_CALLS_A                       57,890     14,161      1
NET.HPROF_TEST             5: A_CALLS_B@1                     43,729     13,663      1
NET.HPROF_TEST           7: B_CALLS_A@1                     30,066     30,066      1

24 rows selected.

This is better, but we can identify some further issues.

Missing Roots
The true root results are missing: For example, A_CALLS_B is missing. This arises because the query is traversing the link records (DBMSHP_PARENT_CHILD_INFO), while the root information is stored in the nodes (DBMSHP_FUNCTION_INFO). This suggests a change from the CONNECT BY syntax to Oracle's v11.2 recursive subquery factoring syntax, which allows you easily to start from the nodes, then traverse recursively via the links. (Incidentally, moving the start of profiling to its own block would result in A_CALLS_B appearing under __anonymous_block, but I prefer to retain the current structure in order to deal with the general case in which multiple roots are possible.)

Notice that function PUT_LINE is reported separately under R_CALLS_R and R_CALLS_R@1, and the timings differ. Also, its own child calls appear under each of its instances, but in those cases the timings are identical. The reason for this is that in the first case, there are separate records of the times used in each call, whereas in the second, the child calls have only a single record giving the total times across both instances of the parent call. The call from R_CALLS_R shows (9 - 4 = ) 5µs used in child calls, while the call from R_CALLS_R@1 shows 14µs. The child calls show totals of (3 + 16 = ) 19µs, equalling the sum across the parent calls.

At this point it is worth looking at this from the more general perspective of a hierarchical data structure where parents can have multiple children and children multiple parents, with one or more roots. If a network diagram were constructed there would be loops apparent indicating multiple routes between nodes. In these situations, Oracle's hierarchical queries effectively traverse all routes, and this is what causes the link duplication (in other scenarios this behaviour can cause big performance problems, but probably not here). Oracle's cycle detection mechanism does not trigger because the loops do not result in any node being a descendant of itself (as noted above, extra nodes are generated by the profiler to avoid this).

It seems to me better to avoid this duplication, and also to signal those cases where times are not aggregated up the tree. We can achieve this by the use of analytic functions. Note that, although the query below refers to the specific tables and attributes for this problem, the proposed solution could be used for any member of this general class of problem. The new query, which orders sibling records by descending subtree elapsed time, is:

WITH last_run AS (
SELECT Max (runid) runid FROM dbmshp_runs
), full_tree (runid, lev, node_id, sub_t, fun_t, calls, link_id) AS (
SELECT fni.runid, 0, fni.symbolid, fni.subtree_elapsed_time, fni.function_elapsed_time, fni.calls, 'root' || ROWNUM
FROM dbmshp_function_info fni
JOIN last_run lrn
ON lrn.runid = fni.runid
WHERE NOT EXISTS (SELECT 1 FROM dbmshp_parent_child_info pci WHERE pci.childsymid = fni.symbolid AND pci.runid = fni.runid)
UNION ALL
SELECT ftr.runid,
ftr.lev + 1,
pci.childsymid,
pci.subtree_elapsed_time,
pci.function_elapsed_time,
pci.calls,
pci.parentsymid || '-' || pci.childsymid
FROM full_tree ftr
JOIN dbmshp_parent_child_info pci
ON pci.parentsymid = ftr.node_id
AND pci.runid = ftr.runid
) SEARCH DEPTH FIRST BY sub_t DESC, fun_t DESC, calls DESC, node_id SET rn
, tree_ranked AS (
SELECT runid, node_id, lev, rn,
sub_t, fun_t, calls,
Row_Number () OVER (PARTITION BY node_id ORDER BY rn) node_rn,
Count (*) OVER (PARTITION BY node_id) node_cnt,
FROM full_tree
)
SELECT RPad (' ', trr.lev*2, ' ') || fni.function "Function tree",
fni.symbolid sy, fni.owner, fni.module,
CASE WHEN trr.node_cnt > 1 THEN trr.node_rn || ' of ' || trr.node_cnt END "Inst.",
trr.sub_t, trr.fun_t, trr.calls,
trr.rn "Row"
FROM tree_ranked trr
JOIN dbmshp_function_info fni
ON fni.symbolid = trr.node_id
AND fni.runid = trr.runid
ORDER BY trr.rn

Query Structure Diagram

The results are then:

Function tree                        SY OWNER MODULE               Inst.         SUB_T      FUN_T  CALLS  Row
----------------------------------- --- ----- -------------------- -------- ---------- ---------- ------ ----
__plsql_vm                            2                                        828,379         58      6    1
__anonymous_block                   1                                        809,839        521      5    2
__sql_fetch_line13               20                                        726,713    726,713      1    3
TABLE_COUNT_TYPE                 12 NET   TABLE_COUNT_TYPE                  55,049         82      1    4
__static_sql_exec_line6        22 NET   TABLE_COUNT_TYPE                  54,967     54,967      1    5
__static_sql_exec_line8          21                                         14,418        260      1    6
__plsql_vm@1                    3                                         14,158         11      1    7
DBFUNC                        8 NET   HPROF_TEST           1 of 2       14,147     14,147      1    8
R_CALLS_R                         9 NET   HPROF_TEST                        12,823      4,159      1    9
R_CALLS_R@1                    10 NET   HPROF_TEST                         8,633      8,618      1   10
PUT_LINE                     18 SYS   DBMS_OUTPUT          1 of 2           15          6      1   11
PUT                        17 SYS   DBMS_OUTPUT          1 of 2           28         28      2   12
NEW_LINE                   16 SYS   DBMS_OUTPUT          1 of 2            7          7      2   13
PUT_LINE                       18 SYS   DBMS_OUTPUT          2 of 2           31          5      1   14
__dyn_sql_exec_line12            19                                            226        226      1   17
GET_LINES                        15 SYS   DBMS_OUTPUT                           68         60      3   18
GET_LINE                       14 SYS   DBMS_OUTPUT                            8          8      3   19
STOP_PROFILING                   11 NET   HPROF_TEST                            21         21      1   20
STOP_PROFILING                 13 SYS   DBMS_HPROF                             0          0      1   21
DBFUNC                              8 NET   HPROF_TEST           2 of 2       18,482     18,482      1   22
A_CALLS_B                             4 NET   HPROF_TEST                        62,340      4,450      1   23
B_CALLS_A                           6 NET   HPROF_TEST                        57,890     14,161      1   24
A_CALLS_B@1                       5 NET   HPROF_TEST                        43,729     13,663      1   25
B_CALLS_A@1                     7 NET   HPROF_TEST                        30,066     30,066      1   26

24 rows selected.

Notice that we now have a single record for each of the 22 links, plus the two root nodes. Also, the "Inst." column lists the instance number of a function having more than one instance, and the children of any such function are only listed once with the gaps in the "Row" column indicating where duplicates have been suppressed.

Network Diagrams
It may be interesting to display the call tree in two diagrams, one for each root.
Root __plsql_vm

Root A_CALLS_B

Notes on Tree Output
Anonymous Block (__anonymous_block)
This function seems to correspond to invocations of anonymous blocks, obviously enough. However, there is an apparent anomaly in the number of calls listed, 6, because the driving program has only three such blocks, and there are none in the called PL/SQL code. I would surmise that the apparent discrepancy arises from the enabling of SERVEROUTPUT, which appears to result in a secondary block being associated with each explicit SQL*Plus block, that issues a call to GET_LINES to process buffered output.

PL/SQL Engine (__plsql_vm)
This function seems to correspond to external invocations of PL/SQL such as from a SQL*Plus session. There are 7 calls, 6 of them presumably being linked with the external anonymous blocks, and the seventh with DBFUNC, where a PL/SQL function is called from a SQL statement from SQL*Plus.

Notice that the SQL statement calling a database function from within PL/SQL generates the recursive call to the engine, __plsql_vm@1

Second Root (A_CALLS_B)
The above function does not have the __plsql_vm/__anonymous_block ancestry that might be expected because profiling only started within the enclosing block.

Inlined Procedure (Rest_a_While)
I wrote a small procedure, Rest_a_While, to generate some elapsed time in the recursive procedures, but preceded it with the INLINE pragma, a new optimisation feature in 11g. This had the desired effect of removing the calls from the profiling output and including the times in the calling procedures. Rest_a_While does not make the obvious call to DBMS_Lock.Sleep because that procedure cannot be inlined. subprogram inlining in 11g provides some analysis of the inlining feature.

Sibling Ordering
We have ordered siblings by descending subtree elapsed time, using the SEARCH clause. It would be nice to have the option to order the siblings by initial invocation time, but Oracle does not provide the data to do this.

Loops and Hierarchies
The first diagram shows two loops, where there are two routes between the loop start and end points, indicated by different colours. The second loop has two child nodes coming from the end point, and hierarchical queries (both CONNECT BY and recursive subquery factors in Oracle) cause the links to be duplicated. Our query has filtered out the duplicates by analytic functions.

It's worth remembering this because it's a general feature of SQL for querying hierarchies, and judging by Oracle forums, not one that's widely understood. For larger hierarchies it can cause serious performance problems, and may justify a PL/SQL programmed solution that need not suffer the same problem.

Manual Instrumentation
Oracle's hierarchical profiler clearly provides extremely useful information on both performance and structure of PL/SQL programs with very little effort. However, it does have the limitation of only providing information down to the subprogram level (which includes embedded SQL statements in this context). It is also often considered good practice to implement timing and other instrumentation permanently in production code, sometimes in a switchable fashion. In the test program, one of the called procedures, A_Calls_B, makes two calls to the inlined procedure, Rest_a_While, the second doing about twice as much work as the first. The profiler reports total within-function times of 4,450µs and 13,663µs on first and second calls, respectively (the work is scaled by a call number parameter, equal to 1, then 3).

I created a second instance of the package and driver script (suffix _TS) to illustrate manual instrumentation. This uses an 'object-oriented' timing package that I wrote a couple of years ago Code Timing and Object Orientation and Zombies (November, 2010) to instrument at procedure and section level. I multiplied the work in Rest_a_While by a factor of ten to get larger times. This produced the output:

Timer Set: HProf, Constructed at 05 Mar 2013 10:21:27, written at 10:21:30
==========================================================================
[Timer timed: Elapsed (per call): 0.04 (0.000044), CPU (per call): 0.05 (0.000050), calls: 1000, '***' denotes corrected line below]

Timer                       Elapsed          CPU          Calls        Ela/Call        CPU/Call
----------------------   ----------   ----------   ------------   -------------   -------------
A_Calls_B, section one         0.06         0.05              2         0.03150         0.02500
A_Calls_B, section two         0.12         0.12              2         0.06050         0.06000
B_Calls_A: 2                   0.15         0.16              1         0.15400         0.16000
B_Calls_A: 4                   0.31         0.30              1         0.30700         0.30000
DBFunc                         0.32         0.31              2         0.15950         0.15500
Open cursor                    0.69         0.69              1         0.68900         0.69000
Fetch from cursor              0.70         0.70              1         0.69600         0.70000
Close cursor                   0.00         0.00              1         0.00000         0.00000
Construct object               0.06         0.04              1         0.05500         0.04000
R_Calls_R                      0.14         0.14              2         0.07000         0.07000
(Other)                        0.00         0.00              1         0.00000         0.00000
----------------------   ----------   ----------   ------------   -------------   -------------
Total                          2.54         2.51             15         0.16960         0.16733
----------------------   ----------   ----------   ------------   -------------   -------------


Notes on Code Timing

• Calls, CPU and elapsed times have been captured at the section level for A_Calls_B
• Observe that, while R_Calls_R and A_Calls_B aggregate over all calls, B_Calls_A records values by call; this is implemented simply by including a value that changes with call in the timer name
• The timing set object is designed to be very low footprint; here 9 statements (calls to Increment_Time), plus a small global overhead, produced 10 result lines, plus associated information
• The 'object-oriented' approach allows multiple programs to be be timed at multiple levels, without interference between timings
• There are Perl and Java implementations of this timing set object included in the Scribd article mentioned

Oracle's Flat Profiler (DBMS_Profiler)
The hierarchical profiler was introduced in v11.1, while prior to this there was a non-hierarchical profiler, DBMS_Profiler. This package still exists in v11: It is omitted from the advanced application developer's guide for v11, but is described in the packages and types manual (Oracle® Database PL/SQL Packages and Types Reference, 11g Release 2 (11.2)); also, SQL*Developer appears to support only the newer hierarchical verion (via right-click on a package). I thought it interesting to run the older version on the same test program (package Old_Test_Prof, driver script Test_Rep_p_Old.sql and reporting script Test_Rep_h_Old.sql). The output from the first three queries is:

Run header (PLSQL_PROFILER_RUNS)

RUNID RUN_DATE        MICRO_S    SECONDS
---------- ------------ ---------- ----------
3 11:03:13        2164000       2.16

Profiler data summary (PLSQL_PROFILER_DATA)

MICRO_S SECONDS    CALLS
---------- ------- --------
2126949    2.13       72

Profiler data by time (PLSQL_PROFILER_DATA)

MICRO_S SECONDS    CALLS UNIT_NAME            UNIT_NUMBER  LINE#
---------- ------- -------- -------------------- ----------- ------
729932    0.73        1                                5     13
569563    0.57        2 OLD_PROF_TEST                  1     56
377880    0.38        2 OLD_PROF_TEST                  1     82
166019    0.17        2 OLD_PROF_TEST                  1     70
150117    0.15        2 OLD_PROF_TEST                  1     43
72742    0.07        2 OLD_PROF_TEST                  1     40
56473    0.06        1 TABLE_COUNT_TYPE               6      6
3338    0.00        1                                5      8
258    0.00        1                                5     12
109    0.00        1                                5     16
68    0.00        2 OLD_PROF_TEST                  1     67
66    0.00        2                                4      1
60    0.00        2                                7      1
60    0.00        2                                3      1
44    0.00        1                                5     14
42    0.00        0                                2      5
31    0.00        1 OLD_PROF_TEST                  1     18
26    0.00        1                                8      5
13    0.00        0                                5      1
9    0.00        1 TABLE_COUNT_TYPE               6     11
9    0.00        2 OLD_PROF_TEST                  1     86
8    0.00        0 OLD_PROF_TEST                  1     51
8    0.00        1 TABLE_COUNT_TYPE               6     13
7    0.00        1                                5     18
6    0.00        0 OLD_PROF_TEST                  1     78
6    0.00        0 OLD_PROF_TEST                  1     64
6    0.00        0                                8      1
6    0.00        1 TABLE_COUNT_TYPE               6      3
5    0.00        0 OLD_PROF_TEST                  1     35
5    0.00        0 OLD_PROF_TEST                  1     15
4    0.00        1                                8      7
4    0.00        1 OLD_PROF_TEST                  1     76
3    0.00        1                                2      8
2    0.00        1 OLD_PROF_TEST                  1     62
2    0.00        1 OLD_PROF_TEST                  1     13
2    0.00        1 TABLE_COUNT_TYPE               6      5
2    0.00        2 OLD_PROF_TEST                  1     72
2    0.00        2 OLD_PROF_TEST                  1     45
2    0.00        2 OLD_PROF_TEST                  1     49
2    0.00        2 OLD_PROF_TEST                  1     46
2    0.00        2 OLD_PROF_TEST                  1     58
1    0.00        1 OLD_PROF_TEST                  1     73
1    0.00        1                                2      6
1    0.00        1 OLD_PROF_TEST                  1     59
1    0.00        1 OLD_PROF_TEST                  1     11
1    0.00        2 OLD_PROF_TEST                  1     54
1    0.00        2 OLD_PROF_TEST                  1     84
0    0.00        0 OLD_PROF_TEST                  1      1
0    0.00        0 OLD_PROF_TEST                  1     88
0    0.00        0                                8      9
0    0.00        0                                2      1
0    0.00        0                                2      2
0    0.00        0 OLD_PROF_TEST                  1      3
0    0.00        0 OLD_PROF_TEST                  1      5
0    0.00        0 OLD_PROF_TEST                  1      9
0    0.00        0 OLD_PROF_TEST                  1     20
0    0.00        1 TABLE_COUNT_TYPE               6      4
0    0.00        1                                8      2
0    0.00        2 OLD_PROF_TEST                  1     39
0    0.00        2 OLD_PROF_TEST                  1     55
0    0.00        2 OLD_PROF_TEST                  1     69
0    0.00        2 OLD_PROF_TEST                  1     38
0    0.00        2 OLD_PROF_TEST                  1     81
0    0.00        2 OLD_PROF_TEST                  1     42
0    0.00        2 OLD_PROF_TEST                  1     68

65 rows selected.



Referring to the package, type and anonymous blocks, I assigned labels to all the lines having more than 10µs, as follows:

   MICRO_S SECONDS    CALLS UNIT_NAME            UNIT_NUMBER  LINE#
---------- ------- -------- -------------------- ----------- ------
729932    0.73        1                                5     13  B2: FETCH
569563    0.57        2 OLD_PROF_TEST                  1     56  B_Calls_A (Rest_a_While)
377880    0.38        2 OLD_PROF_TEST                  1     82  DBFunc (Rest_a_While)
166019    0.17        2 OLD_PROF_TEST                  1     70  R_Calls_R (Rest_a_While)
150117    0.15        2 OLD_PROF_TEST                  1     43  A_Calls_B (Rest_a_While, section 2)
72742    0.07        2 OLD_PROF_TEST                  1     40  A_Calls_B (Rest_a_While, section 1)
56473    0.06        1 TABLE_COUNT_TYPE               6      6  SELECT
3338    0.00        1                                5      8  B2: SELECT DBFunc
258    0.00        1                                5     12  B2: OPEN
109    0.00        1                                5     16  B2: Assign Table_Count_Type
68    0.00        2 OLD_PROF_TEST                  1     67  Put_Line
66    0.00        2                                4      1  Auxiliary SERVEROUTPUT block for B2 (surmised)
60    0.00        2                                7      1  Auxiliary SERVEROUTPUT block for B3 (surmised)
60    0.00        2                                3      1  Auxiliary SERVEROUTPUT block for B1 (surmised)
44    0.00        1                                5     14  B2: CLOSE
42    0.00        0                                2      5  B1: Call to Start_Profiling
31    0.00        1 OLD_PROF_TEST                  1     18  RETURN DBMS_Profiler.Stop_Profiler;
26    0.00        1                                8      5  B3: Call R_Calls_R
13    0.00        0                                5      1  B2: DECLARE


Notes on Output of Flat Profiler
There were six units with no linked information in DBMS_PROFILER_UNITS. By examining the data, I was able to associate unit numbers 2, 5 and 8 with my anonymous blocks B1, B2 and B3. That left three unassigned, and I have surmised that these correspond to the auxiliary blocks associated with processing server output that we earlier surmised when examining the output from the hierarchical profiler.

• The useful call tree structure is not present in the data from the old profiler
• However, the results are at a line level, which the hierarchical profiler does not provide; for example, the two sections of A_Calls_B are reported separately
• Deciphering the output requires significantly more manual effort than with the hierarchical profiler
• Both old and new profiler have their own advantages, and so both should be considered of value
• Manual code timing offers more flexibility in terms of aggregating lines and call instances, but requires more effort...
• ...but not as much as I thought. As noted later on the second example, after reading another article on the profiler, I realised that I could join the system table ALL_SOURCE to see the text of the line (where available)

Second example: Flat profiler omits some detail timings

I later came upon another artilce on the flat profiler, Profiling PL/SQL with dbms_profiler where the author has joined the system table ALL_SOURCE to get the text of the line profiled, which makes interpretation easier. I have then updated the line-level query as follows:

PROMPT Profiler data by time (PLSQL_PROFILER_DATA)
SELECT Round (dat.total_time/1000, 0)  micro_s,
Round (dat.total_time/1000000000, 2) seconds,
dat.total_occur calls,
unt.unit_name,
dat.unit_number,
dat.line#,
Trim (src.text) text
FROM plsql_profiler_data dat
LEFT JOIN plsql_profiler_units unt
ON unt.runid            = dat.runid
AND unt.unit_number      = dat.unit_number
LEFT JOIN all_source      src
ON src.type             IN ('PACKAGE BODY','FUNCTION','PROCEDURE','TRIGGER')
AND src.name             = unt.unit_name
AND src.line             = dat.line#
AND src.owner            = unt.unit_owner
AND src.type             = unt.unit_type
WHERE dat.runid            = :runid
AND dat.total_time       > 0
ORDER BY 1 DESC, 2, 3


Of course the text is only available for stored source, so excludes lines from anonymous blocks.

Flat Profiler

Run header (PLSQL_PROFILER_RUNS)

RUNID RUN_DATE        MICRO_S    SECONDS
---------- ------------ ---------- ----------
5 20:34:45        9220000       9.22

Profiler data by unit (PLSQL_PROFILER_DATA)

UNIT_NAME            UNIT_NUMBER    MICRO_S SECONDS    CALLS
-------------------- ----------- ---------- ------- --------
2        200    0.00        3
UTILS                          1         30    0.00        3

Profiler data by time (PLSQL_PROFILER_DATA)

MICRO_S SECONDS    CALLS UNIT_NAME            UNIT_NUMBER  LINE# TEXT
---------- ------- -------- -------------------- ----------- ------ ------------------------------------------------
136    0.00        1                     2      7
21    0.00        1                     2     10
19    0.00        1                     2     14
14    0.00        0 UTILS                          1    343 FUNCTION Stop_D_Profiling RETURN PLS_INTEGER IS
13    0.00        1 UTILS                          1    346 RETURN DBMS_Profiler.Stop_Profiler;
6    0.00        1 UTILS                          1    341 END Start_D_Profiling;
2    0.00        1 UTILS                          1    339 RETURN l_run_number;

7 rows selected.

Hierarchical Profiler

Run header (DBMSHP_RUNS)

RUNID RUN_TIMESTAMP                   MICRO_S    SECONDS RUN_COMMENT
---------- ---------------------------- ---------- ---------- -----------------------------
16 19-MAR-13 21.37.00.571000       9000292          9 Profile for DBMS_Lock.Sleep

Functions called (DBMSHP_FUNCTION_INFO)

OWNER      MODULE               FUNCTION                LINE#      SUB_T      FUN_T  CALLS
---------- -------------------- ---------------------- ------ ---------- ---------- ------
BRENDAN    UTILS                1: STOP_H_PROFILING       322          8          8      1
SYS        DBMS_HPROF           2: STOP_PROFILING          59          0          0      1
SYS        DBMS_LOCK            3: SLEEP                  197    9000279    9000279      2
SYS        DBMS_LOCK            4: __pkg_init               0          5          5      1

OWNER_P MODULE_P  FUNCTION_P            OWNER_C MODULE_C    FUNCTION_C         SUB_T FUN_T  CALLS
------- --------- --------------------- ------- ----------- ------------------ ----- ----- ------
BRENDAN UTILS     1: STOP_H_PROFILING   SYS     DBMS_HPROF  2: STOP_PROFILING      0     0      1

BPF Recursive Subquery Factor Tree Query

Function tree                              SUB_T      FUN_T  CALLS  Row
------------------------------------- ---------- ---------- ------ ----
3: SYS.DBMS_LOCK.SLEEP                   9000279    9000279      2    1
1: BRENDAN.UTILS.STOP_H_PROFILING              8          8      1    2
2: SYS.DBMS_HPROF.STOP_PROFILING             0          0      1    3
4: SYS.DBMS_LOCK.__pkg_init                    5          5      1    4


Manual Profiler

Timer Set: Profiling DBMS_Lock.Sleep, Constructed at 19 Mar 2013 21:38:54, written at 21:39:03
==============================================================================================
[Timer timed: Elapsed (per call): 0.05 (0.000045), CPU (per call): 0.04 (0.000040), calls: 1000, '***' denotes corrected line below]

Timer               Elapsed          CPU          Calls        Ela/Call        CPU/Call
--------------   ----------   ----------   ------------   -------------   -------------
3 second sleep         3.00         0.00              1         3.00000         0.00000
6 second sleep         6.00         0.00              1         6.00100         0.00000
(Other)                0.00         0.00              1         0.00000         0.00000
--------------   ----------   ----------   ------------   -------------   -------------
Total                  9.00         0.00              3         3.00033         0.00000
--------------   ----------   ----------   ------------   -------------   -------------


Notes on Results for Second Example

• The flat profiler shows 9s at header level but only 230µs at detail level because DBMS_Lock.Sleep does not permit profiling by the user running the script
• The hierarchical profiler shows 9s at header level and a total of 9s in 2 calls to DBMS_Lock.Sleep
• Manual profiling shows the two calls to DBMS_Lock.Sleep taking 3 and 6 seconds

Conclusions

• Running Oracle's hierarchical profiler would seem to be the default first step in tuning PL/SQL programs from v11.1
• Some care is needed in interpreting the output data; I've provided a query for displaying the hierarchies
• Performance is recorded only down to function level, so it will still often be worthwhile to use the old flat profiler in addition
• Manually timing code sections also still has a part to play, in terms of instrumentation and greater flexibility where necessary

Brendan HProf Code
Example 2

# An SQL Solution for the Multiple Knapsack Problem (SKP-m)

In my last article, A Simple SQL Solution for the Knapsack Problem (SKP-1), I presented an SQL solution for the well known knapsack problem in its simpler 1-knapsack form (and it is advisable to read the first article before this one). Here I present an SQL solution for the problem in its more difficult multiple-knapsack form. The solution is a modified version of one I posted on OTN, SQL Query for mapping a set of batches to a class rooms group, and I describe two versions of it, one in pure SQL, and another that includes a database function. The earlier article provided the solutions as comma-separated strings of item identifiers, and in this article also the solutions are first obtained as delimited strings. However, as there are now containers as well as items, we extend the SQL to provide solutions with item and container names in separate fields within records for each container-item pair. The solution is presented, as before, more for its theoretical interest than for practical applicability. Much research has been done on procedural algorithms for this important, but computationally difficult class of problems.

We will consider the same simple example problem as in the earlier article, having four items, but now with two containers with individual weight limits of 8 and 10. As noted in the earlier article, the problem can be considered as that of assigning each item to one of the containers, or to none, leading directly to the expression $N(4, 3) = (3)^4 = 81$ for the number of not necessarily feasible assignment sets for the example. We can again depict the 24 possible item combinations in a diagram, with the container limits added.

We can see that there is one optimal solution in this case, in which items 1 and 3 are assigned to container 1, while items 2 and 4 are assigned to container 2, with a profit of 100. How to find it using SQL?

SQL Solution
The solution to the single knapsack problem worked by joining items recursively in increasing order of item id, accumulating the total weights and profits, and terminating a sequence when no more items can be added within the weight limit. The item sequences were accumulated as comma-separated strings, and the optimal solutions obtained by analytic ranking of the profits.

For the multiple knapsack problem, it's not quite as simple, but a similar approach may be a good starting point. Previously our anchor branch in the recursion selected all items below the single maximum weight, but we now have containers with individual weights. If we now join the containers table we can find all items falling within the maximum weights by container. The recursion can then proceed to find all feasible item combinations by container. Here is the SQL for this:

WITH rsf_itm (con_id, max_weight, itm_id, lev, tot_weight, tot_profit, path) AS (
SELECT c.id,
c.max_weight,
i.id,
0,
i.item_weight,
i.item_profit,
',' || i.id || ','
FROM items i
JOIN containers c
ON i.item_weight <= c.max_weight
UNION ALL
SELECT r.con_id,
r.max_weight,
i.id,
r.lev + 1,
r.tot_weight + i.item_weight,
r.tot_profit + i.item_profit,
r.path || i.id || ','
FROM rsf_itm r
JOIN items i
ON i.id > r.itm_id
AND r.tot_weight + i.item_weight <= r.max_weight
ORDER BY 1, 2
) SEARCH DEPTH FIRST BY con_id, itm_id SET line_no
SELECT con_id,
max_weight,
LPad (To_Char(itm_id), 2*lev + 1, ' ') itm_id,
path itm_path,
tot_weight, tot_profit
FROM rsf_itm
ORDER BY line_no


and here is the resulting output:

CON_ID MAX_WEIGHT ITM_ID ITM_PATH    TOT_WEIGHT TOT_PROFIT
------ ---------- ------ ----------- ---------- ----------
1          8 1      ,1,                  3         10
2    ,1,2,                7         30
3    ,1,3,                8         40
2      ,2,                  4         20
3      ,3,                  5         30
4      ,4,                  6         40
2         10 1      ,1,                  3         10
2    ,1,2,                7         30
3    ,1,3,                8         40
4    ,1,4,                9         50
2      ,2,                  4         20
3    ,2,3,                9         50
4    ,2,4,               10         60
3      ,3,                  5         30
4      ,4,                  6         40

15 rows selected.


Looking at this, we can see that the overall solution will comprise one feasible combination of items for each container, with the constraint that no item appears in more than one container. This suggests that we could perform a second recursion in a similar way to the first, but this time using the results of the first as input, and joining the feasible combinations of containers of higher id only. If we again accumulate the sequence in a delimited string, regular expression functionality could be used to avoid joining combinations with items already included. The following SQL does this recursion:

WITH rsf_itm (con_id, max_weight, itm_id, tot_weight, tot_profit, path) AS (
SELECT c.id,
c.max_weight,
i.id,
i.item_weight,
i.item_profit,
',' || i.id || ','
FROM items i
JOIN containers c
ON i.item_weight <= c.max_weight
UNION ALL
SELECT r.con_id,
r.max_weight,
i.id,
r.tot_weight + i.item_weight,
r.tot_profit + i.item_profit,
r.path || i.id || ','
FROM rsf_itm r
JOIN items i
ON i.id > r.itm_id
AND r.tot_weight + i.item_weight <= r.max_weight
)
, rsf_con (con_id, con_itm_set, con_itm_path, lev, tot_weight, tot_profit) AS (
SELECT con_id,
':' || con_id || ':' || path,
':' || con_id || ':' || path,
0,
tot_weight,
tot_profit
FROM rsf_itm
UNION ALL
SELECT r_i.con_id,
':' || r_i.con_id || ':' || r_i.path,
r_c.con_itm_path ||  ':' || r_i.con_id || ':' || r_i.path,
r_c.lev + 1,
r_c.tot_weight + r_i.tot_weight,
r_c.tot_profit + r_i.tot_profit
FROM rsf_con r_c
JOIN rsf_itm r_i
ON r_i.con_id > r_c.con_id
WHERE RegExp_Instr (r_c.con_itm_path || r_i.path, ',(\d+),.*?,\1,') = 0
) SEARCH DEPTH FIRST BY con_id SET line_no
SELECT
LPad (' ', 2*lev, ' ') || con_itm_set con_itm_set,
con_itm_path,
tot_weight, tot_profit
FROM rsf_con
ORDER BY line_no


Notice the use of RegExp_Instr, which takes the current sequence with potential new combination appended as its source string, and looks for a match against the search string ',(\d+),.*?,\1,'. The function returns 0 if no match is found, meaning no duplicate item was found. The sequence includes the container id using a different delimiter, a colon, at the start of each combination. The search string can be explained as follows:

,(\d+), = a sequence of one or more digits with a comma either side, and the digit sequence saved for referencing
.*?,\1, = a sequence of any characters, followed by the saved digit sequence within commas. The ? specifies a non-greedy search, meaning stop searching as soon as a match is found

The result of the query is:

CON_ITM_SET          CON_ITM_PATH         TOT_WEIGHT TOT_PROFIT
-------------------- -------------------- ---------- ----------
:1:,1,               :1:,1,                        3         10
:2:,2,             :1:,1,:2:,2,                  7         30
:2:,3,             :1:,1,:2:,3,                  8         40
:2:,4,             :1:,1,:2:,4,                  9         50
:2:,2,3,           :1:,1,:2:,2,3,               12         60
:2:,2,4,           :1:,1,:2:,2,4,               13         70
:1:,2,               :1:,2,                        4         20
:2:,1,             :1:,2,:2:,1,                  7         30
:2:,3,             :1:,2,:2:,3,                  9         50
:2:,1,3,           :1:,2,:2:,1,3,               12         60
:2:,4,             :1:,2,:2:,4,                 10         60
:2:,1,4,           :1:,2,:2:,1,4,               13         70
:1:,1,2,             :1:,1,2,                      7         30
:2:,3,             :1:,1,2,:2:,3,               12         60
:2:,4,             :1:,1,2,:2:,4,               13         70
:1:,3,               :1:,3,                        5         30
:2:,1,             :1:,3,:2:,1,                  8         40
:2:,2,             :1:,3,:2:,2,                  9         50
:2:,1,2,           :1:,3,:2:,1,2,               12         60
:2:,4,             :1:,3,:2:,4,                 11         70
:2:,1,4,           :1:,3,:2:,1,4,               14         80
:2:,2,4,           :1:,3,:2:,2,4,               15         90
:1:,1,3,             :1:,1,3,                      8         40
:2:,2,             :1:,1,3,:2:,2,               12         60
:2:,4,             :1:,1,3,:2:,4,               14         80
:2:,2,4,           :1:,1,3,:2:,2,4,             18        100
:1:,4,               :1:,4,                        6         40
:2:,1,             :1:,4,:2:,1,                  9         50
:2:,2,             :1:,4,:2:,2,                 10         60
:2:,1,2,           :1:,4,:2:,1,2,               13         70
:2:,3,             :1:,4,:2:,3,                 11         70
:2:,1,3,           :1:,4,:2:,1,3,               14         80
:2:,2,3,           :1:,4,:2:,2,3,               15         90
:2:,1,               :2:,1,                        3         10
:2:,2,               :2:,2,                        4         20
:2:,1,2,             :2:,1,2,                      7         30
:2:,3,               :2:,3,                        5         30
:2:,1,3,             :2:,1,3,                      8         40
:2:,4,               :2:,4,                        6         40
:2:,1,4,             :2:,1,4,                      9         50
:2:,2,3,             :2:,2,3,                      9         50
:2:,2,4,             :2:,2,4,                     10         60

42 rows selected.


We can see that the optimal solutions can be obtained from the output again using analytic ranking by profit, and in this case the solution with a profit of 100 is the optimal one, with sequence ':1:,1,3,:2:,2,4,'. In the full solution, as well as selecting out the top-ranking solutions, we have extended the query to output the items and containers by name, in distinct fields with a record for every solution/container/item combination. For the example problem above, the output is:

    SOL_ID S_WT  S_PR  C_ID C_NAME          M_WT C_WT  I_ID I_NAME     I_WT I_PR
---------- ---- ----- ----- --------------- ---- ---- ----- ---------- ---- ----
1   18   100     1 Item 1             8    8     1 Item 1        3   10
3 Item 3        5   30
2 Item 2            10   10     2 Item 2        4   20
4 Item 4        6   40


SQL-Only Solution - XSQL
There are various techniques in SQL for splitting string columns into multiple rows and columns. We will take one of the more straightforward ones that uses the DUAL table with CONNECT BY to generate rows against which to anchor the string-parsing.

WITH rsf_itm (con_id, max_weight, itm_id, lev, tot_weight, tot_profit, path) AS (
SELECT c.id,
c.max_weight,
i.id,
0,
i.item_weight,
i.item_profit,
',' || i.id || ','
FROM items i
JOIN containers c
ON i.item_weight <= c.max_weight
UNION ALL
SELECT r.con_id,
r.max_weight,
i.id,
r.lev + 1,
r.tot_weight + i.item_weight,
r.tot_profit + i.item_profit,
r.path || i.id || ','
FROM rsf_itm r
JOIN items i
ON i.id > r.itm_id
AND r.tot_weight + i.item_weight <= r.max_weight
)
, rsf_con (con_id, con_path, itm_path, tot_weight, tot_profit, lev) AS (
SELECT con_id,
To_Char(con_id),
':' || con_id || '-' || (lev + 1) || ':' || path,
tot_weight,
tot_profit,
0
FROM rsf_itm
UNION ALL
SELECT r_i.con_id,
r_c.con_path || ',' || r_i.con_id,
r_c.itm_path ||  ':' || r_i.con_id || '-' || (r_i.lev + 1) || ':' || r_i.path,
r_c.tot_weight + r_i.tot_weight,
r_c.tot_profit + r_i.tot_profit,
r_c.lev + 1
FROM rsf_con r_c
JOIN rsf_itm r_i
ON r_i.con_id > r_c.con_id
AND RegExp_Instr (r_c.itm_path || r_i.path, ',(\d+),.*?,\1,') = 0
)
, paths_ranked AS (
SELECT itm_path || ':' itm_path, tot_weight, tot_profit, lev + 1 n_cons,
Rank () OVER (ORDER BY tot_profit DESC) rnk
FROM rsf_con
), best_paths AS (
SELECT itm_path, tot_weight, tot_profit, n_cons,
Row_Number () OVER (ORDER BY tot_weight DESC) sol_id
FROM paths_ranked
WHERE rnk = 1
), row_gen AS (
SELECT LEVEL lev
FROM DUAL
CONNECT BY LEVEL <= (SELECT Count(*) FROM items)
), con_v AS (
SELECT  r.lev con_ind, b.sol_id, b.tot_weight, b.tot_profit,
Substr (b.itm_path, Instr (b.itm_path, ':', 1, 2*r.lev - 1) + 1,
Instr (b.itm_path, ':', 1, 2*r.lev) - Instr (b.itm_path, ':', 1, 2*r.lev - 1) - 1)
con_nit_id,
Substr (b.itm_path, Instr (b.itm_path, ':', 1, 2*r.lev) + 1,
Instr (b.itm_path, ':', 1, 2*r.lev + 1) - Instr (b.itm_path, ':', 1, 2*r.lev) - 1)
itm_str
FROM best_paths b
JOIN row_gen r
ON r.lev <= b.n_cons
), con_split AS (
SELECT sol_id, tot_weight, tot_profit,
Substr (con_nit_id, 1, Instr (con_nit_id, '-', 1) - 1) con_id,
Substr (con_nit_id, Instr (con_nit_id, '-', 1) + 1) n_items,
itm_str
FROM con_v
), itm_v AS (
SELECT  c.sol_id, c.con_id, c.tot_weight, c.tot_profit,
Substr (c.itm_str, Instr (c.itm_str, ',', 1, r.lev) + 1,
Instr (c.itm_str, ',', 1, r.lev + 1) - Instr (c.itm_str, ',', 1, r.lev) - 1)
itm_id
FROM con_split c
JOIN row_gen r
ON r.lev <= c.n_items
)
SELECT
/* SEL */
v.sol_id sol_id,
v.tot_weight s_wt,
v.tot_profit s_pr,
c.id c_id,
c.name c_name,
c.max_weight m_wt,
Sum (i.item_weight) OVER (PARTITION BY v.sol_id, c.id) c_wt,
i.id i_id,
i.name i_name,
i.item_weight i_wt,
i.item_profit i_pr
/* SEL */
FROM itm_v v
JOIN containers c
ON c.id = To_Number (v.con_id)
JOIN items i
ON i.id = To_Number (v.itm_id)
ORDER BY sol_id, con_id, itm_id


SQL with Function Solution - XFUN
The SQL techniques for string-splitting are quite cumbersome, and a better approach may be the use of a pipelined function that allows the string-parsing to be done in PL/SQL, a procedural language that is better suited to the task.

WITH rsf_itm (con_id, max_weight, itm_id, tot_weight, tot_profit, path) AS (
SELECT c.id,
c.max_weight,
i.id,
i.item_weight,
i.item_profit,
',' || i.id || ','
FROM items i
JOIN containers c
ON i.item_weight <= c.max_weight
UNION ALL
SELECT r.con_id,
r.max_weight,
i.id,
r.tot_weight + i.item_weight,
r.tot_profit + i.item_profit,
r.path || i.id || ','
FROM rsf_itm r
JOIN items i
ON i.id > r.itm_id
AND r.tot_weight + i.item_weight <= r.max_weight
ORDER BY 1, 2
)
, rsf_con (con_id, itm_path, tot_weight, tot_profit) AS (
SELECT con_id,
':' || con_id || ':' || path,
tot_weight,
tot_profit
FROM rsf_itm
UNION ALL
SELECT r_i.con_id,
r_c.itm_path ||  ':' || r_i.con_id || ':' || r_i.path,
r_c.tot_weight + r_i.tot_weight,
r_c.tot_profit + r_i.tot_profit
FROM rsf_con r_c
JOIN rsf_itm r_i
ON r_i.con_id > r_c.con_id
AND RegExp_Instr (r_c.itm_path || r_i.path, ',(\d+),.*?,\1,') = 0
)
, paths_ranked AS (
SELECT itm_path || ':' itm_path, tot_weight, tot_profit, Rank () OVER (ORDER BY tot_profit DESC) rn,
Row_Number () OVER (ORDER BY tot_profit DESC, tot_weight DESC) sol_id
FROM rsf_con
), itm_v AS (
SELECT s.con_id, s.itm_id, p.itm_path, p.tot_weight, p.tot_profit, p.sol_id
FROM paths_ranked p
CROSS JOIN TABLE (Multi.Split_String (p.itm_path)) s
WHERE rn = 1
)
SELECT v.sol_id sol_id,
v.tot_weight s_wt,
v.tot_profit s_pr,
c.id c_id,
c.name c_name,
c.max_weight m_wt,
Sum (i.item_weight) OVER (PARTITION BY v.sol_id, c.id) c_wt,
i.id i_id,
i.name i_name,
i.item_weight i_wt,
i.item_profit i_pr
FROM itm_v v
JOIN containers c
ON c.id = To_Number (v.con_id)
JOIN items i
ON i.id = To_Number (v.itm_id)
ORDER BY sol_id, con_id, itm_id


Pipelined Database Function

CREATE OR REPLACE TYPE con_itm_type AS OBJECT (con_id NUMBER, itm_id NUMBER);
/
CREATE OR REPLACE TYPE con_itm_list_type AS VARRAY(100) OF con_itm_type;
/
CREATE OR REPLACE PACKAGE BODY Multi IS

FUNCTION Split_String (p_string VARCHAR2) RETURN con_itm_list_type PIPELINED IS

l_pos_colon_1           PLS_INTEGER := 1;
l_pos_colon_2           PLS_INTEGER;
l_pos_comma_1           PLS_INTEGER;
l_pos_comma_2           PLS_INTEGER;
l_con                   PLS_INTEGER;
l_itm                   PLS_INTEGER;

BEGIN

LOOP

l_pos_colon_2 := Instr (p_string, ':', l_pos_colon_1 + 1, 1);
EXIT WHEN l_pos_colon_2 = 0;

l_con := To_Number (Substr (p_string, l_pos_colon_1 + 1, l_pos_colon_2 - l_pos_colon_1 - 1));
l_pos_colon_1 := Instr (p_string, ':', l_pos_colon_2 + 1, 1);
l_pos_comma_1 := l_pos_colon_2 + 1;

LOOP

l_pos_comma_2 := Instr (p_string, ',', l_pos_comma_1 + 1, 1);
EXIT WHEN l_pos_comma_2 = 0 OR l_pos_comma_2 > l_pos_colon_1;

l_itm := To_Number (Substr (p_string, l_pos_comma_1 + 1, l_pos_comma_2 - l_pos_comma_1 - 1));
PIPE ROW (con_itm_type (l_con, l_itm));
l_pos_comma_1 := l_pos_comma_2;

END LOOP;

END LOOP;

END Split_String;

END Multi;


Query Structure Diagram (embedded directly)
The QSD shows both queries in a single diagram as the early query blocks are almost the same (the main difference is that the strings contain a bit more information for XSQL to facilitate the later splitting). The directly-embedded version shows the whole query, but it may be hard to read the detail, so it is followed by a larger, scrollable version within Excel.

Query Structure Diagram (embedded via Excel)
This is the larger, scrollable version.

Performance Analysis
As in the previous article, we will see how the solution methods perform as problem size varies, using my own performance benchmarking framework.

Test Data Sets
Test data sets are generated as follows, in terms of two integer parameters, w and d:

• Insert w containers with sequential ids and random maximum weights between 1 and 100
• Insert d items with sequential ids and random weights and profits in the ranges 1-60 and 1-10000, respectively, via Oracle's function DBMS_Random.Value

Test Results
The embedded Excel file below summarises the results obtained over a grid of data points, with w in (1, 2, 3) and d in (8, 10, 12, 14, 16, 18).

The graphs tab below shows 3-d graphs of the number of rows processed and the CPU time for XFUN.

Notes

• There is not much difference in performance between the two query versions, no doubt because the number of solution records is generally small compared with rows processed in the recursions
• Notice that the timings correlate well with the rows processed, but not so well with the numbers of base records. The nature of the problem means that some of the randomised data sets turn out to be much harder to solve than others
• Notice the estimated rows on step 36 of the execution plan for the pipelined function solution. The value of 8168 is a fixed value that Oracle assumes since it has no statistics to go on. We could improve this by using the (undocumented) cardinality hint to provide a smaller estimate
• I extended my benchmarking framework for this article to report the intermediate numbers of rows processed, as well as the cardinality estimates and derived errors in these estimates (maximum for each plan). It is obvious from the nature of the problem that Oracle's Cost Based Optimiser (CBO) is not going to be able to make good cardinality estimates

Conclusions
Oracle's v11.2 implementation of the Ansii SQL feature recursive subquery factoring provides a means for solving the knapsack problem, in its multiple knapsack form, in SQL. The solution is not practical for large problems, for which procedural techniques that have been extensively researched should be considered. However, the techniques used may be of interest for combinatorial problems that are small enough to be handled in SQL, and for other types of problem in general.

# A Simple SQL Solution for the Knapsack Problem (SKP-1)

A poster on OTN (Combination using pl/sql) recently asked for an SQL solution to a problem that turned out to be an example of the well known Knapsack Problem, for the case of a single knapsack. I posted an SQL query as a solution, and also a solution in PL/SQL because the SQL solution uses a feature only available in Oracle v11.2. In this article I explain how the solutions work and provide the results of a performance analysis that involved randomised test problems of varying computational difficulty. I have taken a more general form of problem than the original poster described, and the solutions here have been improved.

Update, 14 July 2013: I used the technique in response to another OTN post here, SQL for the Fantasy Football Knapsack Problem. I have extended the idea there to allow for fast approximate solutions making it viable for larger problems, and have also used a similar idea here, SQL for the Travelling Salesman Problem.

Knapsack Problem (1-Knapsack)
The various forms of knapsack problem have been studied extensively. The problems are known to be computationally difficult and many algorithms have been proposed for both exact and approximate solutions (see reference above). The SQL solution in this article is quite simple and will not be competitive in performance for larger problems in the form described here, but may be interesting for being implemented in pure SQL (and without using Oracle's Model clause, or a purely brute force approach). However, I have later extended the approach to allow for search limiting and have shown this to be viable for larger problems (see links at the top).

The problem can be stated informally, as follows: Given a set of items, each having positive weight and profit attributes, and a weight limit, find the combinations of items that maximise profit within the weight limit. Variant versions include the addition of multiple constraints (easy to handle), and inclusion of multiple knapsacks (more difficult). I also have a solution for the multiple knapsacks version described here (An SQL Solution for the Multiple Knapsack Problem (SKP-m)).

The difficulty of the problem arises from the number of possible combinations increasing exponentially with problem size. The number of these (not necessarily feasible) combinations, N(n,1), can be expressed in terms of the number of items, n, in two ways. First, we can use the well known binomial expression for the number of combinations of r items, summed from $r=0$ to $r=n$:

N(n,1) = $\sum_{r=0}^n \binom{n}{r}$

where $\binom{n}{r} = \frac{n!}{r!(n-r)!}$

Second, and more simply, we can observe that including an item in the combination, or not, is a binary choice, leading to:

N(n,1) = $2^n$

This generalises easily to the expression for the multiple knapsack problem, with m knapsacks:

N(n,m) = $(1+m)^n$

This can also be expressed using a binomial series as

N(n,m) = $\sum_{r=0}^n \binom{n}{r} m^r$

Here, $\binom{n}{r}$ represents the number of combinations of r items from n, with $m^r$ being the number of assignments of the r items to m containers.

Let's look at a simple example problem having four items, with a weight limit of 9, as shown below:

There are 24 possible combinations of these items, having from 0 to 4 items. These are depicted below:

We can see that there are two optimal solutions in this case. How to find them using SQL?

SQL Solution

Oracle's v11.2 implementation of the Ansii standard Recursive Subquery Factoring can be used as the basis for an SQL solution. This would works as follows: Starting from each item in turn, add items recursively while remaining within the weight limit, and considering only items of id greater than the current id. The SQL looks like this, where a marker is added for leaf nodes, following an approach from the Amis technology blog:

WITH rsf (nxt_id, lev, tot_weight, tot_profit, path) AS (
SELECT id nxt_id, 0 lev, item_weight tot_weight, item_profit tot_profit, To_Char (id) path
FROM items
UNION ALL
SELECT n.id,
r.lev + 1,
r.tot_weight + n.item_weight,
r.tot_profit + n.item_profit,
r.path || ',' || To_Char (n.id)
FROM rsf r
JOIN items n
ON n.id > r.nxt_id
AND r.tot_weight + n.item_weight <= 9
) SEARCH DEPTH FIRST BY nxt_id SET line_no
SELECT LPad (To_Char(nxt_id), lev + 1, '*') node,tot_weight, tot_profit,
CASE WHEN lev >= Lead (lev, 1, lev) OVER (ORDER BY line_no) THEN 'Y' END is_leaf,
path
FROM rsf
ORDER BY line_no


and the solution like this:

NODE       TOT_WEIGHT TOT_PROFIT I PATH
---------- ---------- ---------- - ------------------------------
1                   3         10   1
*2                  7         30 Y 1,2
*3                  8         40 Y 1,3
*4                  9         50 Y 1,4
2                   4         20   2
*3                  9         50 Y 2,3
3                   5         30 Y 3
4                   6         40 Y 4

8 rows selected.

The output contains 8 records, as opposed to the total of 23 non-null combinations, because only feasible items are joined, and permutations are avoided by the constraint that item ids increase along the path. Given positivity of weight and profit, we know that all solutions must be leaves, and we can represent the tree structure above in the following diagram:

We can now use the recursive subquery factor as an input to a main query that selects one of the most profitable solutions, or alternatively to a further subquery factor that ranks the solutions in order of descending profit. In the latter case, the main query can select all the most profitable solutions.

In the solution I posted on the OTN thread, I included a subquery factor to restrict the final query section to leaf nodes only. This was because we know that the solutions must be leaf nodes, and usually it is more efficient to filter out non-solution records as early as possible. However, I later realised that the work involved in the filtering might outweigh the saving for the final section, and this turned out to be the case here, as shown in the performance analysis section below. Here are the two queries, without the leaf node filtering:

Query - KEEP

WITH rsf (id, lev, tot_weight, tot_profit, path) AS (
SELECT id, 0, item_weight, item_profit, To_Char (id)
FROM items
UNION ALL
SELECT n.id,
r.lev + 1,
r.tot_weight + n.item_weight,
r.tot_profit + n.item_profit,
r.path || ',' || To_Char (n.id)
FROM rsf r
JOIN items n
ON n.id > r.id
AND r.tot_weight + n.item_weight <= 100
)
SELECT Max (tot_weight) KEEP (DENSE_RANK LAST ORDER BY tot_profit) tot_weight,
Max (tot_profit) KEEP (DENSE_RANK LAST ORDER BY tot_profit) tot_profit,
Max (path) KEEP (DENSE_RANK LAST ORDER BY tot_profit) path,
(Max (lev) KEEP (DENSE_RANK LAST ORDER BY tot_profit) + 1) n_items
FROM rsf

Query - RANK

WITH rsf (id, lev, tot_weight, tot_profit, path) AS (
SELECT id, 0, item_weight, item_profit, To_Char (id)
FROM items
UNION ALL
SELECT n.id,
r.lev + 1,
r.tot_weight + n.item_weight,
r.tot_profit + n.item_profit,
r.path || ',' || To_Char (n.id)
FROM rsf r
JOIN items n
ON n.id > r.id
AND r.tot_weight + n.item_weight <= 100
)
, paths_ranked AS (
SELECT tot_weight, tot_profit, path,
Dense_Rank () OVER (ORDER BY tot_profit DESC) rnk_profit,
lev
FROM rsf
)
SELECT tot_weight tot_weight,
tot_profit tot_profit,
path path,
(lev + 1) n_items
FROM paths_ranked
WHERE rnk_profit = 1
ORDER BY tot_weight DESC

Query Structure Diagram

It's worth noting that Oracle's proprietary recursive syntax, Connect By, cannot be used in this way because of the need to accumulate weights forward through the recursion. The new Ansii syntax is only available from v11.2 though, and I thought it might be interesting to implement a solution in PL/SQL that would work in earlier versions, following a similar algorithm, again with recursion.

PL/SQL Recursive Solution

This is a version in the form of a pipelined function, as I wanted to compare it with the SQL solutions, and be callable from SQL.
SQL

SELECT COLUMN_VALUE sol
FROM TABLE (Packing_PLF.Best_Fits (100))
ORDER BY COLUMN_VALUE

Package

CREATE OR REPLACE PACKAGE BODY Packing_PLF IS

PROCEDURE Write_Log (p_line VARCHAR2) IS
BEGIN
NULL;
END Write_Log;

FUNCTION Best_Fits (p_weight_limit NUMBER) RETURN SYS.ODCIVarchar2List PIPELINED IS

TYPE item_type IS RECORD (
item_id                 PLS_INTEGER,
item_index_parent       PLS_INTEGER,
weight_to_node          NUMBER);
TYPE item_tree_type IS        TABLE OF item_type;
g_solution_list               SYS.ODCINumberList;

g_timer                       PLS_INTEGER := Timer_Set.Construct ('Pipelined Recursion');

i                             PLS_INTEGER := 0;
j                             PLS_INTEGER := 0;
g_item_tree                   item_tree_type;
g_item                        item_type;
l_weight                      PLS_INTEGER;
l_weight_new                  PLS_INTEGER;
l_best_profit                 PLS_INTEGER := -1;
l_sol                         VARCHAR2(4000);
l_sol_cnt                     PLS_INTEGER := 0;

p_item_index_parent     PLS_INTEGER,
p_weight_to_node        NUMBER) RETURN PLS_INTEGER IS
BEGIN

g_item.item_id := p_item_id;
g_item.item_index_parent := p_item_index_parent;
g_item.weight_to_node := p_weight_to_node;
IF g_item_tree IS NULL THEN

g_item_tree := item_tree_type (g_item);

ELSE

g_item_tree.Extend;
g_item_tree (g_item_tree.COUNT) := g_item;

END IF;
RETURN g_item_tree.COUNT;

PROCEDURE Do_One_Level (p_tree_index PLS_INTEGER, p_item_id PLS_INTEGER, p_tot_weight PLS_INTEGER, p_tot_profit PLS_INTEGER) IS

CURSOR c_nxt IS
SELECT id, item_weight, item_profit
FROM items
WHERE id > p_item_id
AND item_weight + p_tot_weight <= p_weight_limit;
l_is_leaf           BOOLEAN := TRUE;
l_index_list        SYS.ODCINumberList;

BEGIN

FOR r_nxt IN c_nxt LOOP
Timer_Set.Increment_Time (g_timer,  'Do_One_Level/r_nxt');

l_is_leaf := FALSE;
Do_One_Level (Add_Node (r_nxt.id, p_tree_index, r_nxt.item_weight + p_tot_weight), r_nxt.id, p_tot_weight + r_nxt.item_weight, p_tot_profit + r_nxt.item_profit);
Timer_Set.Increment_Time (g_timer,  'Do_One_Level/Do_One_Level');

END LOOP;

IF l_is_leaf THEN

IF p_tot_profit > l_best_profit THEN

g_solution_list := SYS.ODCINumberList (p_tree_index);
l_best_profit := p_tot_profit;

ELSIF p_tot_profit = l_best_profit THEN

g_solution_list.Extend;
g_solution_list (g_solution_list.COUNT) := p_tree_index;

END IF;

END IF;
Timer_Set.Increment_Time (g_timer,  'Do_One_Level/leaves');

END Do_One_Level;

BEGIN

FOR r_itm IN (SELECT id, item_weight, item_profit FROM items) LOOP

Timer_Set.Increment_Time (g_timer,  'Root fetches');
Do_One_Level (Add_Node (r_itm.id, 0, r_itm.item_weight), r_itm.id, r_itm.item_weight, r_itm.item_profit);

END LOOP;

FOR i IN 1..g_solution_list.COUNT LOOP

j := g_solution_list(i);
l_sol := NULL;
l_weight := g_item_tree (j).weight_to_node;
WHILE j != 0 LOOP

l_sol := l_sol || g_item_tree (j).item_id || ', ';
j :=  g_item_tree (j).item_index_parent;

END LOOP;
l_sol_cnt := l_sol_cnt + 1;
PIPE ROW ('Solution ' || l_sol_cnt || ' (profit ' || l_best_profit || ', weight ' || l_weight || ') : ' || RTrim (l_sol, ', '));

END LOOP;

Timer_Set.Increment_Time (g_timer,  'Write output');
Write_Log ('Profit ' || l_best_profit || ' has ' || l_sol_cnt || ' solutions...');
Timer_Set.Write_Times (g_timer);

EXCEPTION
WHEN OTHERS THEN
Timer_Set.Write_Times (g_timer);
RAISE;

END Best_Fits;

END Packing_PLF;


Performance Analysis

It will be interesting to see how the solution methods perform as problem size varies, and we will use my own performance benchmarking framework to do this. As the framework is designed to compare performance of SQL queries, I have converted the PL/SQL solution to operate as a pipelined function, and thus be callable from SQL, as noted above. I included a version of the SQL solution, with the leaf filtering mentioned above, XKPLV - this was based on XKEEP, with filtering as in the OTN thread.

Test Data Sets

Test data sets are generated as follows, in terms of two integer parameters, w and d:

Insert w items with sequential ids, and random weights and profits in the ranges 0-d and 0-1000, respectively, via Oracle's function DBMS_Random.Value. The maximum weight is fixed at 100.

Test Results

The embedded Excel file below summarises the results obtained over a grid of data points, with w in (12, 14, 16, 18, 20) and d in (16, 18, 20).

Notes

• The two versions of the non-leaf SQL solution take pretty much the same time to execute, and are faster than the others
• The leaf version of the SQL solution (XKPLV) is slower than the non-leaf versions, and becomes much worse in terms of elapsed time for the more difficult problems; the step-change in performance can be seen to be due to its greater memory usage, which spills to disk above a certain level
• The pipelined function solution is significantly slower than the other solutions in terms of CPU time, and elapsed time, except in the case of the leaf SQL solution when that solution's memory usage spills to disk. The pipelined function continues to use more memory as the problem difficulty rises, until all available memory is consumed, when it throws an error (but this case is not included in the result set above)

Conclusions

Oracle's v11.2 implementation of the Ansii SQL feature recursive subquery factoring provides a simple solution for the knapsack problem, that cannot be achieved with Oracle's older Connect By syntax alone.

The method has been described here in its exact form that is viable only for small problems; however, I have later extended the approach to allow for search limiting and have shown this to be viable for larger problems.

# A Layered Approach To Processing XML Web Services

As explained in an earlier post, Data Modelling XML SOAP Documents, I have an approach to calling web services that involves the use of a generic layer that lies between the client applications and low-level APIs for HTTP calls and XML processing. The earlier post introduces the subject, and deals with the data modelling aspects. This post gives high-level, largely diagrammatic, design information for my PL/SQL implementation of the approach. I expect to post on examples of use and results at a later date.

Layer Diagram

External Call Structure

External Call Structure Diagram

External Call Procedures

Web Service Call

Web Service Call Structure Diagram

Web Service Call Structure Procedures

 Procedure Description Custom Procedures Call Web Service Coordinating procedure for the web service call. Note that both request and response writing and reading calls are within loops as the messages can be more than the HTTP maximum chunk size of 32767 bytes Expand Element (Request) Recursive procedure to create the XML SOAP request from the XML Tree array and other inputs Delim Field Formats an XML element within its tags Expand Element (Response) Recursive procedure to convert the initial form of Group Structure Tree List by Parent into the nested form, Group Structure Tree List, used by later processing Oracle Built-in Packages UTL_HTTP Oracle HTTP package used to make the HTTP request and read the response DBMS_LOB Oracle ‘large object’ package used for processing CLOB variables for the full request and response, passed in 32767-byte chunks in the HTTP calls DBMS_XMLDOM Oracle XML package used to create an XML document and node from the response XMLTYPE Oracle XML package used to create a variable of XML type for passing to the above package

Populate Tree Call

Populate Tree Call Structure Diagram

Populate Tree Procedures

 Procedure Description Custom Procedures Populate Tree Main procedure for populating Data Tree List array. First an attempt is made to populate the output tree specified by the input Group Structure Tree List array; if this returns an error, then a second call is made to populate the output tree specified by the standard error group structure; sometimes this too can fail, if the HHTP response is not the expected SOAP message, and this will also be trapped and returned as an error message variable Check Fault Resets the input Group Structure Tree List array to match the standard SOAP error structure and calls the next procedure to populate the corresponding output tree Populate Specific Tree Populates the output tree specified by the current Group Structure Tree List array: this may be either that specified by the client application, or the standard error structure Populate Tree Record Recursive procedure to build the output Data Tree List array using Oracle’s XML APIs Oracle Built-in Packages DBMS_XMLDOM ‘The DBMS_XMLDOM package is used to access XMLType objects, and implements the Document Object Model (DOM), an application programming interface for HTML and XML documents’ - Oracle® Database PL/SQL Packages and Types Reference, v11.2 DBMS_XMLProcessor ‘The DBMS_XSLPROCESSOR package provides an interface to manage the contents and structure of XML documents’ - Oracle® Database PL/SQL Packages and Types Reference, v11.2

# Data Structure Diagramming

Like many SQL developers I have always used entity-relationship diagrams to help in writing queries, and would extract sections to document them. Some years ago, however, I realised that having a single static diagram was not sufficient for complex queries with large numbers of tables, structures such as inline views, and multiple table instances. I therefore developed a diagram-based design methodology that I published in May 2009 on scribd. Since then I have extended the ideas in that approach to develop diagrams to cover various additional structures in SQL and in other areas. These diagrams were developed as needed for particular scenarios and have been published in several documents on scribd. I thought it would be a good idea to bring them together in one place, namely here, with example diagrams and the scribd document embedded thereafter. [Incidentally, I wonder what readers make of this 8-dimensional document structure?]

I would categorise them under four headings:

• Entity-Relationship Diagrams
• Structured Design Methodology
• SQL Special Structures
• Object Structures

Entity-Relationship Diagrams
Oracle Spatial Schema
The embedded document below also includes an ERD of the much simpler HR schema, but this one is more interesting as it shows extensive use of subtypes. The document is concerned with networks and I superimposed tree and non-tree network links on the diagram.

Oracle Customer Model and Multi-Org
Here I used shading to distinguish between org-striped, org-linked (my term) and other entities.

Structured Design Methodology
The methodology involves a sequence of diagrams and tables, so I have not extracted a diagram in this case.

SQL Special Structures
Multiple Table Instances with Scalar Subqueries in Where Clause
Subquery Factor

Selecting Database Function

Selecting Scalar Subqueries

Nested Analytics Subqueries

Model Clause

Recursive Subquery Factor

Object Structures
I use a different type of diagram for object structures from those for SQL and ERDs, and it's intended to be very general, being independent of programming language and applicable to any object structure, allowing arbitrary nesting of array and record types.
Code Timer Object
This object was implemented in three languages: Oracle, Perl and Java.

Excel Array Object
This object was implemented in Perl.