Knapsacks and Networks in SQL

I opened a GitHub account, Brendan’s GitHub Page last year and have added a number of projects since then, in PL/SQL and other 3GL languages. Partly in response to a request for the code for one of my blog articles on an interesting SQL problem, I decided recently to create a new repo for the SQL behind a group of articles on solving difficult combinatorial optimisation problems via ‘advanced’ SQL techniques such as recursive subquery factoring and model clause, sql_demos – Brendan’s repo for interesting SQL. It includes installation scripts with object creation and data setup, and scripts to run the SQL on the included datasets. The idea is that anyone with the pre-requisites should be able to reproduce my results within a few minutes of downloading the repo.

[Left image from Knapsack problem; right image copied from Chapter 11 Dynamic Programming]

In this article I embed each of the earlier articles relevant to the GitHub repo with a brief preamble.

The first two articles are from January 2013 and use recursive subquery factoring to find exact solutions for the single and multiple knapsack problem, and also include PL/SQL solutions for comparison. They avoid the ‘brute force’ approach by truncating search paths as soon as limit constraints are exceeded. The cumulative paths are stored in string variables passed through the iterations (which would not be possible with the older Connect By hierarchical syntax).

In these articles I illustrate the nature of the problems using Visio diagrams, and include dimensional performance benchmarking results, using a technique that I presented on at last year’s Ireland OUG conference: Dimensional Performance Benchmarking of SQL – IOUG Presentation. I also illustrate the queries using my own method for diagramming SQL queries.

A Simple SQL Solution for the Knapsack Problem (SKP-1), January 2013

An SQL Solution for the Multiple Knapsack Problem (SKP-m), January 2013

The next article uses Model clause to find a more general solution to a problem posed on AskTom, as a ‘bin fitting’ problem. I also solved the problem by other methods including recursive subquery factoring. I illustrate the problem itself, as well as the Model iteration scheme using Visio diagrams, and again include dimensional performance benchmarking. The results show how quadratic performance variation can be turned into much faster linear variation by means of a temporary table in this kind of problem.

SQL for the Balanced Number Partitioning Problem, May 2013

This article arose from a question on OTN, and concerns a type of knapsack or bin-fitting problem that is quite tricky to solve in SQL, where the items fall into categories on which there are separate constraints. I introduced a new idea here, to filter out unpromising paths within recursive subquery factoring by means of analytic functions, in order to allow the technique to be used to generate solutions for larger problems without guaranteed optimality, but in shorter time. Two realistic datasets were used, one from the original poster, and another I got from a scraping website.

SQL for the Fantasy Football Knapsack Problem, June 2013

This article is on a classic ‘hard’ optimisation problem, and uses recursive subquery factoring with the same filtering technique as the previous article, and shows that it’s possible to solve a problem involving 312 American cities quite quickly in pure SQL using the approximation technique. It also uses a simple made-up example dataset to illustrate its working.

SQL for the Travelling Salesman Problem, July 2013

The following two articles concern finding shortest paths between given nodes in a network, and arose from a question on OTN. The first one again uses recursive subquery factoring with a filtering mechanism to exclude paths as early as possible, in a similar way to the approximative solutios methods in the earlier articles. In this case, however, reasoning about the nature of the problem shows that we are not in fact sacrificing optimality. The article has quite a lot of explanatory material on how the SQL works, and uses small dataset examples.

The second article considers how to improve performance further by obtaining a preliminary approximate solution that can be used as a bounding mechanism in a second step to find the exact solutions. This article uses two realistic networks as examples, including one having 428,156 links.

SQL for Shortest Path Problems, April 2015

SQL for Shortest Path Problems 2: A Branch and Bound Approach, May 2015

In the article above I cited results from a general network analysis package I had developed that obtains all the distinct connected subnetworks with their structures in an efficient manner using PL/SQL recursion. It is worth noting that for that kind of problem recursive SQL alone is very inefficient, and I wrote the following article to try to explain why that is so, and why the Connect By syntax is generally much worse than recursive subquery factoring.

Recursive SQL for Network Analysis, and Duality, September 2015

The PL/SQL package mentioned, which I think implements a ‘named’ algorithm although I didn’t know that when I wrote it (I don’t recall the name right now, sorry 🙁 ), is available on GitHub: Brendan’s network structural analysis Oracle package, with article:

PL/SQL Pipelined Function for Network Analysis, May 2015






A Generic Unix Script for Uploading eBusiness Concurrent Programs

I have posted a couple of articles recently on XML Publisher report development within Oracle eBusiness applications (A Design Pattern for Oracle eBusiness Audit Trail Reports with XML Publisher and Design Patterns for Database Reports with XML Publisher and Email Bursting). These reports are of one of several types of batch program, or concurrent program that can be defined within Oracle eBusiness.

Oracle eBusiness uses a number of metadata tables to store information on the programs, and in release 11.5 Oracle introduced a Unix utility called FNDLOAD to download the metadata to formatted text files to allow them to be uploaded via the same utility into downstream environments. At that time batch reports were generally developed in the Oracle Reports tool and typically there might only be two text files (known as LDT files after their extension), for the program and for associated value sets, and maybe one for request groups (which control access to the reports). The executable report file, the RDF, would just be copied to the target environment server directory. I wrote a set of wrapper scripts for the utility to streamline its use and to deal with a number of issues, including handling of audit fields, with a structure consisting of a separate pair of scripts for download and upload of each type of LDT file. I published these on Scribd in July 2009.

In working more recently with Oracle’s successor reporting tool, XML Publisher, I found that the number of objects involved in installation has increased substantially. As well as the LDT files there are also now XML configuration files and RTF (usually) templates, uploaded via a Java utility, XDOLoader. Installation also involves at least one PL/SQL package. For example the email version of my model reports (see the second link above) had 11 configuration files. For this reason, I decided to create a single script that would copy, upload and install all objects required for a concurrent program of any type, and I describe the script in this article.

Here is my original Scribd article, describing the issues mentioned, and with the original individual upload and download scripts.

Loading...

The new script is here XX_Install_XMLCP_ksh, and here is the MD120 from the model article that uses it.

Loading...

Directory Structure

The script operates on an input TAR file containing all the necessary files. The TAR file is called prog.tar after the program short name prog, and contains a directory also called prog with all the installation files. The installer uses relative paths and assumes that the TAR file, with the installer, is placed in a directory below the custom top directory, say $XX_TOP/rel. All loader files are copied to $XX_TOP/import and SQL files to $XX_TOP/install/sql.

After installation, a new directory will remain with all the installation files for reference, $XX_TOP/rel/prog.

Program Structure

The internal call structure of the Unix script is shown below.
Install CP - CSD

The script operates on an input TAR file containing all the necessary files.

After extracting the TAR file, the script has local subroutines that can be divided into four categories, as above:

  1. Preliminaries – parameter processing, file moving and validation
  2. FNDLOAD uploads – uploading the LDT files
  3. XDOLoader uploads – uploading the data and any bursting XML files, and all layout templates present
  4. SQL installations – installing any SQL script present, and the package spec and body files

The following table gives a few notes on the main program and preliminary subroutines.
Preliminaries Summary
Prelims Table
The following table gives a few notes on the upload and SQL subroutines.
Upload and SQL Summary
Install Uploads Table
The upload subroutines all have SQL queries for verification, and here is a sample log file from the script: XX_ERPXMLCP_EM_log

The remainder of the article lists the queries with diagrams and examples of output.

Upload Subroutines

upload_ag

Validation Query

SELECT app_g.application_short_name "App G", fag.group_name "Group",
      fag.description "Description",
      app_t.application_short_name "App T", ftb.table_name "Table",
      fcl.column_name "Column"
  FROM fnd_audit_groups fag
  JOIN fnd_application app_g
    ON app_g.application_id = fag.application_id
  JOIN fnd_audit_tables fat
    ON fat.audit_group_app_id = fag.application_id
   AND fat.audit_group_id = fag.audit_group_id
  JOIN fnd_application app_t
    ON app_t.application_id = fat.table_app_id
  JOIN fnd_tables ftb
    ON ftb.application_id = fat.table_app_id
   AND ftb.table_id = fat.table_id
  JOIN fnd_audit_columns fac
    ON fac.table_app_id = fat.table_app_id
   AND fac.table_id = fat.table_id
  JOIN fnd_columns fcl
    ON fcl.application_id = fac.table_app_id
   AND fcl.table_id = fac.table_id
   AND fcl.column_id = fac.column_id
 WHERE fag.last_update_date     = To_Date ('$sysdate', 'YYYY/MM/DD')
   AND fac.schema_id            = 900
ORDER BY app_g.application_short_name, fag.group_name, 
         app_t.application_short_name, ftb.table_name,
         fcl.column_name;

QSD

Install CP - AG - 1

Example Output
No example available here, but the headings are:
“App G”, “Group”, “Description”, “App T”, “Table”, “Column”

upload_ms

Validation Query

SELECT mes.message_name "Name", mes.message_text "Text"
  FROM fnd_new_messages mes
 WHERE mes.last_update_date     = To_Date ('$sysdate', 'YYYY/MM/DD')
 ORDER BY 1;

QSD

Install CP - Message

Example Output

Name                     Text
--------------------- ---------------------------------------------
XX_ERPXMLCP_EM_NONXML Non-XML Concurrent Program &PROGAPP

upload_vs

Validation Query

SELECT fvs.flex_value_set_name "Value Set", Count(fvl.flex_value_set_id) "Values"
  FROM fnd_flex_value_sets fvs, fnd_flex_values fvl
 WHERE fvs.last_update_date     = To_Date ('$sysdate', 'YYYY/MM/DD')
   AND fvl.flex_value_set_id(+) = fvs.flex_value_set_id
 GROUP BY fvs.flex_value_set_name;

QSD

Install CP - VS

Example Output

Value Set             Values
--------------------- ------
XX_PROGS                   0
XX_APPNAME_ID              0

upload_cp

Validation Query

SELECT prg.user_concurrent_program_name || ': ' || prg.concurrent_program_name "Program", fcu.column_seq_num || ': ' || fcu.end_user_column_name "Parameter"
  FROM fnd_concurrent_programs_vl               prg
  LEFT JOIN fnd_descr_flex_column_usages      fcu
    ON fcu.descriptive_flexfield_name         = '\$SRS\$.' || prg.concurrent_program_name
   AND fcu.descriptive_flex_context_code      = 'Global Data Elements'
 WHERE prg.concurrent_program_name              = '$cp_name'
 ORDER BY 1, 2;

QSD

Install CP - CP

Example Output

Program                                    Parameter
------------------------------------------ ------------------------
XX Example XML CP (Email): XX_ERPXMLCP_EM  100: Cc Email
                                           10: Application
                                           20: From Program
                                           30: To Program
                                           40: From Date
                                           50: To Date
                                           60: From Parameter Count
                                           70: To Parameter Count
                                           80: Override Email
                                           90: From Email

upload_rga

Validation Query

SELECT rgp.request_group_name "Request Group",
       app.application_short_name "App"
  FROM fnd_concurrent_programs          cpr
  JOIN fnd_request_group_units          rgu
    ON rgu.unit_application_id          = cpr.application_id
   AND rgu.request_unit_id              = cpr.concurrent_program_id
  JOIN fnd_request_groups               rgp
    ON rgp.application_id               = rgu.application_id
   AND rgp.request_group_id             = rgu.request_group_id
  JOIN fnd_application                  app
    ON app.application_id               = rgp.application_id
 WHERE cpr.concurrent_program_name      = '$cp_name'
 ORDER BY 1;

QSD

Install CP - RGA

Example Output

Request Group                  App
------------------------------ ----------
System Administrator Reports   FND

upload_dd

Validation Query

SELECT xdd.data_source_code "Code", xtm.default_language "Lang", xtm.default_territory "Terr"
  FROM xdo_ds_definitions_b xdd
  LEFT JOIN xdo_templates_b xtm
    ON xtm.application_short_name       = xdd.application_short_name
   AND xtm.data_source_code             = xdd.data_source_code
 WHERE xdd.data_source_code             = '$cp_name'
 ORDER BY 1, 2, 3;

QSD

Install CP - DD

Example Output

Code                 Lang Terr
-------------------- ---- ----
XX_ERPXMLCP_EM       en   US

upload_all_temps

Validation Query

SELECT xdd.data_source_code "Code", 
        xlb_d.file_name "Data Template",
        xlb_b.file_name "Bursting File",
        xtm.template_code "Template",
        xlb.language "Lang", 
        xlb.territory "Terr",
        xlb.file_name || 
               CASE
               WHEN xlb.language = xtm.default_language AND
                    xlb.territory = xtm.default_territory
               THEN '*' END "File"
  FROM xdo_ds_definitions_b             xdd
  LEFT JOIN xdo_lobs                    xlb_d
    ON xlb_d.application_short_name     = xdd.application_short_name
   AND xlb_d.lob_code                   = xdd.data_source_code
   AND xlb_d.lob_type                   = 'DATA_TEMPLATE'
  LEFT JOIN xdo_lobs                    xlb_b
    ON xlb_b.application_short_name     = xdd.application_short_name
   AND xlb_b.lob_code                   = xdd.data_source_code
   AND xlb_b.lob_type                   = 'BURSTING_FILE'
  LEFT JOIN xdo_templates_b             xtm
    ON xtm.application_short_name       = xdd.application_short_name
   AND xtm.data_source_code             = xdd.data_source_code
  LEFT JOIN xdo_lobs                    xlb
    ON xlb.application_short_name       = xtm.application_short_name
   AND xlb.lob_code                     = xtm.template_code
   AND xlb.lob_type                     LIKE 'TEMPLATE%'
 WHERE xdd.data_source_code             = '$cp_name'
   AND xdd.application_short_name       = '$app'
 ORDER BY 1, 2, 3, 4;

QSD

Install CP - Template

Example Output

Code            Data Template        Bursting File          Template             Lang Terr File
--------------- -------------------- ---------------------- -------------------- ---- ---- -------------------------
XX_ERPXMLCP_EM  XX_ERPXMLCP_EM.xml   XX_ERPXMLCP_EM_BUR.xml XX_ERPXMLCP_EM       en   US   XX_ERPXMLCP_EM.xsl*
                                                                                           XX_ERPXMLCP_EM.rtf*
                                                            XX_ERPXMLCP_EM_XML   en   US   XX_ERPXMLCP_EM_XML.xsl*
                                                                                           XX_ERPXMLCP_EM_XML.rtf*






Query Query Query

In my last post, A Design Pattern for Oracle eBusiness Audit Trail Reports with XML Publisher, I described a database report module developed in Oracle’s XML Publisher tool. Of the report structure I wrote:

It has a master entity with two independent detail entities, and therefore requires a minimum of two queries.

But why does such a structure require two queries? And can we determine the minimum number of queries for reports in general? To start with, let’s define a report in this context as being a hierarchy of record groups, where:

  • a record group is a set of records having the same columns with (possibly) differing values
  • each group is linked to a single record in its (single) parent group by values in the parent record, except the top level (or root) group

For example, in the earlier post the root group is a set of bank accounts, with the two detail (or child) groups being the set of owners of the bank account and the set of audit records for the bank account parent record. Corresponding to this group structure, each bank account record is the root of the data hierarchies, comprising two sets of records below the bank account record, one for the owners and one for the audit records linked to the root record by the bank account id.

A (relational) query always returns a flat record set, and it’s this fact that determines the minimum number of queries required for a given group structure. A master-detail group structure can be flattened in the query by simply copying master fields on to the child record sets. The set cardinality is then the cardinality of the child set. The report designer uses their chosen reporting tool to specify display of the queried data in either flat, or in master-detail format.

In fact this approach works for any number of child levels, with the query cardinality being the number of bottom level descendants (using null records for potential parents that are in fact childless). It’s clear though that the approach will not work for any second child at the same level because there would be two cardinalities and no meaningful single record for both child groups could be constructed within a flat query.

This reasoning leads to the conclusion that the minimum number of queries required in general is equal to the number of groups minus the number of parent groups.

Query Query Query - example

In the earlier post I also stated:

This minimum number of queries is usually the best choice…

There are two main reasons for this:

  • each child query fires for every record returned by its parent, with associated performance impact
  • maintenance tends to be more difficult with extra queries; this is much worse when the individual groups, which should almost always be implemented by a maximum of one query each, are split, and then need to be joined back together procedurally

On thinking about this, it occurred to me that if the group structure were defined in a metadata table we might be able to return minimum query structures using an SQL query. Just one, obviously 🙂 . To save effort we could use Oracle’s handy HR demo schema with the employee hierarchy representing groups.

The remainder of this article describes the query I came up with. As it’s about hierarchies, recursion is the technique to use, and this is one of those cases where Oracle’s old tree-walk syntax is too limited, so I am using the Oracle 11.2 recursive subquery factoring feature. This approach is reminiscent of strategies recommended by Invest Diva for insightful analysis in financial contexts.

The query isn’t going to be of practical value for report group structures since these are always quite small in size, but I expect there are different applications where this kind of Primogeniture Recursion would be useful.

Query Groups Query – Primogeniture Recursion

Query Structure Diagram
Query Query Query

SQL

WITH rsf (last_name, employee_id, lev, part_id, manager_id) AS (
SELECT last_name, employee_id, 0, employee_id, To_Number(NULL)
  FROM employees
 WHERE manager_id IS NULL
UNION ALL
SELECT e.last_name, e.employee_id, r.lev + 1, 
       CASE WHEN Row_Number() OVER (PARTITION BY r.employee_id ORDER BY e.last_name) = 1 THEN r.part_id ELSE e.employee_id END,
       e.manager_id
  FROM rsf r
  JOIN employees e
    ON e.manager_id = r.employee_id
)
SELECT part_id, LPad ('.', lev) || last_name last_name, employee_id, 
       Count(DISTINCT part_id) OVER () "#Partitions",
       Count(DISTINCT manager_id) OVER () "+ #Parents",
       Count(*) OVER () "= #Records"
  FROM rsf
 ORDER BY part_id, lev, last_name

Query Output

   PART_ID LAST_NAME            EMPLOYEE_ID #Partitions + #Parents = #Records
---------- -------------------- ----------- ----------- ---------- ----------
       100 King                         100          89         18        107
           .Cambrault                   148          89         18        107
            .Bates                      172          89         18        107
       101 .Kochhar                     101          89         18        107
            .Baer                       204          89         18        107
       102 .De Haan                     102          89         18        107
            .Hunold                     103          89         18        107
             .Austin                    105          89         18        107
       104   .Ernst                     104          89         18        107
       106   .Pataballa                 106          89         18        107
       107   .Lorentz                   107          89         18        107
       108  .Greenberg                  108          89         18        107
             .Chen                      110          89         18        107
       109   .Faviet                    109          89         18        107
       111   .Sciarra                   111          89         18        107
       112   .Urman                     112          89         18        107
       113   .Popp                      113          89         18        107
       114 .Raphaely                    114          89         18        107
            .Baida                      116          89         18        107
       115  .Khoo                       115          89         18        107
       117  .Tobias                     117          89         18        107
       118  .Himuro                     118          89         18        107
       119  .Colmenares                 119          89         18        107
       120 .Weiss                       120          89         18        107
            .Fleaur                     181          89         18        107
       121 .Fripp                       121          89         18        107
            .Atkinson                   130          89         18        107
       122 .Kaufling                    122          89         18        107
            .Chung                      188          89         18        107
       123 .Vollman                     123          89         18        107
            .Bell                       192          89         18        107
       124 .Mourgos                     124          89         18        107
            .Davies                     142          89         18        107
       125  .Nayer                      125          89         18        107
       126  .Mikkilineni                126          89         18        107
       127  .Landry                     127          89         18        107
       128  .Markle                     128          89         18        107
       129  .Bissot                     129          89         18        107
       131  .Marlow                     131          89         18        107
       132  .Olson                      132          89         18        107
       133  .Mallin                     133          89         18        107
       134  .Rogers                     134          89         18        107
       135  .Gee                        135          89         18        107
       136  .Philtanker                 136          89         18        107
       137  .Ladwig                     137          89         18        107
       138  .Stiles                     138          89         18        107
       139  .Seo                        139          89         18        107
       140  .Patel                      140          89         18        107
       141  .Rajs                       141          89         18        107
       143  .Matos                      143          89         18        107
       144  .Vargas                     144          89         18        107
       145 .Russell                     145          89         18        107
            .Bernstein                  151          89         18        107
       146 .Partners                    146          89         18        107
            .Doran                      160          89         18        107
       147 .Errazuriz                   147          89         18        107
            .Ande                       166          89         18        107
       149 .Zlotkey                     149          89         18        107
            .Abel                       174          89         18        107
       150  .Tucker                     150          89         18        107
       152  .Hall                       152          89         18        107
       153  .Olsen                      153          89         18        107
       154  .Cambrault                  154          89         18        107
       155  .Tuvault                    155          89         18        107
       156  .King                       156          89         18        107
       157  .Sully                      157          89         18        107
       158  .McEwen                     158          89         18        107
       159  .Smith                      159          89         18        107
       161  .Sewall                     161          89         18        107
       162  .Vishney                    162          89         18        107
       163  .Greene                     163          89         18        107
       164  .Marvins                    164          89         18        107
       165  .Lee                        165          89         18        107
       167  .Banda                      167          89         18        107
       168  .Ozer                       168          89         18        107
       169  .Bloom                      169          89         18        107
       170  .Fox                        170          89         18        107
       171  .Smith                      171          89         18        107
       173  .Kumar                      173          89         18        107
       175  .Hutton                     175          89         18        107
       176  .Taylor                     176          89         18        107
       177  .Livingston                 177          89         18        107
       178  .Grant                      178          89         18        107
       179  .Johnson                    179          89         18        107
       180  .Taylor                     180          89         18        107
       182  .Sullivan                   182          89         18        107
       183  .Geoni                      183          89         18        107
       184  .Sarchand                   184          89         18        107
       185  .Bull                       185          89         18        107
       186  .Dellinger                  186          89         18        107
       187  .Cabrio                     187          89         18        107
       189  .Dilly                      189          89         18        107
       190  .Gates                      190          89         18        107
       191  .Perkins                    191          89         18        107
       193  .Everett                    193          89         18        107
       194  .McCain                     194          89         18        107
       195  .Jones                      195          89         18        107
       196  .Walsh                      196          89         18        107
       197  .Feeney                     197          89         18        107
       198  .OConnell                   198          89         18        107
       199  .Grant                      199          89         18        107
       200  .Whalen                     200          89         18        107
       201 .Hartstein                   201          89         18        107
            .Fay                        202          89         18        107
       203  .Mavris                     203          89         18        107
       205  .Higgins                    205          89         18        107
             .Gietz                     206          89         18        107

107 rows selected.