I opened a GitHub account, Brendan’s GitHub Page last year and have added a number of projects since then, in PL/SQL and other 3GL languages. Partly in response to a request for the code for one of my blog articles on an interesting SQL problem, I decided recently to create a new repo for the SQL behind a group of articles on solving difficult combinatorial optimisation problems via ‘advanced’ SQL techniques such as recursive subquery factoring and model clause, sql_demos – Brendan’s repo for interesting SQL. It includes installation scripts with object creation and data setup, and scripts to run the SQL on the included datasets. The idea is that anyone with the pre-requisites should be able to reproduce my results within a few minutes of downloading the repo.
[Left image from Knapsack problem; right image copied from Chapter 11 Dynamic Programming]
In this article I embed each of the earlier articles relevant to the GitHub repo with a brief preamble.
The first two articles are from January 2013 and use recursive subquery factoring to find exact solutions for the single and multiple knapsack problem, and also include PL/SQL solutions for comparison. They avoid the ‘brute force’ approach by truncating search paths as soon as limit constraints are exceeded. The cumulative paths are stored in string variables passed through the iterations (which would not be possible with the older Connect By hierarchical syntax).
In these articles I illustrate the nature of the problems using Visio diagrams, and include dimensional performance benchmarking results, using a technique that I presented on at last year’s Ireland OUG conference: Dimensional Performance Benchmarking of SQL – IOUG Presentation. I also illustrate the queries using my own method for diagramming SQL queries.
A Simple SQL Solution for the Knapsack Problem (SKP-1), January 2013
An SQL Solution for the Multiple Knapsack Problem (SKP-m), January 2013
The next article uses Model clause to find a more general solution to a problem posed on AskTom, as a ‘bin fitting’ problem. I also solved the problem by other methods including recursive subquery factoring. I illustrate the problem itself, as well as the Model iteration scheme using Visio diagrams, and again include dimensional performance benchmarking. The results show how quadratic performance variation can be turned into much faster linear variation by means of a temporary table in this kind of problem.
This article arose from a question on OTN, and concerns a type of knapsack or bin-fitting problem that is quite tricky to solve in SQL, where the items fall into categories on which there are separate constraints. I introduced a new idea here, to filter out unpromising paths within recursive subquery factoring by means of analytic functions, in order to allow the technique to be used to generate solutions for larger problems without guaranteed optimality, but in shorter time. Two realistic datasets were used, one from the original poster, and another I got from a scraping website.
This article is on a classic ‘hard’ optimisation problem, and uses recursive subquery factoring with the same filtering technique as the previous article, and shows that it’s possible to solve a problem involving 312 American cities quite quickly in pure SQL using the approximation technique. It also uses a simple made-up example dataset to illustrate its working.
SQL for the Travelling Salesman Problem, July 2013
The following two articles concern finding shortest paths between given nodes in a network, and arose from a question on OTN. The first one again uses recursive subquery factoring with a filtering mechanism to exclude paths as early as possible, in a similar way to the approximative solutios methods in the earlier articles. In this case, however, reasoning about the nature of the problem shows that we are not in fact sacrificing optimality. The article has quite a lot of explanatory material on how the SQL works, and uses small dataset examples.
The second article considers how to improve performance further by obtaining a preliminary approximate solution that can be used as a bounding mechanism in a second step to find the exact solutions. This article uses two realistic networks as examples, including one having 428,156 links.
SQL for Shortest Path Problems, April 2015
In the article above I cited results from a general network analysis package I had developed that obtains all the distinct connected subnetworks with their structures in an efficient manner using PL/SQL recursion. It is worth noting that for that kind of problem recursive SQL alone is very inefficient, and I wrote the following article to try to explain why that is so, and why the Connect By syntax is generally much worse than recursive subquery factoring.
Recursive SQL for Network Analysis, and Duality, September 2015
The PL/SQL package mentioned, which I think implements a ‘named’ algorithm although I didn’t know that when I wrote it (I don’t recall the name right now, sorry 🙁 ), is available on GitHub: Brendan’s network structural analysis Oracle package, with article: