On RDBMS, SQL and the DRY Principle, and Query Networks

I saw a link a week ago or so on my Twitter feed to an article published by one Lance Gutteridge on 1 June 2018: What I’m Telling Business People About Why Relational Databases Are So Bad. The article is written in a inflammatory style, here’s a sample quote:

Relational databases have been the worst technology to ever poison a field of endeavor

He classifies the ‘badness’ in three main categories:

  • SQL Injection
  • SQL “is a total violation of the DRY principle”
  • Object-Relational Impedance Mismatch

In this article I want to briefly discuss his criticisms under each of these categories, and then move on to discuss some interesting features of SQL queries and joins arising from the fact that SQL plainly does NOT violate the DRY principle. I’ll also discuss how the concept of the network, initially applied to table relationships, can be a very useful design concept in both data modelling and query design.

Part I: Comments on the Lance Gutteridge article
SQL Injection
From Wikipedia, SQL injection:

SQL injection is a code injection technique, used to attack data-driven applications, in which nefarious SQL statements are inserted into an entry field for execution (e.g. to dump the database contents to the attacker).

SQL injection has indeed been a real vulnerability for database systems in the past, but it is an avoidable problem today. As the Wikipedia article puts it:

An SQL injection is a well known attack and easily prevented by simple measures.

SQL “is a total violation of the DRY principle”
Dr. Gutteridge notes that relationships are defined in an RDBMS by foreign keys and primary keys on the tables, and that having to make join relations explicitly in SQL is a repetition of information already known, and hence violates the “Don’t Repeat Yourself” principle.

This criticism is easily dealt with: In general the table relationships do not in fact fully determine the joins in a query. A simple, and very common, example arises in order entry systems. Consider the following simplified 3-table data model:

Here we have an order entity with a foreign key link to a customer, and two foreign key links to the address entity. A customer may have multiple addresses that can serve as shipping or billing addresses on any given order. A particular query may require one or other, or both, or neither of the addresses for the order. The primary key/foreign key relationships cannot determine which tables and links to include without the query specifying them.

The usual way to specify this information in ANSI-standard SQL is to use JOIN/ON-clauses like this:

JOIN addresses add_b ON add_b.address_id = ord.billing_address_id

There are also situations in which joins can be expressed more concisely, and we’ll look at some of them in part II, but it’s clear that these clauses do not in any meaningful way violate the DRY principle.

Object-Relational Impedance Mismatch
In one of the few views on which I am inclined to agree with Dr. Gutteridge, he regards the term as “technobabble”, but it does describe a real phenomenon. Dr. Gutteridge expresses it thus:

…the data in a relational database is stored in ways more in keeping with a 1980s programming language than with a modern, object-oriented language

Though this mismatch does exist, it’s unlikely that dropping the relational model is the answer, because it solves a more fundamental problem. An article from 29 November 2017, Important Papers: Codd and the Relational Model, includes the following:

…Codd motivates the search for a better model by arguing that we need “data independence,” which he defines as “the independence of application programs and terminal activities from growth in data types and changes in data representation.” The relational model, he argues, “appears to be superior in several respects to the graph or network model presently in vogue,” partly because, among other benefits, the relational model “provides a means of describing data with its natural structure only.” By this he meant that programs could safely ignore any artificial structures (like trees) imposed upon the data for storage and retrieval purposes only.

I remember when I started my programming career in 1984 most of the work on any application was spent in writing code simply to store and retrieve data in application-specific formats. Within a few years that effort became largely unnecessary with the introduction of the Oracle RDBMS and SQL. Although modern big data requirements mean other approaches to data storage are also needed, the relational model isn’t going away.

In one of the unwitting ironies in Dr. Gutteridge’s article, he states towards the end that:

there are programmers who have never really seen any other kind of database and believe that all databases are relational

while apparently believing that all modern programming language are object-oriented. They aren’t, and while OOP isn’t going away, it has real deficiencies in modelling the real world that have led to growing interest in other paradigms such as functional programming, as well as old fashioned imperative programming. Here’s an interesting review of some of those deficiencies from 23 July 2016:
Goodbye, Object Oriented Programming

Part II: On SQL and DRY – Joins via NATURAL/USING/ON
In this second part we’ll use two subsets of Oracle’s HR demo schema as examples, and we’ll ignore any links in the tables to tables other than those depicted in the ERDs. Let’s see how, in some cases, we can use ANSI join syntax to avoid explicitly listing all the join column names, but that there are drawbacks to doing so.

Tree Data Model – Department 110, Location, Country, Region – NATURAL JOIN
The ERD below shows a simple linear tree structure.

Let’s start by considering a situation where we don’t need to specify the full join clause with fields on both sides.

 DEPARTMENT_NAME                STREET_ADDRESS                           CITY                           COUNTRY_NAME                             REGION_NAME
------------------------------ ---------------------------------------- ------------------------------ ---------------------------------------- -------------------------
Accounting                     2004 Charade Rd                          Seattle                        United States of America                 Americas

  1  SELECT department_name, street_address, city, country_name, region_name
  2    FROM departments
  3      NATURAL JOIN locations
  4      NATURAL JOIN countries
  5      NATURAL JOIN regions
  6*  WHERE department_id = 110

Here in this simple (linear) tree-structured data model we were able to join the three subsequent tables to the driving table, departments, simply by adding the table names after NATURAL JOIN.

So is this a case of the SQL engine reading the data model and constructing the joins without the need for repetition? No, it isn’t. As the documentation tells you, NATURAL JOIN joins by matching fields with the same names on either side. This can be dangerous as the next example shows.

The second example has only two tables, but there is a loop in the structure.

[In the underlying HR schema from which this is extracted there is also a self-join on employees, which we are excluding]
Department 110 employees: NATURAL JOIN gives wrong answer
There are two employees in department 110:

  COUNT(*)
----------
         2

  1  SELECT COUNT(*)
  2    FROM employees
  3*  WHERE department_id = 110

Let’s try to get the employees using NATURAL JOIN, like this:

DEPARTMENT_NAME                LAST_NAME                 FIRST_NAME           MANAGER_ID
------------------------------ ------------------------- -------------------- ----------
Accounting                     Gietz                     William                     205

  1  SELECT department_name, last_name, first_name, manager_id
  2    FROM departments
  3     NATURAL JOIN employees
  4*  WHERE department_id = 110

This returns only one of the two employees because NATURAL JOIN is matching on both department_id and manager_id as they appear in both tables.

Department 110 employees: USING department_id gives right answer
We can get the right answer by joining with the USING keyword, which assumes the column name to join on is the same on both tables, and mentions it explicitly.

DEPARTMENT_NAME                LAST_NAME                 FIRST_NAME
------------------------------ ------------------------- --------------------
Accounting                     Higgins                   Shelley
Accounting                     Gietz                     William

  1  SELECT department_name, last_name, first_name
  2    FROM departments
  3     JOIN employees USING (department_id)
  4*  WHERE department_id = 110

This example shows how USING resolves the earlier NATURAL JOIN error by specifying the field names in common to be used. The next example shows how this does not always work.

Department 110 manager: USING manager_id gives wrong answer

DEPARTMENT_NAME                LAST_NAME                 FIRST_NAME           MANAGER_ID
------------------------------ ------------------------- -------------------- ----------
Accounting                     Gietz                     William                     205

  1  SELECT department_name, last_name, first_name, manager_id
  2    FROM departments dep
  3     JOIN employees USING (manager_id)
  4*  WHERE dep.department_id = 110

From the first query above we know that the manager of department 110 is Shelley Higgins. It’s reported here instead as William Gietz, because his manager is the same as the department’s manager, but Shirley’s is not.

Department 110 manager: ON mgr.employee_id = dep.manager_id gives right answer

DEPARTMENT_NAME                LAST_NAME                 FIRST_NAME
------------------------------ ------------------------- --------------------
Accounting                     Higgins                   Shelley

  1   SELECT department_name, last_name, first_name
  2     FROM departments dep
  3     JOIN employees mgr ON mgr.employee_id = dep.manager_id
  4*  WHERE dep.department_id = 110

Here we we specify the join with the ON-clause linking the columns explicitly on each side of the join. This is the most usual approach to ANSI joins.

Department 110 manager: NATURAL JOIN subqueries
In a recent article (A tribute to Natural Join, 20 August 2018) Frank Pachot suggested that NATURAL JOIN could be more widely used if tables were replaced by subqueries in which all the columns were aliased in such a way that the join columns only would have the same names in the joined tables. The query above, implemented in this way might be written:

DEPARTMENT_NAME                MGR_LAST_NAME             MGR_FIRST_NAME
------------------------------ ------------------------- --------------------
Accounting                     Higgins                   Shelley

  1  SELECT department_name, mgr_last_name, mgr_first_name
  2    FROM
  3  (SELECT department_id, department_name, manager_id
  4     FROM departments) dep
  5    NATURAL JOIN
  6  (SELECT employee_id manager_id, last_name mgr_last_name, first_name mgr_first_name
  7     FROM employees) mgr
  8*  WHERE dep.department_id = 110

This version is much more verbose and it’s much harder to see which are the join columns by scanning the select lists, compared with specifying them in ON clauses.

Conclusions on Joins via NATURAL/USING/ON

  • Very few people use NATURAL JOIN due to the limitation that the join column names, and only those, in each table or subquery have to be the same
  • USING tends to be used in simple ad hoc queries with small numbers of tables, and improves on NATURAL JOIN by listing the join columns explicitly, but again relies on the join column names being the same
  • The most commonly used join mechanism is the ON clause, with column names specified on each side. This avoids the possible pitfalls of the other mechanisms and for complex, real world queries generally results in more maintainable code

Regarding the DRY principle in SQL more generally, I wrote this,
Modularity in SQL: Patterns, Anti-Patterns and the Kitchen Sink, in September 2013 [tl;dr: Functions and complex views are fine as entry-points but using them as building blocks in SQL is usually a bad idea, and subquery factors (WITH clause) are a better approach to SQL modularity].

Part III: On Data Models and Queries Viewed as Networks
In the examples above we saw that when there are two ways of joining a pair of tables it’s no longer possible for the data model alone to determine the join. An entity relationship structure can be represented as a directed network, with entities as nodes and the relationships between them as links. The second example corresponds to a loop in the network, in which there are two ways of getting from the driving node, departments, to the employees node.

Where the relationships between tables are stored in constraints metadata we can use network analysis PL/SQL to show the network structure and then make diagrams to help in understanding schema structures, as I showed here in May 2015:
PL/SQL Pipelined Function for Network Analysis. This diagram, extracted from that article, shows the structure of Oracle’s demo schemas, with what’s known in graph theory as a spanning tree marked in red, and loop-closing links in blue.

Networks - PLSQL, v1.0 - HR

Queries as Networks
In 2009 I was asked to extend the functionality of an Oracle ERP invoice print report in order to support a move to a multi-org ERP structure. The report had a large number (I think around 30) of small queries in various places, such as format triggers and formula columns as well as in the main data model, and I started by combining most of them into a single, fairly complex query plus one smaller, global data query. The report ran much more quickly and I felt was more maintainable since almost all the logic was in one place, and the query could be tested through tools such as Toad. However, as the query was quite complex I was asked to produce some documentation on how it worked. This got me thinking about how ERDs are used to document data models, and whether we could extend those ideas to document queries too.

My initial thought was that a query can be thought of as a route through the data model network, with looping corresponding to repeated table instances in the query. However, it turns out to be much clearer to represent each table instance as its own node on a new network diagram. After I left the company I wrote my ideas up in a general form in a word document on Scribd in May 2009, A Structured Approach to SQL Query Design. Since then I have extended these ideas to include coverage of query constructs such as unions and subquery factors, and use of annotations for clarity. I wrote another article in August 2012 where I apply these extended ideas to some example queries taken from the OTN forum, Query Structure Diagramming. Here’s a diagram from that article:

You can also find examples in several of the articles on combinatorial SQL referenced in Knapsacks and Networks in SQL from December 2017.

How many tables is too many?
Have you ever heard the view expressed, usually by a DBA, that you should not put more than a small number of tables, say 10, in any query? The reasoning given is that the number of join orders for N tables is N!, which for N=10 is 3,628,800 and the query optimiser (CBO) won’t be able to handle that number of permutations. You will probably know from the discussion above why this reasoning is incorrect: The cost optimization problem is really a network path problem, rather than a permutation problem – you look to join (large) tables that are linked to the current rowset rather than than making cartesian joins, so most permutations are never considered.






SQL Developer: Importing Unit Test Repository via Data Modeler

Recently I started looking at the unit testing functionality within Oracle’s SQL Developer, as a possible alternative to ut/PLSQL. Oracle’s Jeff Smith has a useful starter page on this, Unit Testing Your PL/SQL with Oracle SQL Developer. As he notes, the first thing you need to do is set up a unit test repository, which is essentially a schema within an Oracle database that SQL Developer will create for you for the unit test metadata. The schema contains 24 tables, and when I set it up I thought it would be nice to see an entity-relationship diagram for it.

I often draw these myself after running an analysis query on the foreign key network, as described in a recent post, PL/SQL Pipelined Function for Network Analysis. However, in this case I remembered that SQL Developer also has a data modeler component, and I thought it would be a good opportunity to learn how to use this tool to reverse-engineer a diagram from the schema. It’s actually very easy, and in this article I will show the result (see another Jeff Smith post for learning resources on the data modeler, Data Modeling). For good measure, I will include outputs from my own metadata queries, and add similar for Oracle’s HR demo schema (and linked schemas, in version 12.1).

12 July 2015I added a section on the Oracle v11.2 Apex schema, APEX_040000, which has 425 tables, to see how the data modeler handles larger schemas.

Unit Test Repository Schema

I created the diagram by following the wizard from File/Import/Data Dictionary – see the link above for more information on how to use the Data Modeler.

Data Modeler Diagram

UT_REPOS

Network Analysis Output

Network                      #Links  #Nodes Lev  Node                                                     Link
---------------------------  ------  ------ ---  -------------------------------------------------------  --------------------------------------
UT_LIB_DYN_QUERIES|UT_REPOS      30      20   0  UT_LIB_DYN_QUERIES|UT_REPOS                              ROOT
                                              1  < UT_TEST_IMPL|UT_REPOS                                  ut_test_lib_dyn_queries_fk1|ut_repos
                                              2    < UT_TEST_COVERAGE_STATS|UT_REPOS                      ut_test_cov_stats_ut_t_fk1|ut_repos                                               3      > UT_TEST_IMPL_RESULTS|UT_REPOS                      ut_test_cov_stats_ut_t_fk2|ut_repos
                                              4        < UT_TEST_IMPL_VAL_RESULTS|UT_REPOS                ut_test_impl_val_res_fk3|ut_repos                                               5          > UT_TEST_IMPL|UT_REPOS*                         ut_test_impl_val_res_fk2|ut_repos
                                              5          > UT_VALIDATIONS|UT_REPOS                        ut_test_impl_val_res_fk1|ut_repos
                                              6            > UT_LIB_VALIDATIONS|UT_REPOS                  ut_validations_lib_validn_fk1|ut_repos
                                              6            > UT_TEST_IMPL|UT_REPOS*                       ut_validations_test_impl_fk1|ut_repos
                                              4        > UT_TEST_IMPL|UT_REPOS*                           ut_test_impl_results_ut_t_fk1|ut_repos
                                              4        > UT_TEST_RESULTS|UT_REPOS                         ut_test_impl_results_ut_t_fk2|ut_repos
                                              5          < UT_SUITE_ITEM_RESULTS|UT_REPOS                 ut_suite_item_results_fk2|ut_repos                                               6            > UT_SUITE_RESULTS|UT_REPOS                    ut_suite_item_results_fk1|ut_repos
                                              7              < UT_SUITE_ITEM_RESULTS|UT_REPOS*            ut_suite_item_results_fk3|ut_repos                                               7              > UT_SUITE|UT_REPOS                          ut_suite_results_fk1|ut_repos
                                              8                < UT_STARTUPS|UT_REPOS                     ut_startups_suite_fk1|ut_repos                                               9                  > UT_LIB_STARTUPS|UT_REPOS               ut_startups_lib_validn_fk1|ut_repos
                                              9                  > UT_TEST|UT_REPOS                       ut_startups_test_fk1|ut_repos
                                              0                    < UT_SUITE_ITEMS|UT_REPOS              ut_suite_items_fk2|ut_repos                                               1                      > UT_SUITE|UT_REPOS*                 ut_suite_items_fk1|ut_repos
                                              1                      > UT_SUITE|UT_REPOS*                 ut_suite_items_fk3|ut_repos
                                              0                    < UT_TEARDOWNS|UT_REPOS                ut_teardowns_test_fk1|ut_repos                                               1                      > UT_LIB_TEARDOWNS|UT_REPOS          ut_teardowns_lib_validn_fk1|ut_repos
                                              1                      > UT_SUITE|UT_REPOS*                 ut_teardowns_suite_fk1|ut_repos
                                              0                    < UT_TEST_ARGUMENTS|UT_REPOS           ut_test_arguments_fk|ut_repos
                                              1                      < UT_TEST_IMPL_ARGUMENTS|UT_REPOS    ut_test_impl_arguments_ut_fk1|ut_repos                                               2                        > UT_TEST_IMPL|UT_REPOS*           ut_test_impl_arguments_ut_fk2|ut_repos
                                              1                      < UT_TEST_IMPL_ARG_RESULTS|UT_REPOS  ut_test_impl_arg_results__fk2|ut_repos                                               2                        > UT_TEST_IMPL|UT_REPOS*           ut_test_impl_arg_results__fk1|ut_repos
                                              0                    < UT_TEST_IMPL|UT_REPOS*               ut_test_impl_ut_test_fk1|ut_repos
                                              0                    < UT_TEST_RESULTS|UT_REPOS*            ut_test_results_ut_test_fk1|ut_repos
UT_LOOKUP_CATEGORIES|UT_REPOS      2       3  0  UT_LOOKUP_CATEGORIES|UT_REPOS                            ROOT
                                              1  < UT_LOOKUP_DATATYPES|UT_REPOS                           ut_lookup_datatypes_ut_lo_fk1|ut_repos
                                              2    < UT_LOOKUP_VALUES|UT_REPOS                            ut_lookup_values_ut_looku_fk1|ut_repos

There is one table not shown as it has no links to any other table.

Schema Metadata Analysis Report

SQL> SELECT 'Start: '||dbs.name "Database", ses.sid "Session", ses.osuser "OS User", ses.machine "Machine", To_Char (SYSDATE,'DD-MON-YYYY HH24:MI:SS') "Time",
  2   Replace (Substr(ver.banner, 1, Instr(ver.banner, '64')-4), 'Enterprise Edition Release ', '') "Version"
  3    FROM v$database dbs, v$version ver, v$session ses
  4   WHERE ver.banner LIKE 'Oracle%'
  5     AND audsid = USERENV('sessionid');

Database              Session OS User    Machine              Time                 Version
-------------------- -------- ---------- -------------------- -------------------- ------------------------------
Start: XE                 137 HP-Brendan WORKGROUP\HP-BRENDAN 11-JUL-2015 13:21:25 Oracle Database 11g Express Ed
                              \Brend_000                                           ition Release 11.2.0.2.0


SQL> 
SQL> PROMPT '&owner' "Schema", '&tab' "Prefix"
'UT_REPOS' "Schema", '%' "Prefix"
SQL> PROMPT Tables Summary (excluding like '&nottab1', '&nottab2', '$' )
Tables Summary (excluding like '?', '?', '$' )
SQL> SELECT atc.table_name, Nvl2 (atc_w.column_name, 'Yes', NULL)    "Who?",
  2          CASE att.n_att WHEN 0 THEN To_Number(NULL) ELSE att.n_att END  "Attrs",
  3          ind_tot.n_ind "Indexes", Count(atc.column_name)      "Cols"
  4    FROM all_tab_columns   atc
  5      JOIN all_tables atb
  6        ON atb.table_name      = atc.table_name
  7       AND atb.owner     = atc.owner                             -- Join index count
  8        LEFT JOIN (SELECT i.table_owner, i.table_name, Count (i.index_name) n_ind
  9        FROM all_indexes     i
 10       GROUP BY i.table_owner, i.table_name)     ind_tot
 11          ON ind_tot.table_name    = atb.table_name
 12         AND ind_tot.table_owner   = atb.owner       -- Join attribute count
 13        LEFT JOIN (SELECT atc_att.table_name, atc_att.owner, Count(atc_att.column_name) n_att
 14              FROM all_tab_columns     atc_att
 15             WHERE atc_att.column_name   LIKE '%&notcol%'
 16             GROUP BY atc_att.table_name, atc_att.owner)   att
 17          ON att.table_name    = atb.table_name
 18         AND att.owner     = atb.owner
 19    LEFT JOIN all_tab_columns    atc_w         -- Join the Who column if it exists
 20           ON atc_w.table_name   = atb.table_name
 21          AND atc_w.owner      = atb.owner
 22          AND atc_w.column_name    = 'CREATED_BY'        -- Join index with its sequence for table
 23   WHERE atc.table_name      LIKE Upper('&tab'||'%')
 24     AND atc.table_name      NOT LIKE '%$%'
 25     AND atb.table_name      NOT LIKE '%&nottab1%'
 26     AND atb.table_name      NOT LIKE '%&nottab2%'
 27     AND atb.owner     LIKE Upper('&owner')
 28   GROUP BY atc.table_name,  Nvl2 (atc_w.column_name, 'Yes', NULL),
 29   CASE att.n_att WHEN 0 THEN To_Number(NULL) ELSE att.n_att END,
 30          ind_tot.n_ind
 31   ORDER BY 1;

TABLE_NAME                     Who?  Attrs Indexes Cols
------------------------------ ----- ----- ------- ----
UT_LIB_DYN_QUERIES             Yes               2    8
UT_LIB_STARTUPS                Yes               2    8
UT_LIB_TEARDOWNS               Yes               2    8
UT_LIB_VALIDATIONS             Yes               2    8
UT_LOOKUP_CATEGORIES           Yes               2    6
UT_LOOKUP_DATATYPES            Yes               3    8
UT_LOOKUP_VALUES               Yes               2    7
UT_METADATA                                      1    2
UT_STARTUPS                    Yes               4   10
UT_SUITE                       Yes               1    7
UT_SUITE_ITEMS                 Yes               4   10
UT_SUITE_ITEM_RESULTS          Yes               4    8
UT_SUITE_RESULTS               Yes               2   21
UT_TEARDOWNS                   Yes               4   10
UT_TEST                        Yes               2   12
UT_TEST_ARGUMENTS              Yes               2   30
UT_TEST_COVERAGE_STATS         Yes               3   13
UT_TEST_IMPL                   Yes               4   11
UT_TEST_IMPL_ARGUMENTS         Yes               3    9
UT_TEST_IMPL_ARG_RESULTS       Yes               4   12
UT_TEST_IMPL_RESULTS           Yes               3   29
UT_TEST_IMPL_VAL_RESULTS       Yes               4   15
UT_TEST_RESULTS                Yes               2   13
UT_VALIDATIONS                 Yes               3   10

24 rows selected.

SQL> 
SQL> PROMPT Triggers
Triggers
SQL> COLUMN "Trigger"  FORMAT A30
SQL> COLUMN "Type"   FORMAT A30
SQL> COLUMN "Event"    FORMAT A30
SQL> 
SQL> SELECT  trg.table_name,
  2   trg.trigger_name    "Trigger",
  3   trg.trigger_type    "Type",
  4   trg.triggering_event    "Event"
  5    FROM all_triggers trg
  6   WHERE trg.table_name      LIKE Upper('&tab'||'%')
  7     AND trg.table_name      NOT LIKE '%$%'
  8     AND trg.table_name      NOT LIKE '%&nottab1%'
  9     AND trg.table_name      NOT LIKE '%&nottab2%'
 10     AND trg.owner     LIKE Upper('&owner')
 11   ORDER BY 1, 2;

TABLE_NAME                     Trigger                        Type                           Event
------------------------------ ------------------------------ ------------------------------ ------------------------------
UT_LIB_DYN_QUERIES             UT_LIB_DYN_QUERIES             BEFORE EACH ROW                INSERT
                               UT_LIB_DYN_QUERIES_UP_TRG      BEFORE EACH ROW                UPDATE
UT_LIB_STARTUPS                UT_LIB_STARTUPS                BEFORE EACH ROW                INSERT
                               UT_LIB_STARTUPS_UP_TRG         BEFORE EACH ROW                UPDATE
UT_LIB_TEARDOWNS               UT_LIB_TEARDOWNS               BEFORE EACH ROW                INSERT
                               UT_LIB_TEARDOWNS_UP_TRG        BEFORE EACH ROW                UPDATE
UT_LIB_VALIDATIONS             UT_LIB_VALIDATIONS             BEFORE EACH ROW                INSERT
                               UT_LIB_VALIDATIONS_UP_TRG      BEFORE EACH ROW                UPDATE
UT_LOOKUP_CATEGORIES           UT_LOOKUP_CAT_TRG              BEFORE EACH ROW                INSERT
                               UT_LOOKUP_CAT_UP_TRG           BEFORE EACH ROW                UPDATE
UT_LOOKUP_DATATYPES            UT_LOOKUP_DATATYPES_UP_TRG     BEFORE EACH ROW                UPDATE
                               UT_LOOKUP_DATA_TRG             BEFORE EACH ROW                INSERT
UT_LOOKUP_VALUES               UT_LOOKUP_VALUES_UP_TRG        BEFORE EACH ROW                UPDATE
                               UT_LOOKUP_VALUE_TRG            BEFORE EACH ROW                INSERT
UT_STARTUPS                    UT_STARTUPS_TRG                BEFORE EACH ROW                INSERT
                               UT_STARTUPS_UP_TRG             BEFORE EACH ROW                UPDATE
UT_SUITE                       UT_SUITE_TRG                   BEFORE EACH ROW                INSERT
                               UT_SUITE_UP_TRG                BEFORE EACH ROW                UPDATE
UT_SUITE_ITEMS                 UT_SUITE_ITEMS_TRG             BEFORE EACH ROW                INSERT
                               UT_SUITE_ITEMS_UP_TRG          BEFORE EACH ROW                UPDATE
UT_SUITE_ITEM_RESULTS          UT_SUITE_ITEM_RESULTS_TRG      BEFORE EACH ROW                INSERT
                               UT_SUITE_ITEM_RESULTS_UP_TRG   BEFORE EACH ROW                UPDATE
UT_SUITE_RESULTS               UT_SUITE_RESULTS_TRG           BEFORE EACH ROW                INSERT
                               UT_SUITE_RESULTS_UP_TRG        BEFORE EACH ROW                UPDATE
UT_TEARDOWNS                   UT_TEARDOWNS_TRG               BEFORE EACH ROW                INSERT
                               UT_TEARDOWNS_UP_TRG            BEFORE EACH ROW                UPDATE
UT_TEST                        UT_TEST_TRG                    BEFORE EACH ROW                INSERT
                               UT_TEST_UP_TRG                 BEFORE EACH ROW                UPDATE
UT_TEST_ARGUMENTS              UT_TEST_ARGUMENTS_TRG          BEFORE EACH ROW                INSERT
                               UT_TEST_ARGUMENTS_UP_TRG       BEFORE EACH ROW                UPDATE
UT_TEST_COVERAGE_STATS         UT_TEST_COVERAGE_STATS_TRG     BEFORE EACH ROW                INSERT
                               UT_TEST_COVERAGE_STATS_UP_TRG  BEFORE EACH ROW                UPDATE
UT_TEST_IMPL                   UT_TEST_IMPL_TRG               BEFORE EACH ROW                INSERT
                               UT_TEST_IMPL_UP_TRG            BEFORE EACH ROW                UPDATE
UT_TEST_IMPL_ARGUMENTS         UT_TEST_IMPL_ARGUMENTS_TRG     BEFORE EACH ROW                INSERT
                               UT_TEST_IMPL_ARGUMENTS_UP_TRG  BEFORE EACH ROW                UPDATE
UT_TEST_IMPL_ARG_RESULTS       UT_TEST_IMPL_ARG_RESULTS_TRG   BEFORE EACH ROW                INSERT
                               UT_TEST_IMPL_ARG_RES_UP_TRG    BEFORE EACH ROW                UPDATE
UT_TEST_IMPL_RESULTS           UT_TEST_IMPL_RESULTS_TRG       BEFORE EACH ROW                INSERT
                               UT_TEST_IMPL_RESULTS_UP_TRG    BEFORE EACH ROW                UPDATE
UT_TEST_IMPL_VAL_RESULTS       UT_TEST_IMPL_VAL_RESULTS_TRG   BEFORE EACH ROW                INSERT
                               UT_TEST_IMPL_VAL_RES_UP_TRG    BEFORE EACH ROW                UPDATE
UT_TEST_RESULTS                UT_TEST_RESULTS_TRG            BEFORE EACH ROW                INSERT
                               UT_TEST_RESULTS_UP_TRG         BEFORE EACH ROW                UPDATE
UT_VALIDATIONS                 UT_VALIDATIONS_TRG             BEFORE EACH ROW                INSERT
                               UT_VALIDATIONS_UP_TRG          BEFORE EACH ROW                UPDATE

46 rows selected.

SQL> 
SQL> PROMPT Foreign Keys
Foreign Keys
SQL> COLUMN "From Table" FORMAT A30
SQL> COLUMN "To Table" FORMAT A30
SQL> COLUMN "Constraint" FORMAT A30
SQL> COLUMN "Column"   FORMAT A30
SQL> COLUMN "Seq"    FORMAT 990
SQL> 
SQL> BREAK ON "From Table" ON "To Table" ON "Constraint"
SQL> 
SQL> SELECT  con_f.table_name    "From Table",
  2   con_t.table_name    "To Table",
  3   con_f.constraint_name   "Constraint",
  4   col_f.position      "Seq",
  5   col_f.column_name   "Column"
  6    FROM all_constraints     con_f
  7    JOIN all_constraints     con_t
  8      ON con_t.constraint_name   = con_f.r_constraint_name
  9     AND con_t.owner     = con_f.r_owner
 10    JOIN all_cons_columns    col_f
 11      ON con_f.constraint_type   = 'R'
 12     AND col_f.constraint_name   = con_f.constraint_name
 13     AND col_f.owner     = con_f.owner
 14   WHERE
 15      (
 16        (con_f.table_name        LIKE Upper('&tab'||'%')
 17     AND con_f.table_name        NOT LIKE '%$%'
 18     AND con_f.table_name        NOT LIKE '%&nottab1%'
 19     AND con_f.table_name        NOT LIKE '%&nottab2%'
 20     AND con_f.owner       LIKE Upper('&owner')
 21         ) OR
 22        (con_t.table_name        LIKE Upper('&tab'||'%')
 23     AND con_t.table_name        NOT LIKE '%$%'
 24     AND con_t.table_name        NOT LIKE '%&nottab1%'
 25     AND con_t.table_name        NOT LIKE '%&nottab2%'
 26     AND con_t.owner       LIKE Upper('&owner')
 27        ))
 28   ORDER BY 1, 2, 3, 4;

From Table                     To Table                       Constraint                      Seq Column
------------------------------ ------------------------------ ------------------------------ ---- ------------------------------
UT_LOOKUP_DATATYPES            UT_LOOKUP_CATEGORIES           UT_LOOKUP_DATATYPES_UT_LO_FK1     1 CAT_ID
UT_LOOKUP_VALUES               UT_LOOKUP_DATATYPES            UT_LOOKUP_VALUES_UT_LOOKU_FK1     1 DATA_ID
UT_STARTUPS                    UT_LIB_STARTUPS                UT_STARTUPS_LIB_VALIDN_FK1        1 LIB_STARTUP_ID
                               UT_SUITE                       UT_STARTUPS_SUITE_FK1             1 UT_SID
                               UT_TEST                        UT_STARTUPS_TEST_FK1              1 UT_ID
UT_SUITE_ITEMS                 UT_SUITE                       UT_SUITE_ITEMS_FK1                1 UT_SID
                                                              UT_SUITE_ITEMS_FK3                1 UT_NSID
                               UT_TEST                        UT_SUITE_ITEMS_FK2                1 UT_ID
UT_SUITE_ITEM_RESULTS          UT_SUITE_RESULTS               UT_SUITE_ITEM_RESULTS_FK1         1 UTSR_ID
                                                              UT_SUITE_ITEM_RESULTS_FK3         1 UTR_NSID
                               UT_TEST_RESULTS                UT_SUITE_ITEM_RESULTS_FK2         1 UTR_ID
UT_SUITE_RESULTS               UT_SUITE                       UT_SUITE_RESULTS_FK1              1 UT_SID
UT_TEARDOWNS                   UT_LIB_TEARDOWNS               UT_TEARDOWNS_LIB_VALIDN_FK1       1 LIB_TEARDOWN_ID
                               UT_SUITE                       UT_TEARDOWNS_SUITE_FK1            1 UT_SID
                               UT_TEST                        UT_TEARDOWNS_TEST_FK1             1 UT_ID
UT_TEST_ARGUMENTS              UT_TEST                        UT_TEST_ARGUMENTS_FK              1 UT_ID
UT_TEST_COVERAGE_STATS         UT_TEST_IMPL                   UT_TEST_COV_STATS_UT_T_FK1        1 UTI_ID
                               UT_TEST_IMPL_RESULTS           UT_TEST_COV_STATS_UT_T_FK2        1 UTIR_ID
UT_TEST_IMPL                   UT_LIB_DYN_QUERIES             UT_TEST_LIB_DYN_QUERIES_FK1       1 LIB_DYN_QUERY_ID
                               UT_TEST                        UT_TEST_IMPL_UT_TEST_FK1          1 UT_ID
UT_TEST_IMPL_ARGUMENTS         UT_TEST_ARGUMENTS              UT_TEST_IMPL_ARGUMENTS_UT_FK1     1 ARG_ID
                               UT_TEST_IMPL                   UT_TEST_IMPL_ARGUMENTS_UT_FK2     1 UTI_ID
UT_TEST_IMPL_ARG_RESULTS       UT_TEST_ARGUMENTS              UT_TEST_IMPL_ARG_RESULTS__FK2     1 ARG_ID
                               UT_TEST_IMPL                   UT_TEST_IMPL_ARG_RESULTS__FK1     1 UTI_ID
UT_TEST_IMPL_RESULTS           UT_TEST_IMPL                   UT_TEST_IMPL_RESULTS_UT_T_FK1     1 UTI_ID
                               UT_TEST_RESULTS                UT_TEST_IMPL_RESULTS_UT_T_FK2     1 UTR_ID
UT_TEST_IMPL_VAL_RESULTS       UT_TEST_IMPL                   UT_TEST_IMPL_VAL_RES_FK2          1 UTI_ID
                               UT_TEST_IMPL_RESULTS           UT_TEST_IMPL_VAL_RES_FK3          1 UTIR_ID
                               UT_VALIDATIONS                 UT_TEST_IMPL_VAL_RES_FK1          1 VAL_ID
UT_TEST_RESULTS                UT_TEST                        UT_TEST_RESULTS_UT_TEST_FK1       1 UT_ID
UT_VALIDATIONS                 UT_LIB_VALIDATIONS             UT_VALIDATIONS_LIB_VALIDN_FK1     1 LIB_VALIDATION_ID
                               UT_TEST_IMPL                   UT_VALIDATIONS_TEST_IMPL_FK1      1 UTI_ID

32 rows selected.

SQL> 
SQL> BREAK ON table_name ON "Who?" ON "Attrs" ON "Indexes"
SQL> PROMPT Tables and Columns (omitting Who and %&notcol%)
Tables and Columns (omitting Who and %?%)
SQL> SELECT atc.table_name, Nvl2 (atc_w.column_name, 'Yes', NULL)      "Who?",
  2          CASE att.n_att WHEN 0 THEN To_Number(NULL) ELSE att.n_att END    "Attrs",
  3          ind_tot.n_ind                "Indexes",
  4          Lower (atc.column_name)|| CASE atc.nullable WHEN 'N' THEN '*' END  column_name,
  5   atc.data_type,
  6   atc.data_length               "Length",
  7                  Max(CASE ind.rn WHEN 1 THEN To_Char(aic.column_position) ELSE ' ' END) ||
  8                  Max(CASE ind.rn WHEN 2 THEN To_Char(aic.column_position) ELSE ' ' END) ||
  9                  Max(CASE ind.rn WHEN 3 THEN To_Char(aic.column_position) ELSE ' ' END) ||
 10                  Max(CASE ind.rn WHEN 4 THEN To_Char(aic.column_position) ELSE ' ' END) ||
 11                  Max(CASE ind.rn WHEN 5 THEN To_Char(aic.column_position) ELSE ' ' END) ||
 12                  Max(CASE ind.rn WHEN 6 THEN To_Char(aic.column_position) ELSE ' ' END) ||
 13                  Max(CASE ind.rn WHEN 7 THEN To_Char(aic.column_position) ELSE ' ' END) ||
 14                  Max(CASE ind.rn WHEN 8 THEN To_Char(aic.column_position) ELSE ' ' END) ||
 15                  Max(CASE ind.rn WHEN 9 THEN To_Char(aic.column_position) ELSE ' ' END) "Index Pos"
 16    FROM all_tab_columns     atc
 17      JOIN all_tables      atb
 18        ON atb.table_name      = atc.table_name
 19       AND atb.owner     = atc.owner                             -- Join index count
 20      LEFT JOIN (SELECT i.table_owner, i.table_name, Count (i.index_name) n_ind
 21              FROM all_indexes       i
 22           GROUP BY i.table_owner, i.table_name)   ind_tot
 23           ON ind_tot.table_name   = atb.table_name
 24          AND ind_tot.table_owner    = atb.owner       -- Join attribute count
 25    LEFT JOIN (SELECT atc_att.table_name, atc_att.owner, Count(atc_att.column_name) n_att
 26          FROM all_tab_columns       atc_att
 27         WHERE atc_att.column_name LIKE '%&notcol%'
 28         GROUP BY atc_att.table_name, atc_att.owner) att
 29           ON att.table_name   = atb.table_name
 30          AND att.owner      = atb.owner
 31    LEFT JOIN all_tab_columns atc_w            -- Join the Who column if it exists
 32           ON atc_w.table_name   = atb.table_name
 33          AND atc_w.owner      = atb.owner
 34          AND atc_w.column_name    = 'CREATED_BY'        -- Join index with its sequence for table
 35    LEFT JOIN (SELECT Row_Number () OVER (PARTITION BY i.table_name, i.table_owner
 36           ORDER BY i.table_name, i.table_owner, i.index_name) rn,
 37         i.table_name, i.table_owner, i.owner, i.index_name
 38          FROM all_indexes i)        ind
 39      ON ind.table_name      = atb.table_name
 40     AND ind.table_owner     = atb.owner
 41          LEFT JOIN all_ind_columns  aic         -- Join columns for the index
 42            ON aic.index_owner   = ind.owner
 43           AND aic.index_name    = ind.index_name
 44           AND aic.column_name   = atc.column_name
 45   WHERE atc.column_name     NOT LIKE '%&notcol%'
 46     AND atc.table_name      LIKE Upper('&tab'||'%')
 47     AND atc.table_name      NOT LIKE '%$%'
 48     AND atb.table_name      NOT LIKE '%&nottab1%'
 49     AND atb.table_name      NOT LIKE '%&nottab2%'
 50     AND atb.owner     LIKE Upper('&owner')
 51     AND atc.column_name     NOT IN ('CREATED_BY', 'CREATION_DATE', 'LAST_UPDATED_BY', 'LAST_UPDATE_DATE')
 52   GROUP BY atc.table_name, Nvl2 (atc_w.column_name, 'Yes', NULL), Lower (atc.column_name)||CASE atc.nullable WHEN 'N' THEN '*' END,
 53          CASE att.n_att WHEN 0 THEN To_Number(NULL) ELSE att.n_att END,
 54          ind_tot.n_ind, atc.data_type, atc.data_length
 55   ORDER BY 1, 2;

TABLE_NAME                     Who?  Attrs Indexes COLUMN_NAME                    DATA_TYPE     Length Index Pos
------------------------------ ----- ----- ------- ------------------------------ ------------- ------ ----------
UT_LIB_DYN_QUERIES             Yes               2 created_on*                    TIMESTAMP(6)      11
                                                   lib_dyn_query*                 CLOB            4000
                                                   lib_dyn_query_class*           VARCHAR2         120
                                                   lib_dyn_query_id*              VARCHAR2          40  1
                                                   lib_dyn_query_name*            VARCHAR2         120 1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
UT_LIB_STARTUPS                Yes               2 created_on*                    TIMESTAMP(6)      11
                                                   lib_startup*                   CLOB            4000
                                                   lib_startup_class*             VARCHAR2         120
                                                   lib_startup_id*                VARCHAR2          40  1
                                                   lib_startup_name*              VARCHAR2         120 1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
UT_LIB_TEARDOWNS               Yes               2 created_on*                    TIMESTAMP(6)      11
                                                   lib_teardown*                  CLOB            4000
                                                   lib_teardown_class*            VARCHAR2         120
                                                   lib_teardown_id*               VARCHAR2          40  1
                                                   lib_teardown_name*             VARCHAR2         120 1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
UT_LIB_VALIDATIONS             Yes               2 created_on*                    TIMESTAMP(6)      11
                                                   lib_validation*                CLOB            4000
                                                   lib_validation_class*          VARCHAR2         120
                                                   lib_validation_id*             VARCHAR2          40  1
                                                   lib_validation_name*           VARCHAR2         120 1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
UT_LOOKUP_CATEGORIES           Yes               2 created_on*                    TIMESTAMP(6)      11
                                                   id*                            VARCHAR2          40 1
                                                   name                           VARCHAR2         120  1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
UT_LOOKUP_DATATYPES            Yes               3 cat_id                         VARCHAR2          40  11
                                                   created_on*                    TIMESTAMP(6)      11
                                                   id*                            VARCHAR2          40 1
                                                   type_id                        NUMBER            22
                                                   type_string                    VARCHAR2         120  2
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
UT_LOOKUP_VALUES               Yes               2 created_on*                    TIMESTAMP(6)      11
                                                   data_id                        VARCHAR2          40  1
                                                   id*                            VARCHAR2          40 1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   value                          CLOB            4000
UT_METADATA                                      1 name*                          VARCHAR2         120 1
                                                   value*                         VARCHAR2        2000
UT_STARTUPS                    Yes               4 created_on*                    TIMESTAMP(6)      11
                                                   index_no*                      NUMBER            22
                                                   lib_startup_id                 VARCHAR2          40 1
                                                   startup                        CLOB            4000
                                                   startup_id*                    VARCHAR2          40  1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_id                          VARCHAR2          40    1
                                                   ut_sid                         VARCHAR2          40   1
UT_SUITE                       Yes               1 coverage                       NUMBER            22
                                                   created_on*                    TIMESTAMP(6)      11
                                                   name*                          VARCHAR2         120
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_sid*                        VARCHAR2          40 1
UT_SUITE_ITEMS                 Yes               4 created_on*                    TIMESTAMP(6)      11
                                                   run_start*                     VARCHAR2           1
                                                   run_tear*                      VARCHAR2           1
                                                   sequence*                      NUMBER            22
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_id                          VARCHAR2          40   12
                                                   ut_nsid                        VARCHAR2          40  1 3
                                                   ut_sid*                        VARCHAR2          40 1  1
UT_SUITE_ITEM_RESULTS          Yes               4 created_on*                    TIMESTAMP(6)      11
                                                   sequence*                      NUMBER            22
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   utr_id                         VARCHAR2          40  1 2
                                                   utr_nsid                       VARCHAR2          40   13
                                                   utsr_id*                       VARCHAR2          40 1  1
UT_SUITE_RESULTS               Yes               2 coverage*                      NUMBER            22
                                                   created_on*                    TIMESTAMP(6)      11
                                                   message                        VARCHAR2        2000
                                                   name*                          VARCHAR2         120
                                                   run_date*                      TIMESTAMP(6)      11
                                                   startup_duration               NUMBER            22
                                                   startup_end_time               VARCHAR2          20
                                                   startup_message                VARCHAR2        2000
                                                   startup_start_time             VARCHAR2          20
                                                   startup_status                 VARCHAR2          20
                                                   status*                        VARCHAR2          20
                                                   teardown_duration              NUMBER            22
                                                   teardown_end_time              VARCHAR2          20
                                                   teardown_message               VARCHAR2        2000
                                                   teardown_start_time            VARCHAR2          20
                                                   teardown_status                VARCHAR2          20
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_sid*                        VARCHAR2          40 1
                                                   utsr_id*                       VARCHAR2          40  1
UT_TEARDOWNS                   Yes               4 created_on*                    TIMESTAMP(6)      11
                                                   index_no*                      NUMBER            22
                                                   lib_teardown_id                VARCHAR2          40 1
                                                   teardown                       CLOB            4000
                                                   teardown_id*                   VARCHAR2          40  1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_id                          VARCHAR2          40    1
                                                   ut_sid                         VARCHAR2          40   1
UT_TEST                        Yes               2 connection_name                VARCHAR2         120
                                                   coverage                       NUMBER            22
                                                   created_on*                    TIMESTAMP(6)      11
                                                   name*                          VARCHAR2         120  1
                                                   object_call                    VARCHAR2         120
                                                   object_name*                   VARCHAR2         120
                                                   object_owner*                  VARCHAR2         120
                                                   object_type*                   VARCHAR2         120
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_id*                         VARCHAR2          40 1
UT_TEST_ARGUMENTS              Yes               2 arg_id*                        VARCHAR2          40  1
                                                   argument_name                  VARCHAR2          30
                                                   char_length                    NUMBER            22
                                                   char_used                      VARCHAR2           1
                                                   character_set_name             VARCHAR2          44
                                                   created_on*                    TIMESTAMP(6)      11
                                                   data_length                    NUMBER            22
                                                   data_level*                    NUMBER            22
                                                   data_precision                 NUMBER            22
                                                   data_scale                     NUMBER            22
                                                   data_type                      VARCHAR2          30
                                                   default_length                 NUMBER            22
                                                   in_out                         VARCHAR2           9
                                                   object_id*                     NUMBER            22
                                                   object_name                    VARCHAR2          30
                                                   overload                       VARCHAR2          40
                                                   owner*                         VARCHAR2          30
                                                   package_name                   VARCHAR2          30
                                                   pls_type                       VARCHAR2          30
                                                   position*                      NUMBER            22
                                                   radix                          NUMBER            22
                                                   sequence*                      NUMBER            22
                                                   type_link                      VARCHAR2         128
                                                   type_name                      VARCHAR2          30
                                                   type_owner                     VARCHAR2          30
                                                   type_subname                   VARCHAR2          30
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_id                          VARCHAR2          40 1
UT_TEST_COVERAGE_STATS         Yes               3 created_on*                    TIMESTAMP(6)      11
                                                   line                           NUMBER            22
                                                   text                           VARCHAR2        4000
                                                   total_occur                    NUMBER            22
                                                   total_time                     NUMBER            22
                                                   unit_name                      VARCHAR2          30
                                                   unit_owner                     VARCHAR2          30
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   utc_id*                        VARCHAR2          40 1
                                                   uti_id                         VARCHAR2          40  1
                                                   utir_id                        VARCHAR2          40   1
UT_TEST_IMPL                   Yes               4 created_on*                    TIMESTAMP(6)      11
                                                   dynamic_value_query            CLOB            4000
                                                   expected_return                VARCHAR2          20
                                                   expected_return_error          VARCHAR2         200
                                                   lib_dyn_query_id               VARCHAR2          40    1
                                                   name*                          VARCHAR2         120  2
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_id*                         VARCHAR2          40  11
                                                   uti_id*                        VARCHAR2          40 1
UT_TEST_IMPL_ARGUMENTS         Yes               3 arg_id*                        VARCHAR2          40 21
                                                   created_on*                    TIMESTAMP(6)      11
                                                   input_value                    CLOB            4000
                                                   output_value                   CLOB            4000
                                                   test_outval*                   NUMBER            22
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   uti_id*                        VARCHAR2          40 1 1
UT_TEST_IMPL_ARG_RESULTS       Yes               4 arg_id*                        VARCHAR2          40  1
                                                   created_on*                    TIMESTAMP(6)      11
                                                   message                        VARCHAR2        2000
                                                   name                           VARCHAR2         120
                                                   run_date*                      TIMESTAMP(6)      11
                                                   status*                        VARCHAR2          20
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   uti_id*                        VARCHAR2          40 1
                                                   utiar_id*                      VARCHAR2          40    1
                                                   utir_id                        VARCHAR2          40   1
UT_TEST_IMPL_RESULTS           Yes               3 created_on*                    TIMESTAMP(6)      11
                                                   duration                       NUMBER            22
                                                   end_time                       VARCHAR2          20
                                                   message                        VARCHAR2        2000
                                                   name*                          VARCHAR2         120
                                                   op_call_duration               NUMBER            22
                                                   op_call_end_time               VARCHAR2          20
                                                   op_call_message                VARCHAR2        2000
                                                   op_call_start_time             VARCHAR2          20
                                                   op_call_status                 VARCHAR2          20
                                                   run_date*                      TIMESTAMP(6)      11
                                                   start_time                     VARCHAR2          20
                                                   startup_duration               NUMBER            22
                                                   startup_end_time               VARCHAR2          20
                                                   startup_message                VARCHAR2        2000
                                                   startup_start_time             VARCHAR2          20
                                                   startup_status                 VARCHAR2          20
                                                   status*                        VARCHAR2          20
                                                   teardown_duration              NUMBER            22
                                                   teardown_end_time              VARCHAR2          20
                                                   teardown_message               VARCHAR2        2000
                                                   teardown_start_time            VARCHAR2          20
                                                   teardown_status                VARCHAR2          20
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   uti_id*                        VARCHAR2          40  1
                                                   utir_id*                       VARCHAR2          40 1
                                                   utr_id*                        VARCHAR2          40   1
UT_TEST_IMPL_VAL_RESULTS       Yes               4 created_on*                    TIMESTAMP(6)      11
                                                   message                        VARCHAR2        2000
                                                   run_date                       TIMESTAMP(6)      11
                                                   status*                        VARCHAR2          20
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   uti_id*                        VARCHAR2          40  1
                                                   utir_id                        VARCHAR2          40   1
                                                   utivr_id*                      VARCHAR2          40    1
                                                   val_duration                   NUMBER            22
                                                   val_end_time                   VARCHAR2          20
                                                   val_id*                        VARCHAR2          40 1
                                                   val_start_time                 VARCHAR2          20
                                                   val_type*                      VARCHAR2          40
UT_TEST_RESULTS                Yes               2 connection_name                VARCHAR2         120
                                                   coverage*                      NUMBER            22
                                                   created_on*                    TIMESTAMP(6)      11
                                                   message                        VARCHAR2        2000
                                                   name*                          VARCHAR2         120
                                                   run_date*                      TIMESTAMP(6)      11
                                                   status*                        VARCHAR2          20
                                                   test_user_name                 VARCHAR2         120
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   ut_id*                         VARCHAR2          40  1
                                                   utr_id*                        VARCHAR2          40 1
UT_VALIDATIONS                 Yes               3 apply_validation*              NUMBER            22
                                                   created_on*                    TIMESTAMP(6)      11
                                                   index_no*                      NUMBER            22
                                                   lib_validation_id              VARCHAR2          40 1
                                                   updated_by*                    VARCHAR2         120
                                                   updated_on*                    TIMESTAMP(6)      11
                                                   uti_id*                        VARCHAR2          40   1
                                                   validation                     CLOB            4000
                                                   validation_id*                 VARCHAR2          40  1

252 rows selected.

SQL> 
SQL> BREAK ON table_name ON index_name
SQL> PROMPT Indexes
Indexes
SQL> SELECT  atb.table_name, ind.index_name||CASE ind.uniqueness WHEN 'UNIQUE' THEN '*' END index_name,
  2   aic.column_position "Seq", Lower(aic.column_name) column_name
  3    FROM all_tables      atb
  4    LEFT JOIN all_indexes      ind
  5           ON ind.table_name   = atb.table_name
  6          AND ind.table_owner    = atb.owner
  7         LEFT JOIN all_ind_columns aic
  8                ON aic.index_name    = ind.index_name
  9               AND aic.index_owner = ind.owner
 10   WHERE atb.table_name      LIKE Upper('&tab'||'%')
 11     AND atb.table_name      NOT LIKE '%$%'
 12     AND atb.table_name      NOT LIKE '%&nottab1%'
 13     AND atb.table_name      NOT LIKE '%&nottab2%'
 14     AND atb.owner     LIKE Upper('&owner')
 15   ORDER BY 1, 2, 3;

TABLE_NAME                     INDEX_NAME                       Seq COLUMN_NAME
------------------------------ ------------------------------- ---- ------------------------------
UT_LIB_DYN_QUERIES             UT_LIB_DYN_QUERIES_NAME*           1 lib_dyn_query_name
                               UT_LIB_DYN_QUERIES_PK*             1 lib_dyn_query_id
UT_LIB_STARTUPS                UT_LIB_STARTUPS_NAME*              1 lib_startup_name
                               UT_LIB_STARTUPS_PK*                1 lib_startup_id
UT_LIB_TEARDOWNS               UT_LIB_TEARDOWNS_NAME*             1 lib_teardown_name
                               UT_LIB_TEARDOWNS_PK*               1 lib_teardown_id
UT_LIB_VALIDATIONS             UT_LIB_VALIDATIONS_NAME*           1 lib_validation_name
                               UT_LIB_VALIDATIONS_PK*             1 lib_validation_id
UT_LOOKUP_CATEGORIES           UT_LOOKUP_CATEGORIES_PK*           1 id
                               UT_LOOKUP_CATEGORIES_UK1*          1 name
UT_LOOKUP_DATATYPES            UT_LOOKUP_DATATYPES_PK*            1 id
                               UT_LOOKUP_DATATYPES_UK1*           1 cat_id
                                                                  2 type_string
                               UT_LOOKUP_DTS_CAT_ID_IX            1 cat_id
UT_LOOKUP_VALUES               UT_LOOKUP_VALUES_PK*               1 id
                               UT_LOOKUP_VALUES_UT_LOOKU_IX       1 data_id
UT_METADATA                    UT_METADATA_PK*                    1 name
UT_STARTUPS                    UT_STARTUPS_LIB_VALIDN_IX          1 lib_startup_id
                               UT_STARTUPS_PK*                    1 startup_id
                               UT_STARTUPS_SUITE_IX               1 ut_sid
                               UT_STARTUPS_TEST_IX                1 ut_id
UT_SUITE                       UT_SUITE_PK*                       1 ut_sid
UT_SUITE_ITEMS                 UT_SUITE_ITEMS_IX1                 1 ut_sid
                               UT_SUITE_ITEMS_IX2                 1 ut_nsid
                               UT_SUITE_ITEMS_IX3                 1 ut_id
                               UT_SUITE_ITEMS_UK*                 1 ut_sid
                                                                  2 ut_id
                                                                  3 ut_nsid
UT_SUITE_ITEM_RESULTS          UT_SUITE_ITEM_RESULTS_FK1          1 utsr_id
                               UT_SUITE_ITEM_RESULTS_FK2          1 utr_id
                               UT_SUITE_ITEM_RESULTS_FK3          1 utr_nsid
                               UT_SUITE_ITEM_RESULTS_UK*          1 utsr_id
                                                                  2 utr_id
                                                                  3 utr_nsid
UT_SUITE_RESULTS               UT_SUITE_RESULTS_IX                1 ut_sid
                               UT_SUITE_RESULTS_PK*               1 utsr_id
UT_TEARDOWNS                   UT_TEARDOWNS_LIB_VALIDN_IX         1 lib_teardown_id
                               UT_TEARDOWNS_PK*                   1 teardown_id
                               UT_TEARDOWNS_SUITE_IX              1 ut_sid
                               UT_TEARDOWNS_TEST_IX               1 ut_id
UT_TEST                        UT_TEST_PK*                        1 ut_id
                               UT_TEST_UK1*                       1 name
UT_TEST_ARGUMENTS              UT_TEST_ARGUMENTS_IX               1 ut_id
                               UT_TEST_ARGUMENTS_PK*              1 arg_id
UT_TEST_COVERAGE_STATS         UT_TEST_COVERAGE_STATS_PK*         1 utc_id
                               UT_TEST_COV_STATS_UT_T_IX1         1 uti_id
                               UT_TEST_COV_STATS_UT_T_IX2         1 utir_id
UT_TEST_IMPL                   UT_TEST_IMPL_PK*                   1 uti_id
                               UT_TEST_IMPL_UK1*                  1 ut_id
                                                                  2 name
                               UT_TEST_IMPL_UT_TEST_IX            1 ut_id
                               UT_TEST_LIB_DYN_QUERIES_IX         1 lib_dyn_query_id
UT_TEST_IMPL_ARGUMENTS         UT_TEST_IMPL_ARGUMENTS_PK*         1 uti_id
                                                                  2 arg_id
                               UT_TEST_IMPL_ARGUMENTS_UT_IX1      1 arg_id
                               UT_TEST_IMPL_ARGUMENTS_UT_IX2      1 uti_id
UT_TEST_IMPL_ARG_RESULTS       UT_TEST_IMPL_ARG_RESULTS_IX1       1 uti_id
                               UT_TEST_IMPL_ARG_RESULTS_IX2       1 arg_id
                               UT_TEST_IMPL_ARG_RESULTS_IX3       1 utir_id
                               UT_TEST_IMPL_ARG_RESULTS_PK*       1 utiar_id
UT_TEST_IMPL_RESULTS           UT_TEST_IMPL_RESULTS_PK*           1 utir_id
                               UT_TEST_IMPL_RESULTS_UT_T_IX1      1 uti_id
                               UT_TEST_IMPL_RESULTS_UT_T_IX2      1 utr_id
UT_TEST_IMPL_VAL_RESULTS       UT_TEST_IMPL_VAL_RES_IX1           1 val_id
                               UT_TEST_IMPL_VAL_RES_IX2           1 uti_id
                               UT_TEST_IMPL_VAL_RES_IX3           1 utir_id
                               UT_TEST_IMPL_VAL_RES_PK*           1 utivr_id
UT_TEST_RESULTS                UT_TEST_RESULTS_PK*                1 utr_id
                               UT_TEST_RESULTS_UT_TEST_IX1        1 ut_id
UT_VALIDATIONS                 UT_VALIDATIONS_LIB_VALIDN_IX       1 lib_validation_id
                               UT_VALIDATIONS_PK*                 1 validation_id
                               UT_VALIDATIONS_TEST_IMPL_IX        1 uti_id

72 rows selected.

SQL> 
SQL> PROMPT Tables with no Who columns / No unique indexes/ Id only unique indexes
Tables with no Who columns / No unique indexes/ Id only unique indexes
SQL> SELECT atb.table_name,  CASE WHEN atc_w.column_name IS NULL THEN 'X' END "Who?",
  2                          CASE WHEN uni.maxind IS NULL THEN 'X' END "No UID?",
  3                          CASE WHEN uni.maxind = 1 THEN 'X' END "Id UID?"
  4    FROM all_tables      atb
  5    LEFT JOIN all_tab_columns    atc_w
  6           ON atc_w.table_name   = atb.table_name
  7          AND atc_w.owner      = atb.owner
  8          AND atc_w.column_name    = 'CREATED_BY'
  9    LEFT JOIN (SELECT ind.table_name, ind.table_owner, Max(CASE WHEN Substr(aic.column_name, Length(aic.column_name)-2) IS NULL THEN 0
 10                     WHEN Substr(aic.column_name, Length(aic.column_name)-2) = '_ID' THEN 1
 11                     ELSE 2 END) maxind
 12                 FROM all_indexes ind
 13                 LEFT JOIN all_ind_columns aic
 14                        ON aic.index_name         = ind.index_name
 15                       AND aic.index_owner        = ind.owner
 16                       AND aic.column_position    = 1
 17                WHERE ind.uniqueness   = 'UNIQUE'
 18                GROUP BY ind.table_name, ind.table_owner) uni
 19           ON uni.table_name     = atb.table_name
 20          AND uni.table_owner      = atb.owner
 21   WHERE atb.table_name      LIKE Upper('&tab'||'%')
 22     AND atb.table_name      NOT LIKE '%$%'
 23     AND atb.table_name      NOT LIKE '%&nottab1%'
 24     AND atb.table_name      NOT LIKE '%&nottab2%'
 25     AND atb.owner     LIKE Upper('&owner')
 26     AND (atc_w.column_name    IS NULL OR uni.maxind IN (0, 1))
 27   ORDER BY 1;

TABLE_NAME                     Who?  No UID? Id UID?
------------------------------ ----- ------- -------
UT_METADATA                    X
UT_STARTUPS                                  X
UT_SUITE_ITEM_RESULTS                        X
UT_SUITE_RESULTS                             X
UT_TEARDOWNS                                 X
UT_TEST_ARGUMENTS                            X
UT_TEST_COVERAGE_STATS                       X
UT_TEST_IMPL                                 X
UT_TEST_IMPL_ARGUMENTS                       X
UT_TEST_IMPL_ARG_RESULTS                     X
UT_TEST_IMPL_RESULTS                         X
UT_TEST_IMPL_VAL_RESULTS                     X
UT_TEST_RESULTS                              X
UT_VALIDATIONS                               X

14 rows selected.

SQL> SELECT 'End: '||name "Database", '&owner' "Schema", '&tab' "Prefix", To_Char(SYSDATE,'DD-MON-YYYY HH24:MI:SS') FROM v$database;

Database             Schema          Prefix     TO_CHAR(SYSDATE,'DD-MON-YYYYH
-------------------- --------------- ---------- -----------------------------
End: XE              UT_REPOS        %          11-JUL-2015 13:21:35

HR Demo Schemas (v12)

I had to import the three schemas (HR, OE and PM) one at a time, merging with the previous one, am not sure if you can do it in one go, or if the diagram is affected by the order of import.

Data Modeler Diagram

HR

Manual Visio Diagram

For comparison, here is a manual diagram, deliberately omitting column and other detail. It was done as an illustration of the network analysis program output, and I therefore did not include useful information such as on relationship optionality.

Networks - HR, v1.0

Network Analysis Output

I copied this from my article above.

Network       #Links  #Nodes Lev  Node                                     Link
------------  ------  ------ ---  ---------------------------------------  -------------------------------
COUNTRIES|HR      21      16   0  COUNTRIES|HR                             ROOT
                               1  < LOCATIONS|HR                           loc_c_id_fk|hr
                               2    < DEPARTMENTS|HR                       dept_loc_fk|hr                                3      > EMPLOYEES|HR                       dept_mgr_fk|hr
                               4        < CUSTOMERS|OE                     customers_account_manager_fk|oe
                               5          < ORDERS|OE                      orders_customer_id_fk|oe                                6            > EMPLOYEES|HR*                orders_sales_rep_fk|oe
                               6            < ORDER_ITEMS|OE               order_items_order_id_fk|oe                                7              > PRODUCT_INFORMATION|OE     order_items_product_id_fk|oe
                               8                < INVENTORIES|OE           inventories_product_id_fk|oe                                9                  > WAREHOUSES|OE          inventories_warehouses_fk|oe
                              10                    > LOCATIONS|HR*        warehouses_location_fk|oe
                               8                < ONLINE_MEDIA|PM          loc_c_id_fk|pm
                               8                < PRINT_MEDIA|PM           printmedia_fk|pm
                               8                < PRODUCT_DESCRIPTIONS|OE  pd_product_id_fk|oe                                4        > DEPARTMENTS|HR*                  emp_dept_fk|hr
                               4        = EMPLOYEES|HR*                    emp_manager_fk|hr
                               4        > JOBS|HR                          emp_job_fk|hr
                               5          < JOB_HISTORY|HR                 jhist_job_fk|hr                                6            > DEPARTMENTS|HR*              jhist_dept_fk|hr
                               6            > EMPLOYEES|HR*                jhist_emp_fk|hr
                               1  > REGIONS|HR                             countr_reg_fk|hr

Schema Metadata Analysis Report – not included

Apex Schema – APEX_040000 (v11.22)
This imported quickly, with 425 tables. However, the diagram was not so useful. Trying to print it to .png or .jpg (via File/Print Diagram/To Image File) silently failed; printing to .pdf worked, but the reader opens at a zoom level of 1.41% and it’s not practical to navigate the links. Maybe text-based analysis reports are more useful for the larger schemas.

Data Modeler Diagram

Here is a screenshot of the modeler diagram:
APEX_040000_DM-Shot
Network Analysis Output

I include my standard summary listings in this case, showing that the schema splits into 21 sub-networks, with 298 tables having foreign key links, the remaining 127 being thereby excluded from this report.

SQL> @..\sql\R_Net
Network detail

Network                                     #Links  #Nodes    Lev Node                                                                   Link
------------------------------------------ ------- ------- ------ ---------------------------------------------------------------------- ------------------------------------------
APEX$_WS_FILES|APEX_040000                       4       5      0 APEX$_WS_FILES|APEX_040000                                             ROOT
                                                                1 > APEX$_WS_ROWS|APEX_040000                                            apex$_ws_files_fk|apex_040000
                                                                2   < APEX$_WS_LINKS|APEX_040000                                         apex$_ws_links_fk|apex_040000
                                                                2   < APEX$_WS_NOTES|APEX_040000                                         apex$_ws_notes_fk|apex_040000
                                                                2   < APEX$_WS_TAGS|APEX_040000                                          apex$_ws_tags_fk|apex_040000
WWV_FLOWS|APEX_040000                          286     236      0 WWV_FLOWS|APEX_040000                                                  ROOT
                                                                1 < WWV_FLOW_ALTERNATE_CONFIG|APEX_040000                                wwv_flow_alt_cfg_fk|apex_040000
                                                                2   < WWV_FLOW_ALT_CONFIG_DETAIL|APEX_040000                             wwv_flow_alt_cfg_d_fk|apex_040000
                                                                1 < WWV_FLOW_APP_COMMENTS|APEX_040000                                    wwv_flow_app_comments_fk|apex_040000
                                                                1 < WWV_FLOW_BANNER|APEX_040000                                          wwv_flow_banner_fk|apex_040000
                                                                1 < WWV_FLOW_BUTTON_TEMPLATES|APEX_040000                                wwv_flow_buttont_fk|apex_040000
                                                                1 < WWV_FLOW_CALS|APEX_040000                                            wwv_flow_cal_to_flow_fk|apex_040000
                                                                2   > WWV_FLOW_PAGE_PLUGS|APEX_040000                                    wwv_flow_plug_calendar_fk|apex_040000
                                                                3     > WWV_FLOWS|APEX_040000*                                           wwv_flow_plug_to_flow_fk|apex_040000
                                                                3     < WWV_FLOW_FLASH_CHARTS_5|APEX_040000                              wwv_flow_flash_charts_5_fk2|apex_040000
                                                                4       > WWV_FLOWS|APEX_040000*                                         wwv_flow_flash_charts_5_fk|apex_040000
                                                                4       < WWV_FLOW_FLASH_CHART5_SERIES|APEX_040000                       wwv_flow_flash_5_series_fk|apex_040000
                                                                4       < WWV_FLOW_FLASH_CHARTS_5_DASH|APEX_040000                       wwv_flow_flash_charts5_dash_fk|apex_040000
                                                                3     < WWV_FLOW_FLASH_CHARTS|APEX_040000                                wwv_flow_flash_charts_fk2|apex_040000
                                                                4       > WWV_FLOWS|APEX_040000*                                         wwv_flow_flash_charts_fk|apex_040000
                                                                4       < WWV_FLOW_FLASH_CHART_SERIES|APEX_040000                        wwv_flow_flash_chart_series_fk|apex_040000
                                                                3     < WWV_FLOW_PAGE_DA_ACTIONS|APEX_040000                             wwv_flow_page_da_a_ar_fk|apex_040000
                                                                4       > WWV_FLOW_PAGE_DA_EVENTS|APEX_040000                            wwv_flow_page_da_a_evnt_fk|apex_040000
                                                                5         > WWV_FLOWS|APEX_040000*                                       wwv_flow_page_da_e_flow_fk|apex_040000
                                                                5         > WWV_FLOW_PAGE_PLUGS|APEX_040000*                             wwv_flow_page_da_e_tr_fk|apex_040000
                                                                5         > WWV_FLOW_STEPS|APEX_040000                                   wwv_flow_page_da_e_page_fk|apex_040000
                                                                6           > WWV_FLOWS|APEX_040000*                                     wwv_flow_steps_fk|apex_040000
                                                                6           < WWV_FLOW_PAGE_DA_ACTIONS|APEX_040000*                      wwv_flow_page_da_a_page_fk|apex_040000
                                                                6           < WWV_FLOW_PAGE_PLUGS|APEX_040000*                           wwv_flow_plug_to_page_fk|apex_040000
                                                                6           < WWV_FLOW_STEP_BRANCHES|APEX_040000                         wwv_flow_step_branches_fk2|apex_040000
                                                                7             > WWV_FLOWS|APEX_040000*                                   wwv_flow_step_branches_fk|apex_040000
                                                                7             < WWV_FLOW_STEP_BRANCH_ARGS|APEX_040000                    wwv_flow_step_branch_args_fk|apex_040000
                                                                6           < WWV_FLOW_STEP_BUTTONS|APEX_040000                          wwv_flow_step_buttons_fk2|apex_040000
                                                                7             > WWV_FLOWS|APEX_040000*                                   wwv_flow_step_buttons_fk1|apex_040000
                                                                7             > WWV_FLOW_PAGE_PLUGS|APEX_040000*                         wwv_flow_step_buttons_plug_fk|apex_040000
                                                                6           < WWV_FLOW_STEP_COMPUTATIONS|APEX_040000                     wwv_flow_step_comp_fk2|apex_040000
                                                                7             > WWV_FLOWS|APEX_040000*                                   wwv_flow_step_comp_fk|apex_040000
                                                                6           < WWV_FLOW_STEP_ITEMS|APEX_040000                            wwv_flow_step_items_fk2|apex_040000
                                                                7             > WWV_FLOWS|APEX_040000*                                   wwv_flow_step_items_fk|apex_040000
                                                                7             > WWV_FLOW_PAGE_PLUGS|APEX_040000*                         wwv_flow_step_items_plug_fk|apex_040000
                                                                7             < WWV_FLOW_STEP_ITEM_HELP|APEX_040000                      wwv_flow_item_helptext_fk|apex_040000
                                                                8               > WWV_FLOWS|APEX_040000*                                 wwv_flow_page_helptext_fk|apex_040000
                                                                6           < WWV_FLOW_STEP_PROCESSING|APEX_040000                       wwv_flow_step_proc_fk2|apex_040000
                                                                7             > WWV_FLOWS|APEX_040000*                                   wwv_flow_step_proc_fk|apex_040000
                                                                7             < WWV_FLOW_WS_PROCESS_PARMS_MAP|APEX_040000                wwv_flow_ws_map_fk2|apex_040000
                                                                8               > WWV_FLOW_WS_PARAMETERS|APEX_040000                     wwv_flows_ws_map_fk1|apex_040000
                                                                9                 > WWV_FLOW_WS_OPERATIONS|APEX_040000                   wwv_flow_ws_parms_fk|apex_040000
                                                               10                   > WWV_FLOW_SHARED_WEB_SERVICES|APEX_040000           wwv_flow_ws_opers_fk|apex_040000
                                                               11                     > WWV_FLOWS|APEX_040000*                           wwv_flow_ws_fk|apex_040000
                                                                6           < WWV_FLOW_STEP_VALIDATIONS|APEX_040000                      wwv_flow_step_val_fk2|apex_040000
                                                                7             > WWV_FLOWS|APEX_040000*                                   wwv_flow_step_val_fk|apex_040000
                                                                7             > WWV_FLOW_PAGE_PLUGS|APEX_040000*                         wwv_flow_step_val_to_reg_fk|apex_040000
                                                                3     < WWV_FLOW_PAGE_GENERIC_ATTR|APEX_040000                           wwv_flow_genattr_to_region_fk|apex_040000
                                                                3     = WWV_FLOW_PAGE_PLUGS|APEX_040000*                                 wwv_flow_plug_parent_fk|apex_040000
                                                                3     < WWV_FLOW_QUERY_DEFINITION|APEX_040000                            query_def_to_region_fk|apex_040000
                                                                4       < WWV_FLOW_QUERY_COLUMN|APEX_040000                              query_column_to_query_fk|apex_040000
                                                                5         > WWV_FLOW_QUERY_OBJECT|APEX_040000                            query_column_to_qry_object_fk|apex_040000
                                                                6           > WWV_FLOW_QUERY_DEFINITION|APEX_040000*                     query_object_to_query_fk|apex_040000
                                                                4       < WWV_FLOW_QUERY_CONDITION|APEX_040000                           query_condition_to_query_fk|apex_040000
                                                                3     < WWV_FLOW_REGION_CHART_SER_ATTR|APEX_040000                       wwv_flow_seattr_to_region_fk|apex_040000
                                                                3     < WWV_FLOW_REGION_REPORT_COLUMN|APEX_040000                        report_column_to_region_fk|apex_040000
                                                                3     < WWV_FLOW_REGION_REPORT_FILTER|APEX_040000                        sys_c004963|apex_040000
                                                                3     < WWV_FLOW_REGION_UPD_RPT_COLS|APEX_040000                         wwv_flow_urc_to_plug_fk|apex_040000
                                                                4       > WWV_FLOWS|APEX_040000*                                         wwv_flow_urc_to_flow_fk|apex_040000
                                                                3     < WWV_FLOW_TREE_REGIONS|APEX_040000                                wwv_flow_treeregion_fk2|apex_040000
                                                                4       > WWV_FLOWS|APEX_040000*                                         wwv_flow_treeregion_fk|apex_040000
                                                                3     < WWV_FLOW_WORKSHEETS|APEX_040000                                  wwv_flow_worksheets_reg_fk|apex_040000
                                                                4       > WWV_FLOWS|APEX_040000*                                         wwv_flow_worksheets_flow_fk|apex_040000
                                                                4       < WWV_FLOW_WORKSHEET_COLUMNS|APEX_040000                         wwv_flow_worksheet_columns_fk|apex_040000
                                                                5         > WWV_FLOWS|APEX_040000*                                       wwv_flow_worksheet_col_fk|apex_040000
                                                                5         > WWV_FLOW_WORKSHEET_COL_GROUPS|APEX_040000                    wwv_flow_worksheet_col_grps_fk|apex_040000
                                                                6           > WWV_FLOWS|APEX_040000*                                     wwv_flow_worksheet_col_grp_fk|apex_040000
                                                                6           > WWV_FLOW_WORKSHEETS|APEX_040000*                           wwv_flow_worksheet_col_grws_fk|apex_040000
                                                                6           > WWV_FLOW_WS_WEBSHEET_ATTR|APEX_040000                      wwv_flow_worksheet_col_grp_fk2|apex_040000
                                                                7             > WWV_FLOW_WORKSHEETS|APEX_040000*                         wwwv_flow_ws_websheet_attr_fk|apex_040000
                                                                7             < WWV_FLOW_WORKSHEET_COLUMNS|APEX_040000*                  wwv_flow_worksheet_col_fk2|apex_040000
                                                                7             < WWV_FLOW_WORKSHEET_COMPUTATION|APEX_040000               wwv_flow_ws_computation_fk|apex_040000
                                                                8               > WWV_FLOW_WORKSHEET_RPTS|APEX_040000                    wwv_flow_ws_comp_cols_fk|apex_040000
                                                                9                 > WWV_FLOW_WORKSHEET_CATEGORIES|APEX_040000            wwv_flow_worksheet_rpts_fk|apex_040000
                                                                9                 < WWV_FLOW_WORKSHEET_CONDITIONS|APEX_040000            wwv_flow_worksheet_cond_fk|apex_040000
                                                               10                   > WWV_FLOW_WS_WEBSHEET_ATTR|APEX_040000*             wwv_flow_ws_condition_fk|apex_040000
                                                                9                 < WWV_FLOW_WORKSHEET_GROUP_BY|APEX_040000              wwv_flow_ws_groupby_fk2|apex_040000
                                                               10                   > WWV_FLOW_WS_WEBSHEET_ATTR|APEX_040000*             wwv_flow_ws_groupby_fk|apex_040000
                                                                9                 < WWV_FLOW_WORKSHEET_NOTIFY|APEX_040000                wwv_flow_worksheet_notify_fk2|apex_040000
                                                               10                   > WWV_FLOW_WORKSHEETS|APEX_040000*                   wwv_flow_worksheet_notify_fk|apex_040000
                                                               10                   > WWV_FLOW_WS_WEBSHEET_ATTR|APEX_040000*             wwv_flow_worksheet_notify_fk4|apex_040000
                                                                9                 > WWV_FLOW_WS_WEBSHEET_ATTR|APEX_040000*               wwv_flow_ws_rpt_fk|apex_040000
                                                                7             < WWV_FLOW_WORKSHEET_LOVS|APEX_040000                      wwv_flow_worksheet_lovs_fk2|apex_040000
                                                                8               > WWV_FLOW_WORKSHEETS|APEX_040000*                       wwv_flow_worksheet_lovs_fk|apex_040000
                                                                8               < WWV_FLOW_WORKSHEET_LOV_ENTRIES|APEX_040000             wwv_flow_worksheet_lov_ent_fk2|apex_040000
                                                                9                 > WWV_FLOW_WORKSHEETS|APEX_040000*                     wwv_flow_worksheet_lov_ent_fk|apex_040000
                                                                9                 > WWV_FLOW_WS_WEBSHEET_ATTR|APEX_040000*               wwv_flow_worksheet_lov_ent_fk3|apex_040000
                                                                7             > WWV_FLOW_WS_APPLICATIONS|APEX_040000                     wwv_flow_ws_websheet_attr_fk2|apex_040000
                                                                8               < WWV_FLOW_WS_APP_SUG_OBJECTS|APEX_040000                wwv_flow_ws_app_so_fk1|apex_040000
                                                                8               < WWV_FLOW_WS_COL_VALIDATIONS|APEX_040000                wwv_flow_ws_col_val_fk3|apex_040000
                                                                9                 > WWV_FLOW_WORKSHEETS|APEX_040000*                     wwv_flow_ws_col_val_fk|apex_040000
                                                                9                 > WWV_FLOW_WS_WEBSHEET_ATTR|APEX_040000*               wwv_flow_ws_col_val_fk2|apex_040000
                                                                8               < WWV_FLOW_WS_CUSTOM_AUTH_SETUPS|APEX_040000             wwv_flow_ws_auth_setups_fk|apex_040000
                                                                8               < WWV_FLOW_WS_WEBPAGES|APEX_040000                       wwv_flow_ws_webpages_fk|apex_040000
                                                                9                 = WWV_FLOW_WS_WEBPAGES|APEX_040000*                    wwv_flow_ws_webpages_fk2|apex_040000
                                                                1 < WWV_FLOW_CAL_TEMPLATES|APEX_040000                                   wwv_flow_cal_templ_to_flow_fk|apex_040000
                                                                1 < WWV_FLOW_COMPOUND_CONDITIONS|APEX_040000                             wwv_flow_comp_cond_fk|apex_040000
                                                                1 < WWV_FLOW_COMPUTATIONS|APEX_040000                                    wwv_flow_computations_fk|apex_040000
                                                                1 < WWV_FLOW_CUSTOM_AUTH_SETUPS|APEX_040000                              wwv_flow_auth_setups_fk|apex_040000
                                                                1 < WWV_FLOW_ENTRY_POINTS|APEX_040000                                    wwv_flow_entry_points_fk|apex_040000
                                                                2   < WWV_FLOW_ENTRY_POINT_ARGS|APEX_040000                              wwv_flow_entry_point_args_fk|apex_040000
                                                                1 < WWV_FLOW_FIELD_TEMPLATES|APEX_040000                                 wwv_flow_field_temp_f_fk|apex_040000
                                                                1 < WWV_FLOW_ICON_BAR_ATTRIBUTES|APEX_040000                             wwv_flow_iconbarattr_fk|apex_040000
                                                                1 < WWV_FLOW_ICON_BAR|APEX_040000                                        wwv_flow_icon_bar_fk|apex_040000
                                                                1 < WWV_FLOW_INSTALL_BUILD_OPT|APEX_040000                               wwv_flow_install_build_opt_fk|apex_040000
                                                                2   > WWV_FLOW_INSTALL|APEX_040000                                       wwv_flow_install_build_opt_fk3|apex_040000
                                                                3     > WWV_FLOWS|APEX_040000*                                           wwv_flow_install_fk|apex_040000
                                                                3     < WWV_FLOW_INSTALL_CHECKS|APEX_040000                              wwv_flow_install_checks_fk3|apex_040000
                                                                4       > WWV_FLOWS|APEX_040000*                                         wwv_flow_install_checks_fk|apex_040000
                                                                3     < WWV_FLOW_INSTALL_SCRIPTS|APEX_040000                             wwv_flow_install_scripts_fk3|apex_040000
                                                                4       > WWV_FLOWS|APEX_040000*                                         wwv_flow_install_scripts_fk|apex_040000
                                                                2   > WWV_FLOW_PATCHES|APEX_040000                                       wwv_flow_install_build_opt_fk4|apex_040000
                                                                3     > WWV_FLOWS|APEX_040000*                                           wwv_flow_patches_fk|apex_040000
                                                                1 < WWV_FLOW_ITEMS|APEX_040000                                           wwv_flow_items_fk|apex_040000
                                                                1 < WWV_FLOW_LANGUAGE_MAP|APEX_040000                                    wwv_flow_lang_flow_id_fk|apex_040000
                                                                1 < WWV_FLOW_LISTS_OF_VALUES$|APEX_040000                                wwv_flow_lov_fk|apex_040000
                                                                2   < WWV_FLOW_LIST_OF_VALUES_DATA|APEX_040000                           wwv_flow_lov_data_fk|apex_040000
                                                                1 < WWV_FLOW_LISTS|APEX_040000                                           wwv_flow_lists_flow_fk|apex_040000
                                                                2   < WWV_FLOW_LIST_ITEMS|APEX_040000                                    wwv_flow_list_items_fk|apex_040000
                                                                3     = WWV_FLOW_LIST_ITEMS|APEX_040000*                                 parent_list_item_fk|apex_040000
                                                                1 < WWV_FLOW_LIST_TEMPLATES|APEX_040000                                  wwv_flow_list_template_fk|apex_040000
                                                                1 < WWV_FLOW_LOCK_PAGE|APEX_040000                                       sys_c004810|apex_040000
                                                                1 < WWV_FLOW_MENUS|APEX_040000                                           wwv_flow_menus_flow_fk|apex_040000
                                                                2   < WWV_FLOW_MENU_OPTIONS|APEX_040000                                  wwv_flow_opt_menus_fk|apex_040000
                                                                1 < WWV_FLOW_MENU_TEMPLATES|APEX_040000                                  wwv_flow_menus_t_flow_fk|apex_040000
                                                                1 < WWV_FLOW_MESSAGES$|APEX_040000                                       wwv_flow_messages_fk|apex_040000
                                                                1 < WWV_FLOW_PAGES_RESERVED|APEX_040000                                  wwv_flow_pages_reserved_fk|apex_040000
                                                                1 < WWV_FLOW_PAGE_CACHE|APEX_040000                                      wwv_flow_page_cache_fk|apex_040000
                                                                2   < WWV_FLOW_PAGE_CODE_CACHE|APEX_040000                               wwv_flow_page_code_cache_fk|apex_040000
                                                                1 < WWV_FLOW_PAGE_GROUPS|APEX_040000                                     sys_c004993|apex_040000
                                                                1 < WWV_FLOW_PAGE_PLUG_TEMPLATES|APEX_040000                             wwv_flow_plug_temp_fk|apex_040000
                                                                1 < WWV_FLOW_PAGE_SUBMISSIONS|APEX_040000                                wwv_flow_page_sub_fk|apex_040000
                                                                2   > WWV_FLOW_SESSIONS$|APEX_040000                                     wwv_flow_page_sub_sess_fk|apex_040000
                                                                3     < WWV_FLOW_COLLECTIONS$|APEX_040000                                wwv_flow_collection_fk|apex_040000
                                                                4       < WWV_FLOW_COLLECTION_MEMBERS$|APEX_040000                       wwv_flow_collection_membes_fk|apex_040000
                                                                3     < WWV_FLOW_DATA|APEX_040000                                        wwv_flow_data_fk|apex_040000
                                                                3     < WWV_FLOW_REQUEST_VERIFICATIONS|APEX_040000                       wwv_flow_request_verif_fk|apex_040000
                                                                3     < WWV_FLOW_SC_TRANS|APEX_040000                                    wwv_flow_sc_trans_fk2|apex_040000
                                                                3     < WWV_FLOW_TREE_STATE|APEX_040000                                  wwv_flow_tree_state$fk|apex_040000
                                                                1 < WWV_FLOW_PLUGINS|APEX_040000                                         wwv_flow_plugin_flow_fk|apex_040000
                                                                2   < WWV_FLOW_PLUGIN_ATTRIBUTES|APEX_040000                             wwv_flow_plugin_attr_parent_fk|apex_040000
                                                                3     > WWV_FLOWS|APEX_040000*                                           wwv_flow_plugin_attr_flow_fk|apex_040000
                                                                3     = WWV_FLOW_PLUGIN_ATTRIBUTES|APEX_040000*                          wwv_flow_plugin_attr_depend_fk|apex_040000
                                                                3     < WWV_FLOW_PLUGIN_ATTR_VALUES|APEX_040000                          wwv_flow_plugin_attrv_attr_fk|apex_040000
                                                                4       > WWV_FLOWS|APEX_040000*                                         wwv_flow_plugin_attrv_flow_fk|apex_040000
                                                                2   < WWV_FLOW_PLUGIN_EVENTS|APEX_040000                                 wwv_flow_plugin_evnt_parent_fk|apex_040000
                                                                3     > WWV_FLOWS|APEX_040000*                                           wwv_flow_plugin_evnt_flow_fk|apex_040000
                                                                2   < WWV_FLOW_PLUGIN_FILES|APEX_040000                                  wwv_flow_plugin_file_parent_fk|apex_040000
                                                                3     > WWV_FLOWS|APEX_040000*                                           wwv_flow_plugin_file_flow_fk|apex_040000
                                                                1 < WWV_FLOW_POPUP_LOV_TEMPLATE|APEX_040000                              wwv_flow_fk_poplov_temp|apex_040000
                                                                1 < WWV_FLOW_PROCESSING|APEX_040000                                      wwv_flow_processing_fk|apex_040000
                                                                1 < WWV_FLOW_REPORT_LAYOUTS|APEX_040000                                  wwv_flow_report_layoutse_fk|apex_040000
                                                                1 < WWV_FLOW_REQUIRED_ROLES|APEX_040000                                  wwv_flow_req_roles_fk|apex_040000
                                                                1 < WWV_FLOW_ROW_TEMPLATES|APEX_040000                                   wwv_flow_row_template_fk|apex_040000
                                                                1 < WWV_FLOW_SECURITY_SCHEMES|APEX_040000                                wwv_flow_sec_schemes_fk|apex_040000
                                                                1 < WWV_FLOW_SHARED_QRY_SQL_STMTS|APEX_040000                            wwv_flow_sqry_sql_flow_fk|apex_040000
                                                                2   > WWV_FLOW_SHARED_QUERIES|APEX_040000                                wwv_flow_sqry_sql_sqry_fk|apex_040000
                                                                3     > WWV_FLOWS|APEX_040000*                                           wwv_flow_shdqry_flow_fk|apex_040000
                                                                1 < WWV_FLOW_SHORTCUTS|APEX_040000                                       wwv_flow_shortcuts_to_flow_fk|apex_040000
                                                                1 < WWV_FLOW_TABS|APEX_040000                                            wwv_flow_tabs_fk|apex_040000
                                                                1 < WWV_FLOW_TEMPLATES|APEX_040000                                       wwv_flow_templates_fk|apex_040000
                                                                1 < WWV_FLOW_TEMPLATE_PREFERENCES|APEX_040000                            wwv_flow_templ_pref_fk|apex_040000
                                                                1 < WWV_FLOW_THEMES|APEX_040000                                          wwv_flow_themes_2f_fk|apex_040000
                                                                1 < WWV_FLOW_TOPLEVEL_TABS|APEX_040000                                   wwv_flow_toplev_tab_fk|apex_040000
                                                                1 < WWV_FLOW_TRANSLATABLE_TEXT$|APEX_040000                              wwv_flow_trans_text_fk|apex_040000
                                                                1 < WWV_FLOW_TREES|APEX_040000                                           wwv_flow_tree_fk|apex_040000
                                                                1 < WWV_FLOW_VALIDATIONS|APEX_040000                                     wwv_flow_val_fk|apex_040000
                                                                1 < WWV_MIG_GENERATED_APPLICATIONS|APEX_040000                           wwv_mig_gen_app_flow_id_fk|apex_040000
                                                                2   > WWV_MIG_PROJECTS|APEX_040000                                       wwv_mig_gen_app_proj_id_fk|apex_040000
                                                                3     < WWV_MIG_ACCESS|APEX_040000                                       wwv_mig_acc_fk|apex_040000
                                                                3     < WWV_MIG_FORMS|APEX_040000                                        wwv_mig_forms_project_id_fk|apex_040000
                                                                4       < WWV_MIG_FRM_MODULES|APEX_040000                                wwv_mig_frm_modules_file_id_fk|apex_040000
                                                                5         < WWV_MIG_FRM_FORMMODULES|APEX_040000                          wwv_mig_frm_frmmdl_mdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_ALERTS|APEX_040000                             wwv_mig_frm_alrt_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_ATTACHEDLIBRARY|APEX_040000                    wwv_mig_frm_atlib_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_BLOCKS|APEX_040000                             wwv_mig_frm_blk_frmmdl_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_BLK_DSA|APEX_040000                          wwv_mig_frm_blk_dsa_blk_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_BLK_DSC|APEX_040000                          wwv_mig_frm_blk_dsc_blk_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_BLK_ITEMS|APEX_040000                        wwv_mig_frm_bi_blk_id_fk|apex_040000
                                                                8               < WWV_MIG_FRM_BLK_ITEM_LIE|APEX_040000                   wwv_mig_frm_bi_lie_item_id_fk|apex_040000
                                                                8               < WWV_MIG_FRM_BLK_ITEM_RADIO|APEX_040000                 wwv_mig_frm_bir_item_id_fk|apex_040000
                                                                8               < WWV_MIG_FRM_BLK_ITEM_TRIGGERS|APEX_040000              wwv_mig_frm_bi_trg_item_id_fk|apex_040000
                                                                8               < WWV_MIG_FRM_REV_BLK_ITEMS|APEX_040000                  wwv_mig_frm_rev_bi_item_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_BLK_RELATIONS|APEX_040000                    wwv_mig_frm_blk_rel_blk_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_BLK_TRIGGERS|APEX_040000                     wwv_mig_frm_blk_trg_blk_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_REV_BLOCKS|APEX_040000                       wwv_mig_frm_rev_blocks_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_CANVAS|APEX_040000                             wwv_mig_frm_canvs_frmmdl_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_CNVS_GRAPHICS|APEX_040000                    wwv_mig_frm_cg_cnvs_id_fk|apex_040000
                                                                8               < WWV_MIG_FRM_CNVG_COMPOUNDTEXT|APEX_040000              wwv_mig_frm_cpdtxt_grphs_id_fk|apex_040000
                                                                9                 < WWV_MIG_FRM_CPDTXT_TEXTSEGMENT|APEX_040000           wwv_mig_frm_txtsgmt_cpd_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_CNVS_TABPAGE|APEX_040000                     wwv_mig_frm_ctp_cnvs_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_COORDINATES|APEX_040000                        wwv_mig_frm_crdnt_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_EDITOR|APEX_040000                             wwv_mig_frm_edtr_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_FMB_MENU|APEX_040000                           wwv_mig_frm_menu_frmmdl_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_FMB_MENU_MENUITEM|APEX_040000                wwv_mig_fmb_menuitem_menuid_fk|apex_040000
                                                                8               < WWV_MIG_FRM_FMB_MENUITEM_ROLE|APEX_040000              wwv_mig_fmb_mnuitemrl_mitm_fk|apex_040000
                                                                6           < WWV_MIG_FRM_LOV|APEX_040000                                wwv_mig_frm_lov_frmmdl_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_LOVCOLUMNMAPPING|APEX_040000                 wwv_mig_frm_lvcm_frmmdl_id_fk|apex_040000
                                                                8               < WWV_MIG_FRM_REV_LOVCOLMAPS|APEX_040000                 wwv_mig_frm_rev_lcm_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_REV_LOV|APEX_040000                          wwv_mig_frm_rev_lov_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_MODULEPARAMETER|APEX_040000                    wwv_mig_frm_mdlpr_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_OBJECTGROUP|APEX_040000                        wwv_mig_frm_objgp_frmmdl_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_OBJECTGROUPCHILD|APEX_040000                 wwv_mig_frm_objgpc_objgp_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_PROGRAMUNIT|APEX_040000                        wwv_mig_frm_pgut_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_PROPERTYCLASS|APEX_040000                      wwv_mig_frm_ppcl_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_RECORDGROUPS|APEX_040000                       wwv_mig_frm_recgp_frmmdl_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_RECORDGROUPCOLUMN|APEX_040000                wwv_mig_frm_rgc_recgp_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_REPORT|APEX_040000                             wwv_mig_frm_rpt_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_REV_FORMMODULES|APEX_040000                    wwv_mig_frm_rev_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_TRIGGERS|APEX_040000                           wwv_mig_frm_trg_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_VISUALATTRIBUTES|APEX_040000                   wwv_mig_frm_visat_frmmdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_WINDOWS|APEX_040000                            wwv_mig_frm_wndow_frmmdl_id_fk|apex_040000
                                                                3     < WWV_MIG_FRM_MENUS|APEX_040000                                    wwv_mig_menus_project_id_fk|apex_040000
                                                                4       < WWV_MIG_FRM_MENUS_MODULES|APEX_040000                          wwv_mig_mnu_modules_file_id_fk|apex_040000
                                                                5         < WWV_MIG_FRM_MENUS_MENUMODULES|APEX_040000                    wwv_mig_mnu_mnumdl_mdl_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_MENUSMODULEROLES|APEX_040000                   wwv_mig_mmodrole_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_MENUS_PROGRAMUNIT|APEX_040000                  wwv_mig_mnu_progunit_id_fk|apex_040000
                                                                6           < WWV_MIG_FRM_MENU|APEX_040000                               wwv_mig_mnu_id_fk|apex_040000
                                                                7             < WWV_MIG_FRM_MENU_MENUITEM|APEX_040000                    wwv_mig_mnuitem_id_fk|apex_040000
                                                                8               < WWV_MIG_FRM_MENUITEM_ROLE|APEX_040000                  wwv_mig_mnuitemrole_id_fk|apex_040000
                                                                3     < WWV_MIG_FRM_REV_APEX_APP|APEX_040000                             wwv_mig_frm_rev_apex_app_fk|apex_040000
                                                                3     < WWV_MIG_OLB|APEX_040000                                          wwv_mig_olb_project_id_fk|apex_040000
                                                                4       < WWV_MIG_OLB_MODULES|APEX_040000                                wwv_mig_olb_modules_file_id_fk|apex_040000
                                                                5         < WWV_MIG_OLB_OBJECTLIBRARY|APEX_040000                        wwv_mig_olb_objlib_mdl_id_fk|apex_040000
                                                                6           < WWV_MIG_OLB_BLOCK|APEX_040000                              wwv_mig_olb_block_objlib_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_BLK_DATASOURCECOL|APEX_040000                wwv_mig_olb_blk_dsc_blk_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_BLK_ITEM|APEX_040000                         wwv_mig_olb_blk_item_blk_id_fk|apex_040000
                                                                8               < WWV_MIG_OLB_BLK_ITEM_LIE|APEX_040000                   wwv_mig_olb_bil_item_id_fk|apex_040000
                                                                8               < WWV_MIG_OLB_BLK_ITEM_TRIGGER|APEX_040000               wwv_mig_olb_bit_item_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_BLK_TRIGGER|APEX_040000                      wwv_mig_olb_blk_trgr_blk_id_fk|apex_040000
                                                                6           < WWV_MIG_OLB_CANVAS|APEX_040000                             wwv_mig_olb_canvs_objlib_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_CNVS_GRAPHICS|APEX_040000                    wwv_mig_olb_cg_cnvs_id_fk|apex_040000
                                                                8               < WWV_MIG_OLB_CG_COMPOUNDTEXT|APEX_040000                wwv_mig_olb_cg_ct_grphs_id_fk|apex_040000
                                                                9                 < WWV_MIG_OLB_CG_CT_TEXTSEGMENT|APEX_040000            wwv_mig_olb_cg_ct_ts_ct_id_fk|apex_040000
                                                                6           < WWV_MIG_OLB_OBJECTLIBRARYTAB|APEX_040000                   wwv_mig_olb_olt_objlib_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_ALERT|APEX_040000                        wwv_mig_olb_olt_alrt_olt_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_BLOCK|APEX_040000                        wwv_mig_olb_t_block_olt_id_fk|apex_040000
                                                                8               < WWV_MIG_OLB_OLT_BLK_ITEM|APEX_040000                   wwv_mig_olb_olt_bi_blk_id_fk|apex_040000
                                                                9                 < WWV_MIG_OLB_OLT_BLK_ITEM_TRIGR|APEX_040000           wwv_mig_olb_olt_bit_item_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_CANVAS|APEX_040000                       wwv_mig_olb_t_canvas_olt_id_fk|apex_040000
                                                                8               < WWV_MIG_OLB_OLT_CNVS_GRAPHICS|APEX_040000              wwv_mig_olb_olt_cg_cnvs_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_GRAPHICS|APEX_040000                     wwv_mig_olb_t_grphcs_olt_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_ITEM|APEX_040000                         wwv_mig_olb_olt_item_olt_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_MENU|APEX_040000                         wwv_mig_olb_olt_menu_olt_id_fk|apex_040000
                                                                8               < WWV_MIG_OLB_OLT_MENU_MENUITEM|APEX_040000              wwv_mig_olb_olt_mmi_menu_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_OBJECTGROUP|APEX_040000                  wwv_mig_olb_t_objgrp_olt_id_fk|apex_040000
                                                                8               < WWV_MIG_OLB_OLT_OB_OBJGRPCHILD|APEX_040000             wwv_mig_olb_olt_ob_ogc_obid_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_REPORT|APEX_040000                       wwv_mig_olb_t_report_olt_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_TABPAGE|APEX_040000                      wwv_mig_olb_t_tabpage_oltid_fk|apex_040000
                                                                8               < WWV_MIG_OLB_OLT_TABPG_GRAPHICS|APEX_040000             wwv_mig_olb_olt_tpg_tp_id_fk|apex_040000
                                                                9                 < WWV_MIG_OLB_T_TP_G_GRAPHICS|APEX_040000              wwv_mig_olb_t_tp_gg_g_id_fk|apex_040000
                                                               10                   < WWV_MIG_OLB_T_TP_GG_CPDTXT|APEX_040000             wwv_mig_olb_t_tp_gg_ct_g_id_fk|apex_040000
                                                               11                     < WWV_MIG_OLB_T_TP_GG_CT_TXTSGT|APEX_040000        wwv_mig_olb_ttpggctts_ctid_fk|apex_040000
                                                               10                   < WWV_MIG_OLB_T_TP_GG_GRAPHICS|APEX_040000           wwv_mig_olb_t_tp_ggg_g_id_fk|apex_040000
                                                               11                     < WWV_MIG_OLB_T_TP_GGG_CPDTXT|APEX_040000          wwv_mig_olb_ttp_ggg_ct_gid_fk|apex_040000
                                                               12                       < WWV_MIG_OLB_T_TP_GGG_CT_TXTSGT|APEX_040000     wwv_mig_olb_ttpgggctts_ctid_fk|apex_040000
                                                               11                     < WWV_MIG_OLB_T_TP_GGG_GRAPHICS|APEX_040000        wwv_mig_olb_t_tp_gggg_g_id_fk|apex_040000
                                                               12                       < WWV_MIG_OLB_T_TP_GGGG_CPDTXT|APEX_040000       wwv_mig_olb_ttpggggct_g_id_fk|apex_040000
                                                               13                         < WWV_MIG_OLB_T_TP_GGGG_CT_TXSGT|APEX_040000   wwv_mig_olb_ttpggggcts_ctid_fk|apex_040000
                                                               12                       < WWV_MIG_OLB_T_TP_GGGG_GRAPHICS|APEX_040000     wwv_mig_olb_ttpggggg_g_id_fk|apex_040000
                                                               13                         < WWV_MIG_OLB_T_TP_GGGGG_CPDTXT|APEX_040000    wwv_mig_olb_ttpgggggct_g_id_fk|apex_040000
                                                               14                           < WWV_MIG_OLB_T_TP_GGGGG_CT_TXST|APEX_040000 wwv_mig_olb_ttp5gcts_ct_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_VISUALATTRBUTE|APEX_040000               wwv_mig_olb_olt_va_olt_id_fk|apex_040000
                                                                7             < WWV_MIG_OLB_OLT_WINDOW|APEX_040000                       wwv_mig_olb_olt_wndow_oltid_fk|apex_040000
                                                                6           < WWV_MIG_OLB_PROGRAMUNIT|APEX_040000                        wwv_mig_olb_pu_objlib_id_fk|apex_040000
                                                                6           < WWV_MIG_OLB_PROPERTYCLASS|APEX_040000                      wwv_mig_olb_pc_objlib_id_fk|apex_040000
                                                                6           < WWV_MIG_OLB_VISUALATTRIBUTE|APEX_040000                    wwv_mig_olb_va_objlib_id_fk|apex_040000
                                                                6           < WWV_MIG_OLB_WINDOW|APEX_040000                             wwv_mig_olb_wndow_objlib_id_fk|apex_040000
                                                                3     < WWV_MIG_PLSQL_LIBS|APEX_040000                                   wwv_mig_plls_project_id_fk|apex_040000
                                                                3     < WWV_MIG_PROJECT_COMPONENTS|APEX_040000                           wwv_mig_proj_comp_fk|apex_040000
                                                                3     < WWV_MIG_PROJECT_TRIGGERS|APEX_040000                             wwv_mig_proj_trig_fk|apex_040000
                                                                3     < WWV_MIG_RPTS|APEX_040000                                         wwv_mig_rpts_project_id_fk|apex_040000
                                                                4       < WWV_MIG_REPORT|APEX_040000                                     wwv_mig_rep_file_id_fk|apex_040000
                                                                5         < WWV_MIG_RPT_DATA|APEX_040000                                 wwv_mig_repdata_id_fk|apex_040000
                                                                6           < WWV_MIG_RPT_DATASRC|APEX_040000                            wwv_mig_repsrc_id_fk|apex_040000
                                                                7             < WWV_MIG_RPT_DATASRC_GRP|APEX_040000                      wwv_mig_grp_id_fk|apex_040000
                                                                8               < WWV_MIG_RPT_GRP_DATAITEM|APEX_040000                   wwv_mig_grp_dataitem_id_fk|apex_040000
                                                                9                 < WWV_MIG_RPT_GRP_DATAITEM_DESC|APEX_040000            wwv_mig_grp_itemdesc_id_fk|apex_040000
                                                                9                 < WWV_MIG_RPT_GRP_DATAITEM_PRIV|APEX_040000            wwv_mig_grp_itempriv_id_fk|apex_040000
                                                                8               < WWV_MIG_RPT_GRP_FIELD|APEX_040000                      wwv_mig_grp_fld_id_fk|apex_040000
                                                                8               < WWV_MIG_RPT_GRP_FILTER|APEX_040000                     wwv_mig_grp_fltr_id_fk|apex_040000
                                                                8               < WWV_MIG_RPT_GRP_FORMULA|APEX_040000                    wwv_mig_grp_form_id_fk|apex_040000
                                                                8               < WWV_MIG_RPT_GRP_ROWDELIM|APEX_040000                   wwv_mig_grp_row_id_fk|apex_040000
                                                                8               < WWV_MIG_RPT_GRP_SUMMARY|APEX_040000                    wwv_mig_grp_sum_id_fk|apex_040000
                                                                7             < WWV_MIG_RPT_DATASRC_SELECT|APEX_040000                   wwv_mig_select_id_fk|apex_040000
                                                                6           < WWV_MIG_RPT_DATA_SUMMARY|APEX_040000                       wwv_mig_repsum_id_fk|apex_040000
                                                                5         < WWV_MIG_RPT_REPORTPRIVATE|APEX_040000                        wwv_mig_rptpriv_id_fk|apex_040000
WWV_FLOW_ADVISOR_CATEGORIES|APEX_040000          2       3      0 WWV_FLOW_ADVISOR_CATEGORIES|APEX_040000                                ROOT
                                                                1 < WWV_FLOW_ADVISOR_CHECKS|APEX_040000                                  wwv_flow_adv_chk_cat_fk|apex_040000
                                                                2   < WWV_FLOW_ADVISOR_CHECK_MSGS|APEX_040000                            wwv_flow_adv_chk_msg_check_fk|apex_040000
WWV_FLOW_BUGS|APEX_040000                       13      11      0 WWV_FLOW_BUGS|APEX_040000                                              ROOT
                                                                1 < WWV_FLOW_TEAMDEV_TAG_CLOUD|APEX_040000                               wwv_flow_teamdev_tc_b|apex_040000
                                                                2   > WWV_FLOW_FEATURES|APEX_040000                                      wwv_flow_teamdev_tc_f|apex_040000
                                                                3     = WWV_FLOW_FEATURES|APEX_040000*                                   wwv_flow_features_par_feat_fk|apex_040000
                                                                3     < WWV_FLOW_FEATURE_HISTORY|APEX_040000                             wwv_flow_feature_hist_fk|apex_040000
                                                                3     < WWV_FLOW_FEATURE_PROGRESS|APEX_040000                            wwv_flow_feature_prog_fk|apex_040000
                                                                3     < WWV_FLOW_TEAM_FILES|APEX_040000                                  wwv_flow_team_files_fk1|apex_040000
                                                                4       > WWV_FLOW_EVENTS|APEX_040000                                    wwv_flow_team_files_fk3|apex_040000
                                                                4       > WWV_FLOW_FEEDBACK|APEX_040000                                  wwv_flow_team_files_fk4|apex_040000
                                                                5         < WWV_FLOW_FEEDBACK_FOLLOWUP|APEX_040000                       wwv_flow_feedback_fup_fk|apex_040000
                                                                5         < WWV_FLOW_TEAM_FILES|APEX_040000*                             wwv_flow_team_files_fk5|apex_040000
                                                                4       > WWV_FLOW_TASKS|APEX_040000                                     wwv_flow_team_files_fk2|apex_040000
                                                                5         < WWV_FLOW_TASK_PROGRESS|APEX_040000                           wwv_flow_task_prog_fk|apex_040000
                                                                5         < WWV_FLOW_TEAMDEV_TAG_CLOUD|APEX_040000*                      wwv_flow_teamdev_tc_t|apex_040000
WWV_FLOW_DATA_LOAD_BAD_LOG|APEX_040000           1       2      0 WWV_FLOW_DATA_LOAD_BAD_LOG|APEX_040000                                 ROOT
                                                                1 > WWV_FLOW_DATA_LOAD_UNLOAD|APEX_040000                                wwv_flow_data_load_bad_log_fk1|apex_040000
WWV_FLOW_DICTIONARY_VIEWS|APEX_040000            1       1      0 WWV_FLOW_DICTIONARY_VIEWS|APEX_040000                                  ROOT
                                                                1 = WWV_FLOW_DICTIONARY_VIEWS|APEX_040000*                               wwv_flow_dict_view_parent_fk|apex_040000
WWV_FLOW_FILE_OBJECTS$|FLOWS_FILES               5       6      0 WWV_FLOW_FILE_OBJECTS$|FLOWS_FILES                                     ROOT
                                                                1 < WWV_FLOW_IMPORT_EXPORT|APEX_040000                                   wwv_flow_import_export_fk|apex_040000
                                                                1 < WWV_FLOW_SW_BINDS|APEX_040000                                        wwv_flow_sw_bind_fk|apex_040000
                                                                1 < WWV_FLOW_SW_RESULTS|APEX_040000                                      wwv_flow_sw_result_fk|apex_040000
                                                                2   < WWV_FLOW_SW_DETAIL_RESULTS|APEX_040000                             wwv_flow_sw_d_result_fk|apex_040000
                                                                1 < WWV_FLOW_SW_STMTS|APEX_040000                                        wwv_flow_sw_stmts_fk|apex_040000
WWV_FLOW_FLASH_MAP_FILES|APEX_040000             2       3      0 WWV_FLOW_FLASH_MAP_FILES|APEX_040000                                   ROOT
                                                                1 > WWV_FLOW_FLASH_MAP_FOLDERS|APEX_040000                               wwv_flow_flash_map_files_fk|apex_040000
                                                                1 < WWV_FLOW_FLASH_MAP_REGIONS|APEX_040000                               wwv_flow_flash_map_reg_fk|apex_040000
WWV_FLOW_FND_GROUP_USERS|APEX_040000             1       2      0 WWV_FLOW_FND_GROUP_USERS|APEX_040000                                   ROOT
                                                                1 > WWV_FLOW_FND_USER_GROUPS|APEX_040000                                 wwv_flow_fnd_gu_int_g_fk|apex_040000
WWV_FLOW_HNT_ARGUMENT_INFO|APEX_040000           1       2      0 WWV_FLOW_HNT_ARGUMENT_INFO|APEX_040000                                 ROOT
                                                                1 > WWV_FLOW_HNT_PROCEDURE_INFO|APEX_040000                              wwv_flow_hnt_arg_info_proc_fk|apex_040000
WWV_FLOW_HNT_COLUMN_DICT|APEX_040000             1       2      0 WWV_FLOW_HNT_COLUMN_DICT|APEX_040000                                   ROOT
                                                                1 < WWV_FLOW_HNT_COL_DICT_SYN|APEX_040000                                wwv_flow_hnt_col_dict_syn_fk|apex_040000
WWV_FLOW_HNT_COLUMN_INFO|APEX_040000             4       4      0 WWV_FLOW_HNT_COLUMN_INFO|APEX_040000                                   ROOT
                                                                1 > WWV_FLOW_HNT_GROUPS|APEX_040000                                      wwv_flow_hnt_col_info_grp_fk|apex_040000
                                                                2   > WWV_FLOW_HNT_TABLE_INFO|APEX_040000                                wwv_flow_hnt_groups_tab_fk|apex_040000
                                                                3     < WWV_FLOW_HNT_COLUMN_INFO|APEX_040000*                            wwv_flow_hnt_col_info_tab_fk|apex_040000
                                                                1 < WWV_FLOW_HNT_LOV_DATA|APEX_040000                                    wwv_flow_hnt_lov_data_col_fk|apex_040000
WWV_FLOW_MAIL_ATTACHMENTS|APEX_040000            1       2      0 WWV_FLOW_MAIL_ATTACHMENTS|APEX_040000                                  ROOT
                                                                1 > WWV_FLOW_MAIL_QUEUE|APEX_040000                                      wwv_flow_mail_attachments_fk1|apex_040000
WWV_FLOW_MODELS|APEX_040000                      4       4      0 WWV_FLOW_MODELS|APEX_040000                                            ROOT
                                                                1 < WWV_FLOW_MODEL_PAGES|APEX_040000                                     wwv_flow_model_pages_fk|apex_040000
                                                                2   = WWV_FLOW_MODEL_PAGES|APEX_040000*                                  wwv_flow_model_pages_fk2|apex_040000
                                                                2   < WWV_FLOW_MODEL_PAGE_REGIONS|APEX_040000                            wwv_flow_mpr_fk|apex_040000
                                                                3     < WWV_FLOW_MODEL_PAGE_COLS|APEX_040000                             wwv_flow_model_page_cols_fk|apex_040000
WWV_FLOW_QB_SAVED_COND|APEX_040000               3       4      0 WWV_FLOW_QB_SAVED_COND|APEX_040000                                     ROOT
                                                                1 > WWV_FLOW_QB_SAVED_QUERY|APEX_040000                                  sys_c005031|apex_040000
                                                                2   < WWV_FLOW_QB_SAVED_JOIN|APEX_040000                                 sys_c005038|apex_040000
                                                                2   < WWV_FLOW_QB_SAVED_TABS|APEX_040000                                 sys_c005045|apex_040000
WWV_FLOW_RESTRICTED_SCHEMAS|APEX_040000          1       2      0 WWV_FLOW_RESTRICTED_SCHEMAS|APEX_040000                                ROOT
                                                                1 < WWV_FLOW_RSCHEMA_EXCEPTIONS|APEX_040000                              wwv_flow_rschema_exceptions_fk|apex_040000
WWV_MIG_FRM_OLB_XMLTAGTABLEMAP|APEX_040000       1       1      0 WWV_MIG_FRM_OLB_XMLTAGTABLEMAP|APEX_040000                             ROOT
                                                                1 = WWV_MIG_FRM_OLB_XMLTAGTABLEMAP|APEX_040000*                          wwv_mig_olb_xmltagtablemap_fk|apex_040000
WWV_MIG_FRM_XMLTAGTABLEMAP|APEX_040000           1       1      0 WWV_MIG_FRM_XMLTAGTABLEMAP|APEX_040000                                 ROOT
                                                                1 = WWV_MIG_FRM_XMLTAGTABLEMAP|APEX_040000*                              wwv_mig_frm_xmltagtablemap_fk|apex_040000
WWV_MIG_MENU_XMLTAGTABLEMAP|APEX_040000          1       1      0 WWV_MIG_MENU_XMLTAGTABLEMAP|APEX_040000                                ROOT
                                                                1 = WWV_MIG_MENU_XMLTAGTABLEMAP|APEX_040000*                             wwv_mig_mnu_xmltagtablemap_fk|apex_040000
WWV_MIG_RPT_XMLTAGTABLEMAP|APEX_040000           1       1      0 WWV_MIG_RPT_XMLTAGTABLEMAP|APEX_040000                                 ROOT
                                                                1 = WWV_MIG_RPT_XMLTAGTABLEMAP|APEX_040000*                              wwv_mig_rpt_xmltagtablemap_fk|apex_040000
WWV_PURGE_DATAFILES|APEX_040000                  4       5      0 WWV_PURGE_DATAFILES|APEX_040000                                        ROOT
                                                                1 > WWV_PURGE_WORKSPACES|APEX_040000                                     wwv_purge_datafiles_fk1|apex_040000
                                                                2   < WWV_PURGE_EMAILS|APEX_040000                                       wwv_purge_emails_fk1|apex_040000
                                                                3     < WWV_PURGE_WORKSPACE_RESPONSES|APEX_040000                        wwv_purge_workspace_resp_fk1|apex_040000
                                                                2   < WWV_PURGE_SCHEMAS|APEX_040000                                      wwv_purge_schemas_fk1|apex_040000

359 rows selected.

Elapsed: 00:00:01.81
Network summary 1 - by network

Network                                     #Links  #Nodes    Max Lev
------------------------------------------ ------- ------- ----------
WWV_FLOW_HNT_COLUMN_DICT|APEX_040000             2       2          1
WWV_MIG_RPT_XMLTAGTABLEMAP|APEX_040000           2       1          1
WWV_MIG_MENU_XMLTAGTABLEMAP|APEX_040000          2       1          1
WWV_MIG_FRM_XMLTAGTABLEMAP|APEX_040000           2       1          1
WWV_MIG_FRM_OLB_XMLTAGTABLEMAP|APEX_040000       2       1          1
WWV_FLOW_RESTRICTED_SCHEMAS|APEX_040000          2       2          1
WWV_FLOW_MAIL_ATTACHMENTS|APEX_040000            2       2          1
WWV_FLOW_HNT_ARGUMENT_INFO|APEX_040000           2       2          1
WWV_FLOW_FND_GROUP_USERS|APEX_040000             2       2          1
WWV_FLOW_DICTIONARY_VIEWS|APEX_040000            2       1          1
WWV_FLOW_DATA_LOAD_BAD_LOG|APEX_040000           2       2          1
WWV_FLOW_FLASH_MAP_FILES|APEX_040000             3       3          1
WWV_FLOW_ADVISOR_CATEGORIES|APEX_040000          3       3          2
WWV_FLOW_QB_SAVED_COND|APEX_040000               4       4          2
WWV_PURGE_DATAFILES|APEX_040000                  5       5          3
APEX$_WS_FILES|APEX_040000                       5       5          2
WWV_FLOW_MODELS|APEX_040000                      5       4          3
WWV_FLOW_HNT_COLUMN_INFO|APEX_040000             5       4          3
WWV_FLOW_FILE_OBJECTS$|FLOWS_FILES               6       6          2
WWV_FLOW_BUGS|APEX_040000                       14      11          5
WWV_FLOWS|APEX_040000                          287     236         14

21 rows selected.

Elapsed: 00:00:00.01
Network summary 2 - grouped by numbers of nodes

 #Nodes  #Networks
------- ----------
      1          5
      2          6
      3          2
      4          3
      5          2
      6          1
     11          1
    236          1

8 rows selected.

Elapsed: 00:00:00.01

Schema Metadata Analysis Report

This is too long to embed so is included as an attachment.
L_APEX_040000






Master-Detail Transaction Matching in SQL (MDTM1)

This article is the first in a sequence of three dealing with a very general class of problems in SQL, and exploring various techniques to find efficient solutions. In this first article, the problem is outlined and is divided into two subproblems, of which the first is solved here in several variant SQL statements with performance analysis. The second article, Holographic Set Matching in SQL, takes the most efficient method and applies two new techniques to further improve performance. The third article, Master-Detail Transaction Reconciliation in SQL (MDTM3), adds a sequence of subquery factors to the best solution for the first subproblem to achieve an efficient solution to the overall problem within a single SQL statement.

The General Problem

We consider a transaction with a two-level structure consisting of a header (or master) and lines (or details) linked to the header, and the problem is to pair off transactions by matching, or contra-matching, subsets of the fields at both header and line level. This kind of problem can arise in the context of reconciliation of debit and credit transactions where both transactions are in the same system but are entered separately by two different groups and have no explicit linkage. Typically, in order for the transactions to match at header level, several fields such as transaction class have to be equal, while others such as credit and debit amounts have to be inverse, while others again such as unique identifiers will not match. At the line level, the same applies and matched lines also need to pair off against each other for the transaction to be considered a match. The first part of the problem is to identify all matching (or contra-matching) pairs, and this article will focus on that, while the second (and optional) part, that of pairing off, will be the subject of a later article.

To see why performance might be an important issue for this type of problem, consider the number of possible comparisons for an example with 10,000 headers each having 100 lines. In this case there would be 100,000,000 pairs of transactions, counting both ways, and 50,000,000 counting one way. A similar calculation gives 10,000 (not 5,000!) line comparisons per header-pair, and hence 500,000,000,000 line comparisons in total. The key of course is to minimise the number of comparisons made explicitly.

We will solve the problem through a single SQL query which will be developed through several versions, using a test example problem based on Oracle standard tables. The queries will be tested using my own SQL benchmarking framework, mentioned in earlier articles, and performance characteristics analysed. This will illustrate some performance aspects of the use of subquery factors and temporary tables, among other things.

Matching and Contra-Matching Sets

In the ERD above, each transaction falls into a logical Match Status Set, where the sets are of four distinct types:

  • Unmatched – a single set for transactions having no matching or contra-matching transaction
  • Matched – a set for each group of mutually matching transactions
  • Contra-Matched A -?a set for each group of transactions that all contra-match to a corresponding B-set
  • Contra-Matched B -?a set for each group of transactions that all contra-match to a corresponding A-set

We may define our problem without contra-matching fields, in which case only the first two types of set will be present; we may also have the case where only contra-matching is possible (likely the most common); and a special case may arise where both matching and contra-matching fields are present but where all contra-matching fields may have self-inverse values (for example amounts of zero) and those records having only self-inverse values might be best regarded as falling into one of the first two types of set.

The Sample Problem – Tables and Foreign Key Constraints

We will use two of the Oracle database system views as the basis for our sample problem. The master entity will be the Oracle table defined in the view all_tables, and the detail entity will be the foreign key constraint contained as a subentity in the view all_constraints. The views themselves are very complicated and it is better for our purposes to copy their records into new tables, and for performance testing we’ll copy them multiple times according to the value of a dimensional parameter, using the parameter as a suffix on the owner and table name fields. The sample problem will involve matching only, and tables are defined to match if they have the same set of foreign key references, where the references are defined by the referenced owners and constraint names. As tables without foreign keys all match trivially, we’ll filter these out in the queries.

The table and constraint entities can be represented by the following ERD:

The tables are, with * marking primary keys:
tab_cp

  • owner*
  • table_name*
  • description

con_cp

  • owner*
  • constraint_name*
  • table_name
  • constraint_type
  • r_owner
  • r_constraint_name
  • description

Indexes are defined on the two foreign keys on con_cp:
con_tab_fk_N1

  • owner
  • table_name

con_con_fk_N2

  • r_owner
  • r_constraint_name

The embedded Excel file below gives the solution for my 11g XE database, for the first problem, of identifying all matches.

Solution Methods
This problem might be considered to divide into two subproblems. The first is to identify all the matching pairs, while the second is to take those matching pairs and eliminate duplicate instances, so that each master record matches against at most one other record. This may reduce the number of master records that have matches; for example, if a matching set has three master records, then only two of them will be matched, against each other, in the final solution. We will consider the first subproblem in this article and the second in a later article.

To find the solution to the first subproblem in SQL, the obvious approach is simply to join the master table to itself to form the set of possible matching pairs, then to apply criteria to filter out any pairs that don’t match. Obviously, we can immediately apply a constraint to avoid selecting the same pair twice by requiring that the rowid of the first record be higher than that of the second. This will halve the number of pairs considered, reducing the initial set of pairs from n! to n!/2 (where ! denotes the mathematical factorial function), and also halving the number after applying any other conditions.

Matching Detail Sets with MINUS Operator
The master-level criteria may be easy enough to apply, using conditions in the join clause, but the detail criteria are more difficult because we have to match two sets of records for any given pair of master records. This leads us to think of Oracle’s set operators, specifically the MINUS operator that subtracts one set from another. Consider the matching pair on line 4028 of the Excel file above, with he solution for our example problem. This shows a match between the two tables OEHR_EMPLOYEES and OEHR_JOB_HISTORY in the TWODAYPLUS_0 schema, each of which has three foreign keys. The three constraints on each of these tables reference the same keys in the same schema, namely DEPT_ID_PK, JOB_ID_PK, EMP_EMP_ID_PK. The following query returns no records:

SELECT r_owner,
       r_constraint_name
  FROM con_cp
 WHERE constraint_type = 'R'
   AND table_name = 'OEHR_EMPLOYEES'
   AND owner = 'TWODAYPLUS_0'
 MINUS
SELECT r_owner,
       r_constraint_name
  FROM con_cp
 WHERE constraint_type = 'R'
   AND table_name = 'OEHR_JOB_HISTORY'
   AND owner = 'TWODAYPLUS_0'

Perhaps then the detail set matching could be effected by a NOT EXISTS clause on the above query with the hard-coded owner and table_name replaced by correlation columns from the main query? There are two problems with this arising from the way Oracle’s set operators work. First, if there were any extra foreign keys in the second table the query would still return no records, as it returns only records that are in the first query section and not in the second, thus showing a false match. Second, Oracle views a set in this context as being the set of distinct records, so if some records are duplicated in either table, but differently from the other one then again a false match is shown. These two tables also exist in Oracle’s HR demo schema, without the OEHR_ prefix. In order to show the problem I added an extra field in each table with a foreign key matching one already present, as follows:

  • EMPLOYEES.REL_EMP_ID -> EMP_EMP_ID_PK
  • JOB_HISTORY.REL_JOB_ID -> JOB_ID_PK

Now the query above with new schema and table names still returns no records although in our terms the detail record sets are different: EMPLOYEES has set (DEPT_ID_PK, JOB_ID_PK, EMP_EMP_ID_PK, EMP_EMP_ID_PK), while JOB_HISTORY has set (DEPT_ID_PK, JOB_ID_PK, JOB_ID_PK, EMP_EMP_ID_PK). The solution to this problem is of course that we need to group the detail records by the matching fields and add a count, as follows, using our copied schema HR_0:

SELECT r_owner,
       r_constraint_name,
       Count(*)
  FROM con_cp
 WHERE constraint_type = 'R'
   AND table_name = 'EMPLOYEES'
   AND owner = 'HR_0'
 GROUP BY r_owner,
       r_constraint_name
 MINUS
SELECT r_owner,
       r_constraint_name,
       Count(*)
  FROM con_cp
 WHERE constraint_type = 'R'
   AND table_name = 'JOB_HISTORY'
   AND owner = 'HR_0'
 GROUP BY r_owner,
       r_constraint_name

This returns two records:

R_OWNER  R_CONSTRAINT_NAME    COUNT(*)
=======  =================    ========
HR_0     EMP_EMP_ID_PK        2
HR_0     JOB_ID_PK            1

As for the first problem, this can be solved in two ways, either by repeating the NOT EXISTS clause with the two sections reversed, or by ensuring separately that the two record sets have the same numbers of records – if they don’t they can’t match, and if they do then the MINUS operator works. Obviously the first solution is going to double the work involved, while the second incurs a cost associated with the counting process but that’s offset by avoidance of the NOT EXISTS execution.

Matching Detail Sets with nested NOT EXISTS Operator
If we consider the MINUS query above before we added grouping, it seems likely that Oracle would evaluate the outer NOT EXISTS by obtaining both record sets, then applying the MINUS opersator, before checking that no records are returned. This would seem inefficient since the outer condition fails if any single record is in the first set but not in the second, so one would want to truncate processing on finding a first such record. This suggests an alternative that might be more effficient, that uses another NOT EXISTS nested within the outer one, which would apply to the following subquery:

SELECT 1
  FROM con_cp c1
 WHERE c1.constraint_type = 'R'
   AND c1.table_name = 'OEHR_EMPLOYEES'
   AND c1.owner = 'TWODAYPLUS_0'
   AND NOT EXISTS (
SELECT 1
  FROM con_cp c2
 WHERE c2.constraint_type = 'R'
   AND c2.table_name = 'OEHR_JOB_HISTORY'
   AND c2.owner = 'TWODAYPLUS_0'
   AND c2.r_owner = c1.r_owner
   AND c2.r_constraint_name = c1.r_constraint_name
)

Here we have not included the grouping solution because it is complicated within this structure, but if the detail table were replaced by either a subquery factor or a temporary table where the grouping were already done, then (as we’ll see) this would work just by adding in an equality condition on the count fields. Again, if we know that the record counts are the same the reverse clause is unnecessary.

Pre-Matching Detail Sets by Aggregates
We noted above that the detail sets can only match if they have the same numbers of records, and that this could be used to avoid doing the set matching twice in opposite orders. We also noted that the work done in counting would be offset by the avoidance of expensive set matching for those pairs that don’t have matching counts. In fact, we can extend this idea to all possible aggregates on the detail record set matching fields, and this will likely result in fewer set matchings in the overall query execution. In our simple test problem we will add minimum and maximum aggregates on the r_constraint_name field, giving the following join conditions, prior to the set matching clause, and where tab represents a subquery factor that computes the aggregates:

  FROM tab                      t1
  JOIN tab                      t2
    ON t2.n_det                 = t1.n_det
   AND t2.min_det               = t1.min_det
   AND t2.max_det               = t1.max_det
   AND t2.row_id                > t1.row_id

Subquery Factors and Temporary Tables
Owing to the importance of aggregation at table level, as explained in the last section above, all query variations considered will include a subquery factor, tab, that does this aggregation. However, we have also noted the need to group and count at the level of detail records, and as this grouped record set needs to be used twice, for each member of a potential matching master pair, it would also seem an obvious candidate for a subquery factor. When we try this though, we’ll see that the query structure now precludes the use of indexes within the detail matching subquery and so we’ll also implement a query that uses a temporary table where the grouping and counting is done in advance.

Query Variations
We will test five query variations, as shown below, where MI and NE denote, respectively, the MINUS and NOT EXISTS methods of detail set matching.

  • INL_MI – Detail grouping directly
  • SQF_NE – Detail grouping in subquery factor
  • GTT_NE – Detail grouping in temporary table
  • GTT_NE_X – As GRP_GTT_NE but table-level count aggregation only
  • GTT_MI – As GRP_GTT_NE but with MINUS
************
INL_MI
************
  WITH tab AS (
SELECT t.owner,
       t.table_name,
       t.ROWID                   row_id,
       Count(c.ROWID)            n_det,
       Min (c.r_constraint_name) min_det,
       Max (c.r_constraint_name) max_det
  FROM tab_cp                    t
  JOIN con_cp                    c
    ON c.owner                   = t.owner
   AND c.table_name              = t.table_name
   AND c.constraint_type         = 'R'
 GROUP BY
        t.owner,
        t.table_name,
        t.ROWID
)
SELECT
       t1.owner                  owner_1,
       t1.table_name             table_name_1,
       t2.owner                  owner_2,
       t2.table_name             table_name_2,
       t1.n_det                  n_det
  FROM tab                       t1
  JOIN tab                       t2
    ON t2.n_det                  = t1.n_det
   AND t2.min_det                = t1.min_det
   AND t2.max_det                = t1.max_det
   AND t2.row_id                 > t1.row_id
 WHERE NOT EXISTS (
SELECT c2.r_owner,
       c2.r_constraint_name,
       Count(*)
  FROM con_cp                    c2
 WHERE c2.owner                  = t2.owner
   AND c2.table_name             = t2.table_name
   AND c2.constraint_type        = 'R'
 GROUP BY c2.r_owner,
       c2.r_constraint_name
MINUS
SELECT c1.r_owner,
       c1.r_constraint_name,
       Count(*)
  FROM con_cp                    c1
 WHERE c1.owner                  = t1.owner
   AND c1.table_name             = t1.table_name
   AND c1.constraint_type        = 'R'
 GROUP BY c1.r_owner,
       c1.r_constraint_name
)
 ORDER BY t1.owner,
       t1.table_name,
       t2.owner,
       t2.table_name

************
SQF_NE
************
  WITH det AS (
SELECT owner,
       table_name,
       r_owner,
       r_constraint_name,
       Count(*)                  n_dup
  FROM con_cp
 WHERE constraint_type           = 'R'
 GROUP BY owner,
       table_name,
       r_owner,
       r_constraint_name
), tab AS (
SELECT t.owner,
       t.table_name,
       t.ROWID                   row_id,
       Sum (c.n_dup)             n_det,
       Min (c.r_constraint_name) min_det,
       Max (c.r_constraint_name) max_det
  FROM tab_cp                    t
  JOIN det                       c
    ON c.owner                   = t.owner
   AND c.table_name              = t.table_name
 GROUP BY
        t.owner,
        t.table_name,
        t.ROWID
)
SELECT
       t1.owner                  owner_1,
       t1.table_name             table_name_1,
       t2.owner                  owner_2,
       t2.table_name             table_name_2,
       t1.n_det                  n_det
  FROM tab                       t1
  JOIN tab                       t2
    ON t2.n_det                  = t1.n_det
   AND t2.min_det                = t1.min_det
   AND t2.max_det                = t1.max_det
   AND t2.row_id                 > t1.row_id
 WHERE NOT EXISTS (
SELECT 1
  FROM det                       d1
 WHERE d1.owner                  = t1.owner
   AND d1.table_name             = t1.table_name
   AND (
   NOT EXISTS (
SELECT 1
  FROM det                       d2
 WHERE d2.owner                  = t2.owner
   AND d2.table_name             = t2.table_name
   AND d2.r_owner                = d1.r_owner
   AND d2.r_constraint_name      = d1.r_constraint_name
   AND d2.n_dup                  = d1.n_dup
)))
 ORDER BY t1.owner,
       t1.table_name,
       t2.owner,
       t2.table_name

************
GTT_NE
************
  WITH tab AS (
SELECT t.owner,
       t.table_name,
       t.ROWID                   row_id,
       Sum (c.n_con)             n_det,
       Min (c.r_constraint_name) min_det,
       Max (c.r_constraint_name) max_det
  FROM tab_cp                    t
  JOIN grp_gtt                   c
    ON c.owner                   = t.owner
   AND c.table_name              = t.table_name
 GROUP BY
        t.owner,
        t.table_name,
        t.ROWID
)
SELECT
       t1.owner                  owner_1,
       t1.table_name             table_name_1,
       t2.owner                  owner_2,
       t2.table_name             table_name_2,
       t1.n_det                  n_det
  FROM tab                       t1
  JOIN tab                       t2
    ON t2.n_det                  = t1.n_det
   AND t2.min_det                = t1.min_det
   AND t2.max_det                = t1.max_det
   AND t2.row_id                 > t1.row_id
 WHERE NOT EXISTS (
SELECT 1
  FROM grp_gtt                   d1
 WHERE d1.owner                  = t1.owner
   AND d1.table_name             = t1.table_name
   AND (
   NOT EXISTS (
SELECT 1
  FROM grp_gtt                   d2
 WHERE d2.owner                  = t2.owner
   AND d2.table_name             = t2.table_name
   AND d2.r_owner                = d1.r_owner
   AND d2.r_constraint_name      = d1.r_constraint_name
   AND d2.n_con                  = d1.n_con
)))
 ORDER BY t1.owner,
       t1.table_name,
       t2.owner,
       t2.table_name

************
GTT_NE_X
************
  WITH tab AS (
SELECT t.owner,
       t.table_name,
       t.ROWID                   row_id,
       Sum (c.n_con)             n_det
  FROM tab_cp                    t
  JOIN grp_gtt                   c
    ON c.owner                   = t.owner
   AND c.table_name              = t.table_name
 GROUP BY
        t.owner,
        t.table_name,
        t.ROWID
)
SELECT
       t1.owner                  owner_1,
       t1.table_name             table_name_1,
       t2.owner                  owner_2,
       t2.table_name             table_name_2,
       t1.n_det                  n_det
  FROM tab                       t1
  JOIN tab                       t2
    ON t2.n_det                  = t1.n_det
   AND t2.row_id                 > t1.row_id
 WHERE NOT EXISTS (
SELECT 1
  FROM grp_gtt                   d1
 WHERE d1.owner                  = t1.owner
   AND d1.table_name             = t1.table_name
   AND (
   NOT EXISTS (
SELECT 1
  FROM grp_gtt                   d2
 WHERE d2.owner                  = t2.owner
   AND d2.table_name             = t2.table_name
   AND d2.r_owner                = d1.r_owner
   AND d2.r_constraint_name      = d1.r_constraint_name
   AND d2.n_con                  = d1.n_con
)))
 ORDER BY t1.owner,
       t1.table_name,
       t2.owner,
       t2.table_name

************
GTT_MI
************
  WITH tab AS (
SELECT t.owner,
       t.table_name,
       t.ROWID                   row_id,
       Sum (c.n_con) n_det,
       Min (c.r_constraint_name) min_det,
       Max (c.r_constraint_name) max_det
  FROM tab_cp                    t
  JOIN grp_gtt                   c
    ON c.owner                   = t.owner
   AND c.table_name              = t.table_name
 GROUP BY
        t.owner,
        t.table_name,
        t.ROWID
)
SELECT
       t1.owner                  owner_1,
       t1.table_name             table_name_1,
       t2.owner                  owner_2,
       t2.table_name             table_name_2,
       t1.n_det                  n_det
  FROM tab                       t1
  JOIN tab                       t2
    ON t2.n_det                  = t1.n_det
   AND t2.min_det                = t1.min_det
   AND t2.max_det                = t1.max_det
   AND t2.row_id                 > t1.row_id
 WHERE NOT EXISTS (
SELECT d2.r_owner,
       d2.r_constraint_name,
       d2.n_con
  FROM grp_gtt                   d2
 WHERE d2.owner                  = t2.owner
   AND d2.table_name             = t2.table_name
MINUS
SELECT d1.r_owner,
       d1.r_constraint_name,
       d1.n_con
  FROM grp_gtt                   d1
 WHERE d1.owner                  = t1.owner
   AND d1.table_name             = t1.table_name
)
 ORDER BY t1.owner,
       t1.table_name,
       t2.owner,
       t2.table_name

The query structure diagrams (QSDs) are in the embedded Excel file below:

Performance Analysis

We presented five query variations above, and in this section give the results of benchmarking these queries across a 1-dimensional data domain obtained by copying the system views 1, 2 and 4 times into my test tables described above. The problem sizes are as follows:
Record Counts

Timings and Statistics
Click on the query name in the file below to jump to the execution plan for the largest data point.

Comparison

The timings above are only for the main queries, so we need also to consider the time to populate and delete the temporary table, for the three GTT queries. This is performed as part of the data point setup, and the framework prints out timings for this part separately. For the last data point, the output was:

5144 records inserted in grp_gtt
8956 tables, 44780 constraints, 5 c/t

Timer Set: Setup, Constructed at 09 Oct 2012 22:28:49, written at 22:29:04
==========================================================================
[Timer timed: Elapsed (per call): 0.05 (0.000047), CPU (per call): 0.05 (0.000050), calls: 1000, '***' denotes corrected line below]

Timer                  Elapsed          CPU          Calls        Ela/Call        CPU/Call
-----------------   ----------   ----------   ------------   -------------   -------------
Delete test data          3.33         2.78              1         3.33200         2.78000
Delete GTT                0.18         0.17              1         0.18000         0.17000
Insert tab                0.55         0.44              1         0.54500         0.44000
Insert con                7.89         5.94              1         7.89000         5.94000
Insert grp_gtt            0.14         0.14              1         0.14000         0.14000
Count records             0.02         0.01              1         0.01600         0.01000
Gather statistics         2.59         2.06              1         2.58900         2.06000
(Other)                   0.00         0.00              1         0.00000         0.00000
-----------------   ----------   ----------   ------------   -------------   -------------
Total                    14.69        11.54              8         1.83650         1.44250
-----------------   ----------   ----------   ------------   -------------   -------------

The elapsed times for deleting from, then inserting into the temporary table are given by the ‘Delete GTT’ and ‘Insert grp_gtt’ timers and add up to 0.32s, so do not make much difference (about 5% on the best and less on the others). The following points can be made:

  • Doing the detail grouping and counting directly gives the worst performance
  • Moving the detail grouping and counting into a subquery factor improves performance by a factor of about 4
  • Moving the detail grouping into a temporary table improves performance the most
  • Using NOT EXISTS instead of MINUS for detail matching improves performance, as expected, by a factor of about 2
  • Using the minimum and maximum aggregates for pre-filtering, in addition to the counts, improves performance by a factor of about 20

Subquery Factors and Temporary Tables
If you look at the execution plan for INL_MI, most of the work is done in the HASH GROUP BY steps, 16 and 20, on totals of 127K records each. Moving this grouping into a subquery factor in SQF_NE means that the operation is done only once rather than many times (110K), and the execution plan for SUBQ_NE shows that it takes very little time (line 3).

However, the execution plan for SUBQ_NE shows (lines 14-22) that the factors are read using full scans, because indexes are not possible. This observation led to the improvement of moving the grouping out of the query altogether and into a separate stage that populates a temporary table, on which indexes can be defined. Lines 15-18 in the plan for GTT_NE show the access is now by index on the detail records.

Memory Usage in INL_MI Query with Grouping
Originally, I tried to have a larger range of data points, but doubling the size again always resulted in an Oracle error on INL_MI, ORA-04030: out of process memory when trying to allocate 123404 bytes (QERGH hash-agg,kllcqas:kllsltba). This is surprising because the execution plan statistics include memory statistics that appear to indicate that all queries use the same maximum amount of memory, which is just over 3MB, incurred in the SORT ORDER BY step (e.g. line 7 below).

My framework also collects statistics from the system view v$mystat, and prints out those showing large variations in ‘after minus before’ differences across the queries. The framework printed the statistic ‘session pga memory’ and this tells a different story (the values are in the embedded Excel files under Statistics above). The INL_MI query shows increases of 14MB, then 170MB, then 768MB approx. while the other queries all show no increases. It’s hard to understand what’s going on here, but one guess is that the query is revealing an Oracle bug that causes memory not to be released after use and then re-used, but for new executions of the relevant operation to request new memory, and that the execution plans are not reporting this. However, as discussed later, the variation in execution time with problem size is also difficult to understand and suggests that the HASH GROUP BY operations are really being performed on the entire record sets, which would also greatly increase the memory usage. The version, running under Windows 7 is: Oracle Database 11g Express Edition Release 11.2.0.2.0 – Beta

Execution Plan Hash Values
In my last article, I was able to easily identify the distinct plans by looking at the matrix of plan hash values, with values the same indicating the same plan. This time though, that doesn’t work: all hash values are different, but in fact, inspection of the plans shows that for each query there was essentially only one plan. I believe this may be due to the subquery factors, which result in a system-generated view name, which differs each time. For example, here are the last two plans for the INL_MI, where the only difference appears to be in the view names (SYS_TEMP_0FD9D6681_1B0CFB4 for W2 and SYS_TEMP_0FD9D6687_1B0CFB4 for W4) (note that different statistics don’t make the plans different):

Point W2:

Plan hash value: 89269728

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                       | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                            |      1 |        |   8054 |00:01:40.96 |     191K|     21 |     21 |       |       |          |
|   1 |  TEMP TABLE TRANSFORMATION        |                            |      1 |        |   8054 |00:01:40.96 |     191K|     21 |     21 |       |       |          |
|   2 |   LOAD AS SELECT                  |                            |      1 |        |      0 |00:00:00.05 |    3182 |      0 |     21 |   264K|   264K|  264K (0)|
|   3 |    HASH GROUP BY                  |                            |      1 |   2606 |   1628 |00:00:00.05 |    3158 |      0 |      0 |   766K|   766K| 1273K (0)|
|   4 |     NESTED LOOPS                  |                            |      1 |   2606 |   2606 |00:00:00.05 |    3158 |      0 |      0 |       |       |          |
|*  5 |      TABLE ACCESS FULL            | CON_CP                     |      1 |   2606 |   2606 |00:00:00.01 |     625 |      0 |      0 |       |       |          |
|*  6 |      INDEX UNIQUE SCAN            | TCP_PK                     |   2606 |      1 |   2606 |00:00:00.02 |    2533 |      0 |      0 |       |       |          |
|   7 |   SORT ORDER BY                   |                            |      1 |      1 |   8054 |00:01:40.89 |     188K|     21 |      0 |  1824K|   650K| 1621K (0)|
|*  8 |    FILTER                         |                            |      1 |        |   8054 |00:01:24.17 |     188K|     21 |      0 |       |       |          |
|*  9 |     HASH JOIN                     |                            |      1 |      1 |  27242 |00:00:00.15 |      47 |     21 |      0 |   720K|   720K| 1282K (0)|
|  10 |      VIEW                         |                            |      1 |   2606 |   1628 |00:00:00.01 |      25 |     21 |      0 |       |       |          |
|  11 |       TABLE ACCESS FULL           | SYS_TEMP_0FD9D6681_1B0CFB4 |      1 |   2606 |   1628 |00:00:00.01 |      25 |     21 |      0 |       |       |          |
|  12 |      VIEW                         |                            |      1 |   2606 |   1628 |00:00:00.01 |      22 |      0 |      0 |       |       |          |
|  13 |       TABLE ACCESS FULL           | SYS_TEMP_0FD9D6681_1B0CFB4 |      1 |   2606 |   1628 |00:00:00.01 |      22 |      0 |      0 |       |       |          |
|  14 |     MINUS                         |                            |  27242 |        |  19188 |00:01:40.05 |     188K|      0 |      0 |       |       |          |
|  15 |      SORT UNIQUE                  |                            |  27242 |      1 |  28722 |00:00:53.68 |   73872 |      0 |      0 |  2048 |  2048 | 2048  (0)|
|  16 |       HASH GROUP BY               |                            |  27242 |      1 |  31314 |00:00:45.36 |   73872 |      0 |      0 |   750K|   750K|  610K (0)|
|* 17 |        TABLE ACCESS BY INDEX ROWID| CON_CP                     |  27242 |      1 |  31335 |00:00:01.57 |   73872 |      0 |      0 |       |       |          |
|* 18 |         INDEX RANGE SCAN          | CON_TAB_FK_N1              |  27242 |      1 |    175K|00:00:01.26 |   42504 |      0 |      0 |       |       |          |
|  19 |      SORT UNIQUE                  |                            |  27242 |      1 |  30502 |00:00:45.94 |     114K|      0 |      0 |  2048 |  2048 | 2048  (0)|
|  20 |       HASH GROUP BY               |                            |  27242 |      1 |  31314 |00:00:37.86 |     114K|      0 |      0 |   750K|   750K|  910K (0)|
|* 21 |        TABLE ACCESS BY INDEX ROWID| CON_CP                     |  27242 |      1 |  31335 |00:00:01.64 |     114K|      0 |      0 |       |       |          |
|* 22 |         INDEX RANGE SCAN          | CON_TAB_FK_N1              |  27242 |      1 |    183K|00:00:01.29 |   83068 |      0 |      0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("C"."CONSTRAINT_TYPE"='R')
   6 - access("C"."OWNER"="T"."OWNER" AND "C"."TABLE_NAME"="T"."TABLE_NAME")
   8 - filter( IS NULL)
   9 - access("T2"."N_DET"="T1"."N_DET" AND "T2"."MIN_DET"="T1"."MIN_DET" AND "T2"."MAX_DET"="T1"."MAX_DET")
       filter("T2"."ROW_ID">"T1"."ROW_ID")
  17 - filter("C2"."CONSTRAINT_TYPE"='R')
  18 - access("C2"."OWNER"=:B1 AND "C2"."TABLE_NAME"=:B2)
  21 - filter("C1"."CONSTRAINT_TYPE"='R')
  22 - access("C1"."OWNER"=:B1 AND "C1"."TABLE_NAME"=:B2)

Point W4:

Plan hash value: 892071883

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                       | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                            |      1 |        |  16108 |00:25:33.98 |     788K|     42 |     42 |       |       |          |
|   1 |  TEMP TABLE TRANSFORMATION        |                            |      1 |        |  16108 |00:25:33.98 |     788K|     42 |     42 |       |       |          |
|   2 |   LOAD AS SELECT                  |                            |      1 |        |      0 |00:00:00.10 |    5802 |      0 |     42 |   521K|   521K|  521K (0)|
|   3 |    HASH GROUP BY                  |                            |      1 |   5007 |   3256 |00:00:00.09 |    5757 |      0 |      0 |  1001K|   943K| 1273K (0)|
|   4 |     NESTED LOOPS                  |                            |      1 |   5007 |   5212 |00:00:00.09 |    5757 |      0 |      0 |       |       |          |
|*  5 |      TABLE ACCESS FULL            | CON_CP                     |      1 |   4980 |   5212 |00:00:00.01 |     625 |      0 |      0 |       |       |          |
|*  6 |      INDEX UNIQUE SCAN            | TCP_PK                     |   5212 |      1 |   5212 |00:00:00.04 |    5132 |      0 |      0 |       |       |          |
|   7 |   SORT ORDER BY                   |                            |      1 |      1 |  16108 |00:25:33.84 |     782K|     42 |      0 |  3596K|   822K| 3196K (0)|
|*  8 |    FILTER                         |                            |      1 |        |  16108 |00:22:30.61 |     782K|     42 |      0 |       |       |          |
|*  9 |     HASH JOIN                     |                            |      1 |      1 |    110K|00:00:00.62 |      89 |     42 |      0 |   900K|   900K| 1328K (0)|
|  10 |      VIEW                         |                            |      1 |   5007 |   3256 |00:00:00.02 |      46 |     42 |      0 |       |       |          |
|  11 |       TABLE ACCESS FULL           | SYS_TEMP_0FD9D6687_1B0CFB4 |      1 |   5007 |   3256 |00:00:00.01 |      46 |     42 |      0 |       |       |          |
|  12 |      VIEW                         |                            |      1 |   5007 |   3256 |00:00:00.03 |      43 |      0 |      0 |       |       |          |
|  13 |       TABLE ACCESS FULL           | SYS_TEMP_0FD9D6687_1B0CFB4 |      1 |   5007 |   3256 |00:00:00.02 |      43 |      0 |      0 |       |       |          |
|  14 |     MINUS                         |                            |    110K|        |  94488 |00:25:29.91 |     782K|      0 |      0 |       |       |          |
|  15 |      SORT UNIQUE                  |                            |    110K|      1 |    113K|00:14:20.41 |     300K|      0 |      0 |  2048 |  2048 | 2048  (0)|
|  16 |       HASH GROUP BY               |                            |    110K|      1 |    127K|00:11:47.70 |     300K|      0 |      0 |   789K|   789K|  527K (0)|
|* 17 |        TABLE ACCESS BY INDEX ROWID| CON_CP                     |    110K|      1 |    127K|00:00:08.15 |     300K|      0 |      0 |       |       |          |
|* 18 |         INDEX RANGE SCAN          | CON_TAB_FK_N1              |    110K|      1 |    722K|00:00:06.55 |     156K|      0 |      0 |       |       |          |
|  19 |      SORT UNIQUE                  |                            |    110K|      1 |    123K|00:11:07.57 |     481K|      0 |      0 |  2048 |  2048 | 2048  (0)|
|  20 |       HASH GROUP BY               |                            |    110K|      1 |    127K|00:09:52.37 |     481K|      0 |      0 |   789K|   789K|  907K (0)|
|* 21 |        TABLE ACCESS BY INDEX ROWID| CON_CP                     |    110K|      1 |    127K|00:00:08.37 |     481K|      0 |      0 |       |       |          |
|* 22 |         INDEX RANGE SCAN          | CON_TAB_FK_N1              |    110K|      1 |    735K|00:00:06.31 |     337K|      0 |      0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - filter("C"."CONSTRAINT_TYPE"='R')
   6 - access("C"."OWNER"="T"."OWNER" AND "C"."TABLE_NAME"="T"."TABLE_NAME")
   8 - filter( IS NULL)
   9 - access("T2"."N_DET"="T1"."N_DET" AND "T2"."MIN_DET"="T1"."MIN_DET" AND "T2"."MAX_DET"="T1"."MAX_DET")
       filter("T2"."ROW_ID">"T1"."ROW_ID")
  17 - filter("C2"."CONSTRAINT_TYPE"='R')
  18 - access("C2"."OWNER"=:B1 AND "C2"."TABLE_NAME"=:B2)
  21 - filter("C1"."CONSTRAINT_TYPE"='R')
  22 - access("C1"."OWNER"=:B1 AND "C1"."TABLE_NAME"=:B2)

Performance Variation Polynomials
The timings above show that CPU and elapsed times increased by different powers of the problem size increases, according to query.

The inline grouping query INL_MI shows a variation close to the fourth power, which like its memory usage, is very hard to understand. Most of the time is used in the HASH GROUP BY operations at lines 16 and 20, and it rises about 16 times betwen W2 and W4. The numbers of starts rise by 4 times, as expected, but the number of rows per start remains constant at about 1.15, so the work done should rise by about 4 times. It’s almost as though the SQL engine is really processing the entire record set in the HASH GROUP BY, rather than just the subset for the correlated tables, contrary to what the plan says. Again, this looks buggy.

The double subquery factor query SUBQ_NE has about a cubic variation, which is plausible because the table pairing introduces a quadratic term, with the third power coming from the full scans of the detail subquery factor.

All three of the temporary table queries show quadratic variation, which is likely the best obtainable while matching sets directly (but see my next article Holographic Set Matching in SQL for linear solutions bypassing set matching), and arises from the table pairing, but with the details for each pair being constant in size and accessed via indexes. It’s worth noting that the query GTT_NE_X is actually slower than INL_MI and SUB_NE on the smallest data point, but much quicker on the largest, showing the importance of polynomial order for scalability.

Conclusions

  • We have shown how to solve master-detail transaction matching problems efficiently, using an example problem, but emphasising the generality of the techniques
  • Appropriate use of subquery factors and temporary tables have been demonstrated, with performance analysis
  • It’s worth highlighting the technique of pre-filtering on aggregates before comparing sets in detail
  • The importance for scalability of performance variation powers has been illustrated, being revealed by dimensional benchmarking
  • On finishing this article it occurred to me to wonder whether it might not be possible to use aggregate matching to replace detail set matching altogether, and at least in some cases it is, with linear performance resulting, described in my next article, Holographic Set Matching in SQL (MDTM2)






List Aggregation in Oracle – Comparing Three Methods

In my last article, Grouping by Unique Subsequences in Oracle, I compared three solutions to a querying problem in Oracle. I found that a solution using a pipelined function was fastest across a range of test data sets, while another using Oracle’s Model clause turned out to be extremely inefficient, and very unscaleable owing to quadratic variation in execution times.

I mentioned in the article the issue of possible poor cardinality estimates by Oracle’s Cost-Based Optimiser (CBO) for pipelined functions, referencing an article by Adrian Billington, setting cardinality for pipelined and table functions, that considers four techniques for improving these estimates.

In this article, I want to use another, very common, querying problem, to see in more detail how one of these techniques works, namely the dynamic sampling hint, and to compare the performance again of pipelined functions against the Model clause on a second example amenable to both methods.

In previous articles I have generally focussed on elapsed and CPU times to measure performance, but Oracle provides a range of metrics in instrumenting query execution. My benchmarking framework captures many of these, and we’ll use them to try to understand the performance variation, again using a 2-dimensional domain of test data. We also capture the execution plans used across the domain and display visually the changes across the domain.

The problem is that of list aggregation, and various solution methods are available depending on one’s Oracle version. In Oracle v11.2 a specific built-in function has been provided, Listagg, and Adrian Billington has compared it with earlier SQL techniques (listagg function in 11g release 2), including a Model solution. He looks only at pure SQL solutions, but we will take both the Model and Listagg solutions, and add in a pipelined function solution. We’ll take a simple test problem using Oracle’s demo HR schema, which will be: Return, for each department, its id, manager name, and a comma-separated, ordered list of its employee names.

Test Data
The HR tables employees and departments were copied structurally to test versions with _t suffixes and records were inserted programmatically by the performance testing packages.

ERD

Listagg Solution
How It Works
The first solution for this problem uses the aggregation function ListAgg, which is new in Oracle v11.2. The query groups employees by department, joins departments to get the name, and employees again to get the manager’s name.

Query Diagram

SQL

SELECT e.department_id,
       m.last_name manager,
       ListAgg (e.last_name, ',') WITHIN GROUP (ORDER BY e.last_name) emp_names
  FROM employees_t e
  JOIN departments_t d
    ON d.department_id = e.department_id
  JOIN employees_t m
    ON m.employee_id = d.manager_id
 GROUP BY 
       e.department_id,
       m.last_name
 ORDER BY e.department_id

Model Solution
How It Works

  1. Within an inline view, form the basic Select, with the department_id column, and append placeholders for the employee list and row number
  2. Add the Model keyword, partitioning by department_id, dimensioning by analytic function Row_Number, ordering by name within department, with name and name list as measures
  3. Define the only rule to prepend the list from the next element with the current name, going backwards (so the first record will be the one you want)
  4. Join the other tables to the inline view in the main query, strip off the last ‘,’, and filter out all except the first record for the department

Query Diagram

SQL

SELECT v.department_id, 
       e.last_name manager,
       RTrim (v.emp_names, ',') emp_names  
  FROM (
SELECT department_id,
       emp_names,
       rn
  FROM employees_t 
    MODEL  
      PARTITION BY (department_id)  
      DIMENSION BY (Row_Number() OVER 
                       (PARTITION BY department_id ORDER BY last_name) rn)
      MEASURES (last_name, CAST(NULL AS VARCHAR2(4000)) emp_names) 
      RULES (
        emp_names[ANY] ORDER BY rn DESC = last_name[CV()] || ',' || emp_names[CV()+1] 
      ) 
) v
  JOIN departments_t d
    ON d.department_id = v.department_id
  JOIN employees_t e
    ON e.employee_id = d.manager_id
 WHERE v.rn = 1 
 ORDER BY v.department_id

Pipelined Function Solution
How It Works
This approach is based on pipelined database functions, which are specified to return array types. Pipelining means that Oracle transparently returns the records in batches while processing continues, thus avoiding memory problems, and returning initial rows more quickly. Within the function there is a simple cursor loop over the employees, joining departments to get the manager id. A string variable accumulates the list of employees, until the department changes, when the record is piped out, and the string reset to the new employee. The last record has to be piped after exiting the loop.

Types
Two database types are specified, the first being an object with fields for the department and manager ids and the employee name list; the second is an array of the nested table form with elements of the first type.

Function Pseudocode

Loop over a cursor selecting the records in order 
    If the department changes or first record then
        If the department changes then
            Pipe the row out
        End if
        Reset variables to current record values
    Else
        Append the current employee name to the name list
    End if
End loop
If the last department is not null then
    Pipe the row out using saved values
End if

SQL
Just select the fields named in the record from the function wrapped in the TABLE keyword, and join employees to get the manager name.

Query Diagram

Function Definition (within package)

FUNCTION Dep_Emps RETURN dep_emps_list_type PIPELINED IS

  l_emp_names	        VARCHAR2(4000);
  old_manager_id        PLS_INTEGER;
  old_department_id     PLS_INTEGER;

BEGIN

  FOR r_val IN (SELECT e.department_id, d.manager_id, e.last_name 
                  FROM employees_t e
                  JOIN departments_t d
                    ON d.department_id = e.department_id
                 ORDER BY e.department_id, e.last_name) LOOP

    IF r_val.department_id != old_department_id OR old_department_id IS NULL THEN

      IF r_val.department_id != old_department_id THEN

        PIPE ROW (dep_emps_type (r_val.department_id, r_val.manager_id, l_emp_names));

      END IF;
      old_department_id := r_val.department_id;
      old_manager_id := r_val.manager_id;
      l_emp_names := r_val.last_name;

    ELSE

      l_emp_names := l_emp_names || ',' || r_val.last_name;

    END IF;

  END LOOP;

  IF old_department_id IS NOT NULL THEN
    PIPE ROW (dep_emps_type (old_department_id, old_manager_id, l_emp_names));
  END IF;

END Dep_Emps;

SQL

SELECT d.department_id,
       e.last_name manager,
       d.emp_names
  FROM TABLE (Stragg.Dep_Emps) d
  JOIN employees_t e
    ON e.employee_id = d.manager_id
 ORDER BY d.department_id

Performance Analysis
As in the previous article, I have benchmarked across a 2-dimensional domain, in this case width being the number of employees per department, and depth the number of departments. For simplicity, a single department, ‘Accounting’, and employee, ‘John Chen’, were used as templates and inserted repeatedly with suffixes on names and new ids.

The three queries above were run on all data points, and in addition the function query was run with a hint: DYNAMIC_SAMPLING (d 5).

Record Counts (total employees and employees per department)

Timings

In the embedded Excel file above, the four solutions are labelled as follows:

  • F = Pipelined Function solution
  • D = Pipelined Function with Dynamic Sampling hint solution
  • L = Listagg solution
  • M = Model solution

For the data points, font colour and fill colour signify:

  • Fill colours correspond to distinct execution plans whose hash values can be found later in the same tab. The formatted outputs can be found in the Plans tab for all distinct plans for selected data points (with hyperlinks)
  • White font signifies that the smallest elapsed time for the given data point occurred for the given solution, for all the tables except CPU time
  • For the CPU time table higher in the tab, white font signifies that the smallest CPU time for the given data point occurred for the given solution
  • Red font indicates that the solution incurred more than 500 disk reads (see the disk reads table later in the tab)

Graphs

Comparison
The CPU time for all four queries increases approximately in proportion with either width or depth dimension when the other is fixed, which is not surprising. The elapsed times are very similar to the CPU times for all except the larger problems using Model, which we’ll discuss in a later section. For the largest data point, the times rank in the following order:

The following points can be made:

  • The function query without the dynamic sampling hint was fastest for the largest data point, and by a significant margin over the next best, Listagg
  • The earlier detailed tables show that this was also true for the triangle of data points starting three points back in each dimension, indicating that this is the case once the problem size gets large enough
  • Similarly, the function query with dynamic sampling was in third place for all these larger problems
  • Model was always the slowest query, generally by a factor of about 8 over the fastest in terms of CPU time, but very much worse in elapsed time for problems above a certain size, and we’ll look at this in more detail next.

Model Performance Discontinuity
In Adrian Billington’s article mentioned above (listagg function in 11g release 2), he took a single large-ish data point for his problem and found that his Model query took 308 seconds compared with 6 seconds for Listagg. He puts the Model performance down to ‘an enormous number of direct path reads/writes to/from the temporary tablespace‘. In my last article, I also quoted the anonymous author of MODEL Performance Tuning: ‘In some cases MODEL query performance can even be so poor that it renders the query unusable. One of the keys to writing efficient and scalable MODEL queries seems to be keeping session memory use to a minimum’.

My benchmarking framework records the metrics from the view v$sql_plan_stats_all, which is used by DBMS_XPlan.Display_Cursor to write the execution plan, and prints to a CSV file aggregates over the plan of several of them, for example: Max (last_disk_reads). These are are shown for the Model solution in the tab displayed below of the embedded Excel file (the next tab has 3-d graphs but they may not display in a browser).

The tables show that memory increases with the problem size in each direction up to a maximum, at which points the number of disk reads jumps from a very low level and then rises with problem size. The maximum memory points have red font. These transitions represent discontinuities where there is a jump in the elapsed times, although CPU times continue to rise smoothly. So while Model is always slower than the other solutions, its much greater use of memory causes the discrepancy to increase dramatically when the processing spills to disk, which is consistent with, and extends, the observations of the authors mentioned.

Model and Listagg Cardinality Estimates
Here is the execution plan for the last data point:

----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem | Used-Tmp|
----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |               |      1 |        |   1024 |00:01:25.46 |    5798 |    164K|    164K|       |       |          |         |
|   1 |  SORT ORDER BY          |               |      1 |    260K|   1024 |00:01:25.46 |    5798 |    164K|    164K|  2320K|   704K| 2062K (0)|         |
|*  2 |   HASH JOIN             |               |      1 |    260K|   1024 |00:01:20.86 |    5798 |    164K|    164K|   909K|   909K| 1230K (0)|         |
|*  3 |    HASH JOIN            |               |      1 |   1024 |   1024 |00:00:00.11 |    2901 |      0 |      0 |   935K|   935K| 1228K (0)|         |
|   4 |     TABLE ACCESS FULL   | DEPARTMENTS_T |      1 |   1024 |   1024 |00:00:00.01 |       6 |      0 |      0 |       |       |          |         |
|   5 |     TABLE ACCESS FULL   | EMPLOYEES_T   |      1 |    260K|    262K|00:00:00.40 |    2895 |      0 |      0 |       |       |          |         |
|*  6 |    VIEW                 |               |      1 |    260K|   1024 |00:01:20.68 |    2897 |    164K|    164K|       |       |          |         |
|   7 |     BUFFER SORT         |               |      1 |    260K|    262K|00:01:19.56 |    2897 |    164K|    164K|   297M|  5726K|   45M (0)|     265K|
|   8 |      SQL MODEL ORDERED  |               |      1 |    260K|    262K|01:28:30.86 |    2895 |    131K|    131K|  1044M|    28M|   51M (1)|         |
|   9 |       WINDOW SORT       |               |      1 |    260K|    262K|00:00:01.01 |    2895 |      0 |      0 |  7140K|  1067K| 6346K (0)|         |
|  10 |        TABLE ACCESS FULL| EMPLOYEES_T   |      1 |    260K|    262K|00:00:00.50 |    2895 |      0 |      0 |       |       |          |         |
----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("D"."DEPARTMENT_ID"="V"."DEPARTMENT_ID")
   3 - access("E"."EMPLOYEE_ID"="D"."MANAGER_ID")
   6 - filter("V"."RN"=1)

Notice that at line 6 the cardinality estimate is 260K while the actual rows returned was 1024: The CBO has simply ignored the filtering down to department level by row number, and assumed the number of employees! We may ask whether this mis-estimate has affected the subsequent plan. Well the result set at line 6 makes the second step in a hash join to another row set of actual cardinality 1024, so probably it has not made a significant difference in this case. It’s worth comparing the execution plan for Listagg:

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |               |      1 |        |   1024 |00:00:02.11 |    5796 |       |       |          |
|   1 |  SORT GROUP BY       |               |      1 |    185K|   1024 |00:00:02.11 |    5796 |    15M|  1999K|   14M (0)|
|*  2 |   HASH JOIN          |               |      1 |    260K|    262K|00:00:02.17 |    5796 |   921K|   921K| 1195K (0)|
|*  3 |    HASH JOIN         |               |      1 |   1024 |   1024 |00:00:00.11 |    2901 |   935K|   935K| 1229K (0)|
|   4 |     TABLE ACCESS FULL| DEPARTMENTS_T |      1 |   1024 |   1024 |00:00:00.01 |       6 |       |       |          |
|   5 |     TABLE ACCESS FULL| EMPLOYEES_T   |      1 |    260K|    262K|00:00:00.41 |    2895 |       |       |          |
|   6 |    TABLE ACCESS FULL | EMPLOYEES_T   |      1 |    260K|    262K|00:00:00.46 |    2895 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("D"."DEPARTMENT_ID"="E"."DEPARTMENT_ID")
   3 - access("M"."EMPLOYEE_ID"="D"."MANAGER_ID")

Notice that the cardinality estimates are accurate apart from at line 1 for the Sort Group By, where a similar error has been made in not allowing for the reduction in rows caused by the grouping. Again it doesn’t seem to have affected the plan adversely.

Memory Usage and Buffers
A few observations can be made about these statistics:

  • Buffers is sometimes seen as a good, fundamental measure of performance, but we can see that both function solutions have figures of about 9K, while the other two have very similar figures of about 6K, and these do not correlate well with actual time performance here
  • The buffers figure for Model at the minimum data point is more than half that for the maximum, unlike the other solutions where it is 1-3%. The ratio of records is only 0.2%, so this is hard to understand
  • Similarly, the memory usage for Model starts very high, 3.1M, before rising to its maximum of 54M. For pipelined functions the figures go from 6K to 2.8M, and for Listagg from 29K to 15M

Dynamic Sampling Effects
Here is the execution plan for the pipelined function solution at point (W256-D512), with my own timing output from ‘Timer Set…’ on:

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |             |      1 |        |    512 |00:00:00.65 |    6097 |       |       |          |
|   1 |  SORT ORDER BY                      |             |      1 |   8168 |    512 |00:00:00.65 |    6097 |  1186K|   567K| 1054K (0)|
|*  2 |   HASH JOIN                         |             |      1 |   8168 |    512 |00:00:00.62 |    6097 |  1520K|   901K| 1706K (0)|
|   3 |    COLLECTION ITERATOR PICKLER FETCH| DEP_EMPS    |      1 |   8168 |    512 |00:00:00.54 |    3202 |       |       |          |
|   4 |    TABLE ACCESS FULL                | EMPLOYEES_T |      1 |    130K|    131K|00:00:00.20 |    2895 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("E"."EMPLOYEE_ID"=VALUE(KOKBF$))

Timer Set: Cursor, Constructed at 23 Sep 2012 07:49:43, written at 07:49:45
===========================================================================
[Timer timed: Elapsed (per call): 0.05 (0.000045), CPU (per call): 0.04 (0.000040), calls: 1000, '***' denotes corrected line below]

Timer                  Elapsed          CPU          Calls        Ela/Call        CPU/Call
-----------------   ----------   ----------   ------------   -------------   -------------
Open cursor               0.01         0.00              1         0.01000         0.00000
First fetch               0.65         0.63              1         0.64700         0.63000
Write to file             0.06         0.06              2         0.03050         0.03000
Remaining fetches         0.00         0.00              1         0.00000         0.00000
Write plan                0.64         0.61              1         0.64200         0.61000
(Other)                   0.11         0.05              1         0.11400         0.05000
-----------------   ----------   ----------   ------------   -------------   -------------
Total                     1.47         1.35              7         0.21057         0.19286
-----------------   ----------   ----------   ------------   -------------   -------------

Here is the output for the pipelined function solution with dynamic sampling at the same point (W256-D512):

-----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |             |      1 |        |    512 |00:00:00.59 |    4228 |       |       |          |
|   1 |  SORT ORDER BY                       |             |      1 |    512 |    512 |00:00:00.59 |    4228 |  1186K|   567K| 1054K (0)|
|   2 |   NESTED LOOPS                       |             |      1 |        |    512 |00:00:00.59 |    4228 |       |       |          |
|   3 |    NESTED LOOPS                      |             |      1 |    512 |    512 |00:00:00.58 |    3716 |       |       |          |
|   4 |     COLLECTION ITERATOR PICKLER FETCH| DEP_EMPS    |      1 |    512 |    512 |00:00:00.56 |    3202 |       |       |          |
|*  5 |     INDEX UNIQUE SCAN                | EMP_PK      |    512 |      1 |    512 |00:00:00.01 |     514 |       |       |          |
|   6 |    TABLE ACCESS BY INDEX ROWID       | EMPLOYEES_T |    512 |      1 |    512 |00:00:00.01 |     512 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("E"."EMPLOYEE_ID"=VALUE(KOKBF$))

Note
-----
   - dynamic sampling used for this statement (level=2)

Timer Set: Cursor, Constructed at 23 Sep 2012 07:49:45, written at 07:49:47
===========================================================================
[Timer timed: Elapsed (per call): 0.05 (0.000045), CPU (per call): 0.05 (0.000050), calls: 1000, '***' denotes corrected line below]

Timer                  Elapsed          CPU          Calls        Ela/Call        CPU/Call
-----------------   ----------   ----------   ------------   -------------   -------------
Open cursor               0.55         0.55              1         0.54500         0.55000
First fetch               0.59         0.57              1         0.59300         0.57000
Write to file             0.06         0.06              2         0.03100         0.03000
Remaining fetches         0.00         0.00              1         0.00000         0.00000
Write plan                0.59         0.58              1         0.59300         0.58000
(Other)                   0.06         0.03              1         0.05800         0.03000
-----------------   ----------   ----------   ------------   -------------   -------------
Total                     1.85         1.79              7         0.26443         0.25571
-----------------   ----------   ----------   ------------   -------------   -------------

Notice that in the plan without dynamic sampling the cardinality estimate for the row set returned from the function is 8168, nearly 8 times the actual rows. When dynamic sampling is added the cardinality estimate is exactly right. Also the times reported in the plan are similar but show dynamic sampling to be faster. How can that be, when the table of results earlier showed the query with dynamic sampling taking nearly twice as long?

My framework breaks down the times for the steps in running the query and writing the results out. From these we see that the difference is largely accounted for by the function query taking only 0.01 seconds to open the cursor while with dynamic sampling it takes .55 seconds. Evidently the dynamic sampling hint is causing a call to the function at the query parsing stage which is not accounted for in the execution plan statistics. (Notice incidentally that Oracle is apparaently performing a deferred open in both cases, where the actual cursor opening is performed at the time of first fetching.)

If one looks at the colour-coded table of elapsed times above, it can be seen that dynamic sampling causes a single plan to be chosen for all data points with depth below 1024, while without the hint two plans are chosen that with dynamic sampling are chosen only at depth 1024. It can also be seen that the dynamic sampling version is faster overall in most cases, but at the highest depth is slower because the non-hinted plan is the same. The default cardinality estimate (8168) was too high until that point.

Conclusions
We have applied three techniques for list aggregation to one specific problem, and for that problem the following concluding remarks can be made:

  • The Model solution is much inferior in performance to both the new Listagg native function, and custom pipelined function solutions
  • The variation in performance has been analysed and for the model solution shown to become dramatically worse when problem size causes earlier in-memory processing to spill to disk
  • The dynamic sampling hint has been applied to the pipelined function solution and shown to give correct cardinality estimates. In our example, the performance effect was negative for the largest problem sizes owing to the overhead involved, but in other cases it was positive
  • We have oberved that both Model and Listagg solutions also have cardinality estimation problems, but have not analysed these further
  • The native Listagg solution is significantly, and surprisingly, slower than the custom pipelined function solution at the largest data points considered

We may say more generally:

  • Performance analysis across a 2-dimensional domain of data sets can provide more insight than just looking at one large zero-dimensional case
  • Microsoft Excel graphs and other functionality have proved very useful in helping to visualise what is happening in terms of performance
  • Our findings tend to bear out a common suspicion of Model clause on performance grounds, and further support the view that pipelined functions can be very performance-effective, used appropriately






Grouping by Unique Subsequences in SQL

Recently a deceptively simple question was asked on Oracle’s OTN PL/SQL forum that most posters initially misunderstood, myself included (Grouping Non Consecutive Rows). The question requested a solution for the problem of assigning group numbers to a sequence of records such that each group held no duplicate values of a particular field, and each group was as long as possible. One of the posters produced a solution involving the Model clause and I tried to find a solution without using that clause, just to see if it could be done. Having found such a solution, not without some difficulty, I then solved the problem via a pipelined PL/SQL function having felt that that would be more efficient than straight SQL.

I think all three solutions have their own interest, especially since forum posters frequently assert, incorrectly, that SQL solutions are always faster than those using PL/SQL. I’ll explain how each solution works in this article and will run them through my benchmarking framework. We’ll see that the PL/SQL solution time increases linearly with sequence length, while the others increase quadratically, and PL/SQL is orders of magnitude faster for even small problem sizes. I used Oracle Database v11.2 XE running on a Samsung X120 dedicated PC.

I use my Query Structure Diagramming technique to illustrate the SQL solutions.

Functional Test Data
The problem data structure is quite simple, and I have defined my own version to emphasise generality, and to include partitioning, which was not part of the initial posting. There is one table with a two-column primary key, and I added an index on the partition field plus the value field.

Table Value_Subseqs

Column Type
part_key* Char(10)
ind* Number
val Char(10)

Index VSB_N1

  • part_key
  • val

Test Cases
Functional testing was done on a number of simple sequences, including the sequence of single characters, ABABB, which is used to illustrate the first solution query.

SQL Pairs Solution

How It Works
The first solution for this problem starts by obtaining the possible start and end points of the groups: all pairs that contain no duplicate. In the table below, the rows represent the possible pairs with X marking each value in the ranges. It’s easy to see the solution for this small test problem and the correct pairs are highlighted in yellow.

From the table it’s possible to work out a solution in SQL, given the pair set. The first correct pair is the longest starting from 1, while each subsequent correct pair is the longest beginning one element after the previous ends. This is of course a kind of tree-walk, which leads to the following solution steps (where everything is implicitly partitioned):

  1. Within a subquery factor, self-join the table with the second record having a higher ind than the first
  2. Add a NOT EXISTS subquery that checks for duplicate val in any pair between the outer potantial start and end point pair, using another self-join
  3. Add in the single element pairs via a UNION
  4. Within another subquery factor, obtain the rank of each pair in terms of its length descending, as well as the starting ind value (if it might not always be 1)
  5. In another subquery factor perform the tree-walk as mentioned, using the condition ‘rank = 1’ to include only the longest ranges from any point
  6. Include the the analytic function Row_Number in the Select list, which will be the group number
  7. The main query selects from the last subquery factor, then joins the original table

Notes
The query is of course likely to be performance-intensive owing to the initial subquery factor with its nested self-joins across ranges.

Query Diagram

Notes
The diagram notation follows and extends notation developed earlier, including my previous blog article, and is intended to be largely self-explanatory. In this diagram, I have added a new notation for joins that are not simple foreign key joins, in which I label the join and use a note to explain it.
Oracle v11.2 introduced Recursive Subquery Factors that extend tree-walk functionality. I used the older Connect By syntax in the query since it works on older versions, but found it easier to represent in the diagram as though it were the newer implementation – the diagram shows what’s happening logically.

SQL

WITH sqf AS (
SELECT p1.part_key, p1.ind ind_1, p2.ind ind_2
  FROM value_subseqs p1
  JOIN value_subseqs p2
    ON p2.ind           > p1.ind
   AND p2.part_key      = p1.part_key
   AND NOT EXISTS (SELECT 1
                     FROM value_subseqs p3
                     JOIN value_subseqs p4
                       ON p4.val        = p3.val
                      AND p4.part_key   = p3.part_key
                    WHERE p3.ind        BETWEEN p1.ind AND p2.ind
                      AND p4.ind        BETWEEN p3.ind + 1 AND p2.ind
                      AND p3.part_key   = p1.part_key
                      AND p4.part_key   = p2.part_key)
 UNION ALL
SELECT part_key, ind, ind
  FROM value_subseqs p1
), rnk AS (
SELECT part_key, ind_1, ind_2,
        Row_Number() OVER (PARTITION BY part_key, ind_1 ORDER BY ind_2 - ind_1 DESC) rn,
        Min(ind_1) OVER (PARTITION BY part_key) ind_1_beg
  FROM sqf
), grp AS (
SELECT part_key, ind_1, ind_2, Row_Number() OVER
       (PARTITION BY part_key ORDER BY ind_1, ind_2) grp_no
  FROM rnk
CONNECT BY ind_1 = PRIOR ind_2 + 1
   AND part_key = PRIOR part_key
   AND rn = 1
 START WITH ind_1 = ind_1_beg AND rn = 1
)
SELECT
    p.part_key,
    p.ind,
    p.val,
    g.grp_no
  FROM grp g
  JOIN value_subseqs p
    ON p.ind            BETWEEN g.ind_1 AND g.ind_2
   AND p.part_key       = g.part_key
 ORDER BY p.part_key, p.ind

Model Solution

How It Works
In the OTN thread several solutions were proposed that used Oracle’s Model clause or recursive subquery factoring, but I think only one was a general solution, since the others used string variables to successively concatenate strings over arbitrary numbers of rows and would break when the 4000 character SQL limit is hit.
The general Model solution (which was not by me, but I’ve reformatted it and applied it to my table) worked by defining two measures, for group start indices and group number. The rules specified two passes for the two measures: The first pass counts the distinct values and compares with the count of all values, between the previous group start and the current index; the second uses the group starts to set the group numbers.

  1. Form the basic Select, with all the table columns required, and append a group number placeholder 
  2. Add the Model keyword, partitioning by part_key, dimensioning by analytic function Row_Number, ordering by ind within part_key, with val, group start and group number as measures
  3. Initialise group start and group number to 1 in the measures clause
  4. Define the first rule to obtain the group start date for all rows after the first as the previous group start, unless there is a difference between the two counts, in which case take the new index.
  5. Define the second rule to obtain the group number for all rows as the previous group number, unless the group start has changed, in which case take the previous group number + 1.

Query Diagram

Notes
Queries with the Model clause have a structure that is rather different from other queries, and the diagram attempts to reflect that structure for these problems. The main query feeds its output into an array processing component with a set of rules that specify how any additional data items (called measures) are to be calculated, in a mostly declarative fashion.
The model box above contains 4 specification types:

  • Partition – processing is to be performed separately by one or more columns; the same meaning as in analytic functions
  • Dimension – columns by which the array is dimensioned; can included analytic functions, as here
  • Measures – remaining columns that may be calculated or updated by the rules, possibly including placeholders from the main query
  • Rules – a set of rules that specify measure calculation; rules are processed sequentially, unless otherwise specified; in the diagram:
    • n – the current dimension value
    • F(n-1,n) – denotes that the value depends on values from previous and current rows (and so on, ‘..’ denotes a range)
    • ^ – denotes that the calculation progresses in ascending order by dimension; this is the default so does not have to be coded

SQL

SELECT
    part_key,
    rn,
    val,
    g grp_no
  FROM value_subseqs
 MODEL
   PARTITION BY (part_key)
   DIMENSION BY (Row_number() OVER (PARTITION BY part_key ORDER BY ind) rn)
   MEASURES (val, 1 g, 1 s)
   RULES (
     s[rn > 1] = CASE COUNT (DISTINCT val)[rn BETWEEN s[CV() - 1] AND CV()]
                   WHEN COUNT (val)[rn BETWEEN s[CV() - 1] AND cv()] THEN s[cv() - 1]
                   ELSE CV(rn)
                 END,
     g[rn > 1] = CASE s[CV()]
                    WHEN s[CV() - 1] THEN g[CV() - 1]
                    ELSE g[CV() - 1] + 1
                 END
   )
 ORDER BY part_key, rn

Pipelined Function Hash Solution

How It Works
This approach is based on pipelined database functions, which are specified to return array types. Pipelining means that Oracle transparently returns the records in batches while processing continues, thus avoiding memory problems, and returning initial rows more quickly. See Pipelined Functions (AskTom) for some other examples of use.
Within the function there is a simple cursor loop over the ordered sequence. A PL/SQL index-by array stores values for the current group, and allows duplicate checking to take place without any additional searching or sorting. The array is reset whenever the group changes.

Types
Two database types are specified, the first being an object with fields for the table columns and an extra field for the group to be derived; the second is an array of the nested table form with elements of the first type.

Function Pseudocode

Loop over a cursor selecting the records in order
	If the partition field value changes then
		Reset group and index-by array
	Else if the current value is already in the index-by array then
		Increment group number and reset index-by array
	End if
	Add the value to the index-by array
	Pipe the row out
End loop

Function Definition (within package)

FUNCTION Hash_Array RETURN value_subseq_list_type PIPELINED IS

  TYPE value_list_type      IS TABLE OF PLS_INTEGER INDEX BY VARCHAR2(4000);
  l_value_list_NULL         value_list_type;
  l_value_list              value_list_type;
  l_group_no                PLS_INTEGER;
  old_part_key              VARCHAR2(10);

BEGIN

  FOR r_val IN (SELECT part_key, ind, val FROM value_subseqs ORDER BY part_key, ind) LOOP

    IF r_val.part_key != old_part_key OR old_part_key IS NULL THEN

      old_part_key := r_val.part_key;
      l_group_no := 1;
      l_value_list := l_value_list_NULL;

    ELSIF l_value_list.Exists (r_val.val) THEN

      l_group_no := l_group_no + 1;
      l_value_list := l_value_list_NULL;

    END IF;

    l_value_list (r_val.val) := 1;
    PIPE ROW (value_subseq_type (r_val.part_key, r_val.ind, r_val.val, l_group_no));

  END LOOP;

END Hash_Array;

SQL
Just select the fields named in the record from the function wrapped in the TABLE keyword.

SELECT
    part_key,
    ind,
    val,
    grp_no
  FROM TABLE (Subseq_Groups.Hash_Array)
 ORDER BY part_key, ind

Pipelined Function Sim Solution
This solution was added after performance testing on the the first data set, below. It is a sort of hybrid between the Model and pipelined function approach created to try to understand the apparent quadratic variation of Model with sequence length noted in testing.

How It Works
This function uses the same database types as the Hash solution, but with the same counting idea as the Model solution within an old-fashioned nested cursor structure that one would not expect to perform efficiently.

Function Pseudocode

Loop over a cursor selecting the records in order
	If the partition field value changes then
		Reset group and group starting index
	Else
		Select counts of val and distinct val from the table between the group starting and current indices
		If difference in counts then
			Increment group number and reset group starting index
		End if
	End if
	Pipe the row out
End loop

Function Definition (within package)

FUNCTION Sim_Model RETURN value_subseq_list_type PIPELINED IS

  l_group_no                PLS_INTEGER;
  l_ind_beg                 PLS_INTEGER;
  l_is_dup                  PLS_INTEGER;
  old_part_key              VARCHAR2(10);

BEGIN

  FOR r_val IN (SELECT part_key, ind, val FROM value_subseqs ORDER BY part_key, ind) LOOP

    IF r_val.part_key != old_part_key OR old_part_key IS NULL THEN

      old_part_key := r_val.part_key;
      l_group_no := 1;
      l_ind_beg := r_val.ind;

    ELSE

      SELECT Count(val) - Count(DISTINCT val)
        INTO l_is_dup
        FROM value_subseqs
       WHERE part_key = r_val.part_key
         AND ind      BETWEEN l_ind_beg AND r_val.ind;

      IF l_is_dup > 0 THEN

        l_group_no := l_group_no + 1;
        l_ind_beg := r_val.ind;

      END IF;

    END IF;

    PIPE ROW (value_subseq_type (r_val.part_key, r_val.ind, r_val.val, l_group_no));

  END LOOP;

END Sim_Model;

SQL
Just select the fields named in the record from the function wrapped in the TABLE keyword.

SELECT
    part_key,
    ind,
    val,
    grp_no
  FROM TABLE (Subseq_Groups.Sim_Model)
 ORDER BY part_key, ind

Performance Analysis
In SQL Pivot and Prune Queries – Keeping an Eye on Performance, in May 2011, I first applied an approach to performance testing of SQL queries whereby the queries are tested across a 2-dimensional domain, using a testing framework developed for that work. The same approach has been followed here, although the framework has been substantially improved and extended since then. In this case it may be of interest to observe performance across the following two dimensions:

  • Number of records per partition key
  • Average group size

However, as I prefer to randomise the test data, the group size is known only after querying the data set, so I will use a proxy dimension instead. The value field will be a string of 20 random capital letters (A-T) of length equal to the proxy dimension. As this length increases so will the average group size. Generally execution time would be expected to be proportional to number of partition keys when the other dimensions are fixed, and I will use 2 partition key values throughout.

Data Set 1
The first data set is chosen to keep the CPU times within reason for all queries, which limits the possible ranges (we’ll eliminate one query later and extend the ranges).

Output Record Counts

Depth/Width

W1

W2

W4

Records/Part>

100

200

400

D1

5

6

5

D2

25

24

24

D3

100

80

89

D4

100

200

267

CPU Times (Seconds)

Query

W1

W2

W4

W/Prior W Average

SQL Pairs

D1

2.11

8.49

33.99

4

D2

4.88

13.66

42.15

2.9

D3

10.3

46.71

255.64

5

D4

10.33

78.25

595.58

7.6

Model

D1

0.32

1.08

4.23

3.6

D2

0.31

1.14

4.23

3.7

D3

0.34

1.16

4.65

3.7

D4

0.36

1.36

4.86

3.7

Hash

D1

0.03

0.03

0.03

1

D2

0

0.01

0.03

2

D3

0.01

0.01

0.02

1.5

D4

0.02

0.01

0.02

1.3


Discussion

We can see immediately that on all data points there is the same performance ranking of the three queries and the differences are extreme. On W4-D4, SQL Pairs takes 123 times as long as Model, which in turn takes 243 times as long as Hash.

SQL Pairs
The Average Ratio figure in the table above is the average ratio between successive columns across the width dimension. On D1 this is 4, meaning that the CPU time has risen by the square of the width factor increase. On D8 it’s 7.6, being almost the cube of the factor. It appears that the performance varies with the square of the sequence length, except when the group size reaches the sequence length, when it becomes the cube. We would not consider this query for real problems when faster alternatives exist.

Model
The Average Ratio figure on all depths is about 3.7, meaning that the CPU time has risen by almost the square of the width factor increase. This is very surprising because, considering the algorithm one would assume to be effected by our query, if the group size remains constant then the work done in computing each group ought to be constant too, and the total work ought to rise by the same factor as the sequence length. We’ll look at this further in our second, larger data set, where we’ll also consider an apparently similar algorithm implemented in PL/SQL.

Hash
The Average Ratio figure varies between 1 and 2, but the CPU times are really too small for the figure to be considered reliable (note that the zero figure for W1-D2 was replaced by 0.005 in order to allow the log graph to include it). We can safely say, though, that this is faster by orders of magnitude even on very small problems, and will again look at a larger data set to see whether more can be said.

Data Set 2 (Second Query Set)
The second data set is chosen to give wider ranges, after excluding the Pairs query. A second function was added to replace it, labelled ‘Sim’ below and described above.

Output Record Counts

Depth/Width

W1

W2

W4

W8

W16

W32

Records/Part>

100

200

400

800

1600

3200

D1

5

5

5

5

5

5

D2

18

25

23

24

25

23

D3

50

80

89

84

119

110

D4

100

200

200

267

356

582

D5

100

200

400

533

1600

2133

D6

100

200

400

800

1600

3200

CPU Times (Seconds)

Query

W1

W2

W4

W8

W16

W32

W/Prior W Average

Hash

D1

0.02

0.01

0.01

0.03

0.06

0.10

1.6

D2

0.02

0.01

0.02

0.04

0.06

0.09

1.5

D3

0.02

0.02

0.02

0.04

0.06

0.09

1.4

D4

0.01

0.02

0.01

0.03

0.06

0.11

1.9

D5

0.01

0.03

0.02

0.05

0.06

0.09

1.8

D6

0.01

0.02

0.01

0.05

0.06

0.09

2.0

Model

D1

0.28

1.01

4.05

16.28

63.34

257.46

3.9

D2

0.27

1.06

4.00

15.97

66.27

249.99

3.9

D3

0.29

1.14

4.24

16.28

62.91

250.50

3.9

D4

0.33

1.24

4.74

17.66

69.00

263.34

3.8

D5

0.33

1.26

5.10

19.17

81.59

314.39

3.9

D6

0.33

1.28

4.99

20.55

80.80

331.38

4.0

Sim

D1

0.25

0.37

0.74

1.70

3.14

6.18

1.9

D2

0.28

0.59

1.21

2.42

4.73

9.28

2.0

D3

0.33

0.67

1.43

2.79

5.86

11.28

2.0

D4

0.34

0.77

1.58

3.15

6.97

14.71

2.1

D5

0.36

0.73

1.69

3.61

9.64

22.58

2.3

D6

0.36

0.73

1.71

3.79

9.43

26.45

2.4


Discussion

We can see immediately that on all data points there is the same performance ranking of the three queries and the differences are extreme. On W32-D6, Model takes 12 times as long as Sim, which in turn takes 294 times as long as Hash, Model being 3,678 slower than Hash. (I have included the elapsed times, which are very close to the CPU times, in the shared XL file above. I think the table data are being read from buffer cache throughout).

Model
The Average Ratio figure on all depths is about 3.9, meaning that the CPU time has risen by almost the square of the width factor increase. It seems that the internal algorithm applied from the Model query here is doing something different from what we would expect, and with dire consequences for performance. For what it’s worth, here is the last Execution Plan:

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name          | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |               |      1 |        |   6400 |00:05:33.45 |      30 |       |       |          |
|   1 |  SORT ORDER BY       |               |      1 |   6437 |   6400 |00:05:33.45 |      30 |   478K|   448K|  424K (0)|
|   2 |   SQL MODEL ORDERED  |               |      1 |   6437 |   6400 |05:57:01.43 |      30 |  1156K|   974K| 1002K (0)|
|   3 |    WINDOW SORT       |               |      1 |   6437 |   6400 |00:00:00.03 |      30 |   337K|   337K|  299K (0)|
|   4 |     TABLE ACCESS FULL| VALUE_SUBSEQS |      1 |   6437 |   6400 |00:00:00.01 |      30 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------

Hash
The Average Ratio figure varies between 1.4 and 2, but the CPU times are probaby still too small for the figure to be considered reliable. We can again say, though, that this is faster than the Model solution by orders of magnitude even on very small problems and that the CPU time does not appear to be rising faster than linearly, so that the performance advantage will increase with problem size.

Sim
The Average Ratio figure varies between 1.9 and 2.4, and up to D3 is not more than 2, which indicates a pretty exact proportionality between sequence length and CPU time. This is consistent with our expectation of linearity so long as the group sizes are smaller than sequence lengths. For D6, where the group sizes are the same as the sequence lengths, we would expect to see a quadratic term (number of counts doubles, and work done in sorting/counting also doubles, if that is linear), and time in fact trebles between W16 and W32.
The results from this solution support the view that there is some internal implementation problem with Model for this example that is causing its quadratic CPU time variation. In my August 2011 revision of Forming Range-Based Break Groups with Advanced SQL, I noted significant under-performance in a query using analytic functions, that also seemed to due to an internal implementation problem, and found a work-around that made the query perform as expected. I believe both examples illustrate well the power of this kind of dimensional performance analysis.

Model Clause vs Pipelined Functions
I googled model performance problems and the two top-ranked articles are briefly discussed in the first two subsections below.
MODEL Performance Tuning
The author states: ‘In some cases MODEL query performance can even be so poor that it renders the query unusable. One of the keys to writing efficient and scalable MODEL queries seems to be keeping session memory use to a minimum’. Our first case would seem to fall into one of those cases, Model being 3,678 times slower than the function, at 332 seconds, on a problem of only 3,200 records (times two partition keys) and rising quadratically. The article considers the effects of changing various features of a test Model problem on the memory usage, assuming that that is one cause of poor performance.
From Pipelined Function to Model 
This article is interesting in that it takes pretty much the opposite line to my own. For a different problem, the author started with a pipelined function solution, and then went to Model on performance grounds. He says: ‘Since we were returning the rows in a pipelined (streaming) fashion, the performance was fine initially. It was when the function was called constantly and then joined with other tables that we ran into trouble’. The problem he identifies seems to be principally the fact that Oracle’s Cost Based Optimiser (CBO) cannot accurately predict the cardinality of the function, and assigns a default, on his system (as on mine) of 8168. This can cause poor execution plans when joined with other tables. His initial solution was to use the CARDINALITY hint, which worked but is undocumented and inflexible. He also notes possible performance issues caused by Oracle’s translating the PL/SQL table into an SQL result set and goes on to propose a Model solution to avoid this problem. Unfortunately, the author does not provide any results on significant data sets to demonstrate the performance differences. The following article looks specifically at the cardinality issue.
setting cardinality for pipelined and table functions
The author (Adrian Billington) considers four techniques that he labels thus:

  • CARDINALITY hint (9i+) undocumented
  • OPT_ESTIMATE hint (10g+) undocumented
  • DYNAMIC_SAMPLING hint (11.1.0.7+)
  • Extensible Optimiser (10g+)

The first two are not recommended on the grounds of being undocumented. The third option appears quite a lot simpler than the fourth and I will look at that approach in a second example problem, in my next article List Aggregation in Oracle – Comparing Three Methods.

Discussion
The pipelined function approach is plainly much faster than the SQL solutions in this example, but one has to be aware of possible issues such as the cardinality issue mentioned. One also needs to be aware that pure SQL statements are ‘read-consistent’ in Oracle, but this is not the case when functions are called that themselves do SQL.
Context switches between the SQL and PL/SQL engines, as well as the work done in translating between collections and SQL record sets, are often cited as performance reasons for preferring SQL-only solutions. As we have seen though, these issues, while real, can be dwarfed by algorithmic differences.

Conclusions

  • For the subsequence grouping problem addressed, using a pipelined function is faster by far than the SQL-only solutions identified
  • Although widely asserted, the notion that any query processing will be executed more efficiently in pure SQL than in PL/SQL is a myth
  • The Model solution using embedded aggregate Counts is much slower than expected and its quadratic CPU variation suggests performance problems within Oracle’s internal implementation of the Model clause
  • Dimensional performance analysis is very powerful although its use appears to be extremely rare in the Oracle community
  • It is suggested that diagrammatic techniques, such as my Query Structure Diagramming, although also very rarely used, offer important advantages for query documentation and design






Query Structure Diagramming

Last bank holiday Monday I posted a solution to an SQL problem on OTN, and I later thought that the SQL would make a nice example to illustrate my Query Structure Diagramming (QSD) technique. I published my first example of this in May 2009 on scribd,

Loading...

and have continued to develop it in subsequent articles. I use the technique here to illustrate the SQL structure for the OTN example mentioned and also for a second OTN example that I posted shortly after. Both examples structure the queries using subquery factors.

SQL Subquery Factors

Subquery factors were introduced in Oracle Database v9.2 and have since become a key technique in developing queries of any complexity. They generalise the v8 inline view technique, allowing subqueries now to be declared with an alias (using the ‘WITH’ clause), then referenced as often as desired later in the query. When referenced multiple times, Oracle’s Cost Based Optimiser (CBO) normally executes the subquery once and writes the results to temporary space to aid performance (this can be seen in the Explain Plan as a LOAD AS SELECT action). Using subquery factors can make queries much easier to read, even when they are referenced only once, in which case CBO normally restructures them internally to incorporate them within another subquery or the main query. It is important to note, though, that subquery factors, when retained by CBO, will be joined by full scans when referenced later in the query and in some cases it is more efficient to retain the table references to allow indexed joins (my next blog post will include an example of this).

Subquery factors, along with inline views, are the building blocks of modern SQL, as subroutines are of other languages. My QSDs are intended to show how they allow a procedural flow at the structural level, while retaining the set-based logic within the subqueries.

Leave and Attendance Query

The problem here is that the poster has a daily attendance table and a leave table, where leave is stored as date ranges, but wants a single query that outputs data in daily form. The keys to this are:

  • Realising that you cannot drive from the event tables but must generate a continuous set of days to drive from, for each employee (assuming you don’t have them in a separate reference table)
  • Converting the leave ranges to leave days by joining to the generated days rowset
  • Joining the leave daily rowset with the attendance table by a union

Read more here (I’m BrendanP on OTN): Attendance and Leave table Join

ERD


Note that tables were not provided for Employee and Day in the OTN post, but it is useful to include them as entities nonetheless.

SQL

Note that in the query below, both the date range and the employee set are generated from the transactional data, which is obviously an artificial feature arising from this being for a problem on a forum, but it’s no harm in terms of the purpose of this article.

WITH ext AS (
SELECT Min (att_date) min_date, Max (att_date) - Min (att_date) + 1 n_days
  FROM attendance
), dys AS (
SELECT min_date + LEVEL - 1 day
  FROM ext
CONNECT BY LEVEL < n_days + 1
), ems AS (
SELECT emp_id
  FROM attendance
 UNION
SELECT employee_number
  FROM leave
), edy AS (
SELECT
    dys.day,
    ems.emp_id
  FROM dys
 CROSS JOIN ems
), ldy AS (
SELECT
    edy.emp_id,
    edy.day,
    lve.leave_reason
  FROM edy
  JOIN leave                lve
    ON edy.day              BETWEEN lve.date_start AND lve.date_end
   AND lve.employee_number  = edy.emp_id
), uni AS (
 SELECT
    emp_id,
    att_date,
    timein,
    timeout,
    late_in,
    early_out,
    reason
  FROM attendance
UNION
SELECT
    emp_id,
    day,
    NULL,
    NULL,
    NULL,
    NULL,
    leave_reason
  FROM ldy
)
 SELECT
    edy.emp_id,
    edy.day,
    uni.timein,
    uni.timeout,
    uni.late_in,
    uni.early_out,
    uni.reason
  FROM edy
  LEFT JOIN uni
    ON uni.att_date     = edy.day
   AND uni.emp_id       = edy.emp_id
 ORDER BY 1, 2{code}

QSD

Counting Flight Statistics Query

The problem here is that the poster has three tables with data on events for frequent fliers and wants to show aggregate counts by year, but wants all years within a range to be included in the output, including years with no events. The key to this is realising that you cannot drive from the event tables but must generate a continuous set of years to drive from (assuming you don’t have them in a separate reference table). Read more here: Multiple Count aggregates from different sources grouped by Year

ERD


Note that a table was not provided for Year in the OTN post, but it is useful to include it as an entity nonetheless.

SQL

WITH yrs AS (
SELECT Add_Months (To_Date ('01011989', 'ddmmyyyy'), 12*LEVEL) YEAR
  FROM DUAL
CONNECT BY LEVEL < 24
), ffe AS (
SELECT Trunc (start_date, 'YEAR') year, Count(*) n_enr
  FROM freq_flyer_enrollment
 GROUP BY Trunc (start_date, 'YEAR')
), ffs AS (
SELECT Trunc (survey_date, 'YEAR') year, Count(*) n_sur
  FROM freq_flyer_survey
 GROUP BY Trunc (survey_date, 'YEAR')
), fff AS (
SELECT Trunc (flt_date, 'YEAR') year, Count(*) n_fly
  FROM freq_flyer_flights
 GROUP BY Trunc (flt_date, 'YEAR')
)
SELECT  yrs.year,
        Nvl (ffe.n_enr, 0) n_enr,
        Nvl (ffs.n_sur, 0) n_sur,
        Nvl (fff.n_fly, 0) n_fly
  FROM yrs
  LEFT JOIN ffe
    ON ffe.year = yrs.year
  LEFT JOIN ffs
    ON ffs.year = yrs.year
  LEFT JOIN fff
    ON fff.year = yrs.year
 ORDER BY 1

QSD






Data Structure Diagramming

Like many SQL developers I have always used entity-relationship diagrams to help in writing queries, and would extract sections to document them. Some years ago, however, I realised that having a single static diagram was not sufficient for complex queries with large numbers of tables, structures such as inline views, and multiple table instances. I therefore developed a diagram-based design methodology that I published in May 2009 on scribd. Since then I have extended the ideas in that approach to develop diagrams to cover various additional structures in SQL and in other areas. These diagrams were developed as needed for particular scenarios and have been published in several documents on scribd. I thought it would be a good idea to bring them together in one place, namely here, with example diagrams and the scribd document embedded thereafter. [Incidentally, I wonder what readers make of this 8-dimensional document structure?]

I would categorise them under four headings:

  • Entity-Relationship Diagrams
  • Structured Design Methodology
  • SQL Special Structures
  • Object Structures

Entity-Relationship Diagrams
Oracle Spatial Schema
The embedded document below also includes an ERD of the much simpler HR schema, but this one is more interesting as it shows extensive use of subtypes. The document is concerned with networks and I superimposed tree and non-tree network links on the diagram.

Oracle Customer Model and Multi-Org
Here I used shading to distinguish between org-striped, org-linked (my term) and other entities.

Structured Design Methodology
The methodology involves a sequence of diagrams and tables, so I have not extracted a diagram in this case.

SQL Special Structures
Multiple Table Instances with Scalar Subqueries in Where Clause
Subquery Factor

Selecting Database Function

Selecting Scalar Subqueries


Nested Analytics Subqueries

Model Clause

Recursive Subquery Factor

Object Structures
I use a different type of diagram for object structures from those for SQL and ERDs, and it’s intended to be very general, being independent of programming language and applicable to any object structure, allowing arbitrary nesting of array and record types.
Code Timer Object
This object was implemented in three languages: Oracle, Perl and Java.


Excel Array Object
This object was implemented in Perl.