A Note on Oracle Join Orders and Hints

I read an interesting article this week on a company internal blog (thanks, Deepak), which pointed out that many people seem to think that the hint USE_HASH needs two parameters to specify the hint, when in fact only the right table alias is required. Where more than one alias is given the hint is effectively treated as separate one-alias hints. Here is what the manual says, Oracle SQL Manual 12.1 - Hints - USE_HASH:

use_hash_hint

The USE_HASH hint instructs the optimizer to join each specified table with another row source using a hash join. For example:

SELECT /*+ USE_HASH(l h) */ *
FROM orders h, order_items l
WHERE l.order_id = h.order_id
AND l.order_id > 2400;

Observe that in the example given, with two tables, there can be only one join in the chosen plan but both aliases are given in the hint. Only one of the hints can take effect here, but which one? It depends on the join order chosen by the the Cost Based Optimizer (CBO). In the sections on USE_MERGE and USE_NL, the manual says:

Use of the USE_NL and USE_MERGE hints is recommended with the LEADING and ORDERED hints. The optimizer uses those hints when the referenced table is forced to be the inner table of a join. The hints are ignored if the referenced table is the outer table.

My colleague linked to an article by Jonathan Lewis that considers join order in relation to hash joins in some detail, Quiz Night.

You have NOT defined a hash join completely until you have specified which rowsource should be used as the build table and which as the probe table – so every time you supply the use_hash() hint for a table, you should also supply the swap_join_inputs() hint or the no_swap_join_inputs() hint.

In the article, JL asks how many execution plans are possible for a four-table query where he says he specifies both join order (leading hint) and join method for each of the three joins (use_hash). The answer is 8, which might be a surprise if you accept the specification at face value, as that would imply that a single plan has been fully specified. The explanation of the apparent paradox is of course given in the quote above. JL goes on to enumerate each possible plan, obtained by different combinations of the swap_join_inputs and no_swap_join_inputs hints. He then asserts:

Note the extreme change in shape and apparent order of tables in the plan. Despite this the join order really is t1 -> t2 -> t3 -> t4 in every case. I’ll give a quick description of the first and last plans to explain this.

The explanation seems plausible, but I can see a possible problem with it. It is possible to generate exactly the same plans using the hint leading (t2 t1 t3 t4) and different combinations of the swap hints. I could then equally plausibly assert that the join order really is t2 -> t1 -> t3 -> t4 in every case. Both assertions can't be right can they? The explanation is that we need to consider two levels of ordering, and not just for hash joins. First, I will demonstrate how to get identical plans with different leading hints. I will use Oracle's HR schema, and show the hints necessary to get all hash join plans (with current statistics - I am not fully specifying the plans in general):

SELECT *
  FROM employees e
  JOIN departments d
 USING (department_id)
  JOIN locations l
 USING (location_id)

You can see that hash joins are only possible between e and d, and between d and l (or rowsets that include the relevant keys).

We can use a notation x.y to mean hash-join y as the right table, using x to form the hash-table, and (x.y) to denote the resulting rowset. Then I believe there are exactly 8 possible hash-join permutations, each of which I was able to hint using leading, use_hash, swap_join_inputs and no_swap_join_input hints as follows:

1. leading (l) use_hash (d)
===========================
Combo              SJI
-------            ----
(l.d).e  
e.(l.d)            e
(d.l).e            d
e.(d.l)            d, e

2. leading (e)
==============
Combo              SJI    NSJI
-------            ----   ----
l.(e.d)
(e.d).l                   d
l.(d.e)            d
(d.e).l            d      l

Now, look at the following output where I have got exactly the same plan (the first combo above) using two different leading hints (putting in the first two aliases to be sure). The first has:

/*+ leading (l d)  use_hash (d) */

The second has:

/*+ leading (d l) use_hash (l e) swap_join_inputs (l) */

The plans that result, showing identical plan hash value of 2684174912 are:

SQL_ID  ds98jqmdbn9ym, child number 0
-------------------------------------
SELECT  /*+ leading (l d)  use_hash (d) GATHER_PLAN_STATISTICS
Lead_l_Hash_d */      *   FROM employees e   JOIN departments d  USING
(department_id)   JOIN locations l  USING (location_id)

Plan hash value: 2684174912

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |      1 |        |    106 |00:00:00.01 |      25 |       |       |          |
|*  1 |  HASH JOIN          |             |      1 |    106 |    106 |00:00:00.01 |      25 |   796K|   796K| 1241K (0)|
|*  2 |   HASH JOIN         |             |      1 |     27 |     27 |00:00:00.01 |      12 |   835K|   835K| 1122K (0)|
|   3 |    TABLE ACCESS FULL| LOCATIONS   |      1 |     23 |     23 |00:00:00.01 |       6 |       |       |          |
|   4 |    TABLE ACCESS FULL| DEPARTMENTS |      1 |     27 |     27 |00:00:00.01 |       6 |       |       |          |
|   5 |   TABLE ACCESS FULL | EMPLOYEES   |      1 |    107 |    107 |00:00:00.01 |      13 |       |       |          |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
   2 - access("D"."LOCATION_ID"="L"."LOCATION_ID")

SQL_ID  fcbh8httjtpj3, child number 0
-------------------------------------
SELECT  /*+ leading (d l) use_hash (l e) swap_join_inputs (l)
GATHER_PLAN_STATISTICS Lead_d_Hash_l_e_SJI_l */      *   FROM employees
e   JOIN departments d  USING (department_id)   JOIN locations l  USING
(location_id)

Plan hash value: 2684174912

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |      1 |        |    106 |00:00:00.01 |      25 |       |       |          |
|*  1 |  HASH JOIN          |             |      1 |    106 |    106 |00:00:00.01 |      25 |   796K|   796K| 1240K (0)|
|*  2 |   HASH JOIN         |             |      1 |     27 |     27 |00:00:00.01 |      12 |   835K|   835K| 1109K (0)|
|   3 |    TABLE ACCESS FULL| LOCATIONS   |      1 |     23 |     23 |00:00:00.01 |       6 |       |       |          |
|   4 |    TABLE ACCESS FULL| DEPARTMENTS |      1 |     27 |     27 |00:00:00.01 |       6 |       |       |          |
|   5 |   TABLE ACCESS FULL | EMPLOYEES   |      1 |    107 |    107 |00:00:00.01 |      13 |       |       |          |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
   2 - access("D"."LOCATION_ID"="L"."LOCATION_ID")

So, here we seem to have shown that join order l -> d -> e = join order d -> l -> e, since they have the same plan. I think that to understand this we need to separate out two levels of join order. In the two-table example I took from the manual I stated there could be only one join order in a given plan as there is only one join, meaning by join order which table appears on the left and which on the right.

Outer-level join order

This can be specified by a number assigned to each table matching the sequence number of the join in which it features. The first two tables therefore both have a sequence number of 1. Another way of looking at it can be found by considering my bracketing notation above. In general, the outer-level join sequence number for an n-table query, OS(i) = n - number of brackets enclosing table i - 1

For (l.d).e, l and d have OS = 1 and e has OS = 2. The same is true for (d.l).e.

It is in this sense of join order that JL's queries all have the same join order, and that a single plan can arise from different outer-level join orders, once inner-join order is factored in.

Inner-level join order

This is simply the side on which a table is joined to a rowset in a given join, say 1 for the left side, and 2 for the right side in a hash join.

Note that the leading hint treats both levels of join order, but behaves differently between hash joins and other types of join.

Leading hint in hash join

  • Determines the outer-level join order
  • Defaults the inner-level join order
  • Swap_join_inputs operation overrides the inner-level default join order

Leading hint in other types of join

  • Determines the outer-level join order
  • Determines the inner-level join order on the first two tables
  • Inner-level order not applicable after first two tables

See also a later article I wrote: Benchmarking of Hash Join Options in SQL for Fixed-Depth Hierarchies






 

4 thoughts on “A Note on Oracle Join Orders and Hints

  1. Brendan,

    I see several problems with your article.

    The most significant, perhaps, is inventing terminology that isn't needed and collides with terminology that Oracle already uses. Specifically you introduce the terms "outer-level join order" and "inner-level join order" - but Oracle already uses the term "join order" with a very specific meaning - here's the quote from my article that makes this clear:

    "... 8 possibilities in total for the execution plan – even though only one join order is examined. (If you check the 10053 trace file for this query you will find all the computation for these execution plans under one line that reads: Join order [1], there will not be a Join order[2])"

    Roughly speaking your "outer-level join order" corresponds to "the join order", and your concept of "inner-level join order" is simply giving a different name to Oracle's "swap join inputs" option.

    Having invented "outer-level" and "inner-level", you then have to invent some explanation for why the "inner-level" appears to be irrelevant for nested loop and merge joins - so all you've done is add complexity. More significantly, at that point, you don't explain what you mean by "default" when you say the leading() hint "defaults the inner-level join order". In your notation if I hint leading (a b c) the optimizer could produce a plan (a.b).c, or c.(a.b) -- in what way does the leading hint "default" the inner-level join order to c ?

    • Hi Jonathan,

      I agree with you that I have added some complexity by introducing two levels of join order, although I prefer to think of it as "extending" the Oracle term rather than "colliding" with it. And this can only make sense if it corresponds to some real, underlying greater complexity than the single term implies.

      Your statement that "the join order really is t1 -> t2 -> t3 -> t4" when I can easily see how "t2 -> t1 -> t3 -> t4" could also be described as the join order suggested to me that there might be some value in extending the terminology, to try to make things less confusing.

      I agree that my "outer-level join order" corresponds to "the join order" as usually used. However, I don't see "inner-level join order" as simply giving a different name to Oracle's "swap join inputs" option - they're basically different things - the latter is an operation that gives rise to a difference in the former.

      Also, I had in mind, although I did not include this in the article, the apparent paradox that arises when you think about the "join order" for two tables only - they don't really have a join order in the sense above since both tables are joined at once. But everyone thinks they do, and that the hint "leading (a) use_nl (b)" mandates "a -> b".

      As you imply, "leading (a)" does not mandate any order (in the sense of my "inner-level join order"). If a hash join is used it could be either a.b or b.a, and the reason I say that the hint defaults the (inner) join order is that, if you look at the XML plan outline in SQL Developer, you will see the swap_join_inputs hint is needed to override the order implied by the leading hint.

      I thought that these were interesting and previously under-remarked aspects of Oracle's join behaviour rather than merely adding complexity for its own sake. However, it's always hard to know how far others find one's approach helpful, and feedback is always welcome, so thank you for yours! 🙂

    • Hi Amit,

      Your wish is my command 🙂 I have added WP Subscribe to the sidebar. Let me know if there are any issues.

      p.s. Check out my latest post: A Note on Dependencies and Database Unit Testing

Leave a Reply

Your email address will not be published. Required fields are marked *