Tony’s Oracle Tips

February 22, 2009

Parentheses in ANSI Join Syntax

Filed under: SQL Fun — tonyhasler @ 3:45 pm

I was discussing the concept of bushy joins with some colleagues the other day and I was asked if the use of parentheses in ANSI join syntax could be used to force a bushy join. For an introduction to bushy joins see my other blog entry here. Let us try it:


create table t1 (col1) as select 1 from dual union all select 2 from dual ;
create table t2 (col2) as select 1 from dual union all select 3 from dual ;
create table t3 (col3) as select 1 from dual union all select 4 from dual ;
create table t4 (col4) as select 1 from dual union all select 5 from dual ;

exec dbms_stats.gather_table_stats(user,'T1') ;
exec dbms_stats.gather_table_stats(user,'T2') ;
exec dbms_stats.gather_table_stats(user,'T3') ;
exec dbms_stats.gather_table_stats(user,'T4') ;

We have created four tables each with one column and two rows. One row in each table shares a column value with the others. The second row has a unique column value. Let us see what the execution plan is for the following statement:


select * from (t1 join t2 on col1=col2) join (t3 join t4 on col3=col4) on col1+col2=col3+col4;

------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |     1 |    12 |    13   (8)| 00:00:01 |
|*  1 |  HASH JOIN            |      |     1 |    12 |    13   (8)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN|      |     4 |    36 |    10  (10)| 00:00:01 |
|*  3 |    HASH JOIN          |      |     2 |    12 |     7  (15)| 00:00:01 |
|   4 |     TABLE ACCESS FULL | T1   |     2 |     6 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL | T2   |     2 |     6 |     3   (0)| 00:00:01 |
|   6 |    BUFFER SORT        |      |     2 |     6 |     7  (15)| 00:00:01 |
|   7 |     TABLE ACCESS FULL | T3   |     2 |     6 |     2   (0)| 00:00:01 |
|   8 |   TABLE ACCESS FULL   | T4   |     2 |     6 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("COL3"="COL4")
       filter("T1"."COL1"+"T2"."COL2"="T3"."COL3"+"T4"."COL4")
   3 - access("COL1"="COL2")

We can see that the parentheses have not affected the execution plan at all and no bushy join was used.

In the above example the parentheses made the query easier to read but were in fact unnecessary. Parentheses were actually introduced in ANSI join syntax to resolve ambiguity rather than either to influence execution plans or to ease readability; it just happens that in the above case the parentheses made the query much easier to read.

Let us look at another example:


select * from t1 left join t2 join t3 on col2=col3 join t4 on col3=col4 on col1=col2

      COL1       COL2       COL3       COL4
---------- ---------- ---------- ----------
         1          1          1          1
         2

If you look closely you will see that the only interpretation of the above query is that the outer join with t1 is forced to occur after the inner joins of t2 with t3 and t3 with t4 thus preserving both rows of t1 in the output. However, the CBO is free to reorder the joins of t2, t3, and t4 as in any way it sees fit as the result is unaffected.

Notice how the above query is different from this one:


select * from t1 left join t2 on col1=col2 join t3 on col2=col3 join t4 on col3=col4;

      COL1       COL2       COL3       COL4
---------- ---------- ---------- ----------
         1          1          1          1

Here the change in position of the ON clause results in a different join order and different results. The use of a USING clause will similarly result in a clear join order being defined. But consider these two queries:


select * from ((t1 left join t2 on col1=col2) cross join t3) right join t4 on col3=col4;

      COL1       COL2       COL3       COL4
---------- ---------- ---------- ----------
         1          1          1          1
         2                     1          1
                                          5

select * from (t1 left join t2 on col1=col2) cross join (t3 right join t4 on col3=col4);

      COL1       COL2       COL3       COL4
---------- ---------- ---------- ----------
         2                     1          1
         1          1          1          1
         2                                5
         1          1                     5

Because there is no qualifying clause on a CROSS JOIN the semantics of the query are ambiguous without the presence of parentheses. In reality, parentheses are only actually required in the latter case as without them left associativity applies and the former of the above queries is assumed. NATURAL JOIN syntax also lacks any qualifying clause and suffers from the same ambiguity as CROSS JOIN syntax.

So to recap, a hint is the only SQL syntax used to influence execution plans. All other constructs, including parentheses, are only used to clarify or enforce the semantics of the statement.

About these ads

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 43 other followers

%d bloggers like this: