Tony’s Oracle Tips

June 13, 2015

Interview question: second largest value

Filed under: Interview questions,SQL Fun — tonyhasler @ 8:32 pm

This is my first blog for a very very long time.  I stopped blogging a couple of years ago when I started working on my book and after I finally finished the book I felt I needed a break to recuperate.

However, I have finally decided to put my toe back in the water.  This isn’t a very serious post but I hope you find it thought provoking and a bit of fun.

The thought behind this post came from a colleague who had just asked a candidate for employment the following question in order to help assess their command of the SQL syntax.

“How would you select the second largest value from a column in a table”.

A simple enough question. Or so it seems.  You may be able to come up with several answers yourself.  But what is the “best” answer?

There are others that have opined on this subject before. Here is one example. This post is just my thoughts on the topic.

Let me begin by creating a test table with the values 1 t 10 as the basis for some examples:

CREATE TABLE t AS SELECT ROWNUM n1 FROM DUAL CONNECT BY LEVEL <= 10; 

Here is one possible way to find the second highest value:

WITH q1
     AS (SELECT NTH_VALUE (
                   n1,
                   2)
                OVER (
                   ORDER BY n1 DESC
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
                   n1
           FROM t)
SELECT *
  FROM q1
 WHERE ROWNUM < 2;

Solution 1

Solution 1 shows your interviewer that you know about new 11g Oracle features.  Solution 1 also has the benefit that it is readily changeable to obtain the third, fourth, or 100th top value from a table.  Nevertheless, I wouldn’t recommend offering this solution as it uses relatively obscure Oracle-specific syntax and may be viewed as overcomplicated.  This solution has a second disadvantage:  it isn’t correct!

Let me add an 11th row to the table:

INSERT INTO t
VALUES (10);

After adding this 11th row, “solution” 1 now returns 10.  But the second largest value is still 9, despite the fact that there are two rows with a value of 10.

You may try to impress on your interviewer a mastery of analytic functions.  How about this solution:

WITH q1 AS (SELECT n1, DENSE_RANK () OVER (ORDER BY n1 DESC) rnk FROM t)
SELECT n1
FROM q1
WHERE rnk = 2;

Solution 2

Solution 2 returns a value of 9 despite the presence of two rows with a value of 10. You might explain that the runtime engine will not, in practice, sort all of the rows in the table; it will just maintain the two largest values as it scans the table.  You explain that were the table to hold 11 million rows rather than 11 this may be an important consideration.  I don’t commend this solution to you either as it is also incorrect!

let us add a twelfth row to the table to demonstrate one of two flaws:

INSERT INTO t
VALUES (9);

Now our query returns two rows with a value of 9.  We need to limit the rows to one.

WITH q1 AS (SELECT n1, DENSE_RANK () OVER (ORDER BY n1 DESC) rnk FROM t)
SELECT n1
FROM q1
WHERE rnk = 2 AND ROWNUM = 1;

Solution 3

But even now the query is still incorrect.  Let me add a lucky 13th row to our table.

INSERT INTO t
VALUES (NULL);

Now solution 3 returns a value of 10!  If there is no NOT NULL constraint on our column we need to clarify the problem statement.  When a column is null, its value is unknown or unspecified; neither equality predicates nor inequality predicates will evaluate to true since we cannot assert that the column value is equal to anything (not even NULL) nor can we assert that it is not equal to anything (even NULL).  So how can we assert that the value (were it specified) would not be the largest or the second largest value?  We cannot.  If we are to “respect nulls” we have to return NULL as our second largest value, since the second largest value would be unknown.

However, let us assume (as do most aggregate functions in SQL) that we are to “ignore nulls” and return the second largest value from the subset of rows that are non-null.  We can now recode our solution to function correctly in the presence of duplicate values and NULLs.

WITH q1
AS (SELECT n1, DENSE_RANK () OVER (ORDER BY n1 DESC NULLS LAST) rnk
FROM t)
SELECT n1
FROM q1
WHERE rnk = 2 AND ROWNUM = 1;

UPDATE

The above is actually incorrect as no row is returned when there are fewer than two distinct values. One way to correct this is as follows:

WITH q1
AS (SELECT n1, DENSE_RANK () OVER (ORDER BY n1 DESC NULLS LAST) rnk
FROM t)
SELECT max(n1)
FROM q1
WHERE rnk = 2 AND ROWNUM = 1;
 

Solution 4

As far as I can see, solution 4 is semantically correct.

But is solution 4 the “best” solution?  How about this:

SELECT MAX (n1) n1
FROM t
WHERE n1 < (SELECT MAX (n1) FROM t);

Solution 5

Solution 5 handles duplicate values and nulls correctly.  It also avoids any sorts, uses no Oracle-specific SQL extensions and is very easy to read.

Although solution 5 avoids sorting large numbers of rows just like solution 4, will it be as fast?  In some cases, solution 5 will involve two full table scans and as a consequence might be a lot slower on a large table.

But this is not necessarily the case.  Let us add an index to our table:

CREATE INDEX i1
ON t (n1);

Once the index has been added the execution plan for solution 5 becomes this:

-------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Cost (%CPU)|
-------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |     1 |     1   (0)|
|   1 |  SORT AGGREGATE               |      |     1 |            |
|   2 |   FIRST ROW                   |      |     1 |     1   (0)|
|*  3 |    INDEX RANGE SCAN (MIN/MAX) | I1   |     1 |     1   (0)|
|   4 |     SORT AGGREGATE            |      |     1 |            |
|   5 |      INDEX FULL SCAN (MIN/MAX)| I1   |     1 |     1   (0)|
-------------------------------------------------------------------

What this execution plan shows is that when the runtime engine executes solution 5 on an indexed column it will pick the maximum value of the column from the right edge of the index and then pick the second largest value by looking at the entry immediately preceding that index entry (or entries).  No full scan of the index or table is required.  This means that when the column is indexed, solution 5 is likely to return the second largest value in milliseconds.  So in the case of an indexed column solution 5 will likely outperform solution 4 on large tables and in the case of an unindexed column solution 4 will likely outperform solution 5.

Of course, it would also be possible to devise solutions to our problem statement based on MATCH RECOGNIZE, the MODEL clause, or even user written aggregate functions. I would be very surprised if any such solutions would be simpler than those shown in this post.  Nor is it likely that such solutions would outperform solutions 4 and 5 posted here.

If you enjoyed this, feel free to suggest other topics in the vein in a reply.

Advertisements

December 2, 2014

UKOUG TECH14 SQL Quiz

Filed under: SQL Fun — tonyhasler @ 8:03 pm
Tags:

If you are an attendee of UKOUG Tech14 you have a chance to win a free copy of Troubleshooting Oracle Performance (2nd edition) by Christian Antognini

TOP

 

Troubleshooting Oracle Performance, 2nd Edition provides a systematic approach to addressing the underlying causes of poor database application performance.


 

AND a you can also win a free copy of Expert Oracle SQL by Tony Hasler

cover

Expert Oracle SQL provides a systematic approach for tuning individual SQL statements and shows how to stabilize production performance with a radical approach to managing object statistics.


 

To get a chance to win a copy of both of these books just leave a comment on this blog with your answers to these questions:

For each of the following pairs of SQL statements indicate whether:

A) They are both illegal
B) The first is legal and the second illegal
C) The second is legal and the first illegal
D) They are both legal but may provide different results
E) They are both legal statements guaranteed to return identical results (ignoring order)

You should assume an Oracle 12cR1 database containing tables T1, T2 and T3 in the current schema. Table T1 has a column C1, table T2 has a column C2, and table T3 has a column C3.  All tables have a column called ID.

Question 1:

SELECT * FROM T1 LEFT NATURAL JOIN T2;

SELECT * FROM T1 NATURAL LEFT JOIN T2;

Question 2:

SELECT * from T1 INNER JOIN T2 ON C1=C2;

SELECT * FROM T1 OUTER JOIN T2 ON C1=C2;

Question 3:

SELECT *
FROM t1, t2, t3
WHERE t1.c1 = t2.c2(+) AND t2.c2(+) = t3.c3;

SELECT *
FROM t1
LEFT JOIN t2 ON t1.c1 = t2.c2
RIGHT JOIN t3 ON t2.c2 = t3.c3;

Question 4:

SELECT c1, COUNT (*)
FROM t1
GROUP BY c1
HAVING COUNT (*) = 1;

SELECT c1, COUNT (*)
FROM t1
HAVING COUNT (*) = 1
GROUP BY c1;

Tie breaker 1:

Consider this puzzle.

All the numbers from 2 to 99 are placed into a hat. John draws two numbers from the hat and whispers the product of the two numbers to Paul. John whispers the sum of the two numbers to Sam. Paul and Sam then have the following conversation:

Paul: I don’t know what the two numbers are.
Sam: I knew you didn’t. Neither did I.
Paul: I do now.
Sam: So do I.

Write an SQL query that runs on an Oracle database to identify the only possible pair of numbers that could have been extracted from the hat.

  • The only database object you should reference is DUAL.
  • Use the keyword SELECT as infrequently as possible.
  • No PL/SQL. For the avoidance of doubt, PL/SQL used in part of an SQL statement in 12cR1 is not allowed.
  • You cannot work out all or part of the answer yourself. So, for example, you may work out that if Sam knows the answer immediately then so does Paul. Your SQL cannot make this assumption.

Tie breaker 2:

The first three names drawn out of a hat of containing the names of all correct answers will win a copy of the two books.

Rules

  • This competition is only available to attendees of the UKOUG Tech14 conference
  • To stand a chance of wining you must post your answers to the quiz questions by 11:00 a.m. on Tuesday 9th December 2014
  • You must be present at the Optimizer Round Table session at 12:00 on Tuesday 9th December to collect your prize. Otherwise another name will be draw from the hat.

October 24, 2012

Model clause use cases

Filed under: SQL Fun — tonyhasler @ 9:22 pm

This note describes a couple of real-life use cases for the SQL MODEL (or SPREADSHEET) clause that appeared in Oracle 10g. This is mainly targeted at people attending my UKOUG presentation on the model clause (materials TBS) but anybody who knows the syntax of the model clause may find this of some interest.

We already have SQL (non-procedural) and PL/SQL (procedural) languages and it is easy to call SQL from PL/SQL and vice-versa. So you might be asking yourself why on earth we need the model clause. Doesn’t it seem like just a procedural extension to SQL? If so why wouldn’t we just use PL/SQL?

Well, over the last year or so I have found a number of cases where the model clause has been useful. The two key areas are:

  • Implementing analytics not available directly from an SQL function
  • Implementing calculations in parallel

Before we discuss my use cases let us create some test data:

CREATE TABLE model_test
AS
   WITH q1
        AS (    SELECT DATE '2000-01-01' + LEVEL mydate, DBMS_RANDOM.VALUE ( 90, 100) VALUE
                  FROM DUAL
                 WHERE MOD ( LEVEL, 11)  0
            CONNECT BY LEVEL <= 4000
            UNION ALL
                SELECT DATE '2000-01-01' + LEVEL mydate, DBMS_RANDOM.VALUE ( 100, 200) VALUE
                  FROM DUAL
                 WHERE MOD ( LEVEL, 11) = 0
            CONNECT BY LEVEL <= 4000)
       ,q2
        AS (    SELECT DBMS_RANDOM.string ( 'p', 10) account
                  FROM DUAL
            CONNECT BY LEVEL <= 100)
   SELECT *
     FROM q1, q2;

EXEC DBMS_STATS.gather_table_stats(USER,'MODEL_TEST');

This table contains 40,000 rows with 4,000 days’ worth of data for each of 100 accounts. The data is just a single value of some kind. The average value is approximately 100 but for each account there are 3637 closely packed values between 90 and 100 and 363 values between 100 and 200.

A genuine requirement from my investment bank client was to implement a moving median over a one year window. The first problem I ran into related to the definition of a one year window. I discussed this issue in my last post. The second issue is, of course, that the median function does not support windowing. This is a solution based on the model clause:

SELECT /*+ parallel(10) */ * FROM model_test
  MODEL
  PARTITION BY (account)
  DIMENSION BY (mydate)
  MEASURES (VALUE, 0 mov_median)
  RULES(
    mov_median[ANY] = MEDIAN(VALUE)[mydate BETWEEN ADD_MONTHS(CV()+1,-12) AND CV()]);

This is a relatively concise way to express the requirement and it doesn’t look particularly procedural.

Trying to code this in PL/SQL would be pretty tricky because of the need to constantly manipulate sets of data. Furthermore, a moving median can’t be calculated in any efficient way, which is why there is no native support. The fact that we can use parallel query slaves to reduce the amount of time taken to perform the calculations further adds to the benefit of using the model clause rather than a PL/SQL function of some kind. Each slave performs the calculations for some subset of accounts.

In another case I was asked to calculate a zscore based on semi-standard deviations. A semi-standard deviation is required when the distribution of data above the mean is substantially different from that below the mean as in my test data. Essentially we have one standard deviation for the values above the mean and a separate one for those less than or equal to the mean. Zscore is simply a term for the number of standard deviations (in this case semi) that a particular value is above or below the mean. The zscores were also to be calculated over a moving one year window.

This is much more complicated than a moving median. Firstly, we have no built-in aggregate function that we can use for semi-standard deviations. Secondly, as we shall shortly see, we need an iterative model to implement our calculations.

To help explain my approach I need to invent some terms. I will say that:

  • The target date is the date for which we need our semi-standard deviation and zscore.
  • A contributing date is a date that contributes to the semi-standard deviation. Contributing
    dates are those on the correct side of the mean that are within the window for a target date.

A particular date will be part of the window for itself and all of the target dates in the following year. However, it will only contribute to some subset because its value will only sometimes be on the same side of the mean as the value of the target date. As a consequence we need a separate iteration for each target date.

To avoid unnecessary complications I will make two assumptions:

  • The data is dense. In other words, data is present for every account for every day. This wasn’t the case in real life.
  • A window of 365 days, not a calendar year, was adequate. This was the case in real life.

This is roughly how I did it:

   WITH q1
        AS (SELECT m.*
                  ,  ROW_NUMBER ()
                        OVER (PARTITION BY account ORDER BY mydate)
                   - 1
                      rn
                  ,0 AS mov_stdd
                  ,0 AS mov_avg
                  ,0 AS temp
                  ,0 AS zscore
              FROM model_test m)
   SELECT /*+ parallel(10) */
         account
         ,mydate
         ,VALUE
         ,mov_avg
         ,mov_stdd
         ,zscore
     FROM q1
   MODEL
      PARTITION BY (account)
      DIMENSION BY (rn)
      MEASURES (mydate, VALUE, mov_avg, mov_stdd, zscore, temp)
      RULES UPDATE
         ITERATE (100000) UNTIL VALUE[ITERATION_NUMBER] IS NULL
         (mov_avg [ITERATION_NUMBER] =
               AVG (VALUE)[rn BETWEEN ITERATION_NUMBER - 364
                                  AND ITERATION_NUMBER],
         temp [rn BETWEEN ITERATION_NUMBER - 364 AND ITERATION_NUMBER] =
               CASE
                  WHEN    (    VALUE[CV ()] <= mov_avg[ITERATION_NUMBER]
                           AND VALUE[ITERATION_NUMBER] <=
                                  mov_avg[ITERATION_NUMBER])
                       OR (    VALUE[CV ()] > mov_avg[ITERATION_NUMBER]
                           AND VALUE[ITERATION_NUMBER] >
                                  mov_avg[ITERATION_NUMBER])
                  THEN
                     POWER ( VALUE[CV ()] - mov_avg[ITERATION_NUMBER], 2)
               END,
         mov_stdd [ITERATION_NUMBER] =
               SQRT (
                  AVG (temp)[rn BETWEEN ITERATION_NUMBER - 364
                                    AND ITERATION_NUMBER]),
         zscore [ITERATION_NUMBER] =
               DECODE (mov_stdd[CV ()]
                      ,0, 0
                      , (VALUE[CV ()] - mov_avg[CV ()]) / mov_stdd[CV ()]));

We have associated a number with each target date so that we could use the ITERATION_NUMBER to find our target values. Note that the measure TEMP for a particular date is typically recalculated for each of 365 target dates and holds the contribution to the semi-standard standard deviation. The variance is the average contribution and the standard deviation (or more specifically, the population standard deviation) is the square root of the variance. The zscore is then a simple calculation as you can see.

Even though iteration was required here, PL/SQL wasn’t the right tool; even the limited use of aggregate functions in the above example has substantially simplified the logic and once again we can use parallel query slaves for our computations.

Update in reply to Dushyant

Dushyant has posted an interesting question in his comment. He wants to know how to implement a moving RANK function in a similar way to the moving MEDIAN function above. This was my first attempt:

WITH a
     AS (SELECT 'a' sector, TRUNC (SYSDATE) dt, 64 v FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 1 dt, 2 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 2 dt, 4 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 3 dt, 128 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 4 dt, 8 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 5 dt, 16 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 6 dt, 32 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 7 dt, 256 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 8 dt, 1 v FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 9 dt, 512 FROM DUAL
         UNION ALL
         SELECT 'b' sector, TRUNC (SYSDATE) dt, 3 FROM DUAL
         UNION ALL
         SELECT 'b' sector, TRUNC (SYSDATE) - 1 dt, 27 FROM DUAL
         UNION ALL
         SELECT 'b' sector, TRUNC (SYSDATE) - 2 dt, 9 FROM DUAL
         UNION ALL
         SELECT 'b' sector, TRUNC (SYSDATE) - 3 dt, 81 FROM DUAL),
     b
     AS (SELECT a.*,
                RANK () OVER (PARTITION BY sector ORDER BY dt) dt_rnk,
                COUNT (DISTINCT dt) OVER (PARTITION BY sector) dt_cnt
           FROM a)
SELECT sector,
       dt,
       v,
       mov_rank
  FROM b
MODEL
   PARTITION BY (sector)
   DIMENSION BY (dt)
   MEASURES (v, 0 mov_rank)
   RULES UPDATE
 (     mov_rank [ANY] =
            RANK(v[CV(dt)]) WITHIN GROUP (ORDER BY v)[dt BETWEEN CV()-3 AND CV()]);

One problem with the code above is that the nested cell reference v[CV(dt)] is illegal. I have only been able to solve the problem using an iterative model clause.

WITH a
     AS (SELECT 'a' sector, TRUNC (SYSDATE) dt, 64 v FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 1 dt, 2 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 2 dt, 4 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 3 dt, 128 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 4 dt, 8 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 5 dt, 16 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 6 dt, 32 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 7 dt, 256 FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 8 dt, 1 v FROM DUAL
         UNION ALL
         SELECT 'a' sector, TRUNC (SYSDATE) - 9 dt, 512 FROM DUAL
         UNION ALL
         SELECT 'b' sector, TRUNC (SYSDATE) dt, 3 FROM DUAL
         UNION ALL
         SELECT 'b' sector, TRUNC (SYSDATE) - 1 dt, 27 FROM DUAL
         UNION ALL
         SELECT 'b' sector, TRUNC (SYSDATE) - 2 dt, 9 FROM DUAL
         UNION ALL
         SELECT 'b' sector, TRUNC (SYSDATE) - 3 dt, 81 FROM DUAL),
     b
     AS (SELECT a.*,
                RANK () OVER (PARTITION BY sector ORDER BY dt) dt_rnk,
                COUNT (DISTINCT dt) OVER (PARTITION BY sector) dt_cnt
           FROM a)
SELECT sector,
       dt,
       v,
       mov_rank
  FROM b
MODEL
   PARTITION BY (sector)
   DIMENSION BY (dt_rnk)
   MEASURES (dt, v, dt_cnt, 0 mov_rank, 0 temp)
   RULES UPDATE
      ITERATE (100000) UNTIL dt_cnt[1] = ITERATION_NUMBER
      (temp [dt_rnk BETWEEN ITERATION_NUMBER - 2 AND ITERATION_NUMBER + 1] =
            RANK () OVER (ORDER BY v),
      mov_rank [ITERATION_NUMBER + 1] =
            temp[CV ()]);

This provides the desired effect!  Here is the output:

SECTOR DT V MOV_RANK
a 27/08/2014
512
1
a 28/08/2014
1
1
a 29/08/2014
256
2
a 30/08/2014
32
2
a 31/08/2014
16
2
a 01/09/2014
8
1
a 02/09/2014
128
4
a 03/09/2014
4
1
a 04/09/2014
2
1
a 05/09/2014
64
3
b 02/09/2014
81
1
b 03/09/2014
9
1
b 04/09/2014
27
2
b 05/09/2014
3
1

October 23, 2012

Non-inclusive windows for yearly or monthly intervals

Filed under: SQL Fun — tonyhasler @ 1:01 pm

About six months ago I came across the need to use analytic functions with a moving one year window. This is supposedly supported in Oracle 10g and beyond but the behaviour was not what I expected. Consider this test query:

WITH q1
     AS (    SELECT DATE '2012-01-01' + ROWNUM mydate
               FROM DUAL
         CONNECT BY LEVEL <= 400)
    ,q2
     AS (SELECT mydate
               ,COUNT (
                   *)
                OVER (
                   ORDER BY mydate
                   RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND CURRENT ROW)
                   cnt
               ,MIN (
                   mydate)
                OVER (
                   ORDER BY mydate
                   RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND CURRENT ROW)
                   mindate
               ,MAX (
                   mydate)
                OVER (
                   ORDER BY mydate
                   RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND CURRENT ROW)
                   maxdate
           FROM q1)
SELECT *
  FROM q2
 WHERE mydate = DATE '2013-01-10';

This is the result:

MYDATE CNT MINDATE MAXDATE
10-Jan-2013
367
10-Jan-2012 10-Jan-2013

The count of days in the year ending 10th January is 367 days. Now we know we have included 29th January 2012 so we do expect 366 days but not 367! The explanation comes when we look at the minimum and maximum values in the range: we have included both the 10th January 2012 and 10th January 2013 in our range. In reality, the fact that Oracle included an extra day was not an issue on this occasion but it has been bothering me for the last six months that I couldn’t see a way to solve this problem without the use of the model clause (more of this in my next post). However, as is often the case when one muses on something like this the answer came to me in my sleep and I awoke with a start!

I may have been inspired by Stephen Hawking’s invention of virual time when I finally came up with this solution:

WITH q1
     AS (    SELECT DATE '2012-01-01' + ROWNUM mydate
               FROM DUAL
         CONNECT BY LEVEL <= 400)
    ,q2
     AS (SELECT mydate
               ,COUNT (
                   *)
                OVER (
                   ORDER BY
                        ( (mydate - DATE '1950-01-01') * (86401 / 86400))
                      + DATE '1950-01-01'
                   RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND CURRENT ROW)
                   cnt
               ,MIN (
                   mydate)
                OVER (
                   ORDER BY
                        ( (mydate - DATE '1950-01-01') * (86401 / 86400))
                      + DATE '1950-01-01'
                   RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND CURRENT ROW)
                   mindate
               ,MAX (
                   mydate)
                OVER (
                   ORDER BY
                        ( (mydate - DATE '1950-01-01') * (86401 / 86400))
                      + DATE '1950-01-01'
                   RANGE BETWEEN INTERVAL '1' YEAR PRECEDING AND CURRENT ROW)
                   maxdate
           FROM q1)
SELECT *
  FROM q2
 WHERE mydate = DATE '2013-01-10';

This window function works using a somewhat simpler concept of virtual time than that of the famous physicist: each day is translated into 1 day and one second. This is accomplished by subtracting an arbitrary date earlier than the first date in your data and no more than 236 days earlier than the last date in your data. You then multiply the resultant number of days by 86401/86400 (there are 86400 seconds in one day) and then add the arbitrary date back in again. Now, when converted into virtual time 10th January 2012 is 1 year and 366 seconds prior to 10 January 2013 rather than 1 year and so is excluded from the window. As long as your window is less than around 236 years and your real dates do not include a time of day this technique allows you to implement windows of months and/or years that exclude the start date of the window.

June 11, 2012

Ordering Non-sorting aggregate functions

Filed under: SQL Fun — tonyhasler @ 3:49 pm

Aggregate functions fall into two categories. The first is what I call “Sorting aggregate functions” that need their data sorted in some way to operate. Examples of sorting aggregate functions are MEDIAN and DENSE_RANK. The second category is non-sorting aggregate functions that do not require a sort to operate. These include COUNT, AVG, and RANK.

In Oracle 10g the concept of HASH AGGREGATION was introduced as an optimisation of the GROUP BY operation. The idea is that non-sorting aggregate functions (let me call them NSAFs from now on) do not need to be kept in a sorted list; you just keep a hash table of the various accumulating counters that are needed for that function’s operation. Since a hash table is quicker to access than a sorted list this offers some performance benefit, more so when the list of groups is large.

Hash aggregation obviously only works for NSAFs. If you include MEDIAN(X) in your select list or HAVING clauses then the input data will all need to be sorted. One big list is created, sorted first by the GROUP BY columns and then by, in this case, column X.

For some reason, there are five NSAFs that do not take advantage of hash aggregation. These are FIRST, LAST, RANK, SYS_XMLAGG and XMLAGG. I can’t imagine why not but there it is.

Even if an NSAF supports hash aggregation the CBO doesn’t always use it when the data needs to be sorted anyway, for example as a result of an ORDER BY operation.

Guy Harrison has suggested in his blog that the logic is “seriously flawed” because if you have 1,000,000 rows and 100 groups it would be best to defer the sort until after the aggregation because then you have only 100 rows to sort and not 1,000,000. He has even done some experiements to show that the CPU overhead is much less if you use the USE_HASH_AGGREGATION hint.

However, it turns out that Oracle does not sort 1,000,000 rows after all. Just the 100. Let us create a test case.

CREATE TABLE sort_test
AS
   WITH q1
        AS (    SELECT ROWNUM rn
                  FROM DUAL
            CONNECT BY LEVEL <= 100)
   SELECT MOD (a.rn, 2) grp, a.rn + b.rn + c.rn myint1, b.rn myint2
     FROM q1 a, q1 b, q1 c;

This table has 1,000,000 rows but there are only two distinct values for the GRP column. Let us set a 1K SORT_AREA_SIZE as follows:

ALTER SESSION SET workarea_size_policy=manual;
ALTER SESSION SET sort_area_size=1024;

Now let us see what the execution plan looks like for a statement that truly requires a sort of all 1,000,000 rows:

  SELECT grp, MEDIAN (myint1) myavg
    FROM sort_test
GROUP BY grp
ORDER BY grp;

--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |   966K|    23M|   603   (9)| 00:00:08 |
|   1 |  SORT GROUP BY     |           |   966K|    23M|   603   (9)| 00:00:08 |
|   2 |   TABLE ACCESS FULL| SORT_TEST |   966K|    23M|   559   (2)| 00:00:07 |
--------------------------------------------------------------------------------

The 23M of space that we require is much greater than the 1K we have allocated and so when we run it we get a multi-pass sort that on my laptop took 51 minutes.

Let us change the function to a NSAF:

  SELECT grp, AVG (myint1) myavg
    FROM sort_test
GROUP BY grp
ORDER BY grp;

The execution plan reported by the CBO is identical. However, the query ran in under 1 second and the session statistics report a memory sort!

What is clearly happening is that only the accumulating counters are being maintained not the values from each row. I did the experiment on 10.2.0.4 and 11.2.0.1 so I don’t think it is a new feature.

Incidentally, the same mechansim is used for NSAFs that don’t support hash aggregation:

  SELECT grp, MIN (myint1) KEEP (DENSE_RANK FIRST ORDER BY myint2) myvalue
    FROM sort_test
GROUP BY grp
ORDER BY grp;

The above query also finishes very quickly.

So why was Guy able to demonstrate a performance improvement with the use of the USE_HASH_AGGREGATION hint? Well the answer is that wandering up and down a tree of sorted items can be much more expensive than performing a direct hash. Particularly as Guy had 200,000 groups. However, this benefit may be small, or even negative, if the number of groups is very small as in my artificial case.

So Guy is right about the CBO being “flawed” bacause a) the execution plan costs the operation incorrectly and b) the sorted tree is going to be expensive to walk up and down a huge number of times. However, things aren’t quite as bad as they first appear.

Pushing parameters into views a.k.a predicate pushing

Filed under: SQL Fun — tonyhasler @ 12:18 am

It is a very common and oft discussed issue. You have a complex view definition, possibly containing an analytic function, and when you SELECT from the view you can’t get any advantage from a WHERE clause. For an old, and lengthy, discussion of why Oracle “just can’t do it” you can see this article on Ask Tom. Well it turns out that there is something you can do after all. First, let us set the scene. We need a table and associated index with some values:

CREATE TABLE t1
AS
   WITH q1
        AS (    SELECT ROWNUM rn
                  FROM DUAL
            CONNECT BY LEVEL <= 100)
   SELECT MOD (a.rn, 100) grp, a.rn+b.rn fact_int, RPAD ('X', 100) vpad
     FROM q1 a, q1 b;

CREATE INDEX i1
   ON t1 (grp);

We have a table with 10,000 rows with 100 values of GRP. Now let us build our view with the awkward analytic query:

CREATE OR REPLACE VIEW v1
AS
     SELECT grp
           ,AVG (SUM (fact_int)) OVER (ORDER BY grp RANGE 2 PRECEDING) mv_avg
       FROM t1
   GROUP BY grp;

When we query this view with a specific value for GRP we get no index usage.

SELECT *
  FROM v1
 WHERE grp BETWEEN 50 AND 51;
 
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |  9485 |   240K|    49   (3)| 00:00:01 |
|*  1 |  VIEW                | V1   |  9485 |   240K|    49   (3)| 00:00:01 |
|   2 |   WINDOW BUFFER      |      |  9485 |   240K|    49   (3)| 00:00:01 |
|   3 |    SORT GROUP BY     |      |  9485 |   240K|    49   (3)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| T1   |  9485 |   240K|    48   (0)| 00:00:01 |
-----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("GRP">=50 AND "GRP"<=51)

Note how our predicate is applied at id 1, after all 10,000 rows have been aggregated.
Wouldn’t it be nice if somehow we could convince the optimizer to transform our query into this:

WITH q1
     AS (  SELECT grp
                 ,AVG (SUM (fact_int)) OVER (ORDER BY grp RANGE 2 PRECEDING) mv_avg
             FROM t1
            WHERE grp BETWEEN 48 AND 51
         GROUP BY grp)
SELECT *
  FROM q1
 WHERE grp BETWEEN 50 AND 51;

For the moving average calculation we actually only need to look at the rows with values within the range being aggregated. Well, it turns out that you can rewrite your view to allow this providing you have a convenient way to identify the complete set of legal values for the predicated column (grp in our case). If the table being queried is a fact table in a data warehouse you may already have a suitable dimension table you can use. In our case we will build one:

CREATE TABLE grp_table (grp PRIMARY KEY)
ORGANIZATION INDEX
AS
   SELECT DISTINCT grp FROM t1;

I have made this an IOT given that (in my example) it has only one column and we plan to use it exclusively for unique primary key lookups. A heap dimension table with several columns is fine if it is reasonably small.
We also need to define a couple of types to reflect the result of our original query:

CREATE OR REPLACE TYPE pair_num AS OBJECT (n1 NUMBER, n2 NUMBER);
CREATE OR REPLACE TYPE pair_num_t AS TABLE OF pair_num;

In our case, our desired query returns two numeric values. We define our types generically so that we can reuse them for other views with similar result sets. Now redefine our view as follows:

CREATE OR REPLACE VIEW v1 (startgrp, endgrp, grp, mv_avg)
AS
   WITH q1 AS (SELECT grp startgrp FROM grp_table)
       ,q2 AS (SELECT grp endgrp FROM grp_table)
     SELECT startgrp, endgrp, n1 AS grp, n2 AS mv_avg
     FROM q1
         ,q2
         ,TABLE (CAST (MULTISET (  SELECT grp
                                         ,AVG (SUM (fact_int)) OVER (ORDER BY grp RANGE 2 PRECEDING) mv_avg
                                     FROM t1
                                    WHERE grp BETWEEN startgrp-2 AND endgrp
                                 GROUP BY grp) AS pair_num_t)) pairs
    WHERE n1 >= startgrp;

This view now contains two extra columns STARTGRP and ENDGRP that can be used in our WHERE clause to identify the subset of rows that we want. The new query and associated execution plan are as follows:

SELECT grp,mv_avg
  FROM v1
 WHERE startgrp = 50 AND endgrp = 51;
 
---------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |    20 |   560 |    30   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                       |                   |    20 |   560 |    30   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                      |                   |     1 |    26 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN                | SYS_IOT_TOP_84132 |     1 |    13 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN                | SYS_IOT_TOP_84132 |     1 |    13 |     0   (0)| 00:00:01 |
|*  5 |   COLLECTION ITERATOR SUBQUERY FETCH|                   |    20 |    40 |    29   (0)| 00:00:01 |
|   6 |    VIEW                             |                   |    24 |   624 |     3   (0)| 00:00:01 |
|   7 |     WINDOW BUFFER                   |                   |    24 |   624 |     3   (0)| 00:00:01 |
|   8 |      SORT GROUP BY NOSORT           |                   |    24 |   624 |     3   (0)| 00:00:01 |
|*  9 |       FILTER                        |                   |       |       |            |          |
|  10 |        TABLE ACCESS BY INDEX ROWID  | T1                |    24 |   624 |     3   (0)| 00:00:01 |
|* 11 |         INDEX RANGE SCAN            | I1                |    43 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("GRP"=51)
   4 - access("GRP"=50)
   5 - filter(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)>=50 AND "GRP"<=SYS_OP_ATG(VALUE(KOKBF$),1,2,2))
   9 - filter(:B1-2<=:B2)
  11 - access("GRP">=:B1-2 AND "GRP"<=:B2)

Our new query has a small unnecessary overhead where we "look up" the STARTDATE and ENDDATE in our dimension table but on the other hand we have been able to use information about our predicate to restrict the rows aggregated to the point that we can take advantage of our index.
This technique of passing information from one rows source in a join to another is known as a lateral join and is available with the TABLE and XMLTABLE operators.
Note that we will get very undesirable results if a user does not specify a WHERE clause as now we will get results matching every combination of STARTDATE and ENDDATE where ENDDATE >= STARTDATE. If you want to protect against this you can write a pipelined function.

CREATE OR REPLACE VIEW v1 (startgrp, endgrp, grp, mv_avg)
AS
   WITH q1
        AS (  SELECT /*+ no_merge no_eliminate_oby */
                    grp startgrp
                FROM grp_table
            ORDER BY startgrp DESC)
       ,q2
        AS (  SELECT /*+ no_merge no_eliminate_oby */
                    grp endgrp
                FROM grp_table
            ORDER BY endgrp)
   SELECT /*+ leading(q1) use_nl(q2) */
         startgrp, endgrp, n1 AS grp, n2 AS mv_avg
     FROM q1, q2
         ,TABLE (my_package.my_pipelined_function (startgrp, endgrp)) pairs;

Note the order by clauses with associated hints will cause the pipelined function to be first invoked with a value of ENDDATE less than STARTDATE if no predicate is supplied. The package containing the piplelined function might look like this:

CREATE OR REPLACE PACKAGE my_package
AS
   FUNCTION my_pipelined_function (startgrp NUMBER, endgrp NUMBER)
      RETURN pair_num_t
      PIPELINED;
END my_package;

CREATE OR REPLACE PACKAGE BODY my_package
AS
   FUNCTION my_pipelined_function (startgrp NUMBER, endgrp NUMBER)
      RETURN pair_num_t
      PIPELINED
   IS
      CURSOR c1
      IS
         WITH q1
              AS (  SELECT grp
                          ,AVG (SUM (fact_int)) OVER (ORDER BY grp RANGE 2 PRECEDING) mv_avg
                      FROM t1
                     WHERE grp BETWEEN startgrp - 2 AND endgrp
                  GROUP BY grp)
         SELECT *
           FROM q1
          WHERE grp BETWEEN startgrp AND endgrp;
   BEGIN
      IF startgrp > endgrp
      THEN
         raise_application_error (-20000, 'STARTGRP must be specified and be less than ENDGRP');
      END IF;

      FOR r IN c1
      LOOP
         PIPE ROW (pair_num (r.grp, r.mv_avg));
      END LOOP;
   END my_pipelined_function;
END my_package;

March 27, 2012

FIRST/LAST versus FIRST_VALUE/LAST_VALUE

Filed under: SQL Fun — tonyhasler @ 2:28 pm

As I promised in my last post, I want to discuss the difference between FIRST/LAST and FIRST_VALUE/LAST_VALUE.

Previously I discussed FIRST and LAST purely as aggregate functions but they can be used as analytic functions as well. FIRST_VALUE and LAST_VALUE can only be used as an analytic functions and in that respect at least they are somewhat less powerful.

Most of us take a while to get to grasp with what analytic functions are and how they differ from aggregate functions when we first learn about them but there are many articles and discussions on the web about the general topic. Here is one you might like.

Now that you know about analytic functions in general, let us return to look at an example of FIRST using the same table as last time:

SELECT c1, MIN (c3) KEEP (DENSE_RANK FIRST ORDER BY c1) OVER () "Analytic"
  FROM t1
 WHERE c1 <= 50;
 
 QUERY 1

What this query says is: After filtering out rows where C1 is NULL or has a value greater than 50, find the row or rows that have the minimum value of C1 and amongst those rows find the minimum value of C3. This value is used for the “Analytic” column; all rows in the final result set have the same value for the “Analytic” column.

Now let us look at a similar query using FIRST_VALUE:

SELECT c1, FIRST_VALUE (c3) OVER (ORDER BY c1) "Analytic"
  FROM t1
 WHERE c1 <= 50;
 
 QUERY 2
 

What this query says is: After filtering out rows where C1 is NULL or has a value greater than 50, find the row or rows that have the minimum value of C1 and amongst those rows pick an arbitrary row and then select the value of C3. This value is used for the “Analytic” column; all rows in the final result set have the same value for the “Analytic” column.

Let us first assume that the expression used in the ORDER BY clause is unique (as recommended in the documentation of FIRST_VALUE). In this case, the results of QUERY 2 are deterministic and absolutely the same as QUERY 1. Of course, in this case the aggregate function used in conjunction with FIRST could be MIN, MAX, or even AVG. The result is the same as there is only one row from which to pick.

However, if the ORDER BY CLAUSE does not represent a unique key then I would suggest you use the FIRST function in preference to FIRST_VALUE as it provides a way to explicitly deal with any duplicates.

The FIRST function may also provide better performance. Here is the execution plan for QUERY 1:

Here is the execution plan:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    50 |  1300 |     6   (0)| 00:00:01 |
|   1 |  WINDOW BUFFER     |      |    50 |  1300 |     6   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   |    50 |  1300 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

As you can see, we don’t need a sort as explained in my last post but we do need to buffer the rows as we don’t know what to put into the “Analytic” column for any row until we process the last row in our result set.

Now let us look at the execution plan for QUERY 2:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    50 |  1300 |     7  (15)| 00:00:01 |
|   1 |  WINDOW SORT       |      |    50 |  1300 |     7  (15)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   |    50 |  1300 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

As you can see, with FIRST_VALUE we have a superfluous sort and as a consequence a higher cost.

You can’t always avoid a sort with FIRST. Consider this example:

  SELECT c1
        ,MIN (c3) KEEP (DENSE_RANK FIRST ORDER BY c1) OVER (PARTITION BY c2)
            "Analytic"
    FROM t1
   WHERE c1 <= 50
ORDER BY c2, c1 DESC;
 
 QUERY 3
 

This query is similar to QUERY 1 but this time there is a seperate value of "Analytic" for each value of C2. This PARTITION clause means we have to sort the data by C2. However, the ORDER BY clause requires that at some stage we need to order the rows by "C2, C1 DESC". This single sort is sufficient for both purposes as we can see in the execution plan:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    50 |  1950 |     6   (0)| 00:00:01 |
|   1 |  WINDOW SORT       |      |    50 |  1950 |     6   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   |    50 |  1950 |     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Let us look at the equivalent statement using FIRST_VALUE:

SELECT c1, FIRST_VALUE (c3) OVER (PARTITION BY c2 ORDER BY c1) "Analytic"
    FROM t1
   WHERE c1 <= 50
ORDER BY c2, c1 DESC;

QUERY 4


Here is the execution plan for QUERY 4:

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    50 |  1950 |     8  (25)| 00:00:01 |
|   1 |  SORT ORDER BY      |      |    50 |  1950 |     8  (25)| 00:00:01 |
|   2 |   WINDOW SORT       |      |    50 |  1950 |     8  (25)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |    50 |  1950 |     6   (0)| 00:00:01 |
----------------------------------------------------------------------------

Now we see an extra sort. The reason is that the FIRST_VALUE function requires the data sorted by the composite sort key "c2,c1" and so needs to be reordered by "c2,c1 DESC" for the final result.

Of course, all the above observations can be applied to LAST and LAST_VALUE.

So, is there ever a need for FIRST_VALUE and LAST_VALUE? Well consider this query:

SELECT c1
      ,FIRST_VALUE (
          c3)
       OVER (PARTITION BY c2
             ORDER BY c1
             ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)
          "Analytic"
  FROM t1
 WHERE c1 <= 50;
 
QUERY 5

This time FIRST_VALUE is being used almost like LAG(C3,3) in that it is picking up the value of C3 "three rows up" so to speak. The difference is that LAG(c3,3) returns NULL for the first three rows and FIRST_VALUE doesn't. There isn't a way to code this expression using FIRST because FIRST doesn't support either an ORDER BY clause or a windowing clause.

Whilst we are on the topic of windowing clauses, the implicit and unchangeable window clause for the FIRST and LAST functions is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, in other words all rows in our partition. For FIRST_VALUE and LAST_VALUE the default but changeable windowing clause is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, in other words we exclude rows after the current one. Dropping rows off the bottom of a list makes no difference when we are looking for the first row in the list (FIRST_VALUE) but it does make a difference when we are looking for the last row in the list (LAST_VALUE) so you will usually need either to specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING explicitly when using LAST_VALUE or just use FIRST_VALUE and reverse the sort order.

In conclusion, I would recommend that you only ever use FIRST_VALUE or LAST_VALUE when you need a windowing function other than ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. When FIRST or LAST are available the performance is never worse and frequently better than the equivalent FIRST_VALUE or LAST_VALUE construct. In addition, FIRST and LAST avoid the possibility of non-deterministic results when non-unique ORDER BY clauses are supplied.

March 26, 2012

Aggregate Functions and Sorts

Filed under: SQL Fun,Uncategorized — tonyhasler @ 3:46 pm

I have been prompted to write one of my rare blogs because I have just discovered that the topic of aggregate functions is not quite as straightforward as I previously thought.

Before we start, I want you to look long and hard at the following query and tell me (figuratively speaking) how many sorts are involved in it:

SELECT AVG (c1)
      ,MIN (c1)
      ,MAX (c1)
      ,STDDEV (c2)
      ,RANK (2) WITHIN GROUP (ORDER BY c1 DESC)
      ,STDDEV (c1) KEEP (DENSE_RANK FIRST ORDER BY c2)
      ,VARIANCE (c2) KEEP (DENSE_RANK LAST ORDER BY c3)
  FROM t1;

How are you doing? Let us have a look at the execution plan and see if that will help you?

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    39 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    39 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |  6000 |   228K|     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Has that changed your opinion?

So the correct answer is……zero!

If you are in any doubt, you can create the table I used for the query and try it yourself:

CREATE TABLE t1
AS
       SELECT ROWNUM c1, MOD (ROWNUM, 2) c2, MOD (ROWNUM, 3) c3
         FROM DUAL
   CONNECT BY LEVEL <= 6000;

You can use the following query before and after the query under test to check the sort statistics:

SELECT name, VALUE
  FROM v$mystat NATURAL JOIN v$statname
 WHERE name LIKE 'sort%';

WARNING: Do not use autotrace when checking the stats. as autotrace can perform a sort.

Let us see how it is possible for this statement to complete without any sorts.

The AVG function calculates the mean of a set of values. AVG(c1) can be calculated as the rows are scanned by simply keeping a running total of c1 and a count of the number of rows where c1 is not null and then performing a division at the end. No sort is required. It should be clear that no sort is required for MIN or MAX either. It may be less clear (until you Google it) that STDDEV and VARIANCE aggregations can be performed in a single pass with no sort. In my query, the RANK function simply counts the number of rows that have a value greater than or equal to 2.

This is all very straightforward but what about the last two calculations? Those ORDER BY clauses surely imply some kind of sort don’t they?

Well no. In my query, the FIRST function returns the standard deviation of all values of c1 amongst the rows that have the lowest value of c2.

In general, the FIRST function almost certainly operates by keeping track of the lowest value of c2 encountered so far and on encountering a new row does the following:

  1. If the value of c2 is greater than the stored minimum value of c2, ignore the row for the purposes of this aggregate function.
  2. If the value of c2 equals the minimum encountered so far then perform the aggregation calculations on c1 defined on the left as if no FIRST function was specified.
  3. If the value of c2 is less than the minimum stored so far, replace the stored minimum, delete any stored aggregate values of c1 calculated so far and then process the row as per step 2.

Now, so far I have been very careful in my selection of aggregate functions. Some aggregate functions do require a sort. Let us make a tiny change to the select statement:

SELECT AVG (c1)
      ,MIN (c1)
      ,MAX (c1)
      ,STDDEV (c2)
      ,DENSE_RANK (2) WITHIN GROUP (ORDER BY c1 DESC)
      ,STDDEV (c1) KEEP (DENSE_RANK FIRST ORDER BY c2)
      ,VARIANCE (c2) KEEP (DENSE_RANK LAST ORDER BY c3)
  FROM t1;

The only difference is that I have replaced the RANK function call with a DENSE_RANK function call. The DENSE_RANK function requires a sort as confirmed by the statistics. This is the new execution plan:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    39 |     6   (0)| 00:00:01 |
|   1 |  SORT GROUP BY     |      |     1 |    39 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |  6000 |   228K|     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Note that the SORT AGGREGATE operation has been replaced by a SORT GROUP BY. So we have discovered something interesting: SORT AGGREGATE does not perform a sort! Ever!

What happens if we need more than one sort:

SELECT DENSE_RANK (3000) WITHIN GROUP (ORDER BY c1 DESC), MEDIAN (c2) FROM t1;

The execution plan remains the same and the statistic “sorts (memory)” seems to indicate that there has been only one sort but this is impossible. In fact, if you look at “sorts (rows)” you will see that it has been incremented by 9000. 6000 of these rows are for the MEDIAN function but DENSE_RANK only needs to sort the 3000 rows where c1 has a value of less than or equal to 3000. So SORT GROUP BY just counts one sort in the statistics, no matter how many are actually performed.

Let us introduce a GROUP BY clause:

SELECT DENSE_RANK (3000) WITHIN GROUP (ORDER BY c1 DESC) from t1 GROUP BY c2;

The execution plan remains unchanged and the statistics once again show only one sort. In this case, however, there may be several sorts: the first sort is by c2, the others will be sorts to evaluate DENSE_RANK within each group. The “sorts (rows)” statistic seems to support this theory.

I think that is almost enough for this blog. I will post another soon on the difference between the FIRST and LAST functions and the FIRST_VALUE and LAST_VALUE functions. But let me close this note with a comment about the HASH GROUP BY operation introduced in Oracle 10g.

HASH GROUP BY doesn’t perform a sort (as you might expect) and so can only be used for (sub)queries that do not involve aggregates that require sorts. Consider this query:

SELECT MAX(c1) FROM t1 GROUP BY c2;

Here is the execution plan:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  6000 |   152K|     7  (15)| 00:00:01 |
|   1 |  HASH GROUP BY     |      |  6000 |   152K|     7  (15)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1   |  6000 |   152K|     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

In this case we simply keep an array of maximums (one for each distinct value of c2) as the table is scanned.

March 10, 2010

Parallel query distribution methods

Filed under: SQL Fun — tonyhasler @ 4:38 pm

A while ago I started to investigate the possibility of using parallel query on some large tables. These large tables were partitioned by date and joined together. It struck me that using subpartitioning on the join column I could get good results.

As I had learned from Christian Antognini’s excellent book this would allow a “Full Partition-wise Joins”. Let me describe how this should have helped in my case.

Let us assume that we are joining two tables T1 and T2 both partitioned by range on a date column D1 and sub-partitioned by hash on the join column J1. It should be possible for a parallel query slave to join one sub-partition from T1 to its corresponding subpartition from T2 without any need to communicate with any other parallel query slaves.

So I wrote a test script:


set autotrace off
DROP TABLE T1;

CREATE TABLE T1(D1 DATE, J1 CHAR(2000), C1 CHAR(2000))
PCTFREE 99 -- To make the table a reasonable size
PARTITION BY RANGE (D1)
   SUBPARTITION BY HASH (J1)
      SUBPARTITIONS 8 (PARTITION P1
                          VALUES LESS THAN (DATE '2010-01-02')
                      ,PARTITION P2
                          VALUES LESS THAN (DATE '2010-01-03')
                      ,PARTITION P3
                          VALUES LESS THAN (DATE '2010-01-04')
                      ,PARTITION P4
                          VALUES LESS THAN (DATE '2010-01-05')
                      ,PARTITION P5
                          VALUES LESS THAN (DATE '2010-01-06'))

PARALLEL(DEGREE 4);

DROP TABLE T2;

CREATE TABLE T2(D1 DATE, J1 CHAR(2000), C1 CHAR(2000))
PCTFREE 99 -- To make the table a reasonable size
PARTITION BY RANGE (D1)
   SUBPARTITION BY HASH (J1)
      SUBPARTITIONS 8 (PARTITION P1
                          VALUES LESS THAN (DATE '2010-01-02')
                      ,PARTITION P2
                          VALUES LESS THAN (DATE '2010-01-03')
                      ,PARTITION P3
                          VALUES LESS THAN (DATE '2010-01-04')
                      ,PARTITION P4
                          VALUES LESS THAN (DATE '2010-01-05')
                      ,PARTITION P5
                          VALUES LESS THAN (DATE '2010-01-06'))

PARALLEL(DEGREE 4);

INSERT INTO T1(D1, J1, C1)
       SELECT   DATE '2010-01-03', ROWNUM, ROWNUM
         FROM   DUAL
   CONNECT BY   LEVEL <= 30000;

INSERT INTO T2(D1, J1, C1)
       SELECT   DATE '2010-01-03', ROWNUM, ROWNUM
         FROM   DUAL
   CONNECT BY   LEVEL  <=30000;

BEGIN
   DBMS_STATS.GATHER_TABLE_STATS(USER
                                ,'T1'
                                ,estimate_percent => 100);
   DBMS_STATS.GATHER_TABLE_STATS(USER
                                ,'T2'
                                ,estimate_percent => 100);
END;
/

set autotrace traceonly
set timing on
ALTER SESSION SET EVENTS '10053 trace name context forever';
--
-- First let us try without the hint
--

SELECT   COUNT( * )
  FROM   T1, T2
 WHERE       t1.d1 = DATE '2010-01-03'
         AND t2.d1 = DATE '2010-01-03'
         AND t1.j1 = t2.j1;

--
-- And now hinted
--

SELECT /*+ leading(t1, t2) pq_distribute(t2 none none) */
      COUNT( * )
  FROM   T1, T2
 WHERE       t1.d1 = DATE '2010-01-03'
         AND t2.d1 = DATE '2010-01-03'
         AND t1.j1 = t2.j1;

ALTER SESSION SET EVENTS '10053 trace name context forever';
set autotrace off

Although parallel query was deployed I got a sub-optimal distribution method that not only took longer but used twice as many parallel query slaves as necessary unless I added hints. First the execution plan unhinted:


----------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| P
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |     1 |  4018 |  2090   (1)| 00:00:11 |       |
|   1 |  SORT AGGREGATE               |          |     1 |  4018 |            |          |       |
|   2 |   PX COORDINATOR              |          |       |       |            |          |       |
|   3 |    PX SEND QC (RANDOM)        | :TQ10001 |     1 |  4018 |            |          |       |
|   4 |     SORT AGGREGATE            |          |     1 |  4018 |            |          |       |
|*  5 |      HASH JOIN                |          | 30000 |   114M|  2090   (1)| 00:00:11 |       |
|   6 |       PX RECEIVE              |          | 30000 |    57M|  1044   (0)| 00:00:06 |       |
|   7 |        PX SEND BROADCAST LOCAL| :TQ10000 | 30000 |    57M|  1044   (0)| 00:00:06 |       |
|   8 |         PX BLOCK ITERATOR     |          | 30000 |    57M|  1044   (0)| 00:00:06 |     1 |
|*  9 |          TABLE ACCESS FULL    | T2       | 30000 |    57M|  1044   (0)| 00:00:06 |    17 |
|  10 |       PX BLOCK ITERATOR       |          | 30000 |    57M|  1044   (0)| 00:00:06 |     1 |
|* 11 |        TABLE ACCESS FULL      | T1       | 30000 |    57M|  1044   (0)| 00:00:06 |    17 |
----------------------------------------------------------------------------------------------------

and now hinted:


----------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart|
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |          |     1 |  4018 |  2090   (1)| 00:00:11 |       |
|   1 |  SORT AGGREGATE                 |          |     1 |  4018 |            |          |       |
|   2 |   PX COORDINATOR                |          |       |       |            |          |       |
|   3 |    PX SEND QC (RANDOM)          | :TQ10000 |     1 |  4018 |            |          |       |
|   4 |     SORT AGGREGATE              |          |     1 |  4018 |            |          |       |
|   5 |      PX PARTITION HASH ALL      |          | 30000 |   114M|  2090   (1)| 00:00:11 |     1 |
|*  6 |       HASH JOIN                 |          | 30000 |   114M|  2090   (1)| 00:00:11 |       |
|   7 |        PX PARTITION RANGE SINGLE|          | 30000 |    57M|  1044   (0)| 00:00:06 |     3 |
|*  8 |         TABLE ACCESS FULL       | T1       | 30000 |    57M|  1044   (0)| 00:00:06 |    17 |
|   9 |        PX PARTITION RANGE SINGLE|          | 30000 |    57M|  1044   (0)| 00:00:06 |     3 |
|* 10 |         TABLE ACCESS FULL       | T2       | 30000 |    57M|  1044   (0)| 00:00:06 |    17 |
----------------------------------------------------------------------------------------------------

The curious thing was that no matter what distribution mechanism I picked the cost was the same!

I asked Christian about this and his full reply came in this blog. Christian was able to explain why the estimated costs in the execution plan were misleading and further suggested that my problem with the wrong distribution mechanism was due to partition elimination. I changed my query to remove the date predicate but this didn’t help.

So I took Christian’s lead and looked at the 10053 trace file. The following is an extract of the relevant section of the trace file for the un-hinted query:


-- Enumerating distribution methods for #Hash Join:
---- cost NONE = 0.00  Outer table:
    resc: 3759.85  card 30000.00  bytes: 2009  deg: 4  resp: 1044.40
  Inner table: T2  Alias: T2
    resc: 3759.85  card: 30000.00  bytes: 2009  deg: 4  resp: 1044.40
    using dmeth: 129  #groups: 1
    Cost per ptn: 0.62  #ptns: 8
    hash_area: 16430 (max=82150)       buildfrag: 926  probefrag: 926  passes: 1
  Hash join: Resc: 7524.65  Resp: 2090.04  [multiMatchCost=0.00]
---- cost(Hash Join) = 2090.04 (w/o dist), 2090.04 (w/ dist)
---- cost VALUE = 16.16
---- cost with slave mapping = 6.07
  Outer table:
    resc: 3759.85  card 30000.00  bytes: 2009  deg: 4  resp: 1044.40
  Inner table: T2  Alias: T2
    resc: 3759.85  card: 30000.00  bytes: 2009  deg: 4  resp: 1044.40
    using dmeth: 2  #groups: 1
    Cost per ptn: 0.74  #ptns: 4
    hash_area: 16430 (max=82150)       buildfrag: 1851  probefrag: 1851  passes: 1
  Hash join: Resc: 7522.65  Resp: 2089.54  [multiMatchCost=0.00]
---- cost(Hash Join) = 2089.54 (w/o dist), 2095.61 (w/ dist)
---- cost PARTITION-RIGHT = 4.04
  Outer table:
    resc: 3759.85  card 30000.00  bytes: 2009  deg: 4  resp: 1044.40
  Inner table: T2  Alias: T2
    resc: 3759.85  card: 30000.00  bytes: 2009  deg: 4  resp: 1044.40
    using dmeth: 192  #groups: 1
    Cost per ptn: 0.62  #ptns: 8
    hash_area: 16430 (max=82150)       buildfrag: 926  probefrag: 926  passes: 1
  Hash join: Resc: 7524.65  Resp: 2090.04  [multiMatchCost=0.00]
---- cost(Hash Join) = 2090.04 (w/o dist), 2094.08 (w/ dist)
---- cost PARTITION-LEFT = 4.04
  Outer table:
    resc: 3759.85  card 30000.00  bytes: 2009  deg: 4  resp: 1044.40
  Inner table: T2  Alias: T2
    resc: 3759.85  card: 30000.00  bytes: 2009  deg: 4  resp: 1044.40
    using dmeth: 160  #groups: 1
    Cost per ptn: 0.62  #ptns: 8
    hash_area: 16430 (max=82150)       buildfrag: 926  probefrag: 926  passes: 1
  Hash join: Resc: 7524.65  Resp: 2090.04  [multiMatchCost=0.00]
---- cost(Hash Join) = 2090.04 (w/o dist), 2094.08 (w/ dist)
---- cost BROADCAST-RIGHT = 31.94
---- cost with slave mapping = 0.00
  Outer table:
    resc: 3759.85  card 30000.00  bytes: 2009  deg: 4  resp: 1044.40
  Inner table: T2  Alias: T2
    resc: 3759.85  card: 30000.00  bytes: 2009  deg: 4  resp: 1044.40
    using dmeth: 8  #groups: 8
    Cost per ptn: 0.69  #ptns: 4
    hash_area: 16430 (max=82150)       buildfrag: 1851  probefrag: 926  passes: 1
  Hash join: Resc: 7522.46  Resp: 2089.50  [multiMatchCost=0.00]
  Outer table:
    resc: 3759.85  card 30000.00  bytes: 2009  deg: 4  resp: 1044.40
  Inner table: T1  Alias: T1
    resc: 3759.85  card: 30000.00  bytes: 2009  deg: 4  resp: 1044.40
    using dmeth: 16  #groups: 8
    Cost per ptn: 0.67  #ptns: 4
    hash_area: 16430 (max=82150)       buildfrag: 926  probefrag: 1851  passes: 1
  Hash join: Resc: 7522.36  Resp: 2089.47  [multiMatchCost=0.00]
---- cost(Hash Join) = 2089.47 (w/o dist), 2089.47 (w/ dist)
---- cost BROADCAST-LEFT = 31.94
---- cost with slave mapping = 0.00
  Outer table:
    resc: 3759.85  card 30000.00  bytes: 2009  deg: 4  resp: 1044.40
  Inner table: T2  Alias: T2
    resc: 3759.85  card: 30000.00  bytes: 2009  deg: 4  resp: 1044.40
    using dmeth: 16  #groups: 8
    Cost per ptn: 0.67  #ptns: 4
    hash_area: 16430 (max=82150)       buildfrag: 926  probefrag: 1851  passes: 1
  Hash join: Resc: 7522.36  Resp: 2089.47  [multiMatchCost=0.00]
---- cost(Hash Join) = 2089.47 (w/o dist), 2089.47 (w/ dist)
(newjo-save)    [1 0 ]
Final - All Rows Plan:  Best join order: 1
  Cost: 2089.5431  Degree: 4  Card: 30000.0000  Bytes: 120540000
  Resc: 7522.6511  Resc_io: 7518.0000  Resc_cpu: 36316296
  Resp: 2089.5431  Resp_io: 2088.3333  Resc_cpu: 9445741

You can see that the difference between the cost without distribution (2090.04) and that with (2090.04) is zero for the desired plan (shown first) and that the difference for other distribution methods is either also zero or positive. However, the total cost for some of the other distribution methods is lower! This is because the cost without the distribution is not constant. This in turn seems to be because in some distribution methods the number of partitions (subpartitions in our case) has been correctly calculated as 8 and in other cases the number of partitions has been set to 4 – the degree of parallelism! I confirmed that when altering the degree of parallelism this number changed accordingly.

One other oddity: Although the selected distribution mechanism has a cost less than 2090.04 (2089.5431) it is not the lowest possible according to the trace (2089.47). But that is a problem for another day.

It struck me that a good workaround for what seems to be a bug is to ensure that the number of sub-partitions and the degree of parallelism is the same. Let us try it:

ALTER TABLE t1 PARALLEL(DEGREE 8);
ALTER TABLE t2 PARALLEL(DEGREE 8);

Voila! The correct plan is produced unhinted.

----------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart|
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |          |     1 |  4018 |  1045   (1)| 00:00:06 |       |
|   1 |  SORT AGGREGATE                 |          |     1 |  4018 |            |          |       |
|   2 |   PX COORDINATOR                |          |       |       |            |          |       |
|   3 |    PX SEND QC (RANDOM)          | :TQ10000 |     1 |  4018 |            |          |       |
|   4 |     SORT AGGREGATE              |          |     1 |  4018 |            |          |       |
|   5 |      PX PARTITION HASH ALL      |          | 30000 |   114M|  1045   (1)| 00:00:06 |     1 |
|*  6 |       HASH JOIN                 |          | 30000 |   114M|  1045   (1)| 00:00:06 |       |
|   7 |        PX PARTITION RANGE SINGLE|          | 30000 |    57M|   522   (0)| 00:00:03 |     3 |
|*  8 |         TABLE ACCESS FULL       | T1       | 30000 |    57M|   522   (0)| 00:00:03 |    17 |
|   9 |        PX PARTITION RANGE SINGLE|          | 30000 |    57M|   522   (0)| 00:00:03 |     3 |
|* 10 |         TABLE ACCESS FULL       | T2       | 30000 |    57M|   522   (0)| 00:00:03 |    17 |
----------------------------------------------------------------------------------------------------

So here is the tip of the day:

Set the number of partitions or sub-partitions and the degree of parallelism to be the same when performing full partition-wise joins.

I have reproduced the results in this blog entry both on 10.2.04 and 11.1.0.6.

September 8, 2009

INDEX FULL SCAN (MIN/MAX)

Filed under: SQL Fun — tonyhasler @ 8:39 am

I came accross some interesting behaviour with the little-discussed “INDEX FULL SCAN (MIN/MAX)” operation.
First let us setup an example:

create table t1 (n1 number,n2 number,c1 char(100));

insert into t1 select rownum,1000-rownum,rownum from dual connect by level <= 1000;

create index t1_i on t1(n1,n2,c1) pctfree 99;

exec dbms_stats.gather_table_stats(null,'T1',
estimate_percent=>100,no_invalidate=>false,cascade=>true) ;

The index has been setup with pctfree 99 to ensure that each entry uses a whole block.

Let us autotrace a query. This is the second run to get rid of parsing overhead:

SQL> select max(n1) from t1 ;

   MAX(N1)
----------
      1000

Execution Plan
----------------------------------------------------------
Plan hash value: 1582015564

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |     4 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |     4 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I |  1000 |  4000 |            |          |
-----------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        334  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

The 3 consistent gets show that we have traversed the three levels of the index to identify the maximum value. It is clear that the term “FULL SCAN” is a bit misleading but other than that so far so good. Now let us add a predicate on the same column:

SQL> select max(n1) from t1 where n1 < 1000;

   MAX(N1)
----------
       999

Execution Plan
----------------------------------------------------------
Plan hash value: 3815976388

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |      |     1 |     4 |            |          |
|   2 |   FIRST ROW                  |      |   999 |  3996 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN (MIN/MAX)| T1_I |   999 |  3996 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("N1"<1000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        335  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

There are a few things to note at this point. Firstly, we see that the operation has changed from a “FULL SCAN” to a “RANGE SCAN” and that cardinality and cost estimates have apeared for the scan.

It looks like the cardinality estimate is based on the number of rows that satisfy the access predicate. However, what seems to have happened is that we have identifed a row where n1 = 1000 and then we have done a descending scan of the index from that point. A new operation “FIRST ROW” has also appeared in the plan and perhaps its purpose is to halt the scan after we hit the first suitable row.

Perhaps the most interesting observation is that although the presence of the access predicate means we have to do more work the cost has actually reduced.

Incidentally, we could have used the MIN function and an ascending scan just as easily.  However, this operation will not work at all, for MIN or MAX, for indexes based on a function of the leading column of the index.  This includes descending indexes.

Now let us look at the case where we supply a predicate on a column other than n1:

SQL> select max(n1) from t1 where n2 = 900;

   MAX(N1)
----------
       100

Execution Plan
----------------------------------------------------------
Plan hash value: 1357677816

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |     8 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |      |     1 |     8 |            |          |
|   2 |   FIRST ROW                 |      |     1 |     8 |     3   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN (MIN/MAX)| T1_I |     1 |     8 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("N2"=900)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        903  consistent gets
          0  physical reads
          0  redo size
        334  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Once again we seem to performing a descending scan of the index until we get a row that matches. On this occasion a full table scan would have been better but once again our filter predicate has reduced rather than increased the estimated cost.

We can always combine predicates on the non-leading columns and still use this operation.   Sometimes we can also combine predicates from the leading column and other columns but only under specific circumstances:

SQL> select max(n1) from t1 where n1 < 900 and n2 >= 900;

   MAX(N1)
----------
       100

Execution Plan
----------------------------------------------------------
Plan hash value: 3815976388

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |     8 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |      |     1 |     8 |            |          |
|   2 |   FIRST ROW                  |      |    90 |   720 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN (MIN/MAX)| T1_I |    90 |   720 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("N1"<900)
       filter("N2">=900)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        803  consistent gets
          0  physical reads
          0  redo size
        334  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Here we have traversed down to the row where n1 = 900 and then performed a descending scan until we reached a row where n2 >= 900. However, if we change the predicate on n2 to be n2=900 then the operation suddenly becomes illegal for some reason:

SQL> select /*+ index(t1) */ max(n1) from t1 where n1 < 900 and n2 = 900;

   MAX(N1)
----------
       100

Execution Plan
----------------------------------------------------------
Plan hash value: 3963813734

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     8 |   902   (0)| 00:00:11 |
|   1 |  SORT AGGREGATE   |      |     1 |     8 |            |          |
|*  2 |   INDEX RANGE SCAN| T1_I |     1 |     8 |   902   (0)| 00:00:11 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N2"=900 AND "N1"<900)
       filter("N2"=900)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        902  consistent gets
          0  physical reads
          0  redo size
        334  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Quite a bit to this isn’t there?

One last thing: there do not appear to be specific hints for these operations. There are hints like INDEX_FFS (fast full scan), INDEX_RS_ASC (range scan) and so on for most index operations. However, if we try and identify the operation from the stored outline we just get the “old fashioned” INDEX hint:

SQL> select * from table(dbms_xplan.display(null,null,'outline')) ;

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

Plan hash value: 1582015564

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |     4 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |     4 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I |  1000 |  4000 |            |          |
-----------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX(@"SEL$1" "T1"@"SEL$1" ("T1"."N1" "T1"."N2" "T1"."C1"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Next Page »

Create a free website or blog at WordPress.com.