Tony’s Oracle Tips

September 8, 2009

INDEX FULL SCAN (MIN/MAX)

Filed under: SQL Fun — tonyhasler @ 8:39 am

I came accross some interesting behaviour with the little-discussed “INDEX FULL SCAN (MIN/MAX)” operation.
First let us setup an example:

create table t1 (n1 number,n2 number,c1 char(100));

insert into t1 select rownum,1000-rownum,rownum from dual connect by level <= 1000;

create index t1_i on t1(n1,n2,c1) pctfree 99;

exec dbms_stats.gather_table_stats(null,'T1',
estimate_percent=>100,no_invalidate=>false,cascade=>true) ;

The index has been setup with pctfree 99 to ensure that each entry uses a whole block.

Let us autotrace a query. This is the second run to get rid of parsing overhead:

SQL> select max(n1) from t1 ;

   MAX(N1)
----------
      1000

Execution Plan
----------------------------------------------------------
Plan hash value: 1582015564

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |     4 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |     4 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I |  1000 |  4000 |            |          |
-----------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        334  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

The 3 consistent gets show that we have traversed the three levels of the index to identify the maximum value. It is clear that the term “FULL SCAN” is a bit misleading but other than that so far so good. Now let us add a predicate on the same column:

SQL> select max(n1) from t1 where n1 < 1000;

   MAX(N1)
----------
       999

Execution Plan
----------------------------------------------------------
Plan hash value: 3815976388

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |     4 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |      |     1 |     4 |            |          |
|   2 |   FIRST ROW                  |      |   999 |  3996 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN (MIN/MAX)| T1_I |   999 |  3996 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("N1"<1000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        335  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

There are a few things to note at this point. Firstly, we see that the operation has changed from a “FULL SCAN” to a “RANGE SCAN” and that cardinality and cost estimates have apeared for the scan.

It looks like the cardinality estimate is based on the number of rows that satisfy the access predicate. However, what seems to have happened is that we have identifed a row where n1 = 1000 and then we have done a descending scan of the index from that point. A new operation “FIRST ROW” has also appeared in the plan and perhaps its purpose is to halt the scan after we hit the first suitable row.

Perhaps the most interesting observation is that although the presence of the access predicate means we have to do more work the cost has actually reduced.

Incidentally, we could have used the MIN function and an ascending scan just as easily.  However, this operation will not work at all, for MIN or MAX, for indexes based on a function of the leading column of the index.  This includes descending indexes.

Now let us look at the case where we supply a predicate on a column other than n1:

SQL> select max(n1) from t1 where n2 = 900;

   MAX(N1)
----------
       100

Execution Plan
----------------------------------------------------------
Plan hash value: 1357677816

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |     8 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |      |     1 |     8 |            |          |
|   2 |   FIRST ROW                 |      |     1 |     8 |     3   (0)| 00:00:01 |
|*  3 |    INDEX FULL SCAN (MIN/MAX)| T1_I |     1 |     8 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("N2"=900)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        903  consistent gets
          0  physical reads
          0  redo size
        334  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Once again we seem to performing a descending scan of the index until we get a row that matches. On this occasion a full table scan would have been better but once again our filter predicate has reduced rather than increased the estimated cost.

We can always combine predicates on the non-leading columns and still use this operation.   Sometimes we can also combine predicates from the leading column and other columns but only under specific circumstances:

SQL> select max(n1) from t1 where n1 < 900 and n2 >= 900;

   MAX(N1)
----------
       100

Execution Plan
----------------------------------------------------------
Plan hash value: 3815976388

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |     1 |     8 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE              |      |     1 |     8 |            |          |
|   2 |   FIRST ROW                  |      |    90 |   720 |     3   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN (MIN/MAX)| T1_I |    90 |   720 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("N1"<900)
       filter("N2">=900)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        803  consistent gets
          0  physical reads
          0  redo size
        334  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Here we have traversed down to the row where n1 = 900 and then performed a descending scan until we reached a row where n2 >= 900. However, if we change the predicate on n2 to be n2=900 then the operation suddenly becomes illegal for some reason:

SQL> select /*+ index(t1) */ max(n1) from t1 where n1 < 900 and n2 = 900;

   MAX(N1)
----------
       100

Execution Plan
----------------------------------------------------------
Plan hash value: 3963813734

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     8 |   902   (0)| 00:00:11 |
|   1 |  SORT AGGREGATE   |      |     1 |     8 |            |          |
|*  2 |   INDEX RANGE SCAN| T1_I |     1 |     8 |   902   (0)| 00:00:11 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("N2"=900 AND "N1"<900)
       filter("N2"=900)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        902  consistent gets
          0  physical reads
          0  redo size
        334  bytes sent via SQL*Net to client
        338  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Quite a bit to this isn’t there?

One last thing: there do not appear to be specific hints for these operations. There are hints like INDEX_FFS (fast full scan), INDEX_RS_ASC (range scan) and so on for most index operations. However, if we try and identify the operation from the stored outline we just get the “old fashioned” INDEX hint:

SQL> select * from table(dbms_xplan.display(null,null,'outline')) ;

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

Plan hash value: 1582015564

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |     4 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |      |     1 |     4 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| T1_I |  1000 |  4000 |            |          |
-----------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      INDEX(@"SEL$1" "T1"@"SEL$1" ("T1"."N1" "T1"."N2" "T1"."C1"))
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.4')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Advertisements

Blog at WordPress.com.