Tony’s Oracle Tips

May 7, 2011

MAXIMUM/MINIMUM Column Statistics – Part 2

Filed under: Uncategorized — tonyhasler @ 7:33 pm

Well it is now over four months since I wrote the part 1 of this blog. Apologies for the delay but I have been working long hours and travelling so my blog never made it to the top of the pile.

Let us now address what happens when we try to use NULL for maximum/minimum values by modifying the script in part 1:

DROP TABLE t1;
CREATE TABLE t1 (a NUMBER, b DATE, c TIMESTAMP, d TIMESTAMP WITH LOCAL TIME ZONE,
 e RAW(32), f VARCHAR2(32),g NVARCHAR2(32),h CHAR(32),
i INTERVAL DAY TO SECOND,j INTERVAL YEAR TO MONTH, k ROWID,
l TIMESTAMP WITH TIME ZONE,m NCHAR(32));


DECLARE
   srec   DBMS_STATS.statrec;
BEGIN
   DBMS_STATS.set_table_stats (ownname            => USER,
                               tabname            => 'T1',
                               numrows            => 20000,
                               numblks            => 1000,
                               avgrlen            => 400,
                               no_invalidate      => FALSE);

   FOR r IN (SELECT *
               FROM user_tab_cols
              WHERE table_name = 'T1')
   LOOP
      srec.epc := 2;                       -- Two endpoints
      srec.bkvals := NULL;                  -- No histogram
      DBMS_STATS.delete_column_stats
                                  (ownname      => USER,
                                   tabname      => 'T1',
                                   colname      => r.column_name);
      DBMS_STATS.prepare_column_values
                                 (srec,
                                  DBMS_STATS.rawarray (NULL,
                                                       
                                                       -- Minimum
                                                       NULL
                                                           -- Maximum
                                  ));
      -- srec.novals := dbms_stats.numarray(-9.9e125,9.9e125) ; -- Special fix
      DBMS_STATS.set_column_stats (ownname       => USER,
                                   tabname       => 'T1',
                                   partname      => NULL,
                                   colname       => r.column_name,
                                   distcnt       => NULL,
                                   density       => NULL,
                                   nullcnt       => 0,
                                   srec          => srec,
                                   avgclen       => 100);
   END LOOP;
END;
/

SET autotrace off
COLUMN lowval format a10
COLUMN hival format a10
SET null '(null)'

SELECT distcnt, density, h.lowval, h.hival, h.MINIMUM,
       h.maximum
  FROM SYS.obj$ o, SYS.hist_head$ h
 WHERE o.NAME = 'T1'
   AND o.owner# = USERENV ('SCHEMAID')
   AND o.obj# = h.obj#;

SET autotrace traceonly explain
SELECT *
  FROM t1
 WHERE a > -1;
SELECT *
  FROM t1
 WHERE a > 0;
SELECT *
  FROM t1
 WHERE a = 0;
SELECT *
  FROM t1
 WHERE a  0;
SELECT *
  FROM t1
 WHERE a = 1;
SELECT *
  FROM t1
 WHERE a  1;

The above script generates the following output:

   DISTCNT    DENSITY LOWVAL     HIVAL         MINIMUM    MAXIMUM
---------- ---------- ---------- ---------- ---------- ----------
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0
         0          0 (null)     (null)              0          0

13 rows selected.

SQL>
SQL> SET autotrace traceonly explain
SQL> SELECT *
  2    FROM t1
  3   WHERE a > -1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">(-1))

SQL> SELECT *
  2    FROM t1
  3   WHERE a > 0;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   400 |   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   400 |   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">0)

SQL> SELECT *
  2    FROM t1
  3   WHERE a = 0;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   400 |   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   400 |   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"=0)

SQL> SELECT *
  2    FROM t1
  3   WHERE a  0;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"0)

SQL> SELECT *
  2    FROM t1
  3   WHERE a = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   400 |   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   400 |   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"=1)

SQL> SELECT *
  2    FROM t1
  3   WHERE a  1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 19999 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 19999 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"1)

SQL> select * from t1 where a  3000;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 19999 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 19999 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"3000)

SQL> select * from t1 where a  -1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 19999 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 19999 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"(-1))

SQL> select * from t1 where a  0;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"0)

SQL> select * from t1 where a < 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A" select * from t1 where a < 0;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   400 |   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   400 |   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A" select * from t1 where a < -1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   400 |   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   400 |   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A" select * from t1 where a < 2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A" select * from t1 where a <= 2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<=2)

The first thing to note is that zeroes have been added for the DISTCNT, DENSITY, MINIMUM and MAXIMUM columns in HIST_HEAD$. This causes the selectivity estimate of queries involving a range predicate to be 100% if the range includes 0 and 0% (cardinality 1) when the range does not include 0. Equality and inequality predicates yield 0% and 100% respectively regardless of whether zero is specified as the value (although not the slight discrepancy in the inequality value when zero is not specified).

If you change the script to specify non-zero values for the density and a distcnt greater than 1 you will find that the behaviour changes as follows:

> -1 : 100%
> 0 : (1-density)*100%
= 0 : density*100%
<> 0 : (1-density)*100%
= 1 : density*100%
<> 1 : (1-density)*100%

The bottom line is that this mechanism is probably too complex to be of practical use and the technique of using very diffent values documented in part 1 of this blog will work much better.

Blog at WordPress.com.