Tony’s Oracle Tips

January 23, 2011

MAXIMUM/MINIMUM Column Statistics – Part 1

Filed under: Uncategorized — tonyhasler @ 9:32 pm

I presented a talk at the 2010 UKOUG national conference entitled “Stabilising Statistics” (materials available here) that suggested setting the minimum and maximum values for column statistics to very low and very high values respectively. This technique contributes to execution plan stability. I will refer to this technique as VDV (very different values) for the rest of this note.

During that presentation Wolfgang Breitling posed the question “What happens if you set the maximum and minimum values to NULL”?

I couldn’t answer him at the time and I promised that I would post an answer on my blog.

You will have to wait for the answer a little bit more I am afraid because I must first post some principles that I discovered whilst researching the answer.

If we dive down into the data dictionary we can find a a table called HIST_HEAD$. This table contains the minimum and maximum values for column statistics. What is interesting is that there are two sets of minimum and maximum values. These pairs of values are MINIMUM/MAXIMUM (type NUMBER) and LOWVAL/HIVAL (type RAW(32)). As far as I can see, only datatypes CHAR,NCHAR, VARCHAR2, and NVARCHAR2 use the LOWVAL/HIVAL pair in cardinality calculations. All other datatypes use the MINIMUM/MAXIMUM pair (including RAW!).

Unfortunately, when you look at views such as USER_TAB_COLS all you see is the LOWVAL/HIVAL pair from sys.HIST_HEAD$ reported as LOW_VALUE/HIGH_VALUE. The other pair is completely hidden.

“So what”? You may ask. These values are equivalent aren’t they?

Well some of the time, yes. But not always.

Here is how you might set MIN/MAX column statistics yourself:

DROP TABLE t1;
CREATE TABLE t1 (a NUMBER, b DATE, c TIMESTAMP, d TIMESTAMP WITH LOCAL TIME ZONE,
 e RAW(32), f VARCHAR2(32),g NVARCHAR2(32),h CHAR(32),
i INTERVAL DAY TO SECOND,j INTERVAL YEAR TO MONTH, k ROWID,
l TIMESTAMP WITH TIME ZONE,m nchar(32));

DECLARE
   srec   DBMS_STATS.statrec;
BEGIN
   DBMS_STATS.set_table_stats (ownname            => USER,
                               tabname            => 'T1',
                               numrows            => 20000,
                               numblks            => 1000,
                               avgrlen            => 400,
                               no_invalidate      => FALSE);

   FOR r IN (SELECT *
               FROM user_tab_cols
              WHERE table_name = 'T1')
   LOOP
      srec.epc := 2;                       -- Two endpoints
      srec.bkvals := NULL;                  -- No histogram
      DBMS_STATS.prepare_column_values
                                (srec,
                                 DBMS_STATS.rawarray (
                                 hextoraw('000000000000000000000000000000'),
                                                      -- Minimum
                                 hextoraw('ffffffffffffffffffffffffffffff')
                                                       -- Maximum
                                 ));
      -- srec.novals := dbms_stats.numarray(-9.9e125,9.9e125) ; -- Special fix
      DBMS_STATS.set_column_stats
                                 (ownname       => USER,
                                  tabname       => 'T1',
                                  partname      => NULL,
                                  colname       => r.column_name,
                                  distcnt       => 2,
                                  density       => 0.5,
                                  nullcnt       => 0,
                                  srec          => srec,
                                  avgclen       => 100);
   END LOOP;
END;
/
set autotrace off
column lowval format a30
column hival format a30
select h.lowval,h.hival,h.minimum,h.maximum
from sys.obj$ o,sys.hist_head$ h
where o.name='T1'
  and o.owner# = userenv('SCHEMAID')
  and o.obj#=h.obj# ;
--
set autotrace traceonly explain
select * from t1 where a > 1e30 ;
select * from t1 where a > 1e37 ;
select * from t1 where a < 1e30 ;
select * from t1 where a < 1e37 ;
select * from t1 where a >= 1e30 ;
select * from t1 where a >= 1e37 ;
select * from t1 where a <= 1e30 ;
select * from t1 where a <= 1e37 ;

This code sets the columns statistics to VDV and examines the consequences.
This is what we get:

LOWVAL                         HIVAL                             MINIMUM    MAXIMUM
------------------------------ ------------------------------ ---------- ----------
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF          0 1.3292E+36
13 rows selected.

SQL> --
SQL> set autotrace traceonly explain
SQL> select * from t1 where a > 1e30 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">1e30)

SQL> select * from t1 where a > 1e37 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   400 |   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   400 |   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">1e37)

SQL> select * from t1 where a < 1e30 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  3906K|   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 10000 |  3906K|   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<1e30) SQL> select * from t1 where a < 1e37 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<1e37) SQL> select * from t1 where a >= 1e30 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">=1e30)

SQL> select * from t1 where a >= 1e37 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   400 |   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   400 |   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">=1e37)

SQL> select * from t1 where a <= 1e30 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  3906K|   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 10000 |  3906K|   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<=1e30) SQL> select * from t1 where a <= 1e37 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<=1e37)
 

You can see that cardinality estimates are badly affected when very large values are supplied in the predicates. We can get much more intelligible – and consistent – results by setting NOVALS ourselves. This is what happens when you uncomment the line labeled ‘Special Fix’ above.

LOWVAL                         HIVAL                             MINIMUM    MAXIMUM
------------------------------ ------------------------------ ---------- ----------
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
000000000000000000000000000000 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF -9.90E+125 9.900E+125
13 rows selected.

SQL> --
SQL> set autotrace traceonly explain
SQL> select * from t1 where a > 1e30 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  3906K|   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 10000 |  3906K|   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">1e30)

SQL> select * from t1 where a > 1e37 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  3906K|   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 10000 |  3906K|   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">1e37)

SQL> select * from t1 where a < 1e30 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  3906K|   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 10000 |  3906K|   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<1e30) SQL> select * from t1 where a < 1e37 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  3906K|   222   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 10000 |  3906K|   222   (1)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<1e37) SQL> select * from t1 where a >= 1e30 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">=1e30)

SQL> select * from t1 where a >= 1e37 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A">=1e37)

SQL> select * from t1 where a <= 1e30 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<=1e30) SQL> select * from t1 where a <= 1e37 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 20000 |  7812K|   223   (2)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| T1   | 20000 |  7812K|   223   (2)| 00:00:03 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"<=1e37)
Advertisements

1 Comment »

  1. Faking Oracle statistics for TIMESTAMP columns

    Oracle uses object statistics to estimate the amount of data that a query will probably return. This is used to find the best execution plan for the specific case. Having stale statistics is perhaps a major reason for bad execution plans. But sometimes…

    Trackback by BlogNotes to Myself — October 28, 2014 @ 8:49 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: