Tony’s Oracle Tips

January 17, 2010

Cardinality estimates in the CBO

Filed under: Uncategorized — tonyhasler @ 9:58 pm

There is a rather fundamental point about cardinality and cost calculation in the CBO that seems to have escaped the public attention.

Even if you, like I, have bought Jonathan Lewis’s book on the CBO (e.g., from Amazon) you may still be oblivious. Jonathan assures me, however, that the detail in this blog will be covered in his next book.

Consider the following script that joins two identical tables:


set linesize 180
set pages 0
DROP TABLE t1;
DROP TABLE t2;

CREATE TABLE t1 (n1 INTEGER);

INSERT INTO t1
       SELECT   ROWNUM
         FROM   DUAL
   CONNECT BY   LEVEL  USER
                                   ,tabname => table_name
                                   ,partname => NULL
                                   ,estimate_percent => 100
                                   ,block_sample => TRUE
                                   ,method_opt => 'for all columns size 1'
                                   ,degree => NULL
                                   ,granularity => 'default'
                                   ,cascade => TRUE);
   END;
BEGIN
   my_gather('T1');
   my_gather('T2');
END;
/

SELECT   /*+ gather_plan_statistics leading(t2) use_nl(t1) */
         COUNT( * )
  FROM   t1, t2
 WHERE   t1.n1 = t2.n1;

SELECT   * FROM table(DBMS_XPLAN.display_cursor(NULL, NULL, 'IOSTATS LAST'));

The execution plan that results on 10.2.0.4 is:


--------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.78 |     100K|
|   2 |   NESTED LOOPS      |      |      1 |    100K|    100K|00:00:01.00 |     100K|
|   3 |    TABLE ACCESS FULL| T2   |      1 |    100K|    100K|00:00:00.10 |     157 |
|*  4 |    INDEX RANGE SCAN | I_T1 |    100K|      1 |    100K|00:00:00.57 |     100K|
--------------------------------------------------------------------------------------

You will see that the estimated number of rows in operation 4 is 1 but the actual number of rows is 100,000. Now in addition to buying Jonathan Lewis’s book I have both read and taught the concept of Tuning by Cardinality Feedback a tuning methodology invented by Wolfgang Breitling many moons ago. His assertion is that when the estimated and actual cardinalities of an operation in an explain plan differ by several orders of magnitide it is an indication that the CBO is working from a false set of assumptions.

In this case, however, the CBO is not remotely confused. The point is that when it comes to estimated costs and cardinalities the figures reflect a single execution of the operation and the actual figures reflect the total for all executions of the operation. Because T2 has 100,000 rows the index range scan will be executed 100,000 times (as confirmed by the STARTS column above) and each execution will return exactly one row.

So if you want to use Wolfgang’s methodology you need to divide the column LAST_OUTPUT_ROWS in V$SQL_PLAN_STATISTICS_ALL by LAST_STARTS to get a meaningful comparison with CARDINALITY.


WITH X
       AS (SELECT   sql_id
             FROM   v$sql
            WHERE   sql_text LIKE
                       'SELECT   /*+ gather_plan_statistics leading(t2) %'
                    AND ROWNUM = 1)
  SELECT   DEPTH
          ,LPAD(' ', 2 * DEPTH) || operation operation
          ,options
          ,object_name
          ,last_starts
          ,CARDINALITY "EST ROWS"
          ,last_output_rows "ACTUAL ROWS"
          ,last_output_rows / last_starts "ACTUAL ROWS PER OP"
    FROM      v$sql_plan_statistics_all
           NATURAL JOIN
              x
ORDER BY   id;

The output is:



     DEPTH OPERATION                      OPTIONS    OBJECT_NAME LAST_STARTS   EST ROWS ACTUAL ROWS ACTUAL ROWS PER OP
---------- ------------------------------ ---------- ----------- ----------- ---------- ----------- ------------------
         1   SORT                         AGGREGATE                        1          1           1              1
         2     NESTED LOOPS                                                1     100000      100000         100000
         3       TABLE ACCESS             FULL       T2                    1     100000      100000         100000
         3       INDEX                    RANGE SCAN I_T1             100000          1      100000              1

Incidentally, Jonathan has advertised this himself, albeit more briefly, in his blog here

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: