Tony’s Oracle Tips

March 3, 2017

Object oriented logging

Filed under: Uncategorized — tonyhasler @ 11:34 am

It is a very common requirement. You need to be able to log the start and end of procedure and function calls. You also need to log error and exception events and sometimes some informational messages as well. This is typically done by writing a simple logging package that uses autonomous transactions so that any transaction rollback does not rollback the diagnostic logging information as well. Apart from logging text (and usually other things as well) the package typically logs the name of the package and procedure performing the logging. Here is a simplified example that might be implemented at a mythical AJAX company.

create or replace PACKAGE ajax_log_pkg IS
    PROCEDURE log_message (
        p_package_name     VARCHAR2,
        p_procedure_name   VARCHAR2,
        p_message          VARCHAR2
    );
END ajax_log_pkg;
/

create or replace PACKAGE BODY ajax_log_pkg IS
    PROCEDURE log_message (
        p_package_name     VARCHAR2,
        p_procedure_name   VARCHAR2,
        p_message          VARCHAR2 
    ) IS
        PRAGMA autonomous_transaction;
    BEGIN
        INSERT INTO ajax_log_table (
            log_timestamp,
            package_name,
            procedure_name,
            message
        ) VALUES (
            systimestamp,
            p_package_name,
            p_procedure_name,
            p_message
        );

        COMMIT;
    END log_message;

END ajax_log_pkg;
/

The package has a prefix of the company name and a suffix of “_pkg”, a curiously common convention. Here is how such a package might be used (once again simplified).

create or replace PACKAGE ajax_log_demo_pkg AS
    PROCEDURE inner_proc;
    PROCEDURE outer_proc;
END ajax_log_demo_pkg;
/

create or replace PACKAGE BODY ajax_log_demo_pkg AS
    g_package_name     CONSTANT VARCHAR2(30) := 'log_demo_pkg';

    PROCEDURE inner_proc IS
        v_procedure_name   CONSTANT VARCHAR2(50) := 'inner_proc';
    BEGIN
        ajax_log_pkg.log_message(
            g_package_name,
        v_procedure_name,'START');
        -- Do something
        ajax_log_pkg.log_message(
            g_package_name,
        v_procedure_name,'END');
    END inner_proc;

    PROCEDURE outer_proc IS
        v_procedure_name   CONSTANT VARCHAR2(50) := 'outer_proc';
    BEGIN
        ajax_log_pkg.log_message(
            g_package_name,
        v_procedure_name,'START');
        ajax_log_pkg.log_message(
            g_package_name,
        v_procedure_name,'Before call to inner_proc');
        inner_proc ();
        ajax_log_pkg.log_message(
            g_package_name,
        v_procedure_name,'After call to inner_proc');
        ajax_log_pkg.log_message(
            g_package_name,
        v_procedure_name,'END');
    END outer_proc;

END ajax_log_demo_pkg;

We can see that there is a lot of wasted typing here. The calling package name and calling procedure name are specified repeatedly on each call. Still cut-and-paste can limit the productivity cost. What else is there to do? Presumably nothing? Every logging implementation that I have seen has looked very similar to this.

This is roughly what the entries in AJAX_LOG_TABLE might look like after a call to ajax_log_demo_pkg.outer_proc:

03-MAR-17 10.05.27 log_demo_pkg outer_proc START
03-MAR-17 10.05.27 log_demo_pkg outer_proc Before call to inner_proc
03-MAR-17 10.05.27 log_demo_pkg inner_proc START
03-MAR-17 10.05.27 log_demo_pkg inner_proc END
03-MAR-17 10.05.27 log_demo_pkg outer_proc After call to inner_proc
03-MAR-17 10.05.27 log_demo_pkg outer_proc END

The other day a curious thought sped briefly past my mind as I typed CTRL/V. “This would be a lot easier in Java”, I thought. Then I paused and started a short conversation with myself:

“Tony, why do you think ANYTHING in Java would be easier than PL/SQL”?
“Well, Tony, in Java you have classes not packages. You could create an instance of a logging object and avoid specifying the calling package and procedure names on each call”
“Good point, Tony…..”

At that point I sat straight upright in my chair and opened my eyes wide open.

“Tony, Oracle has that capability! It has been there since 8i it is just that we have all forgotten about it!”

It took just moments to implement an object version of AJAX_LOG_PKG.

create or replace TYPE ajax_log_typ AS OBJECT (
    o_package_name     VARCHAR2(30),
    o_procedure_name   VARCHAR2(30),

    MEMBER PROCEDURE log_message (
        p_message   VARCHAR2
    )
) ;
/

create or replace TYPE BODY ajax_log_typ AS
    MEMBER PROCEDURE log_message (
        p_message   VARCHAR2 
    ) IS
        PRAGMA autonomous_transaction;
    BEGIN
        INSERT INTO ajax_log_table (
            log_timestamp,
            package_name,
            procedure_name,
            message
        ) VALUES (
            systimestamp,
            o_package_name,
            o_procedure_name,
            p_message
        );

        COMMIT;
    END log_message;

END;

The object specification replaces the package specification and the object body replaces the package body. I then added two members to the class for the calling package name and the calling procedure name. As a consequence, we can eliminate these two parameters from the member procedure or procedures. Look at how the object-oriented logging implementation is used:

create or replace PACKAGE ajax_log_demo_pkg2 AS
    PROCEDURE inner_proc;
    PROCEDURE outer_proc;
END ajax_log_demo_pkg2;
/

create or replace PACKAGE BODY ajax_log_demo_pkg2 AS
    g_package_name     CONSTANT VARCHAR2(30) := 'log_demo_pkg';

    PROCEDURE inner_proc IS
           o_log              ajax_log_typ := ajax_log_typ(
            g_package_name,
        'inner_proc');
    BEGIN
        o_log.log_message(
            'START'
        );
        -- Do something
        o_log.log_message(
            'END'
        );
    END inner_proc;

    PROCEDURE outer_proc IS
        o_log   ajax_log_typ := ajax_log_typ(
            g_package_name,
        'outer_proc');
    BEGIN
        o_log.log_message(
            'START'
        );
        o_log.log_message(
            'Before call to inner_proc'
        );
        inner_proc ();
        o_log.log_message(
            'After call to inner_proc'
        );
        o_log.log_message(
            'END'
        );
    END outer_proc;

END ajax_log_demo_pkg2;

Each procedure creates a new instance of AJAX_LOG_TYP object and specifies the package and procedure name. Subsequent calls just need to reference the local variable and the message to log; the rest of the context is retrieved from the object instance.
Hardly rocket science and yet I am quite sure that I am very far from alone in missing this neat trick for more years than I care to admit.

February 9, 2016

Parallel partition scans

Filed under: Uncategorized — tonyhasler @ 9:51 pm

When a table is accessed by multiple members of a parallel query server set, the execution plan may show the use of block range granules (PX BLOCK ITERATOR) or partition granules (PX PARTITION [RANGE|LIST|HASH] ITERATOR or PX PARTITION [RANGE|LIST|HASH] ALL).

The basic ideas surrounding these concepts are discussed in numerous blogs and books, including my own, but discussion of partition range granules is normally restricted to either partition wise joins or the creation and rebuild of partitioned indexes.

However, a colleague came to me recently with a problem with another use case. A problem that a quick Google search was unable to resolve and so after I worked it out I thought it was time for one of my rare blogs.

Let us begin by creating a table partitioned by range using the BUSINESS_DATE and AMOUNT columns.


CREATE TABLE part2
(
   business_date   DATE
  ,amount          NUMBER
)
PARTITION BY RANGE (business_date, amount)
   (PARTITION pdefault VALUES LESS THAN (maxvalue, maxvalue));

BEGIN
   FOR r
      IN (    SELECT TO_CHAR (DATE '2000-01-01' + TRUNC ((ROWNUM-1) / 3), 'YYYYMMDD')
                        partition_date
                    ,DECODE (MOD (ROWNUM-1, 3),  0, '10',  1, '100',  'MAXVALUE')
                        partition_amount
                FROM DUAL
          CONNECT BY LEVEL <= 510)
   LOOP
      EXECUTE IMMEDIATE
            '
ALTER TABLE PART2
   SPLIT PARTITION PDEFAULT
      AT (TO_DATE('''
         || r.partition_date
         || ''',''YYYYMMDD''),'
         || r.partition_amount
         || ')
      INTO (PARTITION P'
         || r.partition_date
         || '_'
         || r.partition_amount
         || ', PARTITION PDEFAULT)';
   END LOOP;
END;
/

After creating the table with 1 partition I added 510 more based on 170 dates with three amount ranges each.

Let us begin by looking at the execution plans of a few queries using 8 parallel query servers:


SET HEADING OFF PAGES 0 FEED OFF
EXPLAIN PLAN
   FOR
      SELECT /*+ parallel(8) */
            * FROM part2;

SELECT *
  FROM TABLE (DBMS_XPLAN.display (format => 'BASIC +PARTITION +OUTLINE'));

EXPLAIN PLAN
   FOR
      SELECT /*+ parallel(8) */
            *
        FROM part2
       WHERE business_date = DATE '2000-01-01';

SELECT *
  FROM TABLE (DBMS_XPLAN.display (format => 'BASIC +PARTITION +OUTLINE'));

EXPLAIN PLAN
   FOR
      SELECT /*+ parallel(8) */
            *
        FROM part2
       WHERE business_date = :b1;

SELECT *
  FROM TABLE (DBMS_XPLAN.display (format => 'BASIC +PARTITION +OUTLINE'));
  
Plan hash value: 3087476059                                                     
                                                                                
---------------------------------------------------------                       
| Id  | Operation            | Name     | Pstart| Pstop |                       
---------------------------------------------------------                       
|   0 | SELECT STATEMENT     |          |       |       |                       
|   1 |  PX COORDINATOR      |          |       |       |                       
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |       |       |                       
|   3 |    PX BLOCK ITERATOR |          |     1 |   511 |                       
|   4 |     TABLE ACCESS FULL| PART2    |     1 |   511 |                       
---------------------------------------------------------                       
                                                                                
Outline Data                                                                    
-------------                                                                   
                                                                                
  /*+                                                                           
      BEGIN_OUTLINE_DATA                                                        
      FULL(@"SEL$1" "PART2"@"SEL$1")                                            
      OUTLINE_LEAF(@"SEL$1")                                                    
      SHARED(8)                                                                 
      ALL_ROWS                                                                  
      DB_VERSION('12.1.0.1')                                                    
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')                                     
      IGNORE_OPTIM_EMBEDDED_HINTS                                               
      END_OUTLINE_DATA                                                          
  */
Plan hash value: 3618659711                                                     
                                                                                
---------------------------------------------------------                       
| Id  | Operation            | Name     | Pstart| Pstop |                       
---------------------------------------------------------                       
|   0 | SELECT STATEMENT     |          |       |       |                       
|   1 |  PX COORDINATOR      |          |       |       |                       
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |       |       |                       
|   3 |    PX BLOCK ITERATOR |          |     1 |     3 |                       
|   4 |     TABLE ACCESS FULL| PART2    |     1 |     3 |                       
---------------------------------------------------------                       
                                                                                
Outline Data                                                                    
-------------                                                                   
                                                                                
  /*+                                                                           
      BEGIN_OUTLINE_DATA                                                        
      FULL(@"SEL$1" "PART2"@"SEL$1")                                            
      OUTLINE_LEAF(@"SEL$1")                                                    
      SHARED(8)                                                                 
      ALL_ROWS                                                                  
      DB_VERSION('12.1.0.1')                                                    
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')                                     
      IGNORE_OPTIM_EMBEDDED_HINTS                                               
      END_OUTLINE_DATA                                                          
  */
Plan hash value: 3618659711                                                     
                                                                                
---------------------------------------------------------                       
| Id  | Operation            | Name     | Pstart| Pstop |                       
---------------------------------------------------------                       
|   0 | SELECT STATEMENT     |          |       |       |                       
|   1 |  PX COORDINATOR      |          |       |       |                       
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |       |       |                       
|   3 |    PX BLOCK ITERATOR |          |   KEY |   KEY |                       
|   4 |     TABLE ACCESS FULL| PART2    |   KEY |   KEY |                       
---------------------------------------------------------                       
                                                                                
Outline Data                                                                    
-------------                                                                   
                                                                                
  /*+                                                                           
      BEGIN_OUTLINE_DATA                                                        
      FULL(@"SEL$1" "PART2"@"SEL$1")                                            
      OUTLINE_LEAF(@"SEL$1")                                                    
      SHARED(8)                                                                 
      ALL_ROWS                                                                  
      DB_VERSION('12.1.0.1')                                                    
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')                                     
      IGNORE_OPTIM_EMBEDDED_HINTS                                               
      END_OUTLINE_DATA                                                          
  */

The first query selects all rows in the table, the second all rows for a specific hard-coded date, and the third all rows for a date specified by a bind variable. All three execution plans show the PX BLOCK ITERATOR operation indicating the use of block range granules.

Now let us add one more partition to bring the total number of partitions to 512

ALTER TABLE part2
   SPLIT PARTITION pdefault
      AT (DATE '2000-06-19', maxvalue)
      INTO (PARTITION p20000619_maxvalue, PARTITION pdefault);

Let us see what execution plans change.

Plan hash value: 1807983963                                                     
                                                                                
-------------------------------------------------------------                   
| Id  | Operation                | Name     | Pstart| Pstop |                   
-------------------------------------------------------------                   
|   0 | SELECT STATEMENT         |          |       |       |                   
|   1 |  PX COORDINATOR          |          |       |       |                   
|   2 |   PX SEND QC (RANDOM)    | :TQ10000 |       |       |                   
|   3 |    PX PARTITION RANGE ALL|          |     1 |   512 |                   
|   4 |     TABLE ACCESS FULL    | PART2    |     1 |   512 |                   
-------------------------------------------------------------                   
                                                                                
Outline Data                                                                    
-------------                                                                   
                                                                                
  /*+                                                                           
      BEGIN_OUTLINE_DATA                                                        
      FULL(@"SEL$1" "PART2"@"SEL$1")                                            
      OUTLINE_LEAF(@"SEL$1")                                                    
      SHARED(8)                                                                 
      ALL_ROWS                                                                  
      DB_VERSION('12.1.0.1')                                                    
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')                                     
      IGNORE_OPTIM_EMBEDDED_HINTS                                               
      END_OUTLINE_DATA                                                          
  */
Plan hash value: 3618659711                                                     
                                                                                
---------------------------------------------------------                       
| Id  | Operation            | Name     | Pstart| Pstop |                       
---------------------------------------------------------                       
|   0 | SELECT STATEMENT     |          |       |       |                       
|   1 |  PX COORDINATOR      |          |       |       |                       
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |       |       |                       
|   3 |    PX BLOCK ITERATOR |          |     1 |     3 |                       
|   4 |     TABLE ACCESS FULL| PART2    |     1 |     3 |                       
---------------------------------------------------------                       
                                                                                
Outline Data                                                                    
-------------                                                                   
                                                                                
  /*+                                                                           
      BEGIN_OUTLINE_DATA                                                        
      FULL(@"SEL$1" "PART2"@"SEL$1")                                            
      OUTLINE_LEAF(@"SEL$1")                                                    
      SHARED(8)                                                                 
      ALL_ROWS                                                                  
      DB_VERSION('12.1.0.1')                                                    
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')                                     
      IGNORE_OPTIM_EMBEDDED_HINTS                                               
      END_OUTLINE_DATA                                                          
  */
Plan hash value: 2732640421                                                     
                                                                                
------------------------------------------------------------------              
| Id  | Operation                     | Name     | Pstart| Pstop |              
------------------------------------------------------------------              
|   0 | SELECT STATEMENT              |          |       |       |              
|   1 |  PX COORDINATOR               |          |       |       |              
|   2 |   PX SEND QC (RANDOM)         | :TQ10000 |       |       |              
|   3 |    PX PARTITION RANGE ITERATOR|          |   KEY |   KEY |              
|   4 |     TABLE ACCESS FULL         | PART2    |   KEY |   KEY |              
------------------------------------------------------------------              
                                                                                
Outline Data                                                                    
-------------                                                                   
                                                                                
  /*+                                                                           
      BEGIN_OUTLINE_DATA                                                        
      FULL(@"SEL$1" "PART2"@"SEL$1")                                            
      OUTLINE_LEAF(@"SEL$1")                                                    
      SHARED(8)                                                                 
      ALL_ROWS                                                                  
      DB_VERSION('12.1.0.1')                                                    
      OPTIMIZER_FEATURES_ENABLE('12.1.0.1')                                     
      IGNORE_OPTIM_EMBEDDED_HINTS                                               
      END_OUTLINE_DATA                                                          
  */

We can see that block range granules continue to be used when a hard-coded date is specified but that in the other two cases partition granules are now used. Why?

The answer can be found by looking at the description of two hidden parameters:


NAME                            VALUE   DEFLT   TYPE    DESCRIPTION
_px_partition_scan_enabled      TRUE    TRUE    boolean enables or disables parallel
                                                        partition-based scan 
_px_partition_scan_threshold      64    TRUE    number  least number of partitions per slave
                                                        to start partition-based scan

What we learn from these descriptions is that once the number of partitions reaches 64 times the degree of paralellism (DOP) then the CBO will elect to use partition granules. The DOP is 8 meaning that when the number of partitions reached 512 (64 x 8 for those of you with arithmetical challenges) the CBO switched to using partition granules. When partition elimination occured as a result of the hard-coded date the CBO knew that only 3 subpartitions would be referenced and so block range granules were selected. But what about the bind variable case?

The issue here can be a bit tricky for some people to spot (mea culpa). Although we know that there are at most three partitions for each date this is not enforced and the CBO does not have a way of checking. Accordingly, in this specific case, partition elimination is not considered when determining which type of granule should be used. But we know that block ranges should be used if we are to utilize all 8 parallel query servers and not just 3.

Interestingly enough the outline hints do not change when the granule type changes, meaning that execution plans may be unstable even if you have a SQL Plan Baseline defined for a statement.

The good news is that if you recognize the problem you do have the option of adding an opt_param('_px_partition_scan_enabled','false') hint to force the use of block range granules (and that will propagate to your baseline). Furthermore, if you know your application well enough you may reasonably consider setting "_px_partition_scan_enabled" to FALSE at the system level. Do not worry. This won’t affect partition-wise joins!

Three more quick points before I close:

1. You will have noticed that empty partitions are not discounted. Yet another reason not to create lots of them in advance!
2. If you are wondering about whether bind variable peeking affects the result, I can tell you that it doesn’t!
3. I tested this on 11.2.0.2 and 12.1 with identical results. I don’t have access to anything earlier or later.

Update on composite partitioned tables

When we look at composite partitioned tables life gets a bit more complicated.  I originally thought that I would need another post to address it but at the end of the day the conclusions can be stated reasonably concisely.

  • If no equality predicate is specified for the partitioning column(s) and an equality predicate is specified for the subpartitioning column (or equality predicates specified for all subpartitioning columns) then the CBO knows that there will be exactly one subpartition accessed per partition.  Partition granules will be used when the number of partitions is 64 x DOP or greater.
  • If no equality predicate is specified for the subpartitioning column(s) and an equality predicate is specified for the partitioning column (or equality predicates specified for all partitioning columns) then the CBO divides the total number of subpartitions in the table by the number of partitions.  Partition granules are used if this calculated average exceeds 64 x DOP.
  • When the predicates are any more complicated then partition elimination is not considered and partition granules are used whenever the total number of subpartitions exceeds 64 x DOP.

In my opinion this is all fuzzy thinking.  The high value of "_px_partition_scan_threshold" suggests a manta of “if in doubt use block range granules” but by ignoring the effect of empty partitions or complex predicates the mantra “if in doubt use partition granules” comes to mind.  The latter is surely dangerous.

September 13, 2015

The importance of the GLOBAL_STATS statistic on partitioned tables

Filed under: Uncategorized — tonyhasler @ 9:22 pm

I am currently working in a team alongside a gentleman known as Roberto Rigliaco. You are unlikely to have heard of Roberto as he is not (yet) a blogger and not (yet) known on the conference circuit.

Nevertheless, Roberto has done some excellent work recently analyzing the use of object statistics in partitioned tables. I haven’t seen his findings published anywhere else and so I am glad to say that Roberto has accepted my offer to publish his findings on my blog.

It turns out that the setting of the GLOBAL_STATS column in ALL_TAB_STATISTICS is critical to the analysis. Don’t be confused into thinking that the column GLOBAL_STATS has anything to do with the level (table, partition or subpartition) that the statistics refer to. In fact, GLOBAL_STATS indicates whether the table statistics have been aggregated from partition level statistics (“NO”) or explicitly gathered or set (“YES”). At the partition level a value of GLOBAL_STATS=”NO” would mean that the partition level statistics were aggregated from subpartition level statistics.

I will begin by giving you the punch lines and then providing some examples to prove the points. First some intuitive stuff:

  • If more than one partition is referenced then table level statistics will be used if available. No partition or subpartition level statistics will be used.
  • If partition elimination occurs but the optimizer does not know what single partition will be used then table level statistics will be used if available as above.
  • If partition elimination occurs (and in the case of composite partitioned tables elimination of all but one subpartition has not occurred) and Oracle knows what partition will be used then the statistics for that partition will be used if available.
  • If subpartition elimination occurs and the optimizer knows neither the partition nor subpartition that will be used then table level statistics will be used if available.
  • If subpartition elimination occurs and Oracle knows both the partition and subpartition that will be used then subpartition level statistics will be used if available.

The word “knows” in the above list needs to be taken with a pinch of salt: the optimizer might use bind variable peeking to create an execution plan based on statistics for one (sub)partition and then use that plan on an entirely different (sub)partition later on. As always with bind variable peeking, the execution plan obtained through the explain plan statement may differ from the one seen at runtime.

Now comes the not so intuitive stuff:

  • If Oracle knows what partition will be used (and in the case of composite partitioned tables elimination of all but one subpartition has not occurred) but partition level statistics are not available then table level statistics will be used only if GLOBAL_STATS is set to “YES” for the table level statistics.
  • If subpartition elimination occurs and the optimizer knows what single partition will be used but not which single subpartition then table level statistics will be used if available regardless of the value of GLOBAL_STATS.
  • If Oracle knows what single subpartition will be used and subpartition statistics are not available, then table level statistics will be used only if GLOBAL_STATS is set to “YES” for the table level statistics. Interestingly, partition level statistics will not be used even if GLOBAL_STATS=”YES”.

Roberto provided me with a full suite of test cases to verify the observations above and I have checked them on 11gR2 and 12cR1. For the purpose of this blog I will provide just three cases.

Let us begin by creating our test table

CREATE TABLE stats_test
(
   part_key      INTEGER
  ,subpart_key   INTEGER
)
PARTITION BY RANGE (part_key)
   SUBPARTITION BY LIST (subpart_key)
   (PARTITION p1 VALUES LESS THAN (2)
       (SUBPARTITION p1_sp1 VALUES (1), SUBPARTITION p1_sp2 VALUES (2))
   ,PARTITION p2 VALUES LESS THAN (3)
       (SUBPARTITION p2_sp1 VALUES (1), SUBPARTITION p2_sp2 VALUES (2)));

BEGIN
   DBMS_STATS.set_table_stats (ownname    => USER
                              ,tabname    => 'STATS_TEST'
                              ,partname   => 'P1'
                              ,numrows    => 100);
   DBMS_STATS.set_column_stats (ownname    => USER
                               ,tabname    => 'STATS_TEST'
                               ,partname   => 'P1'
                               ,colname    => 'PART_KEY'
                               ,distcnt    => 5
                               ,density    => 1 / 5);
   DBMS_STATS.set_column_stats (ownname    => USER
                               ,tabname    => 'STATS_TEST'
                               ,partname   => 'P1'
                               ,colname    => 'SUBPART_KEY'
                               ,distcnt    => 5
                               ,density    => 1 / 5);
   DBMS_STATS.set_table_stats (ownname    => USER
                              ,tabname    => 'STATS_TEST'
                              ,partname   => 'P2'
                              ,numrows    => 200);
   DBMS_STATS.set_column_stats (ownname    => USER
                               ,tabname    => 'STATS_TEST'
                               ,partname   => 'P2'
                               ,colname    => 'PART_KEY'
                               ,distcnt    => 2
                               ,density    => 1 / 2);
   DBMS_STATS.set_column_stats (ownname    => USER
                               ,tabname    => 'STATS_TEST'
                               ,partname   => 'P2'
                               ,colname    => 'SUBPART_KEY'
                               ,distcnt    => 2
                               ,density    => 1 / 2);
END;
/

STATS_TEST has two partitions each of which has two subpartitions. The table is empty but I have set relevant statistics at the partition level only. This means that:

There are no subpartition level statistics

  • The partition level statistics have GLOBAL_STATS=YES
  • The table level stats have GLOBAL_STATS=NO because they have been aggregated from the partition level statistics.

Now let us run a couple of EXPLAIN PLAN statements and examine the associated execution plans. Let us start without using bind variables.

EXPLAIN PLAN
   FOR
      SELECT COUNT (*)
        FROM stats_test
       WHERE part_key = 1 AND subpart_key = 1;

SELECT *
  FROM TABLE (
          DBMS_XPLAN.display (NULL, NULL, 'basic +rows +partition +note'));
----------------------------------------------------------------------
| Id  | Operation               | Name       | Rows  | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT        |            |     1 |       |       |
|   1 |  SORT AGGREGATE         |            |     1 |       |       |
|   2 |   PARTITION RANGE SINGLE|            |     1 |     1 |     1 |
|   3 |    PARTITION LIST SINGLE|            |     1 |     1 |     1 |
|   4 |     TABLE ACCESS FULL   | STATS_TEST |     1 |     1 |     1 |
----------------------------------------------------------------------

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

Notice that the precise subpartition is known because literals have been used for the predicates. However, there are no subpartition statistics to use. Partition level statistics have been mysteriously bypassed and the table level statistics not used because GLOBAL_STATS=NO. The fact that no statistics have been used is confirmed by both the row count and the use of dynamic sampling.

Now let us replace one literal with a bind variable.

EXPLAIN PLAN
   FOR
      SELECT COUNT (*)
        FROM stats_test
       WHERE part_key = 1 AND subpart_key = :b1;

SELECT *
  FROM TABLE (
          DBMS_XPLAN.display (NULL, NULL, 'basic +rows +partition +note'));
----------------------------------------------------------------------
| Id  | Operation               | Name       | Rows  | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT        |            |     1 |       |       |
|   1 |  SORT AGGREGATE         |            |     1 |       |       |
|   2 |   PARTITION RANGE SINGLE|            |    12 |     1 |     1 |
|   3 |    PARTITION LIST SINGLE|            |    12 |   KEY |   KEY |
|   4 |     TABLE ACCESS FULL   | STATS_TEST |    12 |   KEY |   KEY |
----------------------------------------------------------------------

In this example, the optimizer knows what partition is used but not what subpartition. In this case, the optimizer uses the table level statistics regardless of the fact that GLOBAL_STATS=NO. Notice that the row count is 12, obtained by dividing the number of rows the statistics indicate are in the table as a whole by the number of distinct values of the two columns (300/5/5=12).

Finally, let us actually run the query with the bind variable in place:

DECLARE
   dummy       PLS_INTEGER;
   v_subpart   PLS_INTEGER := 1;
BEGIN
   SELECT COUNT (*)
     INTO dummy
     FROM stats_test
    WHERE part_key = 1 AND subpart_key = v_subpart;
END;
/

SELECT *
  FROM TABLE (
          DBMS_XPLAN.display_cursor ('c9uw6vg3204v0'
                                    ,NULL
                                    ,'basic +rows +partition +peeked_binds +note'));
SELECT COUNT (*) FROM STATS_TEST WHERE PART_KEY = 1 AND SUBPART_KEY =
:B1

Plan hash value: 3184952168

----------------------------------------------------------------------
| Id  | Operation               | Name       | Rows  | Pstart| Pstop |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT        |            |       |       |       |
|   1 |  SORT AGGREGATE         |            |     1 |       |       |
|   2 |   PARTITION RANGE SINGLE|            |     1 |     1 |     1 |
|   3 |    PARTITION LIST SINGLE|            |     1 |   KEY |   KEY |
|   4 |     TABLE ACCESS FULL   | STATS_TEST |     1 |   KEY |   KEY |
----------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :B1 (NUMBER): 1

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

The use of bind variable peeking means that the optimizer now knows that it wants to use statistics for subpartition P1_SP1 even though the actual subpartition at runtime might vary. The optimizer reverts to the use of dynamic sampling as with the use of literals.

I don’t know if this behaviour is deliberate or accidental but if it is deliberate the logic escapes me.

December 9, 2014

UKOUG TECH QUIZ TECH14 Answers

Filed under: Uncategorized — tonyhasler @ 10:05 am

The questions in the quiz were, of course, all trick questions.

Question 1: Answer D

SELECT * FROM T1 LEFT NATURAL JOIN T2;

SELECT * FROM T1 NATURAL LEFT JOIN T2;

The second example shows the correct syntax for a natural outer join.  In the first example, LEFT is interpreted as a table alias for T1 and so an inner join is used.

QUESTION 2: Answer E

SELECT * FROM T1 INNER JOIN T2 ON C1=C2;

SELECT * FROM T1 OUTER JOIN T2 ON C1=C2;

This is the same sort of issue.  Any outer join needs to specify either LEFT or RIGHT but in the second example above OUTER is interpreted as a table alias for T1.  They keyword INNER is optional.  A third construct that also generates identical results is:

SELECT * FROM T1 OUTER INNER JOIN T2 ON C1=C2;

Question 3: Answer D

SELECT *
FROM t1, t2, t3
WHERE t1.c1 = t2.c2(+) AND t2.c2(+) = t3.c3;

SELECT *
FROM t1
LEFT JOIN t2 ON t1.c1 = t2.c2
RIGHT JOIN t3 ON t2.c2 = t3.c3;

Those of you attending my talk on Monday would know the answer to this one.  The ambiguity in the first statement meant that it was illegal in releases prior to 12c but has now become legal for performance reasons.  To express the results of the first query in ANSI syntax you would have to say:

SELECT * FROM T1 CROSS JOIN T3 LEFT JOIN T2 ON T1.C1=T2.C2 AND T3.C3=T2.C2;

Question 4: Answer E

SELECT c1, COUNT (*)
FROM t1
GROUP BY c1
HAVING COUNT (*) = 1;

SELECT c1, COUNT (*)
FROM t1
HAVING COUNT (*) = 1
GROUP BY c1;

I wouldn’t consider it best practice but the HAVING clause may precede the GROUP BY clause.

Tie breaker 1:

It is actually possible to solve the Paul and Sam problem using just two query blocks.  Here is my solution:


WITH base
AS ( SELECT TRUNC (LEVEL / 97) + 2 a,
MOD (LEVEL, 97) + 3 b,
TRUNC (LEVEL / 97) + MOD (LEVEL, 97) + 5 s,
(TRUNC (LEVEL / 97) + 2) * (MOD (LEVEL, 97) + 3) p,
COUNT (*)
OVER (
PARTITION BY (TRUNC (LEVEL / 97) + 2)
* (MOD (LEVEL, 97) + 3))
p_cnt1,
COUNT (*)
OVER (
PARTITION BY (TRUNC (LEVEL / 97) + 2)
+ (MOD (LEVEL, 97) + 3))
s_cnt1
FROM DUAL b
WHERE TRUNC (LEVEL / 97) <= MOD (LEVEL, 97)
CONNECT BY LEVEL <= 97 * 97)
-- Main query block
SELECT a,b,p,s
FROM base b LEFT JOIN base p ON p.p_cnt1 = 1 AND b.s = p.s
WHERE b.s_cnt1 > 1 AND p.s IS NULL
GROUP BY b.p
HAVING COUNT (*) = 1
MODEL UNIQUE SINGLE REFERENCE RETURN UPDATED ROWS
DIMENSION BY (COUNT (*) OVER (PARTITION BY MAX (b.s)) s_cnt)
MEASURES (MAX (b.a) a, MAX (b.b) b, b.p, MAX (b.s) s)
RULES
(a [1] = a[1]);

The first query block is easy enough to understand.  We generate all possible pairs of numbers where A is the lower and B is the higher.  P_CNT1 is the number of rows with the same product and S_CNT1 is the number of rows with the same sum.

To understand the main query block we need to build it up in stages.

Let us start with this:
.....
SELECT a,b,p,s
FROM base b
WHERE b.s_cnt1 > 1
AND b.s NOT IN (SELECT s
FROM base p
WHERE p.p_cnt1 = 1);

Here we select the just the rows where Sam doesn’t know and Sam knew Paul didn’t know because there is no pair of numbers with the same sum where Paul could know.  We can get rid of the subquery with an outer join.


SELECT b.a,
b.b,
b.p,
b.s
FROM base b LEFT JOIN base p ON p.p_cnt1 = 1 AND b.s = p.s
WHERE b.s_cnt1 > 1 AND b.p_cnt1 > 1 AND p.s IS NULL;

In the above example the last predicate selects just the rows preserved by the outer join by virtue of not matching.

We can now move on to identify the rows where Paul can now declare that he knows.  For example, if Paul had been given the product 18, he would know that the numbers couldn’t be 3 and 6 because then Sam would have been given the number 9 and Sam wouldn’t have known in advance that Paul hadn’t been given the number 14 and hence have known the answer.  Once Paul is able to eliminate 3 and 6 from consideration he knows that the numbers must be 2 and 9.

We can identify the 86 rows where Paul now knows by using a GROUP BY with a HAVING clause.

...
SELECT MAX (b.a) a,
MAX (b.b) b,
b.p,
MAX (b.s) s,
COUNT (*) OVER (PARTITION BY MAX (b.S)) s_cnt
FROM base b LEFT JOIN base p ON p.p_cnt1 = 1 AND b.s = p.s
WHERE b.s_cnt1 > 1 AND b.p_cnt1 > 1 AND p.s IS NULL
GROUP BY b.p
HAVING COUNT (*) = 1;

The MAX aggregate functions could be MIN or even AVG because in the result set we are aggregating just one row.  The analytic function identifies the number of possibilities that Sam can now consider: the number out of the remaining 86 rows with identical values for the sum.  A visual inspection of these 86 rows yields the one possibility where Sam now knows the answer.  But how can we select that row without another query block?

The answer is to use the MODEL clause.

First of all, let us get the MODEL clause setup without attempting to restrict the rows.  When a model clause is used the aggregate and analytic expressions move out of the SELECT list and into the PARTITION, DIMENSION, and MEASURES sub-clauses.  We ultimately want to look up using S_CNT so that goes in the DIMENSION clause.

...
SELECT a,b,p,s,s_cnt
FROM paul_knows b LEFT JOIN paul_knows p ON p.p_cnt1 = 1 AND b.s = p.s
WHERE b.s_cnt1 > 1 AND p.s IS NULL
GROUP BY b.p
HAVING COUNT (*) = 1
MODEL UNIQUE SINGLE REFERENCE
DIMENSION BY (COUNT (*) OVER (PARTITION BY MAX (b.s)) s_cnt)
MEASURES (MAX (b.a) a, MAX (b.b) b, b.p, MAX (b.s) s)
RULES
();

The above construct generates the same 86 rows as the previous example.  The UNIQUE SINGLE REFERENCE bit is used to supress a check the SQL engine makes; a reference to a measure a[3] would appear to reference a single cell but would actually reference 6.  By specifying UNIQUE SINGLE REFERENCE we promise not to do anything naughty like that and the error is supressed.

To restrict the rows further we add the keywords RETURN UPDATED ROWS.  If that is all we did we would get no rows returned because there are no rules yet to update rows. The final thing to do is to update the single row that is our puzzle answer.  a[1] = a[1] is a dummy update to identify the one row (a= 4 and b = 13) that is our puzzle answer.  We could have used any measure such as b[1] = b[1]+0.

June 10, 2012

tony-hasler-cv

Filed under: Uncategorized — tonyhasler @ 1:50 pm

Here is my latest CV /Resume

Tony Hasler 2016

March 26, 2012

Aggregate Functions and Sorts

Filed under: SQL Fun,Uncategorized — tonyhasler @ 3:46 pm

I have been prompted to write one of my rare blogs because I have just discovered that the topic of aggregate functions is not quite as straightforward as I previously thought.

Before we start, I want you to look long and hard at the following query and tell me (figuratively speaking) how many sorts are involved in it:

SELECT AVG (c1)
      ,MIN (c1)
      ,MAX (c1)
      ,STDDEV (c2)
      ,RANK (2) WITHIN GROUP (ORDER BY c1 DESC)
      ,STDDEV (c1) KEEP (DENSE_RANK FIRST ORDER BY c2)
      ,VARIANCE (c2) KEEP (DENSE_RANK LAST ORDER BY c3)
  FROM t1;

How are you doing? Let us have a look at the execution plan and see if that will help you?

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    39 |     6   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    39 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |  6000 |   228K|     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Has that changed your opinion?

So the correct answer is……zero!

If you are in any doubt, you can create the table I used for the query and try it yourself:

CREATE TABLE t1
AS
       SELECT ROWNUM c1, MOD (ROWNUM, 2) c2, MOD (ROWNUM, 3) c3
         FROM DUAL
   CONNECT BY LEVEL <= 6000;

You can use the following query before and after the query under test to check the sort statistics:

SELECT name, VALUE
  FROM v$mystat NATURAL JOIN v$statname
 WHERE name LIKE 'sort%';

WARNING: Do not use autotrace when checking the stats. as autotrace can perform a sort.

Let us see how it is possible for this statement to complete without any sorts.

The AVG function calculates the mean of a set of values. AVG(c1) can be calculated as the rows are scanned by simply keeping a running total of c1 and a count of the number of rows where c1 is not null and then performing a division at the end. No sort is required. It should be clear that no sort is required for MIN or MAX either. It may be less clear (until you Google it) that STDDEV and VARIANCE aggregations can be performed in a single pass with no sort. In my query, the RANK function simply counts the number of rows that have a value greater than or equal to 2.

This is all very straightforward but what about the last two calculations? Those ORDER BY clauses surely imply some kind of sort don’t they?

Well no. In my query, the FIRST function returns the standard deviation of all values of c1 amongst the rows that have the lowest value of c2.

In general, the FIRST function almost certainly operates by keeping track of the lowest value of c2 encountered so far and on encountering a new row does the following:

  1. If the value of c2 is greater than the stored minimum value of c2, ignore the row for the purposes of this aggregate function.
  2. If the value of c2 equals the minimum encountered so far then perform the aggregation calculations on c1 defined on the left as if no FIRST function was specified.
  3. If the value of c2 is less than the minimum stored so far, replace the stored minimum, delete any stored aggregate values of c1 calculated so far and then process the row as per step 2.

Now, so far I have been very careful in my selection of aggregate functions. Some aggregate functions do require a sort. Let us make a tiny change to the select statement:

SELECT AVG (c1)
      ,MIN (c1)
      ,MAX (c1)
      ,STDDEV (c2)
      ,DENSE_RANK (2) WITHIN GROUP (ORDER BY c1 DESC)
      ,STDDEV (c1) KEEP (DENSE_RANK FIRST ORDER BY c2)
      ,VARIANCE (c2) KEEP (DENSE_RANK LAST ORDER BY c3)
  FROM t1;

The only difference is that I have replaced the RANK function call with a DENSE_RANK function call. The DENSE_RANK function requires a sort as confirmed by the statistics. This is the new execution plan:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    39 |     6   (0)| 00:00:01 |
|   1 |  SORT GROUP BY     |      |     1 |    39 |            |          |
|   2 |   TABLE ACCESS FULL| T1   |  6000 |   228K|     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

Note that the SORT AGGREGATE operation has been replaced by a SORT GROUP BY. So we have discovered something interesting: SORT AGGREGATE does not perform a sort! Ever!

What happens if we need more than one sort:

SELECT DENSE_RANK (3000) WITHIN GROUP (ORDER BY c1 DESC), MEDIAN (c2) FROM t1;

The execution plan remains the same and the statistic “sorts (memory)” seems to indicate that there has been only one sort but this is impossible. In fact, if you look at “sorts (rows)” you will see that it has been incremented by 9000. 6000 of these rows are for the MEDIAN function but DENSE_RANK only needs to sort the 3000 rows where c1 has a value of less than or equal to 3000. So SORT GROUP BY just counts one sort in the statistics, no matter how many are actually performed.

Let us introduce a GROUP BY clause:

SELECT DENSE_RANK (3000) WITHIN GROUP (ORDER BY c1 DESC) from t1 GROUP BY c2;

The execution plan remains unchanged and the statistics once again show only one sort. In this case, however, there may be several sorts: the first sort is by c2, the others will be sorts to evaluate DENSE_RANK within each group. The “sorts (rows)” statistic seems to support this theory.

I think that is almost enough for this blog. I will post another soon on the difference between the FIRST and LAST functions and the FIRST_VALUE and LAST_VALUE functions. But let me close this note with a comment about the HASH GROUP BY operation introduced in Oracle 10g.

HASH GROUP BY doesn’t perform a sort (as you might expect) and so can only be used for (sub)queries that do not involve aggregates that require sorts. Consider this query:

SELECT MAX(c1) FROM t1 GROUP BY c2;

Here is the execution plan:

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |  6000 |   152K|     7  (15)| 00:00:01 |
|   1 |  HASH GROUP BY     |      |  6000 |   152K|     7  (15)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T1   |  6000 |   152K|     6   (0)| 00:00:01 |
---------------------------------------------------------------------------

In this case we simply keep an array of maximums (one for each distinct value of c2) as the table is scanned.

December 26, 2011

FORCE_MATCH for Stored Outlines and/or SQL Baselines????? – follow up

Filed under: Uncategorized — tonyhasler @ 1:45 pm

As has happened a couple of times recently, comments on a blog of mine have helped me understand the subject matter on which I am commenting far better.

In this case a dialogue with Dom Brooks on my previous blog on how to simulate a stored outline/SQL baseline with FORCE_MATCH using SQL profiles has provided me with a simplified way of approaching the creation of said profiles.

The observation is that there is a column OTHER_XML that appears in a number of tables and views from which you can directly copy the outline hints necessary to fix the execution plan of a statement – and use FORCE_MATCH to boot.

It seems that we are not the first to experiment with this technique. See for example this blog by Kerry Osborne published a while ago but I think these posts have slightly different emphasis.

The OTHER_XML column appears in PLAN_TABLE/SYS.PLAN_TABLE$, V$SQL_PLAN, and DBA_HIST_SQL_PLAN. The outline hints from these views are often displayed using the OUTLINE or ADVANCED formatting options to DBMS_XPLAN.DISPLAY, DBMS_XPLAN.DISPLAY_CURSOR, or DBMS_XPLAN.DISPLAY_AWR table functions respectively.

There is also an OTHER_XML column in DBA_SQLTUNE_PLANS that provides an alternative way to accept a SQL profile. Let me demonstrate. Let us setup some tables first:


-- 
-- Setup tables for test and gather stats 
-- 
DROP TABLE t1; 
DROP TABLE t2; 
DROP TABLE t3; 

BEGIN 
   FOR r IN (SELECT name FROM dba_sql_profiles) 
   LOOP 
      DBMS_SQLTUNE.drop_sql_profile (r.name); 
   END LOOP; 
END; 
/ 

CREATE TABLE t1 
AS 
       SELECT ROWNUM * 2 c1 
         FROM DUAL 
   CONNECT BY LEVEL <= 100000; 

CREATE TABLE t2 
AS 
       SELECT ROWNUM * 2 - 1 c1 
         FROM DUAL 
   CONNECT BY LEVEL <= 50000 
   UNION ALL 
   SELECT 2 FROM DUAL; 

CREATE TABLE t3 
AS 
       SELECT ROWNUM * 2 c1 
         FROM DUAL 
   CONNECT BY LEVEL <= 200000; 

EXEC dbms_stats.gather_table_stats(user,'T1'); 
EXEC dbms_stats.gather_table_stats(user,'T2'); 
EXEC dbms_stats.gather_table_stats(user,'T3'); 

CREATE INDEX t3_i1 
   ON t3 (c1); 

These tables are setup in way to deliberately confuse the CBO. Now, let us run the SQL Tuning advisor.

 
BEGIN 
   DBMS_SQLTUNE.drop_sql_profile (name => 'Tony''s SQLTUNE profile'); 
EXCEPTION 
   WHEN OTHERS 
   THEN 
      NULL; 
END; 
/ 

VARIABLE task_name VARCHAR2(20) 

BEGIN 
   :task_name := 
      DBMS_SQLTUNE.create_tuning_task ( 
         sql_text => ' 
SELECT * 
        FROM t1, t2, t3 
       WHERE t1.c1 = t2.c1 AND t1.c1 = t3.c1 
 '); 
   DBMS_SQLTUNE.execute_tuning_task (:task_name); 
END; 
/ 

SET LONG 100000 LONGC 100000 PAGES 10000 

SELECT DBMS_SQLTUNE.report_tuning_task (:task_name) FROM DUAL 
/ 

-- 
-- This is the standard way to accept a profile (with FORCE_MATCHING) 
-- 

--BEGIN 
--   DBMS_SQLTUNE.accept_sql_profile (task_name => :task_name, REPLACE => TRUE, name => 'Tony''s SQLTUNE profile', force_match => TRUE);

--END; 
--/ 

--EXPLAIN PLAN 
--   FOR 
--      SELECT * 
--        FROM t1, t2, t3 
--       WHERE t1.c1 = t2.c1 AND t1.c1 = t3.c1; 

--SELECT * FROM TABLE (DBMS_XPLAN.display); 

The hints stored in the SQL profile by the commented-out method are not guaranteed to (and are not intended to) fix the SQL profile in quite the same way as a stored outline or SQL baseline. By extracting the outline hints from the advisor task, however, we can fix the plan with the same level of certainty as a SQL baseline or stored outline.

 

DECLARE 
   v_sqlprof_xml   CLOB; 
BEGIN 
   SELECT REGEXP_SUBSTR (other_xml, '<outline_data>.*</outline_data>') 
     INTO v_sqlprof_xml 
     FROM dba_sqltune_plans JOIN dba_advisor_tasks USING (task_id) 
    WHERE     attribute = 'Using SQL profile' 
          AND other_xml IS NOT NULL 
          AND task_name = :task_name; 

   DBMS_SQLTUNE.import_sql_profile (name => 'Tony''s SQLTUNE profile' 
                                    , description => 'Example profile from Tony Hasler''s blog' 
                                    , category => 'DEFAULT' 
                                    , sql_text => q'[ 
SELECT * 
        FROM t1, t2, t3 
       WHERE t1.c1 = t2.c1 AND t1.c1 = t3.c1 
]' 
                                    , REPLACE => TRUE, 
                                    force_match => TRUE, 
                                    profile_xml => v_sqlprof_xml); 
END; 
/ 

EXPLAIN PLAN 
   FOR 
      SELECT * 
        FROM t1, t2, t3 
       WHERE t1.c1 = t2.c1 AND t1.c1 = t3.c1; 

SELECT * FROM TABLE (DBMS_XPLAN.display); 

Note that REGEXP_SUBSTR seems adequate for extracting the outline hints from the OTHER_XML column.

December 16, 2011

FORCE_MATCH for Stored Outlines and/or SQL Baselines?????

Filed under: Uncategorized — tonyhasler @ 2:11 pm

Note that there is a follow up to this post here that you should read after this post.

Stored outlines were introduced in Oracle 9i as a way of helping stabilise execution plans.  In 11gR1 these are deprecated in favour of enterprise-edition-only SQL Baselines (sorry standard edition users )-:) but as of 11gR2 neither facility has the capability built into SQL profiles for matching a plan to any statement that matches text when literal values and whitespace are ignored.  I heard this raised as an enhancement request at last week’s Uk Oracle User Group conference.

By a strange coincidence, I was faced with this precise problem this week.  The issue arose with a large application that was very time consuming and expensive to regression test.  It involved a dynamically generated SQL statement against partitioned tables that avoided the use of bind variables for literals, including a couple of dates.  Apart from the literal values, the query was identical from call to call.

When historic data was queried, the execution plan was fine.  However, when the current day was queried the plan was often very poor.  Neither gathering statistics during the day nor copying partition statistics from one day to the next helped.

My long term plan is to identify why the statistics copying isn’t working well and either to customise the copy process or to abandon the use of partition level statistics altogether.  But that is root cause analysis.  What about a recovery plan for my client?

The funny thing is that SQL baselines, stored outlines, and SQL profiles all store hints.  Normally, the types of hints that are stored in a SQL profile are very different from those stored in a SQL baseline or in a stored outline but they don’t have to be.

Let me demonstrate with a two table join for Christmas Day:

CREATE TABLE t1 (d1, c1 DEFAULT 'x') 
PARTITION BY RANGE 
   (d1) 
   ( 
      PARTITION p1 VALUES LESS THAN (DATE '2012-01-01'), 
      PARTITION pdefault VALUES LESS THAN (maxvalue)) 
AS 
       SELECT DATE '2011-12-25', RPAD ('x', 2000) 
         FROM DUAL 
   CONNECT BY LEVEL <= 10000; 

CREATE TABLE t2 
AS 
       SELECT DATE '2011-12-25' d1 
         FROM DUAL 
   CONNECT BY LEVEL <= 100; 

EXEC dbms_stats.gather_table_stats(ownname=>user,tabname=>'T1',no_invalidate=>false); 
EXEC dbms_stats.gather_table_stats(ownname=>user,tabname=>'T2',no_invalidate=>false); 

EXPLAIN PLAN 
   FOR 
      SELECT * 
        FROM t1 NATURAL JOIN t2 
       WHERE d1 = DATE '2011-12-25'; 

SET PAGES 0 LINES 132 

SELECT * FROM TABLE (DBMS_XPLAN.display (NULL, NULL, 'outline')); 

This is the output I get on my 11.2.0.1 test database:

Plan hash value: 937381588             

----------------------------------------------------------------------------------------------------
| Id  | Operation                | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT         |         |  1000K|  1923M|   932   (1)| 00:00:12 |       |       | 
|   1 |  HASH JOIN               |         |  1000K|  1923M|   932   (1)| 00:00:12 |       |       | 
|   2 |   PART JOIN FILTER CREATE| :BF0000 |   100 |   800 |     3   (0)| 00:00:01 |       |       |
|   3 |    TABLE ACCESS FULL     | T2      |   100 |   800 |     3   (0)| 00:00:01 |       |       |
|   4 |   PARTITION RANGE SINGLE |         | 10000 |    19M|   923   (1)| 00:00:12 |KEY(AP)|KEY(AP)| 
|   5 |    TABLE ACCESS FULL     | T1      | 10000 |    19M|   923   (1)| 00:00:12 |     1 |     1 | 
---------------------------------------------------------------------------------------------------- 

Outline Data                           
-------------                          

  /*+   
      BEGIN_OUTLINE_DATA               
      USE_HASH(@"SEL$58A6D7F6" "T1"@"SEL$1")                          
      LEADING(@"SEL$58A6D7F6" "T2"@"SEL$1" "T1"@"SEL$1")              
      FULL(@"SEL$58A6D7F6" "T1"@"SEL$1")                              
      FULL(@"SEL$58A6D7F6" "T2"@"SEL$1")                             
      OUTLINE(@"SEL$1")                
      OUTLINE(@"SEL$2")               
      MERGE(@"SEL$1")                  
      OUTLINE_LEAF(@"SEL$58A6D7F6")    
      ALL_ROWS                         
      DB_VERSION('11.2.0.1')           
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')                           
      IGNORE_OPTIM_EMBEDDED_HINTS      
      END_OUTLINE_DATA                 
  */

Now what if I re-gather statistics and issue the same statement for New Years day?

EXEC dbms_stats.gather_table_stats(ownname=>user,tabname=>'T1',no_invalidate=>false); 
EXEC dbms_stats.gather_table_stats(ownname=>user,tabname=>'T2',no_invalidate=>false); 
EXPLAIN PLAN 
   FOR 
      SELECT * 
        FROM t1 NATURAL JOIN t2 
       WHERE d1 = DATE '2012-01-01'; 

SELECT * FROM TABLE (DBMS_XPLAN.display (NULL, NULL, 'outline -predicate'));

This is the plan for that date:

Plan hash value: 2331183230                                 
------------------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |     1 |  1019 |     6  (17)| 00:00:01 |       |       |
|   1 |  HASH JOIN              |      |     1 |  1019 |     6  (17)| 00:00:01 |       |       |
|   2 |   PARTITION RANGE SINGLE|      |     1 |  1011 |     2   (0)| 00:00:01 |     2 |     2 |
|   3 |    TABLE ACCESS FULL    | T1   |     1 |  1011 |     2   (0)| 00:00:01 |     2 |     2 |
|   4 |   TABLE ACCESS FULL     | T2   |     1 |     8 |     3   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------
Outline Data            
-------------           
  /*+                   
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SEL$58A6D7F6" "T2"@"SEL$1")                
      LEADING(@"SEL$58A6D7F6" "T1"@"SEL$1" "T2"@"SEL$1")    
      FULL(@"SEL$58A6D7F6" "T2"@"SEL$1")                    
      FULL(@"SEL$58A6D7F6" "T1"@"SEL$1")                    
      OUTLINE(@"SEL$1") 
      OUTLINE(@"SEL$2") 
      MERGE(@"SEL$1")   
      OUTLINE_LEAF(@"SEL$58A6D7F6")                         
      ALL_ROWS          
      DB_VERSION('11.2.0.1')                                
      OPTIMIZER_FEATURES_ENABLE('11.2.0.1')                 
      IGNORE_OPTIM_EMBEDDED_HINTS                           
      END_OUTLINE_DATA  
  */

Now what if I know that the first plan is desirable and the second one isn’t? Well, I can load the outline hints associated with the good plan into a SQL profile:

BEGIN 
   DBMS_SQLTUNE.import_sql_profile ( 
      name          => 'BLOG profile', 
      description   => 'Example profile from Tony Hasler''s blog', 
      category      => 'DEFAULT', 
      sql_text      => q'[ 
 SELECT * 
        FROM t1 NATURAL JOIN t2 
       WHERE d1 = DATE '2011-12-25' 
   ]', 
      REPLACE       => TRUE, 
      force_match   => TRUE, 
      profile       => sqlprof_attr('BEGIN_OUTLINE_DATA', 
'USE_HASH(@"SEL$58A6D7F6" "T1"@"SEL$1")', 
'LEADING(@"SEL$58A6D7F6" "T2"@"SEL$1" "T1"@"SEL$1")', 
'FULL(@"SEL$58A6D7F6" "T1"@"SEL$1")', 
'FULL(@"SEL$58A6D7F6" "T2"@"SEL$1")', 
'OUTLINE(@"SEL$1")', 
'OUTLINE(@"SEL$2")', 
'MERGE(@"SEL$1")', 
'OUTLINE_LEAF(@"SEL$58A6D7F6")', 
'ALL_ROWS', 
'DB_VERSION(''11.2.0.1'')', 
'OPTIMIZER_FEATURES_ENABLE(''11.2.0.1'')', 
'IGNORE_OPTIM_EMBEDDED_HINTS', 
'END_OUTLINE_DATA')); 
END;

Careful editing is required: each hint has to be an element in the SQLPROF_ATTR array so be careful with multi-line hints. If you have a very long hint you may have to use the overloaded variant of this undocumented routine that allows you to specify XML rather than an array (profile_xml rather than profile). In my case, I was able to simply replace a long list of columns in an index hint with an index name. Also note the need to deal with sinqle quotes inside the strings (as in OPTIMIZER_FEATURES_ENABLE).

Note the important FORCE_MATCH parameter.

With this profile in place, we get the same plan for both Christmas Day and New Years Day.

You may wonder what happens if the SQL text already contains hints? Well, you have to include these embedded hints in the SQL text when you create the profile but the BEGIN_OUTLINE_DATA, IGNORE_EMBEDDED_HINTS, and END_OUTLINE_DATA hints in the SQL profile combine to allow the embedded hints to be overridden.

July 24, 2011

‘log file sync’ and ‘log file parallel write’ – part 2

Filed under: Uncategorized — tonyhasler @ 8:24 pm

I mentioned at the very end of part 1 that ‘log file parallel write’ wait times are in fact unlikely to be close to ‘log file sync’ wait times.  There have been many questions and blog entries on this topic over the years.  One common starting point seems to be Metalink article 34592.1.  Another good article is this one:

http://kevinclosson.wordpress.com/2007/07/21/manly-men-only-use-solid-state-disk-for-redo-logging-lgwr-io-is-simple-but-not-lgwr-processing/.

However, it seems that everybody has missed a fundamental complication of the changes in commit processing introduced in 10g.  The point has probably been missed because it is normally very hard to demonstrate.

Oracle 9i and below Oracle had the concept of a ‘log buffer’.  You would be in good company if you thought that later versions had a log buffer as well.  After all, there is an initialisation parameter LOG_BUFFER that specifies how big it is.

In reality, Oracle now uses a combination of ‘private redo strands’ and ‘public redo strands’.

Private redo strands do not exist in instances within a RAC cluster.  In a non-RAC environment each transaction gets its own redo strand (subject to availability).  These strands are used in combination with in-memory undo (IMU) to reduce the amount of data that is logged but more importantly to reduce latching conflicts; data is normally copied from a private redo strand only at commit time.  Data will also be copied out if the private redo strand fills up before a commit is issued.  These private redo strands are allocated from shared pool.

RAC or no RAC there is also a minimum of two public redo strands as well.  In larger systems there may be more (number of CPUs divided by 16).  These public redo strands also contribute to latch reduction.  The number of these strands can be changed by use of the hidden parameter “_log_parallelism_max”.

In relatively quiet systems only one public redo strand may be in use but even with as few as two concurrent sessions busy generating redo you can see the second redo strand being used.  When that happens, LGWR writes out the data in the public redo strands sequentially!

How do I know this?

Well, I was performing a rather artificial insert-only performance test on one instance of an 11.2 RAC database on a 64-CPU system with a suspected faulty I/O subsystem (don’t ask).  LGWR was spending 90% of its time in ‘log file parallel write’ (LFPW) and about 10% of the time on the CPU.  I had taken the advice in Kevin’s article above and “reniced” LGWR so I was fairly sure that LGWR rescheduling was not a big issue: throughput was limited by the I/O subsystem.

Nevertheless, ‘log file sync’ (LFS) wait events averaged 4 or 5 times the LFPW times despite the average CPU utilisation only being about 25%.  Theoretically, one would expect that the average LFS time would be somewhat more than 50% higher than LFPW in a system that is limited by LGWR I/O:  if LGWR has just started an I/O at the time that you commit then you have to wait for that I/O to finish as well as your own that follows.  If you are luckier LGWR may just finish an I/O as you commit and now you just have to wait for your own I/O. This gives an average LFS/LFPW ratio of 1.5.  Given the fact that you are still in LFS when LGWR is on the CPU doing its post processing I thought that LFS should be a little higher than 50% above LFPW but certainly not 300% or 400% higher.

Explanations revolving around ‘CPU storms’ didn’t seem plausible so I had to dig deeper.  Well, by ‘digging deeper’ I mean a) posting a question on OTN and asking for help from ‘knowledgeable friends’.  When that failed b) guessing.  I eventually guessed correctly when I set “_log_parallelism_max” to 1 (it was previously 4 on my 64 CPU system).

The faulty I/O subsystem had (has) a characteristic that throughput is several times higher when the I/O size is larger.  The reason for this is not yet known.  However, when the initialisation parameter change was made:

  • The average size of the I/O increased several times
  • The (faulty) I/O subsystem throughput increased several fold
  • The LFS time came back down to just over 50% of the LFPW time.
  • LGWR started waiting on the idle event ‘rdbms ipc message’; the system was no longer limited by I/O.
  • Latching increased thus proving that the long term solution was to fix the I/O subsystem not make unsupported changes to initialisation parameters!

I then set about trying to devise some kind of experiment that others can repeat to verify my conclusions.  The problem is that the only way to get more than one public redo strand in play is to have sufficient redo being generated that a redo allocation latch collision occurs.  You can easily generate this scenario.  You can also prove that the number of transactions being committed each write (redo size/commit size) is much less than the number of processes in LFS but I can’t think of a way to prove that this discrepancy is caused by I/O serialisation without a faulty I/O subsystem.  Sure you can run a trace and see that there are multiple LFPW events between idle ‘rdbms ipc message’ events but that is hardly conclusive.

Any ideas on how to devise an experiment would be warmly welcomed.  In the meantime I guess you will just have to take my word for it!

‘Log file sync’ and ‘log file parallel write’ – part 1

Filed under: Uncategorized — tonyhasler @ 8:15 pm

Preamble
This is the first part of a three part blog about LGWR. If you want a thorough understanding of LGWR, ‘log file sync’, and ‘log file parallel write’ read on. If you are an Oak Table member you might want to skip to the more advanced topics in parts 2 and 3 (unless you want to point out errors and omissions of course :-)).

You are a programmer or a DBA and you are trying for the first time in your life to analyze an Oracle performance problem. What is the most important concept you need to understand to get started?

Well, there are a number of right answers to this question but rightly or wrongly the ‘The Oracle wait interface’ would be many people’s answer. A very basic approach would be to look at the highest wait event on the system and focus on that. So for example, if you see a lot of ‘db file sequential read’ wait events you might conclude that faster disks might be the answer. You may be a bit smarter and realise that these events are generally associated with indexed access and try and work out what indexes are being accessed and figure out if they (or the queries that use them) can be redesigned.

There are, of course, numerous books and web articles that explain the deficiencies of this type of approach but in many cases it seems to work. The success of this type of basic approach will be quite high in simple systems where performance has been neglected for a long time and there are one or two ‘quick wins’ that can have a significant benefit. In complex systems, or ones that have been properly managed over the years this sort of basic analysis is unlikely to be enough.

Cary Millsap wrote a book with Jeff Holt many years ago entitled ‘Optimizing Oracle Performance’ in which he explained the limitations of any system wide tuning approach. Even though the book was written many years ago, I would say that this is still mandatory reading for any Oracle performance specialist. The next most important book, in my opinion, is a more modern work by Christian Antognini http://antognini.ch/top/ but I digress.

Is ‘LOG FILE SYNC’ (LFS) relevant?

Arguably the single most misleading wait event in the Oracle database is ‘log file sync’ (LFS). Let me explain why. Suppose you have a single client process and it spends 1% of its time in LFS. Would you gain a lot by engaging in a tuning exercise that aimed exclusively at reducing this wait event? Well, obviously the upper limit on the gain you might make is 1% even if you eliminated these events in their entirety; the answer is, therefore, NO. You are unlikely to waste your time though because your Statspack or AWR report will probably not highlight the event in the first place.

On the other hand, suppose you have 10,000 sessions connecting to your database and 300 of these are active at any one time. Suppose that 1% of the active time of each session is spent in LFS would you gain much by tuning the event? The answer is the same as before: NO. However, you might be misled by Statspack or AWR because now the total wait time for the database for LFS will be 3 hours in any one hour! This is because at any one time we have 300 active sessions and on average three of them will be in LFS. That comes out to be 3 hours per hour waiting! So before you conclude that LFS is a major performance issue first check out what the average wait time per active session is. Of course, what you should really do is find out the percentage of wait time for your business critical processes but I digress again. Let us get back to the technicalities of LFS.

‘Log File Sync’ (LFS) versus ‘log file parallel write’ (LFPW)

When a transaction commits it will usually wait for its redo data to be recorded on disk before returning control to the application. That wait is LFS. There are three main exceptions to this:

  • When the commit is issued within a PL/SQL block. In this case the wait will be deferred until the block returns. If the PL/SQL block contains multiple commit statements then only one wait will occur when the block returns.
  • When the commit statement is actually COMMIT WORK NOWAIT. In this case, there is no LFS wait. See part 3 in this series for an explanation of why this is dangerous.
  •  When the session parameter COMMIT_WAIT (COMMIT_WRITE in earlier releases) is set to NOWAIT. This parameter can be set at the instance level but this is dangerous practice and should be set at the session level only with care. See part 3 in this series for an explanation of why this is dangerous.

The waiting client process does not do the I/O itself. LGWR does. When LGWR does the I/O it waits on ‘log file parallel write’ (LFPW). So does this mean that the number of LFPW events should equal the number of LFS events? NO! This is because LGWR employs a tactic known as ‘group write’ and may write out the redo data for multiple transactions and multiple sessions in one go. Suppose you have 5 client processes each of which is waiting for its transaction to complete. These transactions will be waiting for LFS. LGWR will be waiting for LFPW. If the I/O takes 10ms then:

  • The system will accumulate 50ms of time in LFS (10ms for each client process). 5 wait events will be recorded.
  • LGWR will accumulate 10ms. One wait event will be recorded.

That is 6 wait events totalling 60ms clocked up for one 10ms I/O! You see the danger in oversimplifying?

In the simplified example above I have suggested that the average wait time for LFS should be about the same as the average wait time for LFPW as they are both associated with waiting for the redo data to be written to disk. In fact, this is itself a misleading simplification. Onto part 2.

Next Page »

Blog at WordPress.com.