Tony’s Oracle Tips

November 17, 2008

Analytic Functions Aren’t Always best

Filed under: SQL Fun — tonyhasler @ 11:20 am

Whilst preparing a talk for the UKOUG national conference I was surprised to see the following bad behaviour of analytic functions.  Here is the test:


ALTER SESSION SET current_schema=scott;
DROP TABLE empc;

CREATE TABLE empc
(
  empno     NUMBER(38),
  ename     VARCHAR2(100 BYTE),
  job       VARCHAR2(100 BYTE),
  mgr       NUMBER(38),
  hiredate  DATE,
  sal       NUMBER(38),
  comm      NUMBER(38),
  deptno    NUMBER(38)
);

INSERT INTO empc
            (empno, ename, job, mgr, hiredate, sal, comm, deptno)
   WITH rowss AS
        (SELECT     ROWNUM rn
               FROM DUAL
         CONNECT BY LEVEL <= 3000000)
   SELECT ROWNUM, 'Emp ' || ROWNUM, 'Job' || MOD (ROWNUM, 10), CEIL (ROWNUM / 100) + 1, SYSDATE,
          ROWNUM, ROWNUM, MOD (rn, 100) + 1
     FROM rowss;

EXEC dbms_stats.gather_table_stats('SCOTT','EMPC');

COMMIT;

ALTER SYSTEM FLUSH BUFFER_CACHE;

SET TIMING ON

PROMPT Analytic version

WITH main_query AS
     (SELECT /*+ NO_MERGE */
             empc.*, AVG (sal) OVER (PARTITION BY deptno) avgsal,
             SUM (sal) OVER (PARTITION BY deptno) sumsal
        FROM empc)
SELECT MAX (avgsal)
  FROM main_query;

PROMPT Aggregate version

ALTER SYSTEM FLUSH BUFFER_CACHE;

WITH qstat AS
     (SELECT   deptno, AVG (sal) avgsal, SUM (sal) sumsal
          FROM empc
      GROUP BY deptno),
     main_query AS
     (SELECT /*+ NO_MERGE */
             *
        FROM empc JOIN qstat USING (deptno)
             )
SELECT MAX (avgsal)
  FROM main_query;
 

And here are the results on my PC running 11g:


Session altered.
Table dropped.
Table created.
3000000 rows created.
PL/SQL procedure successfully completed.
Commit complete.
System switch log altered.
Analytic version

MAX(AVGSAL)
-----------
    1500050

1 row selected.
Elapsed: 00:00:18.64
Aggregate version
System switch log altered.
Elapsed: 00:00:00.11

MAX(AVGSAL)
-----------
    1500050

1 row selected.
Elapsed: 00:00:07.46

As you can see, the analytic version of the query ran considerably slower than the use of the simple aggregate.  I am not sure as to the reason for this.  Perhaps the analytic function is performing an unnecessary sort?  Anybody got any ideas?:

Footnote: The answer is because the analytic function sorts wide rows and spills onto disk.  If the select list is reduced then the analytic function works comparably to the aggegate function.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: