Tony’s Oracle Tips

June 11, 2012

Ordering Non-sorting aggregate functions

Filed under: SQL Fun — tonyhasler @ 3:49 pm

Aggregate functions fall into two categories. The first is what I call “Sorting aggregate functions” that need their data sorted in some way to operate. Examples of sorting aggregate functions are MEDIAN and DENSE_RANK. The second category is non-sorting aggregate functions that do not require a sort to operate. These include COUNT, AVG, and RANK.

In Oracle 10g the concept of HASH AGGREGATION was introduced as an optimisation of the GROUP BY operation. The idea is that non-sorting aggregate functions (let me call them NSAFs from now on) do not need to be kept in a sorted list; you just keep a hash table of the various accumulating counters that are needed for that function’s operation. Since a hash table is quicker to access than a sorted list this offers some performance benefit, more so when the list of groups is large.

Hash aggregation obviously only works for NSAFs. If you include MEDIAN(X) in your select list or HAVING clauses then the input data will all need to be sorted. One big list is created, sorted first by the GROUP BY columns and then by, in this case, column X.

For some reason, there are five NSAFs that do not take advantage of hash aggregation. These are FIRST, LAST, RANK, SYS_XMLAGG and XMLAGG. I can’t imagine why not but there it is.

Even if an NSAF supports hash aggregation the CBO doesn’t always use it when the data needs to be sorted anyway, for example as a result of an ORDER BY operation.

Guy Harrison has suggested in his blog that the logic is “seriously flawed” because if you have 1,000,000 rows and 100 groups it would be best to defer the sort until after the aggregation because then you have only 100 rows to sort and not 1,000,000. He has even done some experiements to show that the CPU overhead is much less if you use the USE_HASH_AGGREGATION hint.

However, it turns out that Oracle does not sort 1,000,000 rows after all. Just the 100. Let us create a test case.

CREATE TABLE sort_test
AS
   WITH q1
        AS (    SELECT ROWNUM rn
                  FROM DUAL
            CONNECT BY LEVEL <= 100)
   SELECT MOD (a.rn, 2) grp, a.rn + b.rn + c.rn myint1, b.rn myint2
     FROM q1 a, q1 b, q1 c;

This table has 1,000,000 rows but there are only two distinct values for the GRP column. Let us set a 1K SORT_AREA_SIZE as follows:

ALTER SESSION SET workarea_size_policy=manual;
ALTER SESSION SET sort_area_size=1024;

Now let us see what the execution plan looks like for a statement that truly requires a sort of all 1,000,000 rows:

  SELECT grp, MEDIAN (myint1) myavg
    FROM sort_test
GROUP BY grp
ORDER BY grp;

--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |   966K|    23M|   603   (9)| 00:00:08 |
|   1 |  SORT GROUP BY     |           |   966K|    23M|   603   (9)| 00:00:08 |
|   2 |   TABLE ACCESS FULL| SORT_TEST |   966K|    23M|   559   (2)| 00:00:07 |
--------------------------------------------------------------------------------

The 23M of space that we require is much greater than the 1K we have allocated and so when we run it we get a multi-pass sort that on my laptop took 51 minutes.

Let us change the function to a NSAF:

  SELECT grp, AVG (myint1) myavg
    FROM sort_test
GROUP BY grp
ORDER BY grp;

The execution plan reported by the CBO is identical. However, the query ran in under 1 second and the session statistics report a memory sort!

What is clearly happening is that only the accumulating counters are being maintained not the values from each row. I did the experiment on 10.2.0.4 and 11.2.0.1 so I don’t think it is a new feature.

Incidentally, the same mechansim is used for NSAFs that don’t support hash aggregation:

  SELECT grp, MIN (myint1) KEEP (DENSE_RANK FIRST ORDER BY myint2) myvalue
    FROM sort_test
GROUP BY grp
ORDER BY grp;

The above query also finishes very quickly.

So why was Guy able to demonstrate a performance improvement with the use of the USE_HASH_AGGREGATION hint? Well the answer is that wandering up and down a tree of sorted items can be much more expensive than performing a direct hash. Particularly as Guy had 200,000 groups. However, this benefit may be small, or even negative, if the number of groups is very small as in my artificial case.

So Guy is right about the CBO being “flawed” bacause a) the execution plan costs the operation incorrectly and b) the sorted tree is going to be expensive to walk up and down a huge number of times. However, things aren’t quite as bad as they first appear.

About these ads

2 Comments »

  1. Hi,
    Can you please also give some idea on why the first query has to use disk sort and second was done in the memory itself. Is this same case with top n query with rownum in outer query.Would it be possible for you to explain how oracle is going to sort in both cases..Thanks

    Comment by bILU — August 16, 2013 @ 3:38 am | Reply

    • bILU,

      The first query has to sort all of the rows within a group to determine the median. You don’t need to sort the elements in a group to determine the average. So in the second case Oracle just keeps running tallies for the sum and count for each group. Since I constructed the example with only two groups the sorted list has only two elements (one for grp=0 and the other for grp=1).

      As far as the sorting algorithm that is a big topic. See chapter 13 of Cost Based Oracle by Jonathan Lewis for one of many sources of information on this.

      Comment by tonyhasler — August 16, 2013 @ 10:24 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 45 other followers

%d bloggers like this: