Wednesday, September 17, 2014

Ignoring outliers in aggregate function

This is another aide-memoire for myself really.  I want to calculate the average load times per page for an application from timings stored in the database, and see which pages need attention. However, the stats can be skewed by the odd exceptional load that takes much longer than a typical load for reasons that are probably irrelevant to me.

Here is a fictitious example:

create table timings (id int, timing number);

insert into timings
select rownum, case when rownum=50 then 1000 else 1 end
from dual
connect by rownum <= 100;

This example has 99 timings of 1 second plus an oddity of 1000 seconds.

A simple average gives a skewed picture:

SQL> select avg(timing) from timings;

AVG(TIMING)
-----------
      10.99

It suggests that users are waiting 11 seconds on average for a page to load, when in fact it is usually 1 second.

The analytic function NTILE(n) can solve this.  This divides the set of results into n "buckets" and then tells us which bucket a particular value falls into.  If we do that with a suitable number of buckets, say 10, we will be able to exlude the highest 10% and lowest 10% of the values:

SQL> select avg(timing) from 
  2  (select timing, ntile(10) over(order by timing) bucket
  3   from timings)
  4  where bucket between 2 and 9;

AVG(TIMING)
-----------
          1


Thursday, September 11, 2014

Why use CASE when NVL will do?

I've found that many developers are reluctant to use "new" features like CASE expressions and ANSI joins. (By new I mean: this millennium.)

But now they have started to and they get carried away.  I have seen this several times recently:

    CASE WHEN column1 IS NOT NULL THEN column1 ELSE column2 END

Before they learned to use CASE I'm sure they would have written the much simpler:

    NVL (column1, column2)

Now I now that NVL is Oracle-specific and CASE is portable, but (a) we aren't ever going to be porting our millions of lines of PL/SQL, and (b) I can guarantee they didn't do it for that reason. They have got a new hammer and now everything looks like a nail.