SQL Brain

This is your Brain on SQL

  • About this Blog

    I use PostgreSQL and love it a lot.

    Whenver I discover or conjeur something neat, I post it here!

  • Subscribe

Archive for the ‘Clever Queries’ Category

Using DISTINCT with CASE

Posted by jrotenstein on September 23, 2010

I had an interesting situation where I wanted to count DISTINCT rows with variable values.

Imagine a table like this:

Customer City VIP
Acme Sydney true
Bally Sydney false
Bally Melbourne true
Costco Melbourne false
David Jones Sydney true
David Jones Melbourne true
Embraer Sydney false
Fastway Sydney true

Now, how would you count the number of distinct customers in each city by VIP status?

I’m looking for this result:

City VIPs Not VIPs
Melbourne 2 1
Sydney 3 2

You can’t just group by City and use COUNT(DISTINCT customer) because it has to be divided by VIP too. That’s when I realised that the DISTINCT keyword looks at the following expression rather than just a column name.

Here’s the code that can be used:

select
  city,
  count(distinct case when     vip then customer else null end) as VIPs,
  count(distinct case when not vip then customer else null end) as Not_VIPs
from table
group by city

The COUNT automatically ignores the null values, so it’s only count the number of distinct cities that are/are not VIPs.

That’s a very different way to use DISTINCT.

Posted in Clever Queries | Leave a Comment »

Avoid SQL Divide-by-zero errors by using NULLIF

Posted by jrotenstein on October 6, 2009

I picked up this tip while reading SQL Hacks.

Let’s say you’re receiving occasional divide by zero errors:

select
  count(name) / count(*) as "Average"
from xyz

This will result in a division by error if there are no rows returned.

It can be avoided by using the NULLIF function, which returns a null value if the two passed-in expressions are equal:

select
  count(name) / NULLIF( count(*), 0)
from xyz

So, if count(*) = 0, a null value is returned. The resulting division therefore equates to null. While that isn’t a particularly useful result, it’s better than getting an error!

Posted in Clever Queries | Leave a Comment »

Using logaritms for grouping large numbers

Posted by jrotenstein on August 20, 2009

I had a bit of fun today. I had to show a summary of Revenue from some companies that I had downloaded from Jigsaw. (Jigsaw is a community site for gathering and sharing Company and Contact information. They’ve got a free API for accessing company information, too!)

The revenue data was fairly random, like this:

5000000
891900000
5000000
150000000
500000
30000000
5000000
1000000000
5000000

Even grouping the data wasn’t much better:

revenue       count  
------------  -----  
5000000       2543   
500000        1636   
30000000      1128   
75000000      432    
150000000     339    
1000000000    338    
350000000     252    
750000000     156    
1200000000    11     
1100000000    10     
1300000000    9      
103630000000  9      
2100000000    4
etc.

So, what would be the best way to present this information?

There’s too many possible values to group into units of $million. But perhaps grouping into orders of magnitude would work:

select
  to_char(power(10, ceiling(log(revenue / 1000000))), 'FM$999,999,999"m"') as revenue,
  count(*)
from jigsaw
where
  revenue > 1000000
group by ceiling(log(revenue / 1000000))
order by ceiling(log(revenue / 1000000))

This results in:

revenue      count  
-----------  -----  
$1m          1      
$10m         2552   
$100m        1642   
$1,000m      1287   
$10,000m     234    
$100,000m    184    
$1,000,000m  24     

Much nicer!

Posted in Clever Queries | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.