Thursday, November 19, 2009

Using the PARTITION BY function in Teradata SQL

The PARTITION BY command in Teradata is a very powerful function. The best use that I have found is when you have to find the minimum or maximum value, it can retrieve that row in one step instead of two.

One way to retrieve the first/last row across a certain criteria is to first build a table (or subquery), group by the attribute, and then use the min/max function. Then take that dataset and join it back to the original dataset to retrieve the row. The PARTITION BY command will do this in one step.

The syntax of the PARTITION BY command can be daunting at first, but it really isn't that complicated:

QUALIFY ROW_NUMBER() OVER (PARTITION BY X ORDER BY Y DESC) = 1

You put this directly under the chunk of code that you want to partition. The only parts of this code that you will usually change are X, Y, and DESC.

X is the field that you want to break on. It is the field that you would group by in the first step of the two-step technique.

Y is the field that you want to sort by.

= 1 means that you will get the first row returned for each value X. So, if you set the order for Y to be ASC, you will get the first row (MIN). If you set the order for Y to be DESC, you will get the last row (MAX).

Hope this helps!

No comments:

Post a Comment