Friday, December 05, 2008

Row_Number() function in SQL Server 2005

 
As we all know, SQL Server 2005 has new features when compared to SQL Server 2000. One of the features that we are going to discuss in this article is the Row_Number() function.
Traditionally we used TOP keyword, temporary tables and co-related sub-queries to generate calculated row numbers in a query. Now SQL Server 2005 provides a function, which replaces all of the additional resources we used to generate row numbers. It is not only used to generate row number for each row but also used for other purposes as well.

Let us assume that we have the following database [Test] and the following table [Orders] in the database. You can use the below query to create the database, table and all the corresponding rows.
-- Create Table Name Products
CREATE TABLE Products(
Prod_NO VARCHAR(10) NOT NULL primary Key,
Prod_Name VARCHAR(100),
Prod_Amt INT
)
GO


-- Insert into Products Tables with some values
INSERT INTO Products VALUES ('P000000001', 'Chemicals 1', 10000)
INSERT INTO Products VALUES ('P000000002', 'Chemicals 2', 13000)
INSERT INTO Products VALUES ('P000000003', 'Chemicals 3', 8000)
INSERT INTO Products VALUES ('P000000004', 'Chemicals 4', 5000)
INSERT INTO Products VALUES ('P000000005', 'Chemicals 5', 21000)
GO


-- Verify Values
SELECT * FROM Products

ROW_NUMBER is an analytic function. It assigns a unique number to each row to which it is applied (either each row in the partition or each row returned by the query), in the ordered sequence of rows specified in the order_by_clause, beginning with 1.

1) Generating Row Number for Each Row after ordering by Prod_Amt.
Following Query will return all result from Products table ordered by with the Amount and assigning each row a Row_Number beginning with 1.

SELECT ROW_NUMBER() OVER(Order by Prod_Amt) as Row_ID, * from Products

Result:
Row_ID Prod_NO Prod_Name Prod_Amt
1 P000000004 Chemicals 4 5000
2 P000000003 Chemicals 3 8000
3 P000000001 Chemicals 1 10000
4 P000000002 Chemicals 2 13000
5 P000000005 Chemicals 4 21000

2) Implement top-N, bottom-N, and inner-N reporting Using Sub Query
By nesting a sub query using ROW_NUMBER inside a query that retrieves the ROW_NUMBER values for a specified range, you can find a precise subset of rows from the results of the inner query. This use of the function lets you implement top-N, bottom-N, and inner-N reporting. For consistent results, the query must ensure a deterministic sort order.
E.g. we want to retrieve Top 2 Product Range which is not more than 10000 in price.

SELECT * FROM (
SELECT ROW_NUMBER() OVER(Order by Prod_Amt DESC) as Row_ID, * from Products
WHERE Prod_Amt <=10000) TopProduct WHERE Row_ID <=2

Result:
Row_ID Prod_NO Prod_Name Prod_Amt
1 P000000001 Chemicals 1 10000
2 P000000003 Chemicals 3 8000

3) Finding Nth maximum Amount of Product
By nesting a sub query using ROW_NUMBER inside a query that retrieves the ROW_NUMBER values for distinct values for a Product Amount, you can find Nth Maximum Product Amount from the results of the inner query

-- add two products with same amount
INSERT INTO Products VALUES ('P000000006', 'Chemicals 6', 13000)
INSERT INTO Products VALUES ('P000000007', 'Chemicals 7', 8000)

SELECT Prod_Amt FROM (
SELECT ROW_NUMBER() OVER(Order by Prod_Amt) as Row_ID, Prod_Amt
FROM (Select Distinct(Prod_Amt) from Products) DistinctAmts) Nth_Amount
WHERE Row_ID = 2
–- This will return you 2nd maximum product price
Result:
Prod_Amt
8000

4) The same Row_Number function can be used to delete duplicate entries. (that is exercise to you)

1 comment:

Unknown said...

Thanks.. it did help to explain ROW_Number concept.