SQL Server ROW_NUMBER()

I took a vacation with the family up north recently, and did some research before I left. Since I was going to be fishing mostly clear water, I did some reading on how best to fish musky in clear water. I’ve always suspected that deeper, open water was the way to go but never had the confidence to really attack it. However, after readying some great information on Andy Meyer’s website, I was ready to give it a good shot. I identified some new water with varying depths that I wanted to try out and planned on when to try each spot. While none of the new spots panned out, I did try a deeper spot off one of my regular spots and was rewarded with this 43″ muskie.
June Muskie.jpg
I was jerking a bulldawg very aggressively in 30 – 35 feet of water. I was alternating casts into shore, away from shore, and parallel to shore. The fish struck about a fourth of the way into the retrieve of a cast I had made parallel to shore. I think this was the deepest water I had caught a muskie in. Given my recent problems with keeping the fishin hooked, I made sure to drive the hooks into the fish’s mouth. It worked well because the bulldawg was completely inside the fish’s mouth and four of the seven hooks were in the fish.

SQL Server is a set based language. For this reason we’re generally not interested in individual rows, more so just the set of rows that a query returns. This is why DBA’s shun cursors and loops. However, sometimes it is necessary to sort out individual rows for sorting or filtering purposes. ROW_NUMBER is a window function that allows us to add a row number to each returned row. We can use OVER() clause to figure out what rows should start with number 1 and when we should stop the current sequence and start over at 1.
To illustrate, let’s look at the following set of data:

CREATE TABLE #CatchRecord
    (
      Fish VARCHAR(25) ,
      CatchDateTime DATETIME2 ,
      Lake VARCHAR(25) ,
      LureUsed VARCHAR(25) ,
      Inches DECIMAL(5, 2)
    );

INSERT  INTO #CatchRecord
        ( Fish, CatchDateTime, Lake, LureUsed, Inches )
VALUES  ( 'Perch', '2016-05-05 10:00:00 AM', 'Little Cedar', 'Jig', 12.5 ),
        ( 'Crappie', '2016-05-06 08:00:00 PM', 'Little Cedar', 'Live Bait', 10.25 ),
        ( 'Northern Pike', '2016-06-05 10:00:00 AM', 'Big Cedar', 'Spoon', 22.75 ),
        ( 'LM Bass', '2016-05-07 11:45:00 AM', 'Silver Lake', 'Skitter Pop',  16 ),
        ( 'Perch', '2016-07-05 06:15:00 PM', 'Big Cedar', 'Jig', 8.5 ),
        ( 'Northern Pike', '2016-08-05 3:45:00 PM', 'Pewaukee', 'Spinner', 19.5 ),
        ( 'SM Bass', '2016-05-08 10:15:00 PM', 'Okauchee', 'Fat Rap', 17.25 ),
        ( 'Perch', '2016-06-09 6:15:00 AM', 'Tomahawk', 'Live Bait', 10.75 ),
        ( 'Muskie', '2016-07-10 11:00:00 PM', 'Oneida', 'Bucktail', 38.5 ),
        ( 'Muskie', '2016-08-11 8:15:00 PM', 'Tomahawk', 'Bulldawg', 42.25 ),
        ( 'Crappie', '2016-09-12 04:15:00 PM', 'Little Cedar', 'Live Bait', 11.5 ),
        ( 'LM Bass', '2016-05-13 12:15:00 AM', 'Pewaukee', 'Wacky Worm', 14.75 ),
        ( 'Northern Pike', '2016-06-14 10:30:00 AM', 'Okauchee', 'Spoon', 26 ),
        ( 'Walleye', '2016-07-15 10:30:00 PM', 'Big Cedar', 'Lindy Rig', 16.25 ),
        ( 'Walleye', '2016-08-16 10:30:00 AM', 'Rainbow Flowage', 'Flicker Shad', 20 ),
        ( 'Muskie', '2016-09-17 09:15:00 PM', 'Pewaukee', 'Bucktail', 36.75 ),
        ( 'Muskie', '2016-06-18 9:15:00 AM', 'Big Arbor Vitae', 'Bulldawg', 41 );

To simply apply a row number to each row, I can use the following query:

SELECT  Fish ,
        CatchDateTime ,
        Lake ,
        LureUsed ,
        Inches ,
        ROW_NUMBER() OVER ( ORDER BY CatchDateTime ) rn
FROM    #CatchRecord;

20160701 rownum1
Note that the OVER() clause requires the ORDER BY clause. This tells SQL Server which row to start with and how to sort the data. This result set isn’t particularily interesting – each row has a distinct rn value. We can get the numbers to repeat by using the PARTITION BY clause. Let’s say we want separate the numbering by fish type starting with the biggest:

SELECT  Fish ,
        CatchDateTime ,
        Lake ,
        LureUsed ,
        Inches ,
        ROW_NUMBER() OVER ( PARTITION BY Fish ORDER BY Inches DESC ) rn
FROM    #CatchRecord;

20160701 rownum2
Let’s think of some interesting questions we could answer with this setup.
1.) What was the size of the second biggest fish caught from each species?
2.) What was the smallest fish caught each month?
3.) What was the species of the biggest fish caught with each type of lure?
4.) What lake was the second last fish caught at each month?

To find these, it helps to assign the row numbers in a common table expression, then returns the desired results in a query off that cte.
1)

WITH    cte
          AS ( SELECT   Fish ,
                        CatchDateTime ,
                        Lake ,
                        LureUsed ,
                        Inches ,
                        ROW_NUMBER() OVER ( PARTITION BY Fish ORDER BY Inches DESC) rn
               FROM     #CatchRecord
             )
    SELECT  cte.Fish ,
            cte.Inches
    FROM    cte
    WHERE   rn = 2;

20160701 rownum3
2)

WITH    cte
          AS ( SELECT   Fish ,
                        CatchDateTime ,
                        Lake ,
                        LureUsed ,
                        Inches ,
                        ROW_NUMBER() OVER ( PARTITION BY MONTH(catchdatetime) ORDER BY Inches ASC) rn
               FROM     #CatchRecord
             )
    SELECT  cte.Fish ,
            cte.CatchDateTime ,
            cte.Lake ,
            cte.LureUsed ,
            cte.Inches ,
            cte.rn
    FROM    cte
    WHERE   rn = 1;

20160701 rownum4
3)

WITH    cte
          AS ( SELECT   Fish ,
                        CatchDateTime ,
                        Lake ,
                        LureUsed ,
                        Inches ,
                        ROW_NUMBER() OVER ( PARTITION BY LureUsed ORDER BY Inches DESC) rn
               FROM     #CatchRecord
             )
    SELECT  cte.LureUsed ,
            cte.Fish
    FROM    cte
    WHERE   rn = 1;

20160701 rownum5
4)

WITH    cte
          AS ( SELECT   Fish ,
                        CatchDateTime ,
                        Lake ,
                        LureUsed ,
                        Inches ,
                        ROW_NUMBER() OVER ( PARTITION BY MONTH(catchdatetime) ORDER BY CatchDateTime DESC) rn
               FROM     #CatchRecord
             )
    SELECT  cte.Lake,
           cte.Fish,
           cte.CatchDateTime
    FROM    cte
    WHERE   rn = 2
   ORDER BY cte.CatchDateTime;

20160701 rownum6

These examples should help you understand some of the great things you can do with the ROW_NUMBER function.

Advertisements

One thought on “SQL Server ROW_NUMBER()

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s