The FIRST and LAST functions in Oracle – Using the KEEP Clause

This weekend I attended the Milwaukee Muskie Expo at State Fair Park. It’s a fun way to stay connected to muskie fishing during the winter months when we can’t fish for them (The muskie season in WI goes from May through December). Basically, there are a lot of lure manufacturers, rod/reel makers, boat dealers, taxidermists, and fishing guides that set up in booths to try and get you to consider their products. My favorite to look at are the Lax replicas and Fittante replicas – it’s great to see some big fish that other fisherman have caught. I also keep my eye out for new and exciting lures – muskie often get used to seeing the same things over and over, so it can help to give them something new. This year I picked out a new Bondy Bait, the Royal Orba. I’ve never thought Bondy Baits looked that great, but I know they’ve gotten results. This new one has a nice tail and trails double blades, so it should be something the fish are not used to seeing. The only sad thing is that I’ve got to wait until May to test it out.
Finding a guide can be a challenging experience. Most guides cost $200 – $400 a day, so there is pressure to find a good one who really knows his stuff. Word of mouth is the main way to find a reliable guide, but if you’re going to an area where you don’t know too many people, like on a vacation in another part of the state or country, you’re going to have to find another reference. It’d be helpful if we had an internet database where people could rate the guides based on how the trip went. If such a database existed, I’d use Oracle’s FIRST and LAST functions to find the right guide.
Oracle’s documentation describes the functions: “When you need a value from the first or last row of a sorted group, but the needed value in not the sort key, the FIRST and LAST functions eliminate the need for self-joins or views and enable better performance.” So the real challenge that is being addressed is when you are selecting with an aggregate function and you want to get records that are tied to the fields you are grouping by, but don’t want them included in the GROUP BY clause, you use the FIRST or LAST functions. As a very simple example, let’s say I have this table:

CREATE TABLE groupbyexample (lake VARCHAR2(25), rating INT, otherdata VARCHAR2(25));
INSERT ALL
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Pewaukee',3,'get me')
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Pewaukee',4,'this value')
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Okauchee',3,'return')
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Okauchee',4,'shallow')
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Okauchee',5,'trophy')
SELECT * FROM dual;

Now I want to get the record with the highest rating for each lake. This is a very simple aggregate query:

SELECT lake, MAX(rating)
FROM groupbyexample
GROUP BY lake;

However, if I also want to display the otherdata field with my results, the common solution would be to join my result set back to the initial table like this:

SELECT g.lake,
  g.rating,
  g.otherdata
FROM groupbyexample g
INNER JOIN
  ( SELECT lake, MAX(rating) AS rating FROM groupbyexample GROUP BY lake
  ) ij
ON g.lake    = ij.lake
AND g.rating = ij.rating;

20150216 FIRST LAST simple example This is the standard way to do this in SQL Server 2008 R2 (the later versions have a version of the FIRST and LAST functions). This example was pretty simple, so lets look at a complex example using fishing guides as an example data set. In this example each guide will be assigned to a certain region of Wisconsin, and will specialize in two fish, a primary and a secondary. They will also be rated based on past user experience. These guide names are all made up and any names that happen to really exist are coincidental.

CREATE TABLE FishSpecies
  (FishID INT, FishName VARCHAR2(50)
  );
CREATE TABLE Regions
  (RegionID INT, RegionName VARCHAR2(100)
  );
CREATE TABLE FishingGuides
  (
    GuideID          INT,
    GuideName        VARCHAR2(50),
    PrimarySpecies   INT,
    SecondarySpecies INT,
    Region           INT,
    Rating           NUMBER(3,0)
  );

INSERT ALL
  INTO FishSpecies (FishID, Fishname) VALUES (1,'Musky')
  INTO FishSpecies (FishID, Fishname) VALUES (2,'Largemouth Bass')
  INTO FishSpecies (FishID, Fishname) VALUES (3,'Smallmouth Bass')
  INTO FishSpecies (FishID, Fishname) VALUES (4,'Panfish')
  INTO FishSpecies (FishID, Fishname) VALUES (5,'Crappie')
  INTO FishSpecies (FishID, Fishname) VALUES (6,'Perch')
  INTO FishSpecies (FishID, Fishname) VALUES (7,'Walleye')
  INTO FishSpecies (FishID, Fishname) VALUES (8,'Northern Pike')
  INTO FishSpecies (FishID, Fishname) VALUES (9,'Steelhead')
  INTO FishSpecies (FishID, Fishname) VALUES (10,'Rainbow Trout')
  INTO FishSpecies (FishID, Fishname) VALUES (11,'Brown Trout')
  INTO FishSpecies (FishID, Fishname) VALUES (12,'Lake Trout')
SELECT * FROM DUAL;

INSERT ALL
  INTO Regions (RegionID, RegionName) VALUES (1,'Southeast WI')
  INTO Regions (RegionID, RegionName) VALUES (2,'Southwest WI')
  INTO Regions (RegionID, RegionName) VALUES (3,'Central WI')
  INTO Regions (RegionID, RegionName) VALUES (4,'Northeast WI')
  INTO Regions (RegionID, RegionName) VALUES (5,'Northwest WI')
  INTO Regions (RegionID, RegionName) VALUES (6,'Southern Lake Michigan')
  INTO Regions (RegionID, RegionName) VALUES (7,'Northern Lake Michigan/Green Bay')
  INTO Regions (RegionID, RegionName) VALUES (8, 'Lake Winnebago')
SELECT * FROM DUAL;  

INSERT ALL 
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (1,'Dave''s Muskie Guide Service', 1,8,1,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (2,'Madison Action', 2,8,1,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (3,'Big Fish Guide Service', 7,1,5,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (4,'Good Eatin''', 4,7,2,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (5,'Muskie Mike', 1,8,4,2)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (6,'WTD Inc.', 2,8,5,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (7,'Fred''s Fishing Charters', 7,3,3,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (8,'Joe Woz', 6,5,4,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (9,'Walt and Willy', 1,8,5,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (10,'Robo Fishing', 10,11,6,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (11,'The Wiz Fishery', 7,3,8,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (12,'H and H', 12,8,6,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (13,'Rhinelander''s Best', 3,8,4,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (14,'Iron Mountain Fishing', 5,4,5,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (15,'Milwaukee Mike''s', 7,3,1,2)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (16,'Eagle River Guide Service', 2,3,4,1)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (17,'Green Bay Charters', 11,9,7,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (18,'Hayward Trophy Fishing', 5,4,5,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (19,'Winnebago Fishing', 7,8,8,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (20,'Lake Michigan Fishing Charters', 9,12,6,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (21,'Tom at Lake Tom', 1,2,4,2)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (22,'HTG Fishing', 1,3,4,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (23,'Greg and Gavin', 7,1,4,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (24,'Captain Hook', 1,8,4,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (25,'The Better Netter', 2,1,4,4)
SELECT * FROM DUAL;

Here are what a select from the tables would look like:
20150216 FIRST LAST real example simple select
Let’s say I want to get the highest rated guide in each region, regardless of the specialized fish type. Using the “old way”, my query would look like this:

WITH cte AS
  (SELECT r.RegionName,
    fg.Rating,
    f.Fishname  AS PrimarySpecies,
    f2.Fishname AS SecondarySpecies,
    fg.GuideName,
    row_number() OVER (partition BY fg.region ORDER BY fg.rating DESC) rn
  FROM FishingGuides fg
  INNER JOIN Regions r ON fg.Region = r.RegionID
  INNER JOIN FishSpecies f ON fg.PrimarySpecies = f.FishID
  INNER JOIN FishSpecies f2 ON fg.SecondarySpecies = f2.FishID
  )
SELECT * FROM cte WHERE rn=1 ORDER BY RegionName;

I use a windowing function to get the row number for each region ordered by rating, then just look at results where the row number is 1. This looks easier to understand than my simple example, but still is not as efficient. Using the LAST function, I can rewrite the query like this:

SELECT  r.RegionName, 
        MAX(fg.Rating) AS Rating, 
        MAX(f.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS PrimarySpecies, 
        MAX(f2.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS SecondarySpecies, 
        MAX(fg.GuideName) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS Guidename
FROM FishingGuides fg
INNER JOIN Regions r ON fg.Region = r.RegionID
INNER JOIN FishSpecies f ON fg.PrimarySpecies = f.FishID
INNER JOIN FishSpecies f2 ON fg.SecondarySpecies = f2.FishID
GROUP BY r.RegionName
ORDER BY r.Regionname;

20150216 FIRST LAST real example best guide by region Both queries return the exact same results. Let’s say I now want to find the top rated guide in each region who has musky as either their primary or secondary fish. I would just add a WHERE clause to get this info:

--Highest rated guide in each region
SELECT  r.RegionName, 
        MAX(fg.Rating) AS Rating, 
        MAX(f.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS PrimarySpecies, 
        MAX(f2.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS SecondarySpecies, 
        MAX(fg.GuideName) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS Guidename
FROM FishingGuides fg
INNER JOIN Regions r ON fg.Region = r.RegionID
INNER JOIN FishSpecies f ON fg.PrimarySpecies = f.FishID
INNER JOIN FishSpecies f2 ON fg.SecondarySpecies = f2.FishID
WHERE fg.PRIMARYSPECIES = 1 OR fg.SECONDARYSPECIES = 1
GROUP BY r.RegionName
ORDER BY r.Regionname;

20150216 FIRST LAST real example best musky guide by region For my last example, let’s say we’re taking a trip up north and we want to find the highest rated guide by primary species. From this I’ll decide what species I want to target, and what guide to use. I’ll change up my query to only look at northern locations, then I’ll change the GROUP BY field to Primary Species:

SELECT  f.Fishname AS PrimarySpecies, 
        MAX(fg.Rating) AS Rating, 
        MAX(r.RegionName) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS Region,
        MAX(f2.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS SecondarySpecies, 
        MAX(fg.GuideName) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS Guidename
FROM FishingGuides fg
INNER JOIN Regions r ON fg.Region = r.RegionID
INNER JOIN FishSpecies f ON fg.PrimarySpecies = f.FishID
INNER JOIN FishSpecies f2 ON fg.SecondarySpecies = f2.FishID
WHERE r.REGIONNAME LIKE '%North%'
GROUP BY f.Fishname
ORDER BY f.Fishname;

20150216 FIRST LAST real example best guide by fish north
Now I can see if I want to go for musky and northern, I can get a five star guide in Captain Hook. However, if I’d rather go for musky and walleye, I can book Greg and Gavin and go with a four star guide instead.

I hope this has helped explain how to use the LAST and FIRST functions available in Oracle. As I stated, SQL Server has similar functions available starting in 2012.

Advertisements