SQL Server ROW_NUMBER()

I took a vacation with the family up north recently, and did some research before I left. Since I was going to be fishing mostly clear water, I did some reading on how best to fish musky in clear water. I’ve always suspected that deeper, open water was the way to go but never had the confidence to really attack it. However, after readying some great information on Andy Meyer’s website, I was ready to give it a good shot. I identified some new water with varying depths that I wanted to try out and planned on when to try each spot. While none of the new spots panned out, I did try a deeper spot off one of my regular spots and was rewarded with this 43″ muskie.
June Muskie.jpg
I was jerking a bulldawg very aggressively in 30 – 35 feet of water. I was alternating casts into shore, away from shore, and parallel to shore. The fish struck about a fourth of the way into the retrieve of a cast I had made parallel to shore. I think this was the deepest water I had caught a muskie in. Given my recent problems with keeping the fishin hooked, I made sure to drive the hooks into the fish’s mouth. It worked well because the bulldawg was completely inside the fish’s mouth and four of the seven hooks were in the fish.

SQL Server is a set based language. For this reason we’re generally not interested in individual rows, more so just the set of rows that a query returns. This is why DBA’s shun cursors and loops. However, sometimes it is necessary to sort out individual rows for sorting or filtering purposes. ROW_NUMBER is a window function that allows us to add a row number to each returned row. We can use OVER() clause to figure out what rows should start with number 1 and when we should stop the current sequence and start over at 1.
To illustrate, let’s look at the following set of data:

CREATE TABLE #CatchRecord
    (
      Fish VARCHAR(25) ,
      CatchDateTime DATETIME2 ,
      Lake VARCHAR(25) ,
      LureUsed VARCHAR(25) ,
      Inches DECIMAL(5, 2)
    );

INSERT  INTO #CatchRecord
        ( Fish, CatchDateTime, Lake, LureUsed, Inches )
VALUES  ( 'Perch', '2016-05-05 10:00:00 AM', 'Little Cedar', 'Jig', 12.5 ),
        ( 'Crappie', '2016-05-06 08:00:00 PM', 'Little Cedar', 'Live Bait', 10.25 ),
        ( 'Northern Pike', '2016-06-05 10:00:00 AM', 'Big Cedar', 'Spoon', 22.75 ),
        ( 'LM Bass', '2016-05-07 11:45:00 AM', 'Silver Lake', 'Skitter Pop',  16 ),
        ( 'Perch', '2016-07-05 06:15:00 PM', 'Big Cedar', 'Jig', 8.5 ),
        ( 'Northern Pike', '2016-08-05 3:45:00 PM', 'Pewaukee', 'Spinner', 19.5 ),
        ( 'SM Bass', '2016-05-08 10:15:00 PM', 'Okauchee', 'Fat Rap', 17.25 ),
        ( 'Perch', '2016-06-09 6:15:00 AM', 'Tomahawk', 'Live Bait', 10.75 ),
        ( 'Muskie', '2016-07-10 11:00:00 PM', 'Oneida', 'Bucktail', 38.5 ),
        ( 'Muskie', '2016-08-11 8:15:00 PM', 'Tomahawk', 'Bulldawg', 42.25 ),
        ( 'Crappie', '2016-09-12 04:15:00 PM', 'Little Cedar', 'Live Bait', 11.5 ),
        ( 'LM Bass', '2016-05-13 12:15:00 AM', 'Pewaukee', 'Wacky Worm', 14.75 ),
        ( 'Northern Pike', '2016-06-14 10:30:00 AM', 'Okauchee', 'Spoon', 26 ),
        ( 'Walleye', '2016-07-15 10:30:00 PM', 'Big Cedar', 'Lindy Rig', 16.25 ),
        ( 'Walleye', '2016-08-16 10:30:00 AM', 'Rainbow Flowage', 'Flicker Shad', 20 ),
        ( 'Muskie', '2016-09-17 09:15:00 PM', 'Pewaukee', 'Bucktail', 36.75 ),
        ( 'Muskie', '2016-06-18 9:15:00 AM', 'Big Arbor Vitae', 'Bulldawg', 41 );

To simply apply a row number to each row, I can use the following query:

SELECT  Fish ,
        CatchDateTime ,
        Lake ,
        LureUsed ,
        Inches ,
        ROW_NUMBER() OVER ( ORDER BY CatchDateTime ) rn
FROM    #CatchRecord;

20160701 rownum1
Note that the OVER() clause requires the ORDER BY clause. This tells SQL Server which row to start with and how to sort the data. This result set isn’t particularily interesting – each row has a distinct rn value. We can get the numbers to repeat by using the PARTITION BY clause. Let’s say we want separate the numbering by fish type starting with the biggest:

SELECT  Fish ,
        CatchDateTime ,
        Lake ,
        LureUsed ,
        Inches ,
        ROW_NUMBER() OVER ( PARTITION BY Fish ORDER BY Inches DESC ) rn
FROM    #CatchRecord;

20160701 rownum2
Let’s think of some interesting questions we could answer with this setup.
1.) What was the size of the second biggest fish caught from each species?
2.) What was the smallest fish caught each month?
3.) What was the species of the biggest fish caught with each type of lure?
4.) What lake was the second last fish caught at each month?

To find these, it helps to assign the row numbers in a common table expression, then returns the desired results in a query off that cte.
1)

WITH    cte
          AS ( SELECT   Fish ,
                        CatchDateTime ,
                        Lake ,
                        LureUsed ,
                        Inches ,
                        ROW_NUMBER() OVER ( PARTITION BY Fish ORDER BY Inches DESC) rn
               FROM     #CatchRecord
             )
    SELECT  cte.Fish ,
            cte.Inches
    FROM    cte
    WHERE   rn = 2;

20160701 rownum3
2)

WITH    cte
          AS ( SELECT   Fish ,
                        CatchDateTime ,
                        Lake ,
                        LureUsed ,
                        Inches ,
                        ROW_NUMBER() OVER ( PARTITION BY MONTH(catchdatetime) ORDER BY Inches ASC) rn
               FROM     #CatchRecord
             )
    SELECT  cte.Fish ,
            cte.CatchDateTime ,
            cte.Lake ,
            cte.LureUsed ,
            cte.Inches ,
            cte.rn
    FROM    cte
    WHERE   rn = 1;

20160701 rownum4
3)

WITH    cte
          AS ( SELECT   Fish ,
                        CatchDateTime ,
                        Lake ,
                        LureUsed ,
                        Inches ,
                        ROW_NUMBER() OVER ( PARTITION BY LureUsed ORDER BY Inches DESC) rn
               FROM     #CatchRecord
             )
    SELECT  cte.LureUsed ,
            cte.Fish
    FROM    cte
    WHERE   rn = 1;

20160701 rownum5
4)

WITH    cte
          AS ( SELECT   Fish ,
                        CatchDateTime ,
                        Lake ,
                        LureUsed ,
                        Inches ,
                        ROW_NUMBER() OVER ( PARTITION BY MONTH(catchdatetime) ORDER BY CatchDateTime DESC) rn
               FROM     #CatchRecord
             )
    SELECT  cte.Lake,
           cte.Fish,
           cte.CatchDateTime
    FROM    cte
    WHERE   rn = 2
   ORDER BY cte.CatchDateTime;

20160701 rownum6

These examples should help you understand some of the great things you can do with the ROW_NUMBER function.

Advertisements

The FIRST and LAST functions in Oracle – Using the KEEP Clause

This weekend I attended the Milwaukee Muskie Expo at State Fair Park. It’s a fun way to stay connected to muskie fishing during the winter months when we can’t fish for them (The muskie season in WI goes from May through December). Basically, there are a lot of lure manufacturers, rod/reel makers, boat dealers, taxidermists, and fishing guides that set up in booths to try and get you to consider their products. My favorite to look at are the Lax replicas and Fittante replicas – it’s great to see some big fish that other fisherman have caught. I also keep my eye out for new and exciting lures – muskie often get used to seeing the same things over and over, so it can help to give them something new. This year I picked out a new Bondy Bait, the Royal Orba. I’ve never thought Bondy Baits looked that great, but I know they’ve gotten results. This new one has a nice tail and trails double blades, so it should be something the fish are not used to seeing. The only sad thing is that I’ve got to wait until May to test it out.
Finding a guide can be a challenging experience. Most guides cost $200 – $400 a day, so there is pressure to find a good one who really knows his stuff. Word of mouth is the main way to find a reliable guide, but if you’re going to an area where you don’t know too many people, like on a vacation in another part of the state or country, you’re going to have to find another reference. It’d be helpful if we had an internet database where people could rate the guides based on how the trip went. If such a database existed, I’d use Oracle’s FIRST and LAST functions to find the right guide.
Oracle’s documentation describes the functions: “When you need a value from the first or last row of a sorted group, but the needed value in not the sort key, the FIRST and LAST functions eliminate the need for self-joins or views and enable better performance.” So the real challenge that is being addressed is when you are selecting with an aggregate function and you want to get records that are tied to the fields you are grouping by, but don’t want them included in the GROUP BY clause, you use the FIRST or LAST functions. As a very simple example, let’s say I have this table:

CREATE TABLE groupbyexample (lake VARCHAR2(25), rating INT, otherdata VARCHAR2(25));
INSERT ALL
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Pewaukee',3,'get me')
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Pewaukee',4,'this value')
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Okauchee',3,'return')
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Okauchee',4,'shallow')
  INTO groupbyexample (lake, rating, otherdata) VALUES ('Okauchee',5,'trophy')
SELECT * FROM dual;

Now I want to get the record with the highest rating for each lake. This is a very simple aggregate query:

SELECT lake, MAX(rating)
FROM groupbyexample
GROUP BY lake;

However, if I also want to display the otherdata field with my results, the common solution would be to join my result set back to the initial table like this:

SELECT g.lake,
  g.rating,
  g.otherdata
FROM groupbyexample g
INNER JOIN
  ( SELECT lake, MAX(rating) AS rating FROM groupbyexample GROUP BY lake
  ) ij
ON g.lake    = ij.lake
AND g.rating = ij.rating;

20150216 FIRST LAST simple example This is the standard way to do this in SQL Server 2008 R2 (the later versions have a version of the FIRST and LAST functions). This example was pretty simple, so lets look at a complex example using fishing guides as an example data set. In this example each guide will be assigned to a certain region of Wisconsin, and will specialize in two fish, a primary and a secondary. They will also be rated based on past user experience. These guide names are all made up and any names that happen to really exist are coincidental.

CREATE TABLE FishSpecies
  (FishID INT, FishName VARCHAR2(50)
  );
CREATE TABLE Regions
  (RegionID INT, RegionName VARCHAR2(100)
  );
CREATE TABLE FishingGuides
  (
    GuideID          INT,
    GuideName        VARCHAR2(50),
    PrimarySpecies   INT,
    SecondarySpecies INT,
    Region           INT,
    Rating           NUMBER(3,0)
  );

INSERT ALL
  INTO FishSpecies (FishID, Fishname) VALUES (1,'Musky')
  INTO FishSpecies (FishID, Fishname) VALUES (2,'Largemouth Bass')
  INTO FishSpecies (FishID, Fishname) VALUES (3,'Smallmouth Bass')
  INTO FishSpecies (FishID, Fishname) VALUES (4,'Panfish')
  INTO FishSpecies (FishID, Fishname) VALUES (5,'Crappie')
  INTO FishSpecies (FishID, Fishname) VALUES (6,'Perch')
  INTO FishSpecies (FishID, Fishname) VALUES (7,'Walleye')
  INTO FishSpecies (FishID, Fishname) VALUES (8,'Northern Pike')
  INTO FishSpecies (FishID, Fishname) VALUES (9,'Steelhead')
  INTO FishSpecies (FishID, Fishname) VALUES (10,'Rainbow Trout')
  INTO FishSpecies (FishID, Fishname) VALUES (11,'Brown Trout')
  INTO FishSpecies (FishID, Fishname) VALUES (12,'Lake Trout')
SELECT * FROM DUAL;

INSERT ALL
  INTO Regions (RegionID, RegionName) VALUES (1,'Southeast WI')
  INTO Regions (RegionID, RegionName) VALUES (2,'Southwest WI')
  INTO Regions (RegionID, RegionName) VALUES (3,'Central WI')
  INTO Regions (RegionID, RegionName) VALUES (4,'Northeast WI')
  INTO Regions (RegionID, RegionName) VALUES (5,'Northwest WI')
  INTO Regions (RegionID, RegionName) VALUES (6,'Southern Lake Michigan')
  INTO Regions (RegionID, RegionName) VALUES (7,'Northern Lake Michigan/Green Bay')
  INTO Regions (RegionID, RegionName) VALUES (8, 'Lake Winnebago')
SELECT * FROM DUAL;  

INSERT ALL 
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (1,'Dave''s Muskie Guide Service', 1,8,1,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (2,'Madison Action', 2,8,1,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (3,'Big Fish Guide Service', 7,1,5,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (4,'Good Eatin''', 4,7,2,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (5,'Muskie Mike', 1,8,4,2)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (6,'WTD Inc.', 2,8,5,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (7,'Fred''s Fishing Charters', 7,3,3,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (8,'Joe Woz', 6,5,4,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (9,'Walt and Willy', 1,8,5,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (10,'Robo Fishing', 10,11,6,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (11,'The Wiz Fishery', 7,3,8,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (12,'H and H', 12,8,6,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (13,'Rhinelander''s Best', 3,8,4,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (14,'Iron Mountain Fishing', 5,4,5,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (15,'Milwaukee Mike''s', 7,3,1,2)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (16,'Eagle River Guide Service', 2,3,4,1)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (17,'Green Bay Charters', 11,9,7,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (18,'Hayward Trophy Fishing', 5,4,5,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (19,'Winnebago Fishing', 7,8,8,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (20,'Lake Michigan Fishing Charters', 9,12,6,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (21,'Tom at Lake Tom', 1,2,4,2)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (22,'HTG Fishing', 1,3,4,3)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (23,'Greg and Gavin', 7,1,4,4)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (24,'Captain Hook', 1,8,4,5)
  INTO FishingGuides (GuideID, GuideName, PrimarySpecies, SecondarySpecies, Region, Rating) VALUES (25,'The Better Netter', 2,1,4,4)
SELECT * FROM DUAL;

Here are what a select from the tables would look like:
20150216 FIRST LAST real example simple select
Let’s say I want to get the highest rated guide in each region, regardless of the specialized fish type. Using the “old way”, my query would look like this:

WITH cte AS
  (SELECT r.RegionName,
    fg.Rating,
    f.Fishname  AS PrimarySpecies,
    f2.Fishname AS SecondarySpecies,
    fg.GuideName,
    row_number() OVER (partition BY fg.region ORDER BY fg.rating DESC) rn
  FROM FishingGuides fg
  INNER JOIN Regions r ON fg.Region = r.RegionID
  INNER JOIN FishSpecies f ON fg.PrimarySpecies = f.FishID
  INNER JOIN FishSpecies f2 ON fg.SecondarySpecies = f2.FishID
  )
SELECT * FROM cte WHERE rn=1 ORDER BY RegionName;

I use a windowing function to get the row number for each region ordered by rating, then just look at results where the row number is 1. This looks easier to understand than my simple example, but still is not as efficient. Using the LAST function, I can rewrite the query like this:

SELECT  r.RegionName, 
        MAX(fg.Rating) AS Rating, 
        MAX(f.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS PrimarySpecies, 
        MAX(f2.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS SecondarySpecies, 
        MAX(fg.GuideName) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS Guidename
FROM FishingGuides fg
INNER JOIN Regions r ON fg.Region = r.RegionID
INNER JOIN FishSpecies f ON fg.PrimarySpecies = f.FishID
INNER JOIN FishSpecies f2 ON fg.SecondarySpecies = f2.FishID
GROUP BY r.RegionName
ORDER BY r.Regionname;

20150216 FIRST LAST real example best guide by region Both queries return the exact same results. Let’s say I now want to find the top rated guide in each region who has musky as either their primary or secondary fish. I would just add a WHERE clause to get this info:

--Highest rated guide in each region
SELECT  r.RegionName, 
        MAX(fg.Rating) AS Rating, 
        MAX(f.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS PrimarySpecies, 
        MAX(f2.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS SecondarySpecies, 
        MAX(fg.GuideName) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS Guidename
FROM FishingGuides fg
INNER JOIN Regions r ON fg.Region = r.RegionID
INNER JOIN FishSpecies f ON fg.PrimarySpecies = f.FishID
INNER JOIN FishSpecies f2 ON fg.SecondarySpecies = f2.FishID
WHERE fg.PRIMARYSPECIES = 1 OR fg.SECONDARYSPECIES = 1
GROUP BY r.RegionName
ORDER BY r.Regionname;

20150216 FIRST LAST real example best musky guide by region For my last example, let’s say we’re taking a trip up north and we want to find the highest rated guide by primary species. From this I’ll decide what species I want to target, and what guide to use. I’ll change up my query to only look at northern locations, then I’ll change the GROUP BY field to Primary Species:

SELECT  f.Fishname AS PrimarySpecies, 
        MAX(fg.Rating) AS Rating, 
        MAX(r.RegionName) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS Region,
        MAX(f2.Fishname) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS SecondarySpecies, 
        MAX(fg.GuideName) KEEP (DENSE_RANK LAST ORDER BY fg.rating, fg.PRIMARYSPECIES, fg.SECONDARYSPECIES) AS Guidename
FROM FishingGuides fg
INNER JOIN Regions r ON fg.Region = r.RegionID
INNER JOIN FishSpecies f ON fg.PrimarySpecies = f.FishID
INNER JOIN FishSpecies f2 ON fg.SecondarySpecies = f2.FishID
WHERE r.REGIONNAME LIKE '%North%'
GROUP BY f.Fishname
ORDER BY f.Fishname;

20150216 FIRST LAST real example best guide by fish north
Now I can see if I want to go for musky and northern, I can get a five star guide in Captain Hook. However, if I’d rather go for musky and walleye, I can book Greg and Gavin and go with a four star guide instead.

I hope this has helped explain how to use the LAST and FIRST functions available in Oracle. As I stated, SQL Server has similar functions available starting in 2012.

Send Database Mail in HTML Table Format from Oracle

As a fishing trip approaches, I like to plan out my expected strategy for my first day. I look at the expected weather (especially wind direction and speed), and try to decide where active fish may be biting. However, once I get out there plans can change very quickly. I remember a trip up north quite a few years ago where, once out on the water, I was facing 25 – 30 mph winds. I was also fishing in a rowboat, and was unable to fish where I wanted because I was drifting through the prime areas too quickly. I had to make an adjustment and focus on spots on the upwind side of the lake.
Database administrators also need to be able to make adjustments. It is fairly common to try something one way and find it doesn’t work well or doesn’t work as expected, and have to find another way. In my last post, I demonstrated how to use ListAgg to make a common delimited list in Oracle. I had planned to use this function to build a stored procedure that would take a query as its parameters and send an email in an html table format with the results of that query. Upon building the query, I found ListAgg has a severe limitation. It returns its results in a VARCHAR2 format only. This means the result set, which is a string of html, could only use 4000 characters, the limit of VARCHAR2. This may seem like a lot, but for an entire result set along with html tags it may not be sufficient. Instead, I used some of Oracle’s XML functions to complete the procedure. First I’ll show the procedure and a sample of its use, then I’ll do a brief description of the xml functions used.
Here is the SQL for the stored procedure:

CREATE OR REPLACE PROCEDURE       SCOTT.SENDDBEMAILWITHTABULARQUERY 
(
  QSELECT IN VARCHAR2
, FIELDLIST IN VARCHAR2
, QFROM IN VARCHAR2
, QWHERE IN VARCHAR2 
, QGROUPBY IN VARCHAR2 
, QHAVING IN VARCHAR2 
, QORDERBY IN VARCHAR2 
, RECIPIENTS IN VARCHAR2 
, SUBJECT IN VARCHAR2 
, TITLE IN VARCHAR2 
, SENDEMAILWITHNORESULTS IN VARCHAR2 
) AS 

  XML CLOB;
  body CLOB;
  varsql      VARCHAR2(4000);
  tblfieldheader NVARCHAR2(2000) := '';
  tempfield NVARCHAR2(2000)      := '';
  i          INT                          := 1;
  j          INT                          := 1;
  SendEmail  INT;
  splitcnt   INT;
  fieldcount INT ;
TYPE AllFieldNames
IS
  TABLE OF VARCHAR2(2000);
  TempFields AllFieldNames;
BEGIN
  --Find the number of fields in the query
  splitcnt := LENGTH(fieldlist)-LENGTH(REPLACE(fieldlist,'|',''));
  --Loop through the fields and put each on into the #Fields temp table as a new record
  FOR j IN 1..splitcnt
  LOOP
    SELECT x.s BULK COLLECT
    INTO TempFields
    FROM
      (SELECT regexp_substr (fieldlist, '[^|]+', 1, rownum) s,
        rownum rn
      FROM dual
        CONNECT BY LEVEL <= LENGTH (regexp_replace (fieldlist, '[^|]+')) + 1
      ) x;
  END LOOP;
  --SELECT fieldcount = splitcnt + 1 --Will be the splitcnt + 1, otherwise MAX(ID) FROM TempFields
  --Start setting up the sql statement for the query.
  varsql := qSELECT || ' xmlagg(xmlelement("tr",xmlforest(';
  --Loop through the #Fields table to get the field list
  FOR i IN TempFields.first..TempFields.last
  LOOP
    --------------------------------------------------------------------------------------------------------------------------------------------------------------
    --This next section is required in case a field is aliased.  For the xml, we need to get rid of the aliases, the table header will only require the aliases.
    --NULL values need to be shown as a string = 'NULL' or the html table will just skip the cell and all values after that in the row will be shifted left.
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------
    IF SUBSTR(TempFields(i),-1) = ']' OR INSTR(UPPER(TempFields(i)),' AS ') = 0 THEN
      --Set the xml field to be the entire field name
      varsql := varsql || 'NVL(CAST(' || TempFields(i) || ' AS VARCHAR2(2000)),''NULL'') as "td", ';
      --Set the table header field to be the entire field name
      tblfieldheader := tblfieldheader || '<th>' || TempFields(i) || '</th>';
    ELSE
      --Set the xml field to be the field name minus the alias
      varsql := varsql || 'NVL(CAST(' || SUBSTR(TempFields(i),1,(INSTR(UPPER(TempFields(i)),' AS ',1))-1) || ' AS VARCHAR2(2000)),''NULL'') as "td", ';
      --Set the table header field to be the field name's alias
      tblfieldheader := tblfieldheader || '<th>' || SUBSTR(TempFields(i),INSTR(UPPER(TempFields(i)),' AS ',-1)+4) || '</th>';
    END IF;
    --Increment the counter.
  END LOOP;
  --Trim the extra two characters of the end of sql.
  varsql := SUBSTR(varsql,1, LENGTH(varsql)-2);
  --Add the end of the table tag
  varsql := varsql || '))).GetClobVal() ';
  --Add the from, where, group by, having, and order by clause to the select statement.
  varsql := varsql || qFROM || ' ' || qWHERE || ' ' || qGroupBy || ' ' || qHaving || ' ' || qOrderBy;
  --Run the sql that will create the xml.
  EXECUTE IMMEDIATE varsql INTO XML;
  --Capture whether or not any rows were returned
  IF LENGTH(XML) > 0 THEN
    --Create the body of the email, which contains the xml results of the query.
    body := to_clob('<html><body><H3>') || Title || '</H3><table border = 1><tr>' || tblfieldheader || '</tr>' || XML || '</table></body></html>';
    --If rows were returned, send the email.
    sys.utl_mail.SEND( 'Oracledbmail@domain.com', recipients,NULL, NULL, subject , boddy, 'text/html', 3);
    --If no results, check whether user wants an email anyway
  ELSIF SendEmailWithNoResults != '0' THEN
    sys.utl_mail.SEND( 'Oracledbmail@domain.com', recipients,NULL, NULL, subject, 'Query returned no results', 'text/html', 3);
  END IF;
END SENDDBEMAILWITHTABULARQUERY;

To call the stored procedure, we’d use this syntax:

DECLARE
  QSELECT VARCHAR2(200);
  FIELDLIST VARCHAR2(200);
  QFROM VARCHAR2(200);
  QWHERE VARCHAR2(200);
  QGROUPBY VARCHAR2(200);
  QHAVING VARCHAR2(200);
  QORDERBY VARCHAR2(200);
  RECIPIENTS VARCHAR2(200);
  SUBJECT VARCHAR2(200);
  TITLE VARCHAR2(200);
  SENDEMAILWITHNORESULTS VARCHAR2(200);
BEGIN
  QSELECT := 'SELECT';
  FIELDLIST := 'emp.ename AS Employee|emp.job|emp.sal as Salary|dept.dname AS Department|dept.loc';
  QFROM := 'FROM scott.emp
            INNER JOIN scott.dept
            ON emp.deptno = dept.deptno';
  QWHERE := 'WHERE emp.sal > 1000';
  QGROUPBY := NULL;
  QHAVING := NULL;
  QORDERBY := NULL;
  RECIPIENTS := 'dgugg@domain.com';
  SUBJECT := 'Scott''s Employees';
  TITLE := 'Salary Greater Than 1,000';
  SENDEMAILWITHNORESULTS := '1';

  Scott.SENDDBEMAILWITHTABULARQUERY(
    QSELECT => QSELECT,
    FIELDLIST => FIELDLIST,
    QFROM => QFROM,
    QWHERE => QWHERE,
    QGROUPBY => QGROUPBY,
    QHAVING => QHAVING,
    QORDERBY => QORDERBY,
    RECIPIENTS => RECIPIENTS,
    SUBJECT => SUBJECT,
    TITLE => TITLE,
    SENDEMAILWITHNORESULTS => SENDEMAILWITHNORESULTS
  );
END;

Here’s how the resulting email looks:
20150209 SendHTML Received Mail
A couple of things to be aware of:

  • You first need to configure database mail before you can send email from the database.  You can do this in Database/Enterprise manager by going to Setup -> Notification Methods

20150209 SendHTML Setup DBMail

  • You need to setup a Access Control List for your user to get to the email server.  This can be done with these SQL statements, run as SYS.
--Create a new ACL
BEGIN
  --Creat the list
  DBMS_NETWORK_ACL_ADMIN.CREATE_ACL(
    acl         => 'utl_mail.xml',
    description => 'utl mail for Scott',
    principal   => 'Scott',
    is_grant    => true,
    privilege   => 'connect',
    start_date  => NULL,
    end_date    => NULL);
  --Add an initial privilege to it
  DBMS_NETWORK_ACL_ADMIN.ADD_PRIVILEGE(
    acl         => 'utl_mail.xml',
    principal   => 'Scott',
    is_grant    => true,
    privilege   => 'connect',
    start_date  => NULL,
    end_date    => NULL);
  --Assign a host to the ACL
  DBMS_NETWORK_ACL_ADMIN.ASSIGN_ACL(
    acl        => 'utl_mail.xml',
    host       => 'hostname',
    lower_port => 25,
    upper_port => 25);
END;
/
COMMIT;

Without this you will get the following error:
20150209 SendHTML DBMail ACL

Now I’ll briefly touch on the functionality of the XML functions.

  1. xmlagg – This takes multiple rows of xml and changes it into a single row.  This is needed for this stored procedure because we want the html table to control the line breaks, not line break characters created by SQL.
  2. xmlelement – This simply turns a returned column into an xml element.  The first argument is the xml element name and the second argument is the field.
  3. xmlforest – This behaves similarly to xmlelement, but it lets you do multiple columns at the same time.  You use comma to separate the columns.  Confusingly, opposite from xmlelement you have to first specify the field, than you put the xml element name second.

When I first got the requirement to send emails from Oracle, I took a look around the web and couldn’t find any ready-made solutions for sending out query results in an html table.  Hopefully this post will help some of you who are looking for the same thing.

Creating a comma separated list with ListAgg in Oracle

When moving to a new location on a lake to fish I do a mental inventory on my lures to identify which ones would work for that location. First I’ll categorize my lures by type (spinner, topwater, jerkbait, crankbait), then I’ll think about which lures in each category will work in that location. If there are weeds coming up I don’t want anything that’ll dive too deep and get fouled each cast. If there are no weeds and I’m working bottom structure I want a lure that will get down deep enough, closer to the bottom. If I’m in clear water I want natural colored lures, if I’m in stained water I want brightly colored lures. If the fish have been active I’ll want lures that can be worked fast, so I can cover as much water as possible. If they’ve been passive I’ll choose a lure that can be worked slowly, giving the fish plenty of time to decide to strike. Once my mental inventory is finished, I have a list of lures to choose from that will give me the best chance to catch a fish.
I’ve already demonstrated how to create a comma separated list in SQL Server here. Creating one in Oracle is a bit different, and in my opinion, easier. This method works starting in 11g, which most people are on by now.  To demonstrate, let’s start by creating a set of sample data:

CREATE TABLE DGUGG.LureNames
  (ID NUMBER PRIMARY KEY , Lure VARCHAR2(25), PriceRange VARCHAR2(8));
  
CREATE sequence DGUGG.LureNameID_seq;

CREATE OR REPLACE TRIGGER DGUGG.LureNamesID_trg before
  INSERT ON DGUGG.LureNames FOR EACH row BEGIN
  SELECT LureNameID_seq.nextval INTO :new.ID FROM dual;
END;
/

INSERT ALL
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Mepps Muskie Killer', '$10-$15' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Harasser', '$10-$15' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Buchertail 700','$10-$15')
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Reef Runner', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Creek Chub Pike Minnow', '$10-$15' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Grandma', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Jake', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Shallow Raider', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Bulldawg', '$10-$15' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Suick Thriller', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Believer', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Swim Whizz', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Double Cowgirl', '$20-$25' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Medussa', '$20-$25' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Doubledawg', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Shallow Invader', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Top Raider', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Depth Raider', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Pacemaker', '$15-$20' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Hawg Wobbler', '$25-$30' )
  INTO DGUGG.LureNames ( Lure, PriceRange ) VALUES  ( 'Softtail Phantom', '$25-$30' )
SELECT * FROM dual;

I’ve created a table called LureNames and inserted several lures names into it, as well as an approximate price range. A simple example would get all the lure names into a comma separated list:

SELECT listagg(Lure,', ')WITHIN GROUP (ORDER BY 1) AllLures FROM DGUGG.LureNames;

The Listagg() function takes two arguments, the source of data (the field Lure in this example) and the delimiter. I have made the delimiter a comma and space. After WITHIN GROUP we are able to choose how to order the results within the list.
20150203 Oracle Comma Delimited List simple
A slightly more complex example will group all the lures together into a comma separated list within each price range bucket:

SELECT PriceRange, listagg(Lure,', ') WITHIN GROUP (ORDER BY Lure DESC) LureList 
FROM DGUGG.LureNames 
GROUP BY PriceRange;

This time I added the PriceRange field so I also had to group by it. I also switched the order by Lure to descending so the lists start with the lowest alphabetical name and work up.
20150203 Oracle Comma Delimited List complex
So that is how to create a comma separate list in Oracle.  It is a bit easier since it only requires one function, not a second to clip off the leading delimiter like in SQL Server.  Next post I will be using this functionality to create a database email with an HTML table of query results in Oracle.

Creating a comma separated list with FOR XML PATH and STUFF in SQL Server

I can’t really think of a story or tip that ties into creating a comma separated list, so I’ll just use one that’s completely unrelated.  It’s fun sharing experiences with loved ones, and fishing is no different.  Many of my recent favorite fishing memories involve fishing with my now-four year old son.  Two summers ago when he was two years old we were catching panfish off the dock at my Grandpa’s cabin in northern Wisconsin.  I was hooking them and letting my son net them with a small butterfly net before unhooking them and letting them go.  I had just hooked another one and was trying to pull it close enough for my son to net when suddenly a large northern pike darted in and completely inhaled the hooked bluegill.  Upon seeing such a large fish my son backed off quickly.  The northern pike wasn’t hooked at all, but it still fought me for about half a minute before spitting out the bluegill.  It was something most people don’t get to see, and it was fun to experience it with my son, who was just getting into fishing.

If you need to create a comma separated list in SQL Server, the best way to do this is to use the FOR XML PATH clause.  This takes a set of data and puts it into a format that can be used in XML.  The syntax can be a bit complicated, so let’s take a closer look.  First I’ll build a temp table and fill it with some values – we’ll use muskie lures:

CREATE TABLE #LureNames
    (
      ID INT IDENTITY(1, 1) ,
      Lure VARCHAR(25)
    )
INSERT  INTO #LureNames
        ( Lure )
VALUES  ( 'Mepps Muskie Killer' ),
        ( 'Harasser' ),
        ( 'Buchertail 700' ),
        ( 'Reef Runner' ),
        ( 'Creek Chub Pike Minnow' ),
        ( 'Grandma' ),
        ( 'Jake' ),
        ( 'Shallow Raider' ),
        ( 'Bulldawg' ),
        ( 'Suick Thriller' ),
        ( 'Believer' ),
        ( 'Swim Whizz' ),
        ( 'Double Cowgirl' ),
        ( 'Medussa' ),
        ( 'Doubledawg' ),
        ( 'Shallow Invader' ),
        ( 'Top Raider' ),
        ( 'Depth Raider' ),
        ( 'Pacemaker' ),
        ( 'Hawg Wobbler' ),
        ( 'Softtail Phantom' )

If we do a simple select from this table, we get what we’d expect:

20150128 comma list simple select

If we want to get a comma separate list, we use this syntax:

SELECT  STUFF(( SELECT  ', ' + Lure
                FROM    ( SELECT DISTINCT
                                    Lure
                          FROM      #LureNames
                        ) x
              FOR
                XML PATH('')
              ), 1, 2, '') AllLures

To understand what this does let’s start from the inside and work our way out. The very inside has a rather simple SELECT DISTINCT:

SELECT DISTINCT
     Lure
FROM      #LureNames

This is going to simply give us each lure name once:
20150128 comma list select distinct In the next step we’re going to add the commas and put in the FOR XML PATH clause.

SELECT  ', ' + Lure
FROM    ( SELECT DISTINCT
                    Lure
          FROM      #LureNames
        ) x
FOR     XML PATH

20150128 comma list simple xml You can see the results are now in an xml format. Since we didn’t specify a column name the entries are simply called <row>. We now have the basic comma separated list, but there are extra characters in there. First off are the <row> xml tags, and on top of that the very first entry starts with a comma and space. To get rid of the first issue, we’ll just change FOR XML PATH to FOR XML PATH(”). This tells SQL Server to suppress the row tags.

SELECT  ', ' + Lure
FROM    ( SELECT DISTINCT
                    Lure
          FROM      #LureNames
        ) x
FOR     XML PATH('')

20150128 comma list xml no tags Now the only thing to fix is the leading comma and space. To get rid of this we use the STUFF function. The STUFF function takes a string and deletes a set amount of characters from a specified position and inserts new characters there. For a simple example if I want to change fisherman to fisherwoman, I would go like this:

SELECT  STUFF('fisherman', 7, 0, 'wo')

So I took my initial string, went to the seventh character (just after the ‘r’), deleted 0 characters and inserted ‘wo’ in that location:
20150128 comma list simple stuff For our example above, we’re going to start at the first position and delete two characters, and replace them with an empty string. Now we have our comma separated list:

20150128 comma list with space

It’s worth noting that I put a space between my words in this example.  If we didn’t want the space, we’d change the lurename and comma concatenation to exclude the space, and also only delete the leading one character with the stuff function.

SELECT  STUFF(( SELECT  ',' + Lure
                FROM    ( SELECT DISTINCT
                                    Lure
                          FROM      #LureNames
                        ) x
              FOR
                XML PATH('')
              ), 1, 1, '') AllLures

20150128 comma list no space

That is how you create a comma separate list in SQL Server.

The Differences Between COALESCE and ISNULL

Two muskie lures that are both effective and quite similar are the Musky Innovations Bulldawg and Chaos Tackle Medussa. These lures are both hard rubber lures without any natural action; the angler is responsible for making the lure move in a way that will get the fish to bite. They can be swam, jerked, bounced, ripped, or any other action you can think of to try to get fish to bite. Although they look quite similar, the action has subtle differences, and knowing these differences tells you which lure to use, depending on your situation. The bulldawg will cut through the water a bit better, and rolls over on its side as you let it fall. The medussa will almost hovers when you let it fall, slowly sinking with a back-and-forth wobble. Though similar, knowing these subtle differences can mean the difference between catching or not catching a fish.
Two SQL Server functions that also have rather subtle but important differences are the ISNULL() and COALESCE() functions. Both functions are used to return another value if the first evaluates to NULL. This can be useful for making sure your query is always returning a result and not grabbing those meaningless NULL values.
Let’s take a quick look at when this might be useful. Let’s say we’re running a bait shop, Super SQL Fishing Baits! To keep track of the baits we sell and what price and quantity they come in, we’ll create a table called dbo.LiveBait, and insert some data:

CREATE TABLE dbo.LiveBait
    (
      ID INT IDENTITY(1, 1) ,
      BaitName VARCHAR(255) NOT NULL ,
      Quantity INT ,
      Price MONEY NOT NULL ,
      SalePrice MONEY NULL
    )
INSERT  INTO dbo.LiveBait
        ( BaitName, Quantity, Price, SalePrice )
VALUES  ( 'Nightcrawlers', 24, 3.99, NULL ),
        ( 'Redworms', 48, 2.50, NULL ),
        ( 'Crappie Minnows', 12, 3.50, 2.49 ),
        ( 'Redtail Chubs', 12, 4.00, NULL ),
        ( 'Musky Suckers', 1, 8.00, 5.99 )

Now our table has a Price and a SalesPrice field. When we want to put a product on sale, we put in a sales price, otherwise we leave it NULL and charge the regular price. To get what the price will be, we can use ISNULL or COALESCE. Both of these queries return the same results:

SELECT ID ,
       BaitName ,
       Quantity ,
       ISNULL(SalePrice,Price) AS CurrentPrice
FROM dbo.LiveBait

SELECT ID ,
       BaitName ,
       Quantity ,
       COALESCE(SalePrice,Price) AS CurrentPrice
FROM dbo.LiveBait

20150105 Coalesce Isnull Same Results

Now let’s discuss some of the subtle differences between ISNULL and COALESCE. First, ISNULL can only accept two arguments. If the first argument is NULL, the other is returned, even if it too is NULL. COALESCE can accept any number of arguments, and will return NULL only if every one of those arguments are NULL.

Secondly, the data type of the result will be different depending on which is used. ISNULL uses the data type of the first parameter, whereas COALESCE returns the data type of value with the highest precedence. Let’s look at an example from our previously created table. If we write:

SELECT  ISNULL(SalePrice, 'Muskie')
FROM    dbo.LiveBait

We’re going to get an error. The first argument’s data type is money, so ISNULL tries to change the string ‘Muskie’ into a money, and when it can’t, it spits out an error:

20150105 Coalesce Isnull Isnull data type

Coalesce will find the data type with the highest precedence. Here is a list of the data type precedence from Microsoft. Money is higher than char, so we’d still get an error.

The last main difference is that COALESCE is actually rewritten by the query optimizer as a CASE statement. This means that our statement earlier

SELECT ID ,
       BaitName ,
       Quantity ,
       COALESCE(SalePrice,Price) AS CurrentPrice
FROM dbo.LiveBait

Is actually rewritten by the optimizer as:

SELECT  ID ,
        BaitName ,
        Quantity ,
        CASE WHEN SalePrice IS NOT NULL THEN Salesprice
             ELSE Price
        END AS CurrentPrice
FROM    dbo.LiveBait

This may seem like a rather small detail, but does cause a few side affects. The first is that using a subquery in the case statement will cause the subquery to be evaluated multiple times. Consider this statement:

SELECT  COALESCE(( SELECT TOP 1
                            SalePrice
                   FROM     dbo.LiveBait
                   ORDER BY Price ASC
                 ), Price) AS field
FROM    dbo.LiveBait

If we look at the execution code, we will see that the subquery, shown by a LiveBait tables scan with a sort, has been performed twice; once to see if it returns NULL and once to return the value.

20150105 Coalesce Isnull Coalesce Case Subquery

If we add the same subquery again we will see the execution plan now shows it being executed four times:

20150105 Coalesce Isnull Coalesce Case Subquery 2

The next affect involves what can possibly happen, based on the database’s transaction isolation level, if someone is making changes to the table involved in the subquery during the execution of the COALESCE statement. Consider that COALESCE executes the subquery and finds a non-NULL value. It now needs to execute it again to return that value as the result. However, another statment has come in and updated the table so now the subquery does return a NULL value. You can get around this by changing the isolation level to at least SNAPSHOT ISOLATION, or rewriting the subquery as a subselect like this:

SELECT  CASE WHEN X IS NOT NULL THEN X
             ELSE Price
        END AS CurrentPrice
FROM    ( SELECT    ( SELECT TOP 1
                                SalePrice
                      FROM      dbo.LiveBait
                      ORDER BY  Price ASC
                    ) AS X ,
                    Price
          FROM      dbo.LiveBait
        ) AS T

The last difference is the NULLability of the resulting expression is different for ISNULL and COALESCE. The ISNULL return value is always assumed to be NOT NULL, whereas the COALESCE return value is NULL. This makes a difference in the NULLABILITY of computed columns and key constraints. As an example, let’s add a computed column to our table to hold our current price:

ALTER TABLE dbo.LiveBait ADD CurrentPrice AS (COALESCE(SalePrice,Price))

This new computed column is considered NULLable, so when we try to make it our primary key:

20150105 Coalesce Isnull Coalesce Results NULLABLE

However, let’s drop that column and re-add it using ISNULL instead of COALESCE, and we can see we are successful in creating the primary key constraint:

20150105 Coalesce Isnull ISNULL Results NONNULLABLE

And here is the view of the table in SQL Server Management Studio:

20150105 Coalesce Isnull ISNULL Results NONNULLABLE table view

So there are the main differences between ISNULL and COALESCE. I hope this post helped you understand which to use and when.