Creating a comma separated list with FOR XML PATH and STUFF in SQL Server

I can’t really think of a story or tip that ties into creating a comma separated list, so I’ll just use one that’s completely unrelated.  It’s fun sharing experiences with loved ones, and fishing is no different.  Many of my recent favorite fishing memories involve fishing with my now-four year old son.  Two summers ago when he was two years old we were catching panfish off the dock at my Grandpa’s cabin in northern Wisconsin.  I was hooking them and letting my son net them with a small butterfly net before unhooking them and letting them go.  I had just hooked another one and was trying to pull it close enough for my son to net when suddenly a large northern pike darted in and completely inhaled the hooked bluegill.  Upon seeing such a large fish my son backed off quickly.  The northern pike wasn’t hooked at all, but it still fought me for about half a minute before spitting out the bluegill.  It was something most people don’t get to see, and it was fun to experience it with my son, who was just getting into fishing.

If you need to create a comma separated list in SQL Server, the best way to do this is to use the FOR XML PATH clause.  This takes a set of data and puts it into a format that can be used in XML.  The syntax can be a bit complicated, so let’s take a closer look.  First I’ll build a temp table and fill it with some values – we’ll use muskie lures:

      ID INT IDENTITY(1, 1) ,
      Lure VARCHAR(25)
        ( Lure )
VALUES  ( 'Mepps Muskie Killer' ),
        ( 'Harasser' ),
        ( 'Buchertail 700' ),
        ( 'Reef Runner' ),
        ( 'Creek Chub Pike Minnow' ),
        ( 'Grandma' ),
        ( 'Jake' ),
        ( 'Shallow Raider' ),
        ( 'Bulldawg' ),
        ( 'Suick Thriller' ),
        ( 'Believer' ),
        ( 'Swim Whizz' ),
        ( 'Double Cowgirl' ),
        ( 'Medussa' ),
        ( 'Doubledawg' ),
        ( 'Shallow Invader' ),
        ( 'Top Raider' ),
        ( 'Depth Raider' ),
        ( 'Pacemaker' ),
        ( 'Hawg Wobbler' ),
        ( 'Softtail Phantom' )

If we do a simple select from this table, we get what we’d expect:

20150128 comma list simple select

If we want to get a comma separate list, we use this syntax:

                FROM    ( SELECT DISTINCT
                          FROM      #LureNames
                        ) x
                XML PATH('')
              ), 1, 2, '') AllLures

To understand what this does let’s start from the inside and work our way out. The very inside has a rather simple SELECT DISTINCT:

FROM      #LureNames

This is going to simply give us each lure name once:
20150128 comma list select distinct In the next step we’re going to add the commas and put in the FOR XML PATH clause.

SELECT  ', ' + Lure
          FROM      #LureNames
        ) x

20150128 comma list simple xml You can see the results are now in an xml format. Since we didn’t specify a column name the entries are simply called <row>. We now have the basic comma separated list, but there are extra characters in there. First off are the <row> xml tags, and on top of that the very first entry starts with a comma and space. To get rid of the first issue, we’ll just change FOR XML PATH to FOR XML PATH(”). This tells SQL Server to suppress the row tags.

SELECT  ', ' + Lure
          FROM      #LureNames
        ) x
FOR     XML PATH('')

20150128 comma list xml no tags Now the only thing to fix is the leading comma and space. To get rid of this we use the STUFF function. The STUFF function takes a string and deletes a set amount of characters from a specified position and inserts new characters there. For a simple example if I want to change fisherman to fisherwoman, I would go like this:

SELECT  STUFF('fisherman', 7, 0, 'wo')

So I took my initial string, went to the seventh character (just after the ‘r’), deleted 0 characters and inserted ‘wo’ in that location:
20150128 comma list simple stuff For our example above, we’re going to start at the first position and delete two characters, and replace them with an empty string. Now we have our comma separated list:

20150128 comma list with space

It’s worth noting that I put a space between my words in this example.  If we didn’t want the space, we’d change the lurename and comma concatenation to exclude the space, and also only delete the leading one character with the stuff function.

                FROM    ( SELECT DISTINCT
                          FROM      #LureNames
                        ) x
                XML PATH('')
              ), 1, 1, '') AllLures

20150128 comma list no space

That is how you create a comma separate list in SQL Server.


Sending Alerts when Reporting Services job fail

There’s a lot of gear to remember when you head out for a fishing trip. You need your rods/reels, lures, live bait, net, fishing license, sunscreen, bug spray, hat, water, snacks, beer, and one of the most important things to remember – put the plug in the boat before you launch! A recent fishing trip to Lake Winnebago got off to a rocky start. First, I did forget to put the plug in. Once I noticed water pouring in I quickly screwed in the plug and got the bilge pump running. Next, after we motored out about a half mile to an offshore reef I looked in the cooler only to discover that I’d left the worms at home. We had to motor all the way back and swing by a nearby gas station to pick up some more.
It can be frustrating when we are expecting something to be there, but it’s not. We have several SQL Server Reporting Services (SSRS) reports that are scheduled to run at certain times throughout the week. These reports are sent to internal users for making everyday business decisions. Unfortunately, if the reports fail for any reason, they are simply not sent. I wanted to make sure everyone was getting their reports, so I set up a SQL Agent job to alert me for job failures. I was able to take advantage of the ReportDB.dbo.Subscriptions table to looks for statuses that weren’t “Mail sent to” or “New Subscrip”.

SELECT  c.Name ,
        S.LastRunTime ,
        S.LastStatus ,
        S.Description ,
FROM    LW_ReportDB.dbo.Subscriptions S
        LEFT OUTER JOIN LW_ReportDB.dbo.Catalog c ON c.ItemID = S.Report_OID
WHERE   LEFT(S.LastStatus, 12) != 'Mail sent to'
        AND LEFT(S.LastStatus, 12) != 'New Subscrip' 

This query will show any scheduled jobs that didn’t run as expected. Using my stored procedure that I created to send emails in html table format, I run the query like this:

EXEC DBMaint.dbo.SendDBEmailwithTabularQuery 
    @fieldlist = N'C.Name|S.LastRunTime|S.LastStatus|S.Description|C.Path', 
    @qFROM = N'FROM LW_ReportDB.dbo.Subscriptions AS S
				LEFT OUTER JOIN LW_ReportDB.dbo.[Catalog] AS C
				ON C.ItemID = S.Report_OID',
    @qWHERE = N'WHERE LEFT (S.LastStatus, 12) != ''Mail sent to''
				AND LEFT (S.LastStatus, 12) != ''New Subscrip''',
    @recipients = N'', 
    @subject = N'SSRS Reports that have failed'

And I get results like this:
20150126 SSRS Report Fail Notification
The last step is just to set up the query in a SQL Agent job to run each morning. Now when I get in each day I can investigate any failures before the report user has a chance to even notice it failed.

Send Database Mail in HTML Table Format from SQL Server

I gave a sharp tug on my rod, jerking the Bulldawg into the edge of my field of vision. It was still about 15 feet from the boat, and the morning sun reflecting off the water left nothing in view but a silloutte of the lure hoovering a few feet below the glassy surface. I paused a moment before giving another hard jerk, and this time when the lure stopped I felt a hard tug on the end of my pole. I quickly looked over to see the muskie that had chomped down on my lure, still holding the soft rubber in its mouth. I yanked hard to drive the hooks into the hard boney mouth, but I ended up coming up empty. The fish spit the lure and swam off. I was disappointed to have missed the opportunity to fight with the big fish, but I was grateful to have even seen him. I wouldn’t have made contact with the fish were it not for my attention to presentation. The muskie hit out of reaction, not because it was hungry. I was able to elicit the reaction strike by using hard jerks, which caused the lure to pop into view of the muskie range and then stop.
It’s a common practice to have the database send out alerts regarding data inconsistency or problems to affected internal users. I have created a stored procedure that will send out these emails in an html format, which is considerably easier to read than the standard database mail.
To demonstrate, I’ll create a simple set of data to email:

CREATE TABLE ##temp (LakeName VARCHAR(50), Acreage DECIMAL(8,2), Rating INT)
        ( LakeName, Acreage, Rating )
VALUES  ( 'Lake Tomahawk',3462,4),
       ('Big Arbor Vitae',1070,3),
       ('Random Lake',212,1)

Next I’ll query this table using the normal databasemail stored procedure:

EXEC msdb.dbo.sp_send_dbmail
       @recipients = '',
       @subject = 'My Lake List',
       @body = Lakes,
       @query = 'select top 5 lakename, acreage, rating from ##temp order by rating';

You can see the results look a bit hard to read:
20150121 DBMail PlainImagine how much harder to read it would be with more columns and records!

Next I’ll query using my specially built stored procedure:

EXEC DBMaint.dbo.SendDBEmailwithTabularQuery 
    @fieldlist = N'LakeName|Acreage|Rating', 
    @qFROM = N'FROM ##temp', 
    @qWHERE = '', 
    @qGroupBy = '',
    @qHaving = '', 
    @qOrderBy = N'ORDER BY Rating', 
    @recipients = N'', 
    @subject = N'My Lake List', 
    @Title = N'Lakes', 
    @SendEmailWithNoResults = 1 

Much easier to read!

20150121 DBMail Tabular

So here is the stored procedure. It is set up to send no email if there are no results, but that can be toggled when you call it.

CREATE PROCEDURE [dbo].[SendDBEmailwithTabularQuery]
   @qSELECT NVARCHAR(100), --The select part of the sql statement, which can include top X
   @fieldlist NVARCHAR(MAX), --Pipe delimited list of fields, which can include aliases
   @qFROM NVARCHAR(MAX), --The from part of the sql statment, which can include joins
   @qWHERE NVARCHAR(MAX) = '', --The where part of the sql statement
   @qGroupBy NVARCHAR(MAX) = '',--The group by clause
   @qHaving NVARCHAR(MAX) = '',--The having clause
   @qOrderBy NVARCHAR(MAX) = '', --The order by part of the sql statement
   @recipients NVARCHAR(4000), --The recipients of the email
   @subject NVARCHAR(400), --The subject of the email
   @Title NVARCHAR(4000) = '', --The title of the html table that holds the query results
   @SendEmailWithNoResults BIT = 0 --Send an email even if no records are generated


   --Declare initial variable.
   DECLARE @resultexist NVARCHAR(MAX)
   DECLARE @tblfieldheader NVARCHAR(MAX) = ''
   DECLARE @tempfield NVARCHAR(MAX) = ''
   DECLARE @i INT = 1, @j INT = 0, @SendEmail INT
   DECLARE @splitcnt INT
   DECLARE @fieldcount INT  
   --Find the number of fields in the query  
   SELECT @splitcnt = LEN(@fieldlist)-LEN(REPLACE(@fieldlist,'|',''))
   --Loop through the fields and put each on into the #Fields temp table as a new record
   WHILE @j <= @splitcnt
           INSERT INTO #Fields ( field ) SELECT s FROM DBMaint.dbo.SplitString(@fieldlist,'|') WHERE zeroBasedOccurance = @j
           SET @j += 1
   SELECT @fieldcount = @splitcnt + 1 --Will be the splitcnt + 1, otherwise MAX(ID) FROM #Fields  

   --Start setting up the sql statement for the query.
   SET @sql = @qSELECT
   --Loop through the #Fields table to get the field list
   WHILE @i <= @fieldcount
           SELECT @tempfield = field FROM #Fields WHERE ID = @i
           --This next section is required in case a field is aliased.  For the xml, we need to get rid of the aliases, the table header will only require the aliases.
           --NULL values need to be shown as a string = 'NULL' or the html table will just skip the cell and all values after that in the row will be shifted left.
           IF RIGHT(@tempfield,1) = ']' OR CHARINDEX(' as ',@tempfield) = 0
                   --Set the xml field to be the entire field name
                   SET @sql = @sql + ' ISNULL(CAST(' + @tempfield + ' AS NVARCHAR(4000)),''NULL'') AS ''td'','''','
                   --Set the table header field to be the entire field name
                   SET @tblfieldheader = @tblfieldheader + '<th>' + @tempfield + '</th>'
                   --Set the xml field to be the field name minus the alias
                   SET @sql = @sql + ' ISNULL(CAST(' + LEFT(@tempfield,LEN(@tempfield) - (CHARINDEX(' sa ',REVERSE(@tempfield))+3)) + ' AS NVARCHAR(4000)),''NULL'') AS ''td'','''','
                   --Set the table header field to be the field name's alias
                   SET @tblfieldheader = @tblfieldheader + '<th>' + RIGHT(@tempfield,CHARINDEX(' sa ',REVERSE(@tempfield))-1) + '</th>'
           --Increment the counter.
           SET @i += 1
   --Trim the extra four characters of the end of @sql.      
   SET @sql = LEFT(@sql, LEN(@sql)-4)
   --Add the from, where, group by, having, and order by clause to the select statement.
   SET @sql = @sql + ' ' + @qFROM + ' ' + @qWHERE + ' ' +  @qGroupBy + ' ' + @qHaving + ' ' + @qOrderBy
   --Put the set xml command around the sql statement.
   SET @sql = 'SET @XML = CAST(( ' + @sql + ' FOR XML PATH(''tr''),ELEMENTS ) AS NVARCHAR(MAX))'
   --Run the sql that will create the xmll.
   EXEC sp_executesql @sql, N'@xml nvarchar(max) output', @xml OUTPUT
   --Create the body of the email, which contains the xml results of the query.
   SET @body = '<html><body><H3>' + @Title + '</H3><table border = 1><tr>' + @tblfieldheader + '</tr>' + @xml + '</table></body></html>'

   --Drop the fields temp table.
   DROP TABLE #Fields

   --Set the variable that will be tested to verify there was at least one result.
   SET @resultexist = 'IF NOT EXISTS(SELECT TOP 1 1 ' + @qFROM + ' ' + @qWHERE + ' ' + @qGroupBy + ' ' + @qHaving + ') SET @SendEmail = 0 ELSE SET @SendEmail = 1'
   --Capture whether or not any rows were returned
   EXEC sp_executesql @resultexist, N'@SendEmail int output', @SendEmail OUTPUT
   --Check the variable.
   IF @SendEmail = 1
           --If rows were returned, send the email.
           EXEC msdb.dbo.sp_send_dbmail
                   @recipients = @recipients,
                   @subject = @subject,
                   @body = @body,
                   @body_format = 'HTML';
   ELSE IF @SendEmailWithNoResults = 1
           EXEC msdb.dbo.sp_send_dbmail
                   @recipients = @recipients,
                   @subject = @subject,
                   @body = 'Query returned no results';


This stored procedure uses a string splitting table valued function. Many versions of this can be found online, but here is the one I use. Please note I am not the original author, but I am unsure of where I found this, so I can’t give credit to the correct persion.

CREATE FUNCTION [dbo].[SplitString] 
        @str NVARCHAR(4000), 
        @separator CHAR(1)
    RETURN (
        WITH tokens(p, a, b) AS (
                CHARINDEX(@separator, @str)
            UNION ALL
                p + 1, 
                b + 1, 
                CHARINDEX(@separator, @str, b + 1)
            FROM tokens
            WHERE b > 0
            p-1 zeroBasedOccurance,
                CASE WHEN b > 0 THEN b-a ELSE 4000 END) 
            AS s
        FROM tokens

Querying Active Directory Through Oracle

It’s usually easy to tell when a muskie fisherman in another boat sees a good sized fish.  He’ll usually use the trolling motor to slowly move around a small area, picking it apart from different angles with his casts.  He may stay in an area of 100 square yards for a couple hours, trying to find the right presentation and angle to get the follower to bite.  Seeing a big fish follow your bait to the boat is exciting, but if you want to catch more of these followers, you need to be ready to change go at them with something different.  If bucktail gets them to follow lazily, try a jerkbait.  If casting from shallow water brings them in too late, try casting from the deep.

In my previous post, I showed how to query Active Directory from SQL Server.  This was nice and easy to do; since they’re both Microsoft products, they integrate really easily.  However, what if you need to do it in Oracle?  Fortunately, Oracle realized this would be a need and included a package called DBMS_LDAP to help us get at this information.  Here’s how to use this package to retrieve data from Active Directory into Oracle:

The first step is to add a table to hold the data.  I created a table called ACTIVEDIRECTORY:

    "EMAIL"      VARCHAR2(255 BYTE),
    "MOBILE"     VARCHAR2(25 BYTE),
    "RTYPE"      CHAR(1 BYTE)
  'E=Employee, R=Room, G=Group, V=Vendor';

Nothing too special about this table, I am going to be capturing the Displayname, Email, Extension, DirectDial, and Mobile phone numbers from Active Directory. I create a Type field that I am going to be manually adding to differentiate the type of records there. All rows should have a Displayname and a Type.

Next I need to create a stored procedure which will capture the data and insert it into the table.  I had to Google some examples to get the code working.  The most helpful example I found was at Oracle-Base.

  l_ldap_host   VARCHAR2(256) := '';
  l_ldap_port   VARCHAR2(256) := '389'; --Default
  l_ldap_user   VARCHAR2(256) := domain\username;
  l_ldap_passwd VARCHAR2(256) := ********;
  l_ldap_base   VARCHAR2(256) := 'OU=Users,OU=ougroup,dc=dcgroup,dc=com';
  dispname      VARCHAR2(255);
  email         VARCHAR2(255);
  Extension     VARCHAR2(10);
  DirectDial    VARCHAR2(25);
  Mobile        VARCHAR2(25);
  RType         CHAR(1) := 'E';
  l_retval PLS_INTEGER;
  l_session DBMS_LDAP.session;
  l_attrs DBMS_LDAP.string_collection;
  l_message DBMS_LDAP.MESSAGE;
  l_attr_name VARCHAR2(256);
  l_ber_element DBMS_LDAP.ber_element;
  l_vals DBMS_LDAP.string_collection;

  -- Choose to raise exceptions.
  -- Connect to the LDAP server.
  l_session := DBMS_LDAP.init(hostname => l_ldap_host, portnum => l_ldap_port);
  l_retval  := DBMS_LDAP.simple_bind_s(ld => l_session, dn => l_ldap_user, passwd => l_ldap_passwd);
  -- Get all attributes
  l_attrs(1)                                                   := 'displayname';
  l_attrs(2)                                                   := 'mail';
  l_attrs(3)                                                   := 'mobile';
  l_attrs(4)                                                   := 'homephone';
  l_attrs(5)                                                   := 'ipphone';
  l_retval                                                     := DBMS_LDAP.search_s(ld => l_session, base => l_ldap_base, scope => DBMS_LDAP.SCOPE_SUBTREE, filter => '(&(&(objectclass=user)(!(physicaldeliveryofficename=no phone)))(!(useraccountcontrol=514)))', attrs => l_attrs, attronly => 0, res => l_message);
  IF DBMS_LDAP.count_entries(ld => l_session, msg => l_message) > 0 THEN
    -- Get all the entries returned by our search.
    l_entry := DBMS_LDAP.first_entry(ld => l_session, msg => l_message);
    << entry_loop >>
    WHILE l_entry IS NOT NULL
      -- Get all the attributes for this entry.
      l_attr_name := DBMS_LDAP.first_attribute(ld => l_session, ldapentry => l_entry, ber_elem => l_ber_element);
      --Clear variables
      dispname   := NULL;
      email      := NULL;
      Extension  := NULL;
      DirectDial := NULL;
      Mobile     := NULL;
      << attributes_loop >>
      WHILE l_attr_name IS NOT NULL
        -- Get all the values for this attribute.
        l_vals := DBMS_LDAP.get_values (ld => l_session, ldapentry => l_entry, attr => l_attr_name);
        << values_loop >>
        FOR i IN l_vals.FIRST .. l_vals.LAST
          IF l_attr_name    = 'displayName' THEN
            dispname       := SUBSTR(l_vals(i),1,200);
          ELSIF l_attr_name = 'ipPhone' THEN
            DirectDial     := SUBSTR(l_vals(i),1,200);
          ELSIF l_attr_name = 'mail' THEN
            email          := SUBSTR(l_vals(i),1,200);
          ELSIF l_attr_name = 'homePhone' THEN
            Extension      := SUBSTR(l_vals(i),1,200);
          ELSIF l_attr_name = 'mobile' THEN
            Mobile         := SUBSTR(l_vals(i),1,200);
          END IF ;
        END LOOP values_loop;
        l_attr_name := DBMS_LDAP.next_attribute(ld => l_session, ldapentry => l_entry, ber_elem => l_ber_element);
      END LOOP attibutes_loop;
      l_entry := DBMS_LDAP.next_entry(ld => l_session, msg => l_entry);
    END LOOP entry_loop;
  -- Disconnect from the LDAP server.
  l_retval := DBMS_LDAP.unbind_s(ld => l_session);
  --Delete non-person users
  WHERE DisplayName IN('Laptop','Report User','huffman_01','time test','Time Clock','manager','VPN Test','Tanya VPN');

An important thing to note about this stored procedure is the letter case. We keep our SQL Server collation such that case sensitivity doesn’t matter, but in Oracle it does. Pay attention to your collation settings to determine whether or not you need to worry about capitalization.
At this point I had the code working properly in the SQL Developer window, but when I called the stored procedure I got an error:

ORA-24247: network access denied by access control list (ACL).

Oracle uses access control lists to enforce security on network locations. To get around this, I created an access control list that allowed access to my domain controller:

    acl          => 'ldap_access.xml',
    description  => 'Permissions to access LDAP servers.',
    principal    => 'dbname',
    is_grant     => TRUE,
    privilege    => 'connect');

    acl          => 'ldap_access.xml',
    host         => '',
    lower_port   => 389,
    upper_port   => 389

You can see access control lists in your database with these queries:


20150115 Oracle LDAP
20150115 Oracle LDAP 2

The last step is to set up a job to run the stored procedure to refresh the table. I’ve chosen a job that runs daily.

  DBMS_SCHEDULER.CREATE_JOB ( job_name => 'Refresh_ActiveDirectory', 
                              job_type => 'STORED_PROCEDURE', 
                              job_action => 'REFRESH_ACTIVEDIRECTORY', 
                              start_date => '13-JAN-15 11:00:00 PM', 
                              repeat_interval => 'FREQ=DAILY');

Now I have data from Active Directory flowing into my database daily.

Querying Active Directory Through SQL Server

One great thing about fishing is being out in nature. It’s great seeing bald eagles cruising around looking for food, hearing the eerie call of a loon, and smelling the air that seems so much fresher than the air on the land. Observing the tranquility of nature can be very relaxing and very fulfilling.
SQL Server is great because it’s a Microsoft product and that helps it to easily “observe” the data in other Microsoft projects. I was recently asked to import some data from Active Directory into a SQL Server table. This data was going to be used as a feed for our company directory. A quick Google search showed I had two options. Both involve using OPENQUERY(), but the first has you create a linked server whereas the second queries the data source with the definition included. The second method requires Ad Hoc Distributed Queries to be enabled on the server, and my server does not have that enabled, so I went ahead and created a linked server:

USE [master]

EXEC MASTER.dbo.sp_addlinkedserver @server = N'ADSI', @srvproduct=N'Active Directory Service Interfaces', @provider=N'ADSDSOObject', @datasrc=N'adsdatasource'
EXEC MASTER.dbo.sp_addlinkedsrvlogin @rmtsrvname=N'ADSI',@useself=N'False',@locallogin=NULL,@rmtuser=N'DomainName\ServiceAccount',@rmtpassword='########'


EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'collation compatible', @optvalue=N'false'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'data access', @optvalue=N'true'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'dist', @optvalue=N'false'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'pub', @optvalue=N'false'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'rpc', @optvalue=N'false'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'rpc out', @optvalue=N'false'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'sub', @optvalue=N'false'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'connect timeout', @optvalue=N'0'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'collation name', @optvalue=NULL

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'lazy schema validation', @optvalue=N'false'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'query timeout', @optvalue=N'0'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'use remote collation', @optvalue=N'true'

EXEC MASTER.dbo.sp_serveroption @server=N'ADSI', @optname=N'remote proc transaction promotion', @optvalue=N'true'

My next step was to create a table to hold the information I wanted. There are a TON of fields from Active Directory that can be captured, but I was only interested in a few. I also added a ‘Type’ field at the end so I could put in some other data and differentiate it with the employee data.

CREATE TABLE [dbo].[LW_ActiveDirectory](
   [DisplayName] [VARCHAR](255) NOT NULL,
   [Email] [VARCHAR](355) NULL,
   [Extension] [VARCHAR](10) NULL,
   [DirectDial] [VARCHAR](25) NULL,
   [Mobile] [VARCHAR](25) NULL,
   [Type] [CHAR](1) NOT NULL

Next I created a stored procedure that captures the data using the Linked Server, and inserts it into my new table:

CREATE PROCEDURE [dbo].[RefreshADTable]



   --Clear the table first
   TRUNCATE TABLE DatabaseName.dbo.LW_ActiveDirectory

   --Latest employee information
   INSERT  INTO DatabaseName.dbo.LW_ActiveDirectory
           ( DisplayName ,
             Email ,
             Extension ,
             DirectDial ,
             Mobile ,
           SELECT  displayname ,
                   mail ,
                   homephone ,
                   ipphone ,
                   mobile ,
                   'E' AS TYPE
           FROM    OPENQUERY(ADSI,
                             'SELECT displayName, mail, mobile, homephone,ipphone
	  FROM  ''LDAP://,OU=MyOU,DC=DC1,DC=com'' 
	  WHERE objectClass =  ''User''
	  AND useraccountcontrol <> 514
	  AND physicaldeliveryofficename <> ''no phone''
	  ') AS tblADSI

   --Delete non-person users
   FROM DatabaseName.dbo.LW_ActiveDirectory
   WHERE DisplayName IN('Laptop','Report User','time test','manager')


A few notes on this code:

  1. The LDAP tree is navigated from right to left.  This means I was looking for
  2. I excluded a useraccountcontrol value to get rid of inactive employees.  This value could be different for you.
  3. There were several employees with no phone information.  I had to go into Active Directory and put “no phone” into an unused field [Office] to exclude these from my extract.
  4. I also had to delete a few legacy users that couldn’t be deleted from Active Directory, such as Laptop and manager.

Next I create a SQL Agent job to run this stored procedure once a day.  We don’t need data more up-to-date than that for our purposes.

The last step was to go into Active Directory and add any missing information to each user.  This was a pain, but worth the effort.  When new employees are added the HR person here will enter this information.  Now we have a employee directory that will have the latest information in the database each day.  We used a reporting piece to create a printable directory sheet.

An upcoming blog post will show how to do this same thing in Oracle.  Stay tuned!

Sargability – make sure your indexes are being used!

My favorite time of year to fish is fall. The fish are bulking up for the winter, so they are fatter than usual. They also will often binge feed, providing some great fishing. However, fall can be rather uncomfortable here in Wisconsin. We get freezing temperatures, blustery days, and snow or rainfail for extended periods in the fall. Just this last October I was fishing cold conditions, with a pretty strong wind and light snow that turned heavy every so often. I remember looking around, seeing no one else on the lake, and thinking to myself “This is living!”. I’ve come to enjoy the sting of the weather because I associate it with the joy of fishing and being in the outdoors.
One thing that can really sting database end users is a slow query. Web pages that take seconds to load instead of miliseconds and BI queries that take minutes instead of seconds can make people awfully grumpy toward the DBA. One way to make sure you can aviod this sting is to ensure your queries are sargable. Sarbable is a made up term that comes from the expression Search ARGument ABLE. Basically, you want to make sure any fields that are referenced in the WHERE, ORDER BY, GROUP BY, and HAVING clauses are able to use an index. The main way to do this is to avoid functions in your predicate. Let’s take a look at a quick example. I’ll create a table with a clustered primary key and a some nonclustered indexes and fill it with some data:

--Create table with primary key clustered
CREATE TABLE dbo.BaitPurchases (ID INT IDENTITY(1,1), Baitname VARCHAR(255) NOT NULL, Purchasedate DATETIME2 NULL, Price MONEY NULL
([ID] ASC)

--Create nonclustered index on PurchaseDate
CREATE NONCLUSTERED INDEX IX_BaitPurchases_PurchaseDate ON [dbo].BaitPurchases
   Purchasedate ASC
)INCLUDE (Baitname)

--Create nonclustered index on Baitname
CREATE NONCLUSTERED INDEX IX_BaitPurchases_Baitname ON [dbo].BaitPurchases
   Baitname ASC
)INCLUDE (Price)

--Create nonclustered index on Price
CREATE NONCLUSTERED INDEX IX_BaitPurchases_Price ON [dbo].BaitPurchases
   Price ASC
)INCLUDE (Baitname)

--Throw in some data
INSERT INTO dbo.BaitPurchases
        ( Baitname, Purchasedate, Price )
VALUES  ( 'Bulldawg','2010-04-29',15.99),
( 'Double Cowgirl','2010-04-30',16.99),
( 'Shallow Raider','2010-05-29',15.49),
( 'Depth Raider','2010-05-30',18.99),
( 'Reef Runner','2011-05-29',14.99),
( 'Mepps Muskie Killer','2011-04-29',12.49),
( 'Suick Thriller','2011-04-30',16.09),
( 'Softtail Phantom','2011-05-30',26.99),
( 'Shallow Invader','2012-04-29',18.29),
( 'Medussa','2012-04-30',17.99),
( 'Grandma','2012-05-29',15.39),
( 'Jake','2012-05-30',15.39),
( 'Believer','2013-04-29',16.59),
( 'Harasser','2013-04-30',13.49),
( 'Doubledawg','2013-05-29',17.99),
( 'Buchertail 700','2013-05-30',13.89),
( 'Hawg Wobbler','2014-04-29',21.99),
( 'Pacemaker','2014-04-30',20.99),
( 'Top Raider','2014-05-29',18.50),
( 'Swim Whizz','2014-05-30',16.98),
( 'Creek Chub Pike Minnow','2015-01-12',15.00)

Now we’ll look for all items that were purchased in 2012, attempting to utilize the nonclustered index that was created on the Purchasedate field:

SELECT Baitname, Purchasedate
FROM dbo.BaitPurchases
WHERE Purchasedate BETWEEN '2013-01-01' AND '2013-12-31 23:59:59.999'

20150112 Sargability Date Seek

You can see from the execution plan that the database engine is doing an index seek on our nonclustered index, just as we were expecting.  However, your junior DBA comes across this code and thinks it’ll be much more readable if he re-writes it with a function to look at the purchase year:

SELECT  Baitname ,
FROM    dbo.BaitPurchases
WHERE   YEAR(Purchasedate) = 2013

This does look more readable to me, but if we take a look at the execution plan, we find a very bad negative consequence of this change:

20150112 Sargability Date Scan

We are now doing a nonclustered index scan, meaning the entirety of the index was read. This doesn’t make a big difference in our current tiny table, but with a table with millions of rows you could be seeing the query execution time going from less than a second to minutes.

Let’s take a look at one more common filtering technique that causes index scans instead of seeks. Let’s say you can’t quite remember the name of that lure, but you’re sure it started with an ‘S’. The obvious way to search for this would be:

SELECT  Baitname
FROM    dbo.BaitPurchases
WHERE   Baitname LIKE 'S%'

Looking at the execution plan, we can see we got a nonclustered index seek, so we’re happy with our results.
20150112 Sargability String Seek

However, now we remembered the name didn’t start with an ‘S’, but it definitely had ‘Raider’ in it somewhere. Now we change up our query like this:

SELECT  Baitname
FROM    dbo.BaitPurchases
WHERE   Baitname LIKE '%Raider%'

Now we’re getting a nonclustered index scan:
20150112 Sargability String Scan

Before, the database engine could look at the beginning of each baitname to find the results we were looking for. With the index, the engine jumped right to the section of the index that had the ‘S’s and returned the results from there. Now it has to look in every single bait name and see if it contains the string ‘Raider’. Searches like the second one are usually going to be slow, but SQL Server does provide one tool for getting that data a bit faster, and that is Full-Text indexes. I’ll discuss those in a later blog post.

So take the sting out of database queries by making sure your indexes are being used.

Backup size and time – database total space versus used space

There is a rather rough rule that fisherman follow when looking for trophy fish. The rule goes like this: the bigger the lake, the bigger the fish. As I stated, this is a rough rule, and there are lots of exceptions, but in many cases it proves to be true. There is a very small lake here in southeastern Wisconsin called Random lake. This lake holds muskie, but very few, if any, above the 40″ mark. Contrast that with Green Bay, a huge amount of water where many 50″+ muskie are caught every year. Much of this is the results of the forage available and the density of the target fish, but either way, if I’m looking to catch a true trophy, I’m heading to large lakes.
At work I came across a situation that made me wonder about backup sizes. I had inherited a ~100 GB database, and after some initial benchmarking of the system, I decided to compress some of the largest tables. I ended up reducing the amount of used space down by almost 53 gigabytes, which reduced the space needed and increased performance (less I/O). However, I now have these large data files (this database has 10 files) that are about 2/3rds empty. I wondered to myself whether having large files, even if they are mostly empty, increases the backup or the restore time of the database. I decided to run a quick test on my Sandbox machine to see if there was any real difference.
The database started out with an initial size of 0.4 GB.

20150109 Backup Restore Size Small DB

I used this script to do the backup. The script backups up the database and also does a test restore to ensure the backup is usable. This is an important step in any backup strategy.

BACKUP DATABASE [Sandbox] TO  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL12.SQLSANDBOX\MSSQL\Backup\Sandbox.bak' WITH NOFORMAT, NOINIT,  NAME = N'Sandbox-Full Database Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 10
SELECT  @backupSetId = position
FROM    msdb..backupset
WHERE   database_name = N'Sandbox'
        AND backup_set_id = ( SELECT    MAX(backup_set_id)
                              FROM      msdb..backupset
                              WHERE     database_name = N'Sandbox'
IF @backupSetId IS NULL
        RAISERROR(N'Verify failed. Backup information for database ''Sandbox'' not found.', 16, 1)
RESTORE VERIFYONLY FROM  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL12.SQLSANDBOX\MSSQL\Backup\Sandbox.bak' WITH  FILE = @backupSetId,  NOUNLOAD,  NOREWIND

20150109 Backup Restore Size Small Backup

The backup took 16 seconds. Here is the backup in the directory. You can see the backup size is 373 MB.

20150109 Backup Restore Size Small Backup Size

Lastly, I’ll do a restore to check how long that takes. You can see in the screenshot below it took 23 seconds.

USE [master]
RESTORE DATABASE [Sandbox]   FROM  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL12.SQLSANDBOX\MSSQL\Backup\Sandbox.bak' WITH  FILE = 1,  
                           NOUNLOAD,  REPLACE,  STATS = 5


20150109 Backup Restore Size Small Restore

Next I grow the database file. Since I’m using SQL Server Express, I’ll grow it to its max size, 10 GB.

20150109 Backup Restore Size Large DB

I use the same backup script and it now takes 17 seconds. This is a very slight difference considering the database file grew by 2400%. We’ll call it a negligible difference

20150109 Backup Restore Size Large Backup

The size of the backup has also grown by a very small amount, up to 383 MB.

20150109 Backup Restore Size Large Backup Size

The last thing to check is the restore time. This time the restore took 23 seconds. This is in line with what we saw from the restore with the small size database file.

20150109 Backup Restore Size Large Restore

So this experiment has shown that data file size makes a tiny, if any at all, difference in the amount of time to backup and restore and the size of the backups. Shrinking database data files almost always causes high fragmentation in the database, so unless you absolutely need more disk space on your drive, there is really no reason to shrink the database files, even if they have a lot of free space.