SQL Agent Properties

It’s been a rather windy summer thus far, making it less fun to be out on the water.  I don’t know many guys who enjoyed being blasted by the wind while bobbing up and down on the waves for hours on end.  I went out on Pewaukee Lake a few weeks ago with a buddy from work.  We had picked the day in advance since it was supposed to be dry and calm.  We got the dry, but not the calm.  We had a stiff wind blowing out of the west that drove us back into the western half of the lake after trying to fish the narrows in the middle.

I spent the day focusing my fishing efforts on making the bait look real.  I tried hard to avoid retrieving the lure in a rhythmic fashion.  I was paid off with a nice upper 30s muskie:

June 2017 Pewaukee Muskie.jpg

My fishing buddy hooked one a short time later, but couldn’t keep it pinned and we lost it.

Recently, I blogged about migrating the SQL Server installation onto a different drive.  I did find one problem after this move that I had to address.  I ran into a problem with the SQL Agent and I wasn’t able to diagnose the issue.  If I remember correctly it was actually an SSRS subscription that failed, and I needed details to find out why.  I found that the SQL Agent has properties, and the error log was still pointing back at the previous location on the C: drive, which no longer existed.  There is a stored procedure you can execute to see those properties, in addition to looking at them in the SSMS UI:

EXEC msdb..sp_get_sqlagent_properties

20170712 SQL Agent Properties.PNG

Lastly, just update the value with the corresponding SET stored procedure and restart the SQL Agent:

EXEC msdb..sp_set_sqlagent_properties 
   @errorlog_file = N'D:\SQLSERVERINSTALL\SQLAGENTERRORLOG.log'

Now your SQL Agent properties have been update.

Addressing login trigger failures in SQL Server

As I get older I have come to enjoy watching others fish, especially my children.  The thrill of catching a big fish is magnified by seeing the smile on someone else’s face when he/she is the one bringing it in.  Below is a nice sized largemouth bass my son caught on a recent fishing trip.

Two Sisters LM.jpg

In my previous post I showed how to create a login trigger to log sysadmin access to a SQL Server instance.  Almost immediately I received a comment describing how the failure of the trigger could almost completely prevent anyone from logging into the instance.  This is a major problem!

The reason this occurs makes sense if you think about it.  While attempting to login, the user executes some code in a trigger.  If that code is invalid, the trigger will fail and abort.  When that happens, the login aborts as well.  What could cause the trigger to fail?  Well, if the table (or other objects) you are accessing within the trigger is inaccessible to the user, or if it doesn’t even exist, the trigger will fail.

I tested this by using my working trigger, which logged sysadmin logins to a table called dbo.sysadminLogging.  Next I renamed the table to dbo.sysadminLogging1.

20170606 Renamed table

Next I tried to login in a new window in SSMS:

20170606 Failed login

First, let’s talk about how to get back into a server that has this issue.  We need to log into the SQL using SQLCMD with a dedicated administrator connection, then disable the trigger:

20170606 Disable trigger

After doing this everyone should now be able to log back into SQL Server as normal.

Now to prevent this type of event from happening, I suggest a small edit to my original trigger.  This edit will make sure the referenced objects are valid.  If not, the trigger does nothing.  It may also be a good idea to send an email to the DBA so they can investigate, and I’ve noted that in the comments.

CREATE TRIGGER [servertrigger_CheckForSysAdminLogin] ON ALL SERVER
    FOR LOGON
AS
    BEGIN
       IF OBJECT_ID('DBMaint.dbo.sysadminLogging') IS NULL
           BEGIN
               --Possibly send an email to the DBA, indicating the trigger is not working as expected
               GOTO Abort;--Do nothing
           END

        IF IS_SRVROLEMEMBER('sysadmin') = 1
            BEGIN
                INSERT  INTO DBMaint.dbo.sysadminLogging
                        ( [Login] , LoginDate )
                VALUES  ( ORIGINAL_LOGIN() , GETDATE() );
            END;

Abort:

    END;

GO

This newer version of the trigger should cut down on the chances that this functionality will come back to bite you. Special thanks to james youkhanis for pointing this out.

Logging sysadmin logins in SQL Server

Our yearly cabin opening “men’s weekend” was last weekend. The fishing was a bit below average, but we still had a great time. I brought my six year old son up, and two of my six year old nephews were brought up by their dad’s as well. The first day of fishing we went over to the rainbow flowage, which was a bit of a bust. We ended up hooking five or six northern pike, but only landed one because they kept biting off the line. The boys in the other boat caught some bluegill and bass, but overall the fishing wasn’t as hot as in years past. This was probably due to all the rain they got up there. There were places where the rivers were over the road. Below is the one pike we managed to land.
Rainbow Flowage NP.jpg
Security these days is as important as ever. SQL Server provides lots of ways to improve security of your data and environment. One small way I like to keep an eye on my system is to log sysadmin logins to the server. Sysadmin is able to do anything in SQL Server, and by reviewing my log from time to time I can be sure that no one is using this type of login to compromise the system.
The first thing to do is to create a table to hold the log data:

USE [DBA_DB]
GO

CREATE TABLE [dbo].[sysadminLogging](
   [SAL_id] [INT] IDENTITY(1,1) NOT NULL,
   [Login] [VARCHAR](255) NOT NULL,
   [LoginDate] [DATETIME2](7) NOT NULL,
 CONSTRAINT [PK_sysadminLogging] PRIMARY KEY NONCLUSTERED 
(
   [SAL_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

All that is left is to create a server level login trigger to record any sysadmin logins to that table:

CREATE TRIGGER [servertrigger_CheckForSysAdminLogin] ON ALL SERVER
    FOR LOGON
AS
    BEGIN
        IF IS_SRVROLEMEMBER('sysadmin') = 1
            BEGIN
                INSERT  INTO DBA_DB.dbo.sysadminLogging
                        ( [Login] , LoginDate )
                VALUES  ( ORIGINAL_LOGIN() , GETDATE() );
            END;
    END;
GO

Check this table regularly to get a good idea of who is logging into your SQL Server with these higher privileges.

EDIT:

james youkhanis points out a problem that can occur in the comments.  This problem could make logging in quite difficult.  I have posted a follow-up where I explain the problem, provide a workaround to allow logging in (as james demonstrates below), and provide an updated trigger to minimize the risk of this occurring.

Upgrade Oracle Database from 11g2 to 12c

The fishing opener has come and gone.  I was able to get out for the morning on the day of the opener, from about 6 AM to 11:30 AM.  I went with my brother-in-law down to Pewaukee Lake, where we tried to find some early season muskies.  I ended up with two follow-ups, one that fled as soon as it saw the boat, and another beauty of a fish that was very aggressive.  It followed the lure in hot, right on its tail.  It was a bigger fish (somewhere between 40 and 45 inches by my estimation), so instead of doing a figure 8 I went with a big continuous circle next to the boat.  I ended up circling somewhere around 10 times with the fish following closely.  Each time I swung the lure out away from the boat, the fish would cut the turn and make like it was going to t-bone the lure, but stopped just short.  Finally, as I swung the lure back toward the boat, the large fish simply nudged it without even opening its mouth.  It clearly was wary, and didn’t want to commit without investigating a bit more.  After the nudge the fish took off, I’m sure because it felt the different between a real fish and my plastic Shallow Raider.  The follow was a great start to the fishing season, but I’m hoping for more catches this year, not just follows.  I did get this largemouth bass as a consolation catch:

Open Day LM Bass.jpg

Last week I completed an upgrade of the Oracle database in our Production environment.  We only have a single application running on our Oracle database as compared to numerous applications running against SQL Server, so I’m not nearly as familiar with working in Oracle.  To put it bluntly, I was quite nervous to do the upgrade, and practiced several times in our Test environment before proceeding with the Production upgrade.  I was nervous for good reason because the upgrade did not go smoothly at all.

In our Test environment, it took me 3 or 4 tries before I was able to successfully complete the upgrade.  I found a few major hurdles to clear:

  • I needed a new user created.  We currently had a domain user running the existing Oracle database service, but the installer complained that the existing user had some flavor of domain admin rights, and it would not let me proceed with that user.
  • Our startup parameter file had a parameter called os_authent_prefix, which was set to doman\.  When the upgrade ran I kept getting error messages caused by the \.  I guess the upgrade assistant was not smart enough to deal with it, so I removed that parameter before the upgrade, then added it back in afterward.  This is an important note!  If you are doing an upgrade in the Windows environment, you will probably run into this issue.
  • I had to set the job_queue_processes parameter to a higher number, 100.
  • I dropped a few deprecated startup parameters, remote_os_authent and cursor_space_for_time.  I ended up adding the remote_os_authent back in after the upgrade had completed.
  • Lastly, Before the upgrade I compile invalid objects with the UTLRP.SQL job and emptied the Oracle recycle bin.

In addition to those issues which were causing the upgrade process to completely bomb, once I fixed them and got the upgrade to complete I had some cleanup, including the PATH and TNS_ADMIN environment variables, which had to point to the new Oracle home folder structure, and the tnsnames.ora and listener.ora files in the new Oracle home needed to be updated.  By the last practice attempt I was getting through the upgrade process in around an hour.

Finally, the night arrived for the Production upgrade.  The application is only used during business hours, so I started it at 5 PM once the normal users were finished for the day.  The first thing I noticed is a difference in Environment Variables between our Production and Test environment.  Production has the ORACLE_HOME environment variable set, and Test does not.  This somehow caused the first upgrade attempt to completely fail.  Not only did it fail, but it somehow erased the OracleDB service from the list of services in Windows.  It took me quite a while to get it back and working again so I could make a second attempt at the upgrade.  Although I received some error messages, this one did end up completing successfully.   The one thing I wasn’t expecting though was the amount of time it took.  While the database that was the same size but on inferior hardware took less than an hour to install in the Test environment, in our Production environment it took well over three hours to install.

I had to perform similar cleanup on the Production environment as in Test, but I also ran into one additional hiccup.  After the upgrade I found all accounts except for sys were expired and locked.  Now, unlocking an account is not a big problem, but there is not good way to unexpire an account.  This is a big problem because many of the accounts can be service accounts, where no actual user is signing in.  So no prompt for a new password, and no way to unexpire.  Fortunately I found a great workaround on a blog post by Simon Krenger.  This method involves replacing the existing encrypted password with that same encrypted password.  Once I executed the SQL output by his method, all the accounts were unexpired.

The last step was to gather stats on the relevant databases, and the upgrade was complete.

Removing Schema from Oracle

I took my kids to the park on the local lake here this weekend.  While we went mainly to play on the playground, we also brought along a fishing pole and some worms.  I didn’t think the fish had moved up in the shallows yet, but we brought the pole just in case.  Unfortunately, I was right.  The weeds were still brown and low, and there was no sign of any fish.  They should be coming up pretty soon though.  We have a little over two weeks until the fishing opener!

We recently stopped using a third party software that had a back-end in our Oracle database.  After waiting a sufficient amount of time, I was finally able to remove the user and data from the database.  I’d never done a large data removal like this from Oracle before.  My first step was to export a copy of the data using data pump.  Although the data should never be needed again, it doesn’t hurt to grab one last copy before deleting it into oblivion.

20170418 expdp

I zipped that up and placed it into archive storage, probably never to be touched again.

Next I dropped the user.  In order to remove a user that has objects in it, you must use cascade.

DROP USER usertodrop CASCADE;

Lastly, I noticed that the tablespaces that were created to house the objects in that schema were still there. I needed to drop the tablespaces. I made a mistake the first time around by omitting including contents and datafiles. Without this part of the command, the files will stay on the disk and you will have to manually delete them using the OS. In my case, I had to wait several hours for the Oracle service to relinquish its lock on the files. The second time around I used the proper command, and the disk space was returned to the server immediately:

DROP TABLESPACE ts_usertodrop_pd INCLUDING CONTENTS AND DATAFILES;
DROP TABLESPACE ts_usertodrop_td INCLUDING CONTENTS AND DATAFILES;

After this, the user was gone!

Data Warehouse Fact Tables

Being an avid fisherman, I love good fishing art.  I found the picture above using a Google search.  The pictures online are good, but aren’t comparable to seeing them in person.  I recently saw a picture of a muskie called Windfall Muskie, painted by Terry Doughty.  This is the best painting of a muskie I’ve seen so far, a beautiful piece of art.  I love seeing how an artist can capture the thrill of fishing on a canvas.

Previously, I described how to begin setting up a data warehouse by choosing a business process, selecting the grain, and building the dimension tables.  This post will focus on the last step, building the fact tables.  The fact tables contain a foreign key link to each associated dimension table as well as the measures that we will be using in the reports.  The measures are usually numeric values that can be aggregated in the report.  For example, for retail orders I may want to measure product price, discount, tax paid, and other metrics related to price and cost paid in an order.

Fact tables will usually be very long and narrow, as opposed to dimension tables which are generally short and wide.  Fact tables will hold one line for each transaction in your business process, based on your chosen grain.  For retail orders, each line in a transaction table might be an order line.  The table would have an Order ID, Order Line Number, Line Price, Quantity Purchased, and then foreign keys to all the descriptive dimension tables, such as Customer, Product, Purchase Date, Purchase Location, etc.

Once the fact table is built and loaded, you can easily find totals at a level aggregated by any of the descriptors.  For example, you can get the quantity purchased by store, the total order value by customer, or the most popular product by date.  Here is an example of how you can join the fact to dimension tables:

SELECT  dc.CustomerState
      
, dp.LongSkuDivision
      
, dd.MONTH
      
, oc.OrderChannel
      
, SUM(fop.OrderQuantity)
      ,
SUM(fop.SalePrice)
FROM    dbo.factOrderPlaced fop
INNER JOIN dbo.dimCustomer dc ON dc.CustomerKey = fop.CustomerKey
INNER JOIN dbo.dimProduct dp ON dp.ProductKey = fop.ProductKey
INNER JOIN dbo.dimDate dd ON dd.DateKey = fop.OrderDateKey
INNER JOIN dbo.dimOrderChannel oc ON oc.OrdChanKey = fop.OrdChanKey
GROUP BY dc.CustomerState
      
, dp.LongSkuDivision
      
, dd.MONTH
      
, oc.OrderChannel;

I hope this three part series has given you a basic idea of how to go about starting a data warehouse.

Data Warehouse Dimension Tables

There is a joke among fisherman about what boat really means. The common acronyms are Break Out Another Thousand or Bankruptcy On A Trailer. These jokes are funny, but also true – owning a boat is expensive. This winter I decided to take the boat in to the marina to have it tuned up. It was having trouble starting consistently, but aside from that it ran pretty well. Now that I’ve gotten it back, I’m anxious to take it out and see how much better it performs. It feels good to know that it should be running better, but it did set me back a pretty penny.
This is part 2 in my 3 part series on the high level overview of setting up a data warehouse. My previous post dealt with the first two decisions, selecting the business process and choosing the level of granularity. This part will focus on identifying and building the dimension tables. Dimensions provide the who, what, when, where, why, and how context surrounding the business process. For my example business process, retail orders, these will include:

  • Who is making each purchase?
  • What are they buying?
  • Which purchase channel or store are they purchasing it from?
  • When are they making the purchase?

To answer these questions, I’ve identified four distinct dimension tables that will be needed: Customer, Product, Order Channel, and Date.
I’ll go over a few best practices for creating dimension tables, then I’ll provide examples for creating the first three, then I’ll describe the Date table in a bit more detail.
Each table needs a primary key that will be linked to the fact table through the use of a foreign key. The best practice is to use a surrogate key. Although using the system of record’s natural key may seem like a good idea, this can cause major complications down the line, should anything in the source system change. Additionally, this allows for the easy integration of multiple source systems, improves performance, and supports attribute change tracking (I may get into slowly changing attribute dimension tables in a later post). That said, I would include the natural key as an attribute in the dimension table.
Each table should be as denormalized as possible. Using a star schema, with all or most dimension tables joining only to the fact table, will get much better performance than the snowflake schema, where a dimension may have to go through one or more other dimension tables to eventually be related to the fact table. As an example, if you have a product table where each product has a department, category, and color, you would include the descriptions for these right in the dimension table rather than having a separate product department, product category, and product color tables that link to the product dimension table. Although this uses more space in the database, the space used by dimension tables pales in comparison to the fact tables. Additionally, the performance is much better without having to do all the additional joins to get the data that is needed.
Each table needs to have a row for the unknown value. This will allow us to report when we don’t have all the information. For example, let’s say someone returns a product that we have never sold. The employee working in the returns area accepts it back by mistake. Now to get rid of it, we list it as a non-SKU’d product for a low price. We still want to report on the sale in our data warehouse, but we don’t have any real product to tie it to. In that case, we’d tie it to a product called “Unknown”. By doing this we can keep NULLs out of our fact table foreign keys.
The last thing I’ll note on dimension tables is the load process. Because of the relative simplicity of our system, do the ETL all within stored procedures in SQL Server. Many data warehouses will require the use of Integration Services (SSIS). I would recommend sticking with the stored procedures when possible, for simplicity’s sake. However, there will be scenarios where SSIS is required. I have fact tables in my environment that are truncated and reloaded each night, as well as fact tables that are incrementally loaded (only new and changed rows are loaded). Because of this, I am required to preserve the surrogate keys in my dimension tables. With that in mind, my load process is as follows:

  1. Load any newly created attributes (new products, customers, etc.)
  2. Update any aggregate info (customer lifetime order value, product last purchase date, etc.).

I include logging on the load process, but I’ll show that in greater detail on the next post when I discuss fact tables.
So to show examples of tables for our scenario, I might have the following tables:

CREATE TABLE [dbo].[dimCustomer](
   [CustomerKey] [INT] IDENTITY(1,1) NOT NULL,
   [CustomerID] [DECIMAL](9, 0) NOT NULL,
   [CustomerName] [VARCHAR](100) NULL,
   [CustomerAddress] [VARCHAR](100) NULL,
   [CustomerCity] [VARCHAR](75) NULL,
   [CustomerStateAbbrv] [VARCHAR](3) NULL,
   [CustomerCountry] [VARCHAR](30) NULL,
   [CustomerZIP] [VARCHAR](20) NULL,
   [CustomerEmail] [VARCHAR](255) NULL,
   [LifetimeOrderValue] [DECIMAL](18, 2) NULL,
   [FirstOrderChannel] [VARCHAR](25) NULL,
   [FirstOrderHadFreeShipping] [BIT] NULL,
 CONSTRAINT [PK_dimCustomer] PRIMARY KEY CLUSTERED 
(
   [CustomerKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]


CREATE TABLE [dbo].[dimProduct](
   [ProductKey] [int] IDENTITY(1,1) NOT NULL,
   [Style] [varchar](20) NULL,
   [StyleDescription] [varchar](40) NULL,
   [SkuCode] [varchar](14) NULL,
   [SkuDescription] [varchar](40) NULL,
   [ColorCode] [varchar](4) NULL,
   [ColorName] [varchar](40) NULL,
   [LongSkuDivision] [varchar](30) NULL,
   [LongSkuDepartment] [varchar](30) NULL,
   [CountryOfOrigin] [varchar](30) NULL,
 CONSTRAINT [PK_dimProduct] PRIMARY KEY CLUSTERED 
(
   [ProductKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[dimOrderChannel](
   [OrdChanKey] [INT] IDENTITY(1,1) NOT NULL,
   [OrderChannel] [VARCHAR](50) NULL,
 CONSTRAINT [PK_dimOrderChannel] PRIMARY KEY CLUSTERED 
(
   [OrdChanKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

The date table requires a bit more explanation. This is the only dimension table where using a logical primary key, rather than a simple identity integer, makes sense. I prefer creating the primary key as yyyymmdd.
Virtually every model will need a date table, because almost every process that is being reported on occurred at or during a specific time. The date table can be built in advance, can be used across multiple models, and can be used in a single model multiple times. For example, Amazon may want to report on orders both on the purchase date/time and the ship date/time. Both dimensions can use a single date dimension table. It is helpful to put tons of descriptors into your date table. It is not uncommon to see an array of info including day of week, day of year, fiscal day of year, holiday indicator, etc. Below is the code I used to create my date dimesion table.

CREATE TABLE [dbo].[dimDate](
   [DateKey] [INT] NOT NULL,
   [Date] [DATETIME2](0) NULL,
   [Year] [SMALLINT] NULL,
   [Quarter] [TINYINT] NULL,
   [Month] [TINYINT] NULL,
   [MonthName] [VARCHAR](20) NULL,
   [WeekOfMonth] [TINYINT] NULL,
   [WeekOfYear] [TINYINT] NULL,
   [DayOfWeek] [TINYINT] NULL,
   [DayOfWeekName] [VARCHAR](14) NULL,
   [DayOfMonth] [TINYINT] NULL,
   [DayOfYear] [SMALLINT] NULL,
 CONSTRAINT [PK_dimDate] PRIMARY KEY CLUSTERED 
(
   [DateKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

SELECT TOP 10000
        IDENTITY( INT,1,1 ) AS N
INTO    #tally
FROM    MASTER.dbo.syscolumns sc1
        CROSS JOIN MASTER.dbo.syscolumns sc2;

WITH    cte
          AS ( SELECT   DATEADD(DAY, N - 1, '2000-01-01') AS Date
               FROM     #tally
             )
            INSERT INTO dbo.dimDate
    SELECT  YEAR(cte.Date) * 10000 + MONTH(cte.Date) * 100 + DAY(cte.Date) AS DateKey,
           Date ,
            YEAR(cte.Date) AS YEAR ,
           DATEPART(QUARTER,cte.Date) AS Quarter,
            MONTH(cte.Date) AS MONTH ,
            DATENAME(MONTH, cte.Date) AS MonthName ,
           DATEPART(ww, cte.Date) + 1 - DATEPART(ww, CAST(DATEPART(mm, cte.Date) AS VARCHAR) + '/1/' + CAST(DATEPART(yy, cte.Date) AS VARCHAR)) AS WeekOfMonth ,
           DATEPART(wk, cte.Date) AS WeekOfYear ,
           DATEPART(dw,cte.Date) AS DayOfWeek,
            DATENAME(dw, cte.Date) AS DayOfWeekName ,
           DAY(cte.Date) AS DayOfMonth ,
            DATEPART(DAYOFYEAR, cte.Date) AS DayOfYear ,
    FROM    cte
   UNION
   SELECT -1,
   NULL,
   NULL,
   NULL,
   NULL,
   'Unknown',
   NULL,
   NULL,
   NULL,
   'Unknown',
   NULL,
   NULL;

   
DROP TABLE #tally;

So I gave you a high level overview of creating the dimension tables in a data warehouse. Stay tune for part three, where I look at the fact tables.