Data Warehouse Creation

Spring fishing for walleye on the Wolf River can be really hot.  When the walleye are running up-river to spawn in the marshes, they can be extremely thick.  Catching them can be fairly easy.  The one bad thing about this is that almost every angler knows it.  As you can see in the picture above, boats can get stacked right on top of each other.  I was hoping to head up one day to try to get a limit of eaters, but I haven’t been in the mood to fight the crowds lately.

I’ve recently implemented a data warehouse at work.  A data warehouse is a storehouse for information collected from a wide range of sources across an organization.  This storehouse must make it easy for users to access the information they need in a timely manner, it must be consistent, it must be adaptable to change, and most importantly it must be trustworthy.  This is the first time I’ve ever set up a data warehouse.  I’m going to spend the next couple posts explaining the steps I followed in setting it up.

I started by studying the Ralph Kimball method for dimensional modeling.  I used The Data Warehouse Toolkit, 3rd Edition.  I feel it’s very important to spend time researching and planning in advance, because poor design can be very difficult and onerous to fix.

The Kimball method proposes a four step dimensional design process:

  1. Select the business process
  2. Declare the grain
  3. Identify the dimensions
  4. Identify the facts

We chose retail orders as the business process we wanted to report on.  It’s a good idea to choose a fairly simple process to start with.

I’m going to save the dimension and fact creation for later blog posts, but I will discuss the grain here.  The grain is basically the detail level of the fact table.  The Kimball method suggests starting at the atomic grain, or the lowest level at which data is captured by a given business process.  For my purpose, since I began with retail orders, the lowest level is the order line.  Other grains that I could have considered would have been at the order level or even the daily, weekly, or yearly order level.  Every time you go up a level you lose details about the order.  For example, at the order line level I can see a separate product for each line.  But if I look at the order level, I can no longer see the individual products within the order.  If I go up another level and look at all order taken on a day, I lose the different customers that placed orders.

The only advantage of using a higher level is that you will be dealing with less data since it has been aggregated, which will make processing run faster.  To offset this disadvantage at the lower levels, Analysis Cubes can be used.  These cubes pre-aggregate various cross sections of the data so analysis can be performed quickly at the aggregate level while preserving the pre-aggregated details.

Stay tuned for my next post where I will define and describe dimension table creation.

Credentials in Powershell

I had the opportunity to attend the Madison Fishing Expo a few weekends ago.  It was a great way to stay excited for the upcoming fishing year during these cold winter months.  I didn’t get any new information, but I did let my son pick out a few cheap lures to add to his tackle box.

Choosing fishing lures

The warm weather has completely melted the ice off our area lakes (nice and early!), but we, along with almost the entire rest of the country, got a round of winter weather this week, so we’re back to almost a foot of snow on the ground.  It’ll be at least a few more weeks before I launch the boat for the first time this year.

The company I work for has been in the process of strengthening its security posture for the last few years.  Recently, they took the step of creating separate administrator accounts to use when we are doing things that require administrative permissions.  Up until now, I only had one account – an administrator-level account.  I expected at least a few things to break once they turned off my higher privileges, and those expectations were met.  The thing I’m going to touch on today is Powershell privileges.

I use a Powershell script that is run daily to collect various health statistics regarding my SQL databases and servers.  This script is run from Windows Task Scheduler, and is run from my laptop using my Windows AD account user.  Once that user lost its admin privileges, a few of the collection methods failed.  In order to get them to work, I needed to plug in my admin account for that specific method.  I found a neat way to do that using Powershell’s Credential object.

First, I stored the account password in a text file.  The password is encrypted and placed as a file on the disk by using the following Powershell command:

20170316 Powershell Credential Create Encrypted PW

Opening the new file contains the following:

20170316 Powershell Credential Encrypted PW

So you can see that the password is definitely encrypted.

Now I can reference that file whenever I need to enter credentials.

#Create a credential for connecting to the server
$user = "Domain\adminuser"
$pw = cat "C:\Temp\Password.txt" | convertto-securestring
$cred = new-object -typename System.Management.Automation.PSCredential -argumentlist $user, $pw

#Access the Disk Info using my new credntial
$disks = Get-WmiObject -ComputerName $instance -Class Win32_LogicalDisk -Filter "DriveType = 3" -Credential $cred;

Using this method you can pass credentials to your Powershell script without having to store them in plain text on your computer. The only downside in my case is I will have to update my encrypted password file whenever my admin account has a password change.

Cross Database Certificates – Trouble with Triggers

The weather has been awesome here for the last few days.  Sixty plus degree temperatures has made it feel more like May than February.  It isn’t supposed to last much longer, but I have enjoyed it.  I took the boat in for an engine tune-up this weekend, which means I should get it back just in time for most the ice to be coming off the lakes.  I’m hoping to take a couple more shots at the Wolf River walleye run this spring.  Last year didn’t provide good results.

I took my sons to a park on the edge of a lake this past weekend and happened to be watching while an unfortunate ice fisherman’s ATV fell through the ice.  I’m not sure how these ice fishermen know what ice is good versus what ice is bad, but you can see from the main picture above that not all of them know either.  Fortunately, only the front tires went through and another ATV came over and pulled him out.

I ran into an issue with cross database certificates recently.  I have blogged about how to set these certificates up here – they are a handy way to enable permissions across databases.  However, I ran into a problem where the permission chain failed due to a trigger on the original table that updated a separate table.  Here is the SQL  to replicate the issue:

CREATE LOGIN [GuggTest] WITH PASSWORD=N'abcd', DEFAULT_DATABASE=[master], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF

CREATE DATABASE A;
CREATE DATABASE B;

USE A;

CREATE TABLE dbo.SPtoUpdate
    (
      ID INT
    , ILoveFishing VARCHAR(255)
    );
INSERT INTO dbo.SPtoUpdate
        ( ID , ILoveFishing )
VALUES  ( 1,'Musky'),( 2,'Pike'),( 3,'Yellow Perch');
CREATE TABLE dbo.TriggerToInsert
    (
      ID INT
    , ILoveFishing VARCHAR(255)
    , ChangeDate DATETIME2
    );
GO

CREATE TRIGGER dbo.SPtoUpdateTrigger ON dbo.SPtoUpdate
    FOR UPDATE
AS
    DECLARE @datetime DATETIME2;
    SELECT  @datetime = GETDATE()

    INSERT  INTO dbo.TriggerToInsert
            ( ID , ILoveFishing , ChangeDate )
    VALUES  ( 1 , 'Yes' , @datetime );
GO

CREATE CERTIFICATE BExecutor
   ENCRYPTION BY PASSWORD = 'Obfuscated'
   WITH SUBJECT = 'Execute sp from B to A',
   START_DATE = '20140101', EXPIRY_DATE = '20300101'
GO

BACKUP CERTIFICATE BExecutor TO FILE = 'C:\temp\crossdbcert.cer'
WITH PRIVATE KEY (FILE = 'C:\temp\crossdbcert.pvk' ,
                  ENCRYPTION BY PASSWORD = 'Obfuscated',
                  DECRYPTION BY PASSWORD = 'Obfuscated')
GO

CREATE USER BExecutor FROM CERTIFICATE BExecutor

GRANT UPDATE ON dbo.SPtoUpdate TO BExecutor
GRANT SELECT ON dbo.SPtoUpdate TO BExecutor
--Also give insert on dbo.TriggerToInsert
GRANT INSERT ON dbo.TriggerToInsert TO BExecutor

USE B
GO

CREATE USER [GuggTest] FOR LOGIN [GuggTest];
EXEC sp_addrolemember N'db_owner', N'GuggTest'
GO

CREATE PROCEDURE dbo.UpdateTableInA
AS
    BEGIN
        UPDATE  A.dbo.SPtoUpdate
        SET     ILoveFishing = 'Walleye'
        WHERE   ID = 2;
    END

GO


CREATE CERTIFICATE BExecutor FROM FILE = 'C:\temp\crossdbcert.cer'
WITH PRIVATE KEY (FILE = 'C:\temp\crossdbcert.pvk' ,
                  ENCRYPTION BY PASSWORD = 'Obfuscated',
                  DECRYPTION BY PASSWORD = 'Obfuscated')
GO

EXEC MASTER..xp_cmdshell 'DEL C:\temp\crossdbcert.*', 'no_output'
GO

ADD SIGNATURE TO dbo.UpdateTableInA BY CERTIFICATE BExecutor
    WITH PASSWORD = 'Obfuscated'
GO

--Log In or Change execution context to GuggTest, then EXEC dbo.UpdateTableInA

It turns out you can counter sign a trigger with the certificate, and this will allow the permission chain to succeed. By doing this, you don’t even need to grant the certificate user permission to the second table. Here is the syntax to do that:

ADD COUNTER SIGNATURE TO dbo.SPtoUpdateTrigger
BY CERTIFICATE BExecutor
WITH PASSWORD = 'Obfuscated';

Use this technique to work with cross database permissions that have to access tables with triggers.

Recursive Common Table Expressions

Wind can be an ally or an enemy of the fisherman.  Both in terms of comfort and in changing the mood and location of the fish, wind is something that can’t be ignored.  As it relates to the fish, wind can often turn fish on.  The term “muskie chop” refers to medium sized waves that can help create good conditions for fishing.  The wind does a couple things: it restricts the light by creating waves that break up the sun, and it also creates a current that can move fish to specific locations that can be targeted.  The other factor to consider related to wind if fisherman comfort.  I love fishing the colder months, but you’d better make sure you’re dressed for the weather.  There is no indoors in a fishing boat, so if it’s going to be windy and cold, bundle up.  At the same time on those hot, sunny, humid July days, you may not want to even be out unless there is some wind to cool you down.  Keeping all these factors in mind, it’s important to remember that wind is strongest when it has a large open space to build up it force.  If you want to avoid the wind, head to the upwind side of the lake.  If you want to embrace the wind, head to the downwind side.

In SQL Server, a recursive common table expression (CTE) could be compared to wind building up power as it moves over the lake.  A recursive CTE will call itself, and in doing so use the previous results to build to a final results set.

I recently had a perfect use case for this concept.  I had to take dollars given to me on a monthly level and distribute it to each day within the month.  Using a recursive CTE, I told SQL Server to give me the monthly total divided by the days in the month for each day in the month.  Below is an example of how I set it up:

CREATE TABLE #SalesTotalsByMonth
    (
      FirstOfMonth DATE
    , Channel VARCHAR(10)
    , SalesTotal DECIMAL(10 , 2)
    );
INSERT  INTO #SalesTotalsByMonth
        ( FirstOfMonth , Channel , SalesTotal )
VALUES  ( '2016-01-01' , 'Web' , 165473.99 ),
        ( '2016-01-01' , 'In-store' , 56998.45 ),
        ( '2016-01-01' , 'Mail' , 4645.85 )
,       ( '2016-02-01' , 'Web' , 27463.56 ),
        ( '2016-02-01' , 'In-store' , 61423.78 ),
        ( '2016-02-01' , 'Mail' , 5341.56 )
,       ( '2016-03-01' , 'Web' , 487356.67 ),
        ( '2016-03-01' , 'In-store' , 15734.56 ),
        ( '2016-03-01' , 'Mail' , 3104.85 )
,       ( '2016-04-01' , 'Web' , 478236.78 ),
        ( '2016-04-01' , 'In-store' , 24675.67 ),
        ( '2016-04-01' , 'Mail' , 1024.56 )
,       ( '2016-05-01' , 'Web' , 167524.89 ),
        ( '2016-05-01' , 'In-store' , 31672.78 ),
        ( '2016-05-01' , 'Mail' , 1798.67 )
,       ( '2016-06-01' , 'Web' , 347652.19 ),
        ( '2016-06-01' , 'In-store' , 41675.19 ),
        ( '2016-06-01' , 'Mail' , 801.78 )
,       ( '2016-07-01' , 'Web' , 247653.02 ),
        ( '2016-07-01' , 'In-store' , 59713.02 ),
        ( '2016-07-01' , 'Mail' , 2097.19 )
,       ( '2016-08-01' , 'Web' , 891642.23 ),
        ( '2016-08-01' , 'In-store' , 67134.23 ),
        ( '2016-08-01' , 'Mail' , 3752.02 )
,       ( '2016-09-01' , 'Web' , 342591.24 ),
        ( '2016-09-01' , 'In-store' , 77123.24 ),
        ( '2016-09-01' , 'Mail' , 2406.23 )
,       ( '2016-10-01' , 'Web' , 246758.25 ),
        ( '2016-10-01' , 'In-store' , 81214.24 ),
        ( '2016-10-01' , 'Mail' , 3012.24 )
,       ( '2016-11-01' , 'Web' , 267423.26 ),
        ( '2016-11-01' , 'In-store' , 91023.26 ),
        ( '2016-11-01' , 'Mail' , 2034.24 )
,       ( '2016-12-01' , 'Web' , 265219.56 ),
        ( '2016-12-01' , 'In-store' , 34167.02 ),
        ( '2016-12-01' , 'Mail' , 1010.26 );

WITH    recurse
          AS ( SELECT   stbm.Channel
                      , stbm.SalesTotal / DATEDIFF(DAY , stbm.FirstOfMonth , DATEADD(MONTH , 1 , stbm.FirstOfMonth)) AS Revenue
                      , DATEDIFF(DAY , stbm.FirstOfMonth , DATEADD(MONTH , 1 , stbm.FirstOfMonth)) AS daysleft
                      , stbm.FirstOfMonth AS [Sales Day]
               FROM     #SalesTotalsByMonth stbm
               UNION ALL
               SELECT   recurse.Channel
                      , recurse.Revenue
                      , recurse.daysleft - 1
                      , DATEADD(DAY , 1 , recurse.[Sales Day])
               FROM     recurse
               WHERE    recurse.daysleft > 1
             )
    SELECT  recurse.[Sales Day]
          , recurse.Channel
          , SUM(recurse.Revenue) AS Revenue
    FROM    recurse
    GROUP BY recurse.Channel
          , recurse.[Sales Day];

DROP TABLE #SalesTotalsByMonth;

The important thing to note here is the general pattern for a recursive CTE – the initial expression with a UNION ALL that calls the CTE.  Be sure to put the upper limit in the WHERE clause of the bottom half to avoid infinite recursion.

My final results gave me the total per day.

Moving the SQL Server Installation to a Different Drive

Following fishing regulations is very important.  We as a society are called to be responsible stewards of our natural resources, and that includes fish.  Overfishing, poaching, and spreading invasive species can all decimate a lake’s fish population, ruining it for everyone else.  I was disheartened to see a news article this week about a man caught with over 2,500 panfish in his freezer.  The legal limit is 50 per species, so he would have been allowed to possess 150 fish.  Hopefully the story of his guilt will dissuade other poachers, but given his rather light sentence, I doubt that will be the case.

I recently needed to install SQL Server Analysis Services (SSAS) on our test server to begin experimenting with it.  However, the C drive, where SQL Server was installed, had only a few hundred MBs of space left.  When installing SSAS on the existing instance of SQL Server, you are forced to use the same drive and I didn’t have enough space.  I decided to move the existing installation from the C drive to the D drive, which had plenty of available space.

There isn’t any way to move the existing installation, so I was forced to uninstall SQL Server on the C drive, then install it on the D drive.  Here are the steps I followed:

  1. Take a backup of all the databases, just in case.  This is always a good first step when making any significant changes to your environment.
  2. Run the Uninstall Wizard through Windows Control Panel to remove all SQL Server components.
  3. Reinstall SQL Server on the D drive.  I found I had to use an actual iso to do the install rather than the extracted contents of the iso.  When I tried to use the extracted contents I kept running into errors about missing msi files.
  4. Apply any service pack and patches to the installation so it is at least at the same version as the uninstalled instance.  If you skip this step you will not be able to restore/attach any of your existing databases to the new instance.
  5. At this point I expected to be able to move my existing master database file into the new default data folder, but I found my existing master database file had disappeared!  The uninstall must have deleted it.
  6. Instead, I started up SQL Server with the -m parameter in the SQL Server Configuration Manager’s SQL Server Advanced Properties.  This causes SQL Server to start up in single user mode, and only the master database comes online.
  7. Now restore the last backup of the master database:
    C:\> sqlcmd  
    1> RESTORE DATABASE master FROM DISK = 'Z:\SQLServerBackups\master.bak' WITH REPLACE;  
    2> GO
  8. When the restore is complete the service will stop.  Remove the -m parameter and start SQL back up.
  9. At this point everything came up as expected.  There were a few cleanup tasks to complete before I was finished:
  • Reconfigure Reporting Services.
    • During the install I had chosen to install but not configure so that I could plug into my existing SSRS databases.
  • Configure Powershell
    • The msdb.dbo.syssubsystems table contains information about Powershell that SQL Server uses when executing a PS script.  Mine was pointing to a subsystem dll and agent exe that were in the old installation location.  I updated this directly in the table with an UPDATE statement.

Once complete, SQL was ready to use, I had SSAS installed, and I opened up an additional 3 GB of hard drive space on the C drive, relieving the fear of crashing the OS.

Float to varchar – conversion confusion

I don’t own my own ice fishing gear.  Between the shanty, the auger, the tip-ups, the rods/reels, and all the other miscellaneous equipment, you’re looking at a $500 inventment minimum. If you want to do it comfortably, it’s probably closer to $1,000. So I have been relying on friends and family to go out about once a winter. Since I’m not familiar with the winter patterns, this is probably better anyway.
I recently asked my brother in law if he wanted to go out, along with our kids. He said we would find a weekend when it would work, but looking at the forecast, I’m thinking that may not be for quite a while.
Weather Forecast.PNG
Rain is not good for ice. Since it’s already mid-January, we may be looking at February before the ice hardens back up enough to trust.
I recently ran into an issue that caused me a few minutes confusion. I was given a file that contained IDs in Excel. I needed to update some values in a table that contained these IDs as the primary key. I uploaded the data into SQL using the Data Upload wizard in SSMS. I used all the defaults except for giving the table a unique name that included the date. I use this type of nomenclature so I can periodically drop all the tables that have been created by ad-hoc uploads.
I did a quick SELECT from the newly created tables to ensure everything looked correct, and it did.
20170117-conversion-confusion-initial-select
Next I joined to my table that needed to be updated. I found the field to be joined on, OrderNumber, was created as float in my newly uploaded table. In the table to be updated, the column was a varchar(100). I did a simple CAST to try to join them together. I was surprised to see no results returned. I tried again while trimming each of the columns to join, and again no results.
20170117-conversion-confusion-joined-result
This was not making any sense to me. I next picked an order I knew was in both data sets and SELECTed the rows from each one separately to see if they should match. This showed the same order in both data sets.
20170117-conversion-confusion-both-tables-separately
From what I could see, both these columns should join together perfectly. I tried formulating the query differently, but this again provided no results.
20170117-conversion-confusion-different-query-setup
Lastly, I ran just the subquery separately. This gave me the clue I needed to figure out what was happening.
20170117-conversion-confusion-subquery-only
The conversion from FLOAT to varchar was bringing over the scientific form of the number as characters. This was clearly not going to match the order numbers in the other table. To fix, I used the STR function, and was able to make my update.
20170117-conversion-confusion-actual-update

So next time you are converting a float to a varchar, remember to use the STR function.  If not, you may get unexpected results.

SQL Server moving system databases Part 2 – master

I decided not to participate in the Rhinelander Hodag Muskie Challenge this past year.  While I enjoyed the experience in 2015, I didn’t feel compelled to try again quite yet.  Having caught no fish, it felt like a bit of a waste of money.  Instead of fishing for free and enjoying the peace and serenity of the lakes, instead it cost $250 and we were jammed into small lakes with several other boats trying to find some free water to fish.  I definitely see myself giving it another shot some time in the future, maybe even this year (2017).  The thrill of catching a big fish would only be magnified by thrill of hoping our catch was greater than all the other boats’ catches.  It’s just something that needs to be enjoyed in moderation, so it doesn’t become a waste of money when fishing is slow.

In my previous post, I demonstrated moving the msdb and model system databases.  This time I’m going to show how to move master, which is more complicated.  The master database in SQL Server holds all the system level information for SQL Server – all the logins, linked servers, endpoints and other system-wide configuration settings.  The master database also holds information about all the other databases and the locations of their files.  Because it is the “main” database, moving the files becomes more difficult – and dangerous.  If you lose the master database, you are going to have problems with basic requirements such as logging in.  Microsoft help states “SQL Server cannot start if the master database is unavailable”.

The first step in moving the database is to open SQL Server Configuration Manager and go to the Properties for SQL Server (MSSQLSERVER).

20170109 Move master database.PNG

Under the Advanced tab, you will see the current file locations specified in the Startup Parameters field.

20170109-move-master-db-current-location

The -d argument is the location of the data file and the -l argument is the location of the log file.  The -e is the location of the error log, in case you want to move that as well.

Update the file locations in the startup parameters to wherever you plan to move the files.

20170109-move-master-db-new-location

You will get a message that the changes were saved, but don’t go into effect until the service is restarted.

20170109-move-master-service-restart-required

Stop the service.

20170109-move-master-stop-service

Now move the files from their current location:

20170109-move-master-current-files-in-folder

To the new locations you specified in the startup parameters.

20170109-move-master-new-folder-location-for-data

20170109-move-master-new-folder-location-for-log

Lastly, start up the SQL Server Service, and you should be good to go!  You can verify the new file locations by running this query:

SELECT  name
      , physical_name AS CurrentLocation
      , state_desc
FROM    sys.master_files
WHERE   database_id = DB_ID('master');
GO

20170109 move master new location query.PNG