Being an avid fisherman, I love good fishing art. I found the picture above using a Google search. The pictures online are good, but aren’t comparable to seeing them in person. I recently saw a picture of a muskie called Windfall Muskie, painted by Terry Doughty. This is the best painting of a muskie I’ve seen so far, a beautiful piece of art. I love seeing how an artist can capture the thrill of fishing on a canvas.
Previously, I described how to begin setting up a data warehouse by choosing a business process, selecting the grain, and building the dimension tables. This post will focus on the last step, building the fact tables. The fact tables contain a foreign key link to each associated dimension table as well as the measures that we will be using in the reports. The measures are usually numeric values that can be aggregated in the report. For example, for retail orders I may want to measure product price, discount, tax paid, and other metrics related to price and cost paid in an order.
Fact tables will usually be very long and narrow, as opposed to dimension tables which are generally short and wide. Fact tables will hold one line for each transaction in your business process, based on your chosen grain. For retail orders, each line in a transaction table might be an order line. The table would have an Order ID, Order Line Number, Line Price, Quantity Purchased, and then foreign keys to all the descriptive dimension tables, such as Customer, Product, Purchase Date, Purchase Location, etc.
Once the fact table is built and loaded, you can easily find totals at a level aggregated by any of the descriptors. For example, you can get the quantity purchased by store, the total order value by customer, or the most popular product by date. Here is an example of how you can join the fact to dimension tables:
FROM dbo.factOrderPlaced fop
INNER JOIN dbo.dimCustomer dc ON dc.CustomerKey = fop.CustomerKey
INNER JOIN dbo.dimProduct dp ON dp.ProductKey = fop.ProductKey
INNER JOIN dbo.dimDate dd ON dd.DateKey = fop.OrderDateKey
INNER JOIN dbo.dimOrderChannel oc ON oc.OrdChanKey = fop.OrdChanKey
GROUP BY dc.CustomerState
I hope this three part series has given you a basic idea of how to go about starting a data warehouse.