Site icon Paul Turley's SQL Server BI Blog

SQL, M or DAX?

[ Related posts in this series: SQL, M or DAX: When Does it Matter? SQL, M or Dax? – part 2 SQL or M? – SSAS Partitions Using SQL Server Table-Valued Functions (UDFs) SQL or M? – SSAS Partitions in Power Query/M ]

We live in a world of choices and we have many tools at our disposal.  In Microsoft Business Intelligence solutions using tools like Power BI and SQL Server Analysis Services, you have at least three different ways to perform data collection, transformations and calculations.  A question I get all the time is: “Which database or BI tool should be used to perform routine tasks?  Is it best to shape and transform data at the source, in Power Query using M script, or in the data model using DAX?”

In this series, I’ll demonstrate options for creating utility and dimension tables, columns and calculations using each option and discuss the advantages, disadvantages and recommended practice for each.

I welcome your questions and ideas on these topics.  Please post comments to this post with your questions and challenges.  Let’s get started with one of the most common examples…

Creating a Date Dimension Table

A Date dimension table is an essential component in most any data warehouse or reporting database so techniques to generate these tables have been around for a long time.  The foundation of a Date dimension table is a table containing one row per contiguous date in a range that includes every possible transaction date or fact record.  To make reporting easier, it is common practice to have multiple date dimensions in the semantic model.  For example, if sales transaction facts have an Order Date and a Delivery Date, and both are used independently for reporting; there may be an Order Date dimension and a Delivery Date dimension in the model.

A common practice for building the dimension table is to just populate a single Date type column with the sequential date values.  After these rows are inserted, date part functions may be used to populate additional columns by referencing the Date value in an expression.  Most every language includes, for example, a MONTH() and YEAR() function to convert a date value into these date parts.

SQL

If you have a data warehouse or a relational database specifically suited to support your Power BI and reporting models, use that to define all of your tables using conventional techniques like T-SQL.  Examples for generating a date reference or dimension table are easy to find online, primarily because this is the oldest and most enduring technique, used for many years in conventional data warehouse design.  T-SQL is a flexible language but the SQL date part functions are pretty bare bones.  In the end, it really comes down to preference and language familiarity.

I think there is a good argument to be made for not only defining a date dimension using familiar SQL script but for persisting the table in the data warehouse along with other standardized dimension tables.  This approach is optimal when you are working with SQL Server or another relational database as your primary data source.

Reporting, BI and dashboard projects don’t always reply on a data warehouse.  Self-service BI solutions usually start with ad-hoc data mashups to support analytic reports rather than a holistic IT-driven solution.  If you aren’t using a relational database as the primary data source, you may be better off using a tool managed within Power BI or SSAS.

Example

There are several different techniques that include using a cursor or a WHILE loop to iterate through each date in a range, one row at a time.  One of the best techniques I’ve found is this example from Aaron Bertrand.  Adding special columns to keep track of holidays or special calendar periods (like Fiscal, 4-5-4, ISO, etc.) can require a lot of complex code.

— Date dimension script by Aaron Bertrand:

https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server


CREATE TABLE #dim
(
[date]       DATE PRIMARY KEY,
[day]        AS DATEPART(DAY,      [date]),
[month]      AS DATEPART(MONTH,    [date]),
FirstOfMonth AS CONVERT(DATE, DATEADD(MONTH, DATEDIFF(MONTH, 0, [date]), 0)),
[MonthName]  AS DATENAME(MONTH,    [date]),
[week]       AS DATEPART(WEEK,     [date]),
[ISOweek]    AS DATEPART(ISO_WEEK, [date]),
[DayOfWeek]  AS DATEPART(WEEKDAY,  [date]),
[quarter]    AS DATEPART(QUARTER,  [date]),
[year]     AS DATEPART(YEAR,     [date]),
FirstOfYear  AS CONVERT(DATE, DATEADD(YEAR,  DATEDIFF(YEAR,  0, [date]), 0)),
Style112     AS CONVERT(CHAR(8),   [date], 112),
Style101     AS CONVERT(CHAR(10),  [date], 101)
);

— use the catalog views to generate as many rows as we need

INSERT #dim([date])

SELECT d
FROM
(
SELECT d = DATEADD(DAY, rn – 1, @StartDate)
FROM
(
SELECT TOP (DATEDIFF(DAY, @StartDate, @CutoffDate))
rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
— on my system this would support > 5 million days
ORDER BY s1.[object_id]
) AS x
) AS y;

Power Query/M

In my opinion, Power Query is the best choice when other query transformations are also managed in Power Query.  For simplicity, you can keep all of your query and transformation logic in one place.  If you are just getting started with Power BI and aren’t inclined to use a different technique, use this one.

Example

Nearly every step in this process can be performed using menu selections and simple features in the Power Query user interface.  It just takes a little creativity to get started.  I’ve done this using a few different approaching until arriving at this one.  It’s easiest and most flexible.

  • Start by creating two parameters named “Dates From” and “Dates To”.  Assign them values to define the range of dates you need in the date dimension table; like January 1, 2010 and December 31, 2018.
  • Use the Get Data menu to create a Blank Query
  • The first two steps need to be entered manually.  Open the Advanced Editor and pastes these first two lines on a new line after the “let” command:

     DateCount = Duration.Days(Duration.From( #”Dates To” – #”Dates From” )),
Source = List.Dates(#”Dates From”, DateCount, #duration(1,0,0,0))

  • Switch back to the Transform ribbon tab and then click Convert > To Table
  • Change the name of the new date columna dn change the data type to Date
  • At this point, you can simply use the menus on the Add Columns ribbon to generate all of the date part columns you need in the date dimension table

The resulting M query can be viewed in the Advanced Editor:

let
DateCount = Duration.Days(Duration.From( #”Dates To” – #”Dates From” )),
Source = List.Dates(#”Dates From”, DateCount, #duration(1,0,0,0)),
TableFromList = Table.FromList(Source, Splitter.SplitByNothing()),
#”Renamed Columns” = Table.RenameColumns(TableFromList,{{“Column1”, “Date”}}),
#”Changed Type” = Table.TransformColumnTypes(#”Renamed Columns”,{{“Date”, type date}}),
#”Inserted Year” = Table.AddColumn(#”Changed Type”, “Year”, each Date.Year([Date]), Int64.Type),
#”Inserted Month” = Table.AddColumn(#”Inserted Year”, “Month”, each Date.Month([Date]), Int64.Type),
#”Inserted Month Name” = Table.AddColumn(#”Inserted Month”, “Month Name”, each Date.MonthName([Date]), type text),
#”Inserted Quarter” = Table.AddColumn(#”Inserted Month Name”, “Quarter”, each Date.QuarterOfYear([Date]), Int64.Type),
#”Inserted Week of Year” = Table.AddColumn(#”Inserted Quarter”, “Week of Year”, each Date.WeekOfYear([Date]), Int64.Type),
#”Inserted Week of Month” = Table.AddColumn(#”Inserted Week of Year”, “Week of Month”, each Date.WeekOfMonth([Date]), Int64.Type),
#”Inserted Day” = Table.AddColumn(#”Inserted Week of Month”, “Day”, each Date.Day([Date]), Int64.Type),
#”Inserted Day of Week” = Table.AddColumn(#”Inserted Day”, “Day of Week”, each Date.DayOfWeek([Date]), Int64.Type),
#”Inserted Day of Year” = Table.AddColumn(#”Inserted Day of Week”, “Day of Year”, each Date.DayOfYear([Date]), Int64.Type),
#”Inserted Day Name” = Table.AddColumn(#”Inserted Day of Year”, “Day Name”, each Date.DayOfWeekName([Date]), type text),
#”Renamed Columns1″ = Table.RenameColumns(#”Inserted Day Name”,{{“Month”, “Month Number”}, {“Quarter”, “Quarter of Year Number”}}),
#”Added Custom” = Table.AddColumn(#”Renamed Columns1″, “Quarter Name”, each “Q” & Number.ToText([Quarter of Year Number])),
#”Renamed Columns2″ = Table.RenameColumns(#”Added Custom”,{{“Day”, “Day of Month”}}),
#”Changed Type1″ = Table.TransformColumnTypes(#”Renamed Columns2″,{{“Quarter Name”, type text}}),
#”Reordered Columns” = Table.ReorderColumns(#”Changed Type1″,{“Date”, “Year”, “Month Number”, “Month Name”, “Quarter of Year Number”, “Quarter Name”, “Week of Year”, “Week of Month”, “Day of Month”, “Day of Week”, “Day of Year”, “Day Name”})

in
#”Reordered Columns”

Beyond ordinary Gregorian calendar date parts, specialized columns like Fiscal periods, holiday flags and components of a 4-5-4 calendar are a little easier to do in M because

the language includes advanced functions to support complex formulas.  I’ll share some of these advanced techniques in a later post.

DAX

Calculated tables were recently added to the tabular model designer in both Power BI Desktop and the tabular model project editor in SQL Server Data Tools (SSDT) for Visual Studio.  This feature uses a handful of new table-based DAX functions, which include CALENDAR and CALENDARAUTO, for easily defining date dimension tables directly in the model.

Getting started is simple:

       My Calendar = CALENDAR(date(2018,1,1), date(2018, 12, 31))

This post on the Power BI Tips site demonstrates a few variations of DAX-generated calendar tables: https://powerbi.tips/2017/11/creating-a-dax-calendar/

Each column in the table will be a separate expression or calculated column.  With regard to performance or model optimization, there is no additional overhead or good argument not to use DAX to generate a date dimension table.  However, using Power Query & M to transform data and create some tables and then DAX to generate other tables in the model can be more messy than keeping everything in one place.

So, why are there two different ways to create tables in Power BI?

This is an excellent question and it is really just an artifact of the evolution of the product and its constituent technologies.  The modeling tools behind Power BI (DAX, VertiPaq & the SSAS Tabular model) were created first and became the Power Pivot add-in for Excel.  As the DAX language evolved, that development team gave us the ability to generate tables using DAX Script.  Not long after that, a separate product team created Power Query and the M data mashup language.  Power Query was also made available as an add-in for Excel.  Eventually both tools found their way into the Power BI Desktop product.

Final Recommendation

If you have a SQL Server data warehouse, you can use SQL to create date dimension tables.  It’s usually best to unify all of the reporting data in the data warehouse or data mart to create a single version of the truth for reporting.  You can also use tools like SSIS or Azure Data Factory to build and manage these objects before the data is imported into Power BI or the Analysis Services data model.

If using Power BI only, use the Get Data tools to build all the tables, including the date dimension(s).  There is nothing wrong with using the DAX techniques but that is my second choice in the Power BI toolbox, for this particular need.

Exit mobile version