SQL or M? – SSAS Partitions

Part 1: Using SQL Server Table-Valued Functions (UDFs)

In SQL Server Analysis Services projects, as of SQL Server Data Tools 2017, you can define table partitions using Power Query.  Of course, we still have the option to use SQL Server database objects like views or user-defined functions.  So, which of these two option makes most sense?  The same concepts and decision points apply to Power BI data models although the design experience is quite a bit different.

The following steps will bring us to a question: Using the new SSDT partition design method for SSAS 2017, should I define partition filtering logic in SQL or in Power Query/M?

The objective is to define three partitions in the data model for the Sales fact table in the ContosoDW database:

  • New transactions added in the current week
  • Adjusting entries for the current month
  • Historic records prior to the current month

New sales transactions in the source database needs to be refreshed in the data model every hour for reporting.  Reprocessing only the records since the beginning of the current week takes seconds to minutes.  If we schedule that partition to refresh every hour, users can have up-to-date reports throughout the day.  In addition to new transactions, adjusting records are made weekly but only to records in the current month before the end-of-month closing of the books.  Records in the current month that are older than the current week might be updated on occasion but changes don’t need to be available until the weekend.  Records older than a month rarely change and don’t need to be refreshed but once a month.  By scheduling only the first or second partition to process, data can be updated without requiring tens of millions of historical records to be reloaded.

Partitioning with a SQL User-Defined Function

I’ll step through the more conventional method we’ve been using for many years.  I’ve written the following T-SQL table-valued User-Defined Function named fnSalesPartitionForPeriod.  Three possible input parameter values allow the function to return rows for the past week, for the past month (up to the past week) or for all dates previous to the current month.

Here is the T-SQL script for a table-valued user-defined function created in SQL Server.  Passing in one of three parameter values will cause it to return the desired records.

/******************************
     User-defined function used to partition Sales fact table in SSAS tabular model
     @Period values:
         PriorToThisMonth
         ThisMonthPriorToThisWeek
         ThisWeek
*******************************/
create function dbo.fnSalesPartitionForPeriod
     ( @Period varchar(100) )
returns table
return
     select * from [ContosoDW].[dbo].[FactSalesCompleteDates]
     where
     (@Period = ‘BeforeThisMonth’
         and
         [DateKey] < dateadd(month, datediff(month, 0, getdate()), 0)
     )
     or
     (@Period = ‘ThisWeek’
         and
         [DateKey] >= dateadd(week, datediff(week, 0, getdate()), 0)
     )
     or
     (@Period = ‘ThisMonthBeforeThisWeek’
         and
         [DateKey] >= dateadd(month, datediff(month, 0, getdate()), 0)
         and
         [DateKey] < dateadd(week, datediff(week, 0, getdate()), 0)
     )
;
go

To create the three Sales table partitions using this UDF, I start by importing one table.  Here’s the Import Table dialog for the new Sales table in the data model.  I’ve selected the new UDF and entered the parameter value ‘BeforeThisMonth’ to define the first partition.

image

This part gets tricky and quite honestly, I rarely get the steps right the first time through.  I haven’t quite decided yet if my routine struggle with the SSDT Power Query editor are because I expect it work like it does in Power BI Desktop or if it truly has some quirks that catch me off guard.  Regardless, I’m cautious to save copies of my work and if something doesn’t work, I delete the query and repeat the steps.

The query editor was smart enough to create an M function from the UDF query and this function needs to be invoked to generate the new Sales table.  Enter the parameter value once again and click the Invoke button. 

image

Change the name of the new query to “Sales” and make sure that the query is set to “Create New Table”, then click the Import button on the toolbar.

image

After the table is imported, click the Partitions button on the SSDT toolbar.  As you can see, the Power Query “M” script for the Sales table calls the function and passes the parameter value I had set.  This default partition should be renamed and the other two partitions should be added using different parameter values.

image

Updating and adding the partitions is fairly simple, using these steps:

  1. Copy the original partition
  2. Rename the new partition
  3. Change the function parameter value

Rename the current partition with a friendly name.   Clicking the Copy button twice gives me two copies of the parameter.  You can see that I’ve commented the code with the valid UDF parameter values.

SNAGHTML3791b15

Now the table can be refreshed incrementally and only new transaction records for the current week or month can be updated during schedule refresh cycles.

Partitioning with Power Query

No matter what the data source is; whether you use table-valued UDFs, views or in-line SQL, you are still using Power Query to define tables – so why not just use Power Query without creating database objects?

In another post, I’ll repeat the exercise using only Power Query to define the same partitions.  Stay tuned.

Hands-on Workshops at the Pacific Northwest Power BI Symposium

Please join Power BI authors and community leaders for an afternoon and evening of deep learning.  Featured presenters include “Guy In A Cube” Adam Saxton and international author and Excel MVP, Matt Allington.  Deepen your skills with Power BI and get a recap from the Microsoft Business Applications Summit held in Seattle earlier in the week.

Jul 26, 2018 from 3:00 PM to 8:30 PM Portland, OR

Associated with  Portland Power BI User Group

Event Image

AGENDA –

3:00 ; check-in and registration

3:45 ; guide guests to workshops

4:00 ; workshops commence, ONLY REGISTER FOR ONE:

workshop 1 – TRISTAN MALHERBE (AZEO) – Power Query to Create a Calendar Table

register here: https://bit.ly/2K3U0lT

workshop 2 – PAUL TURLEY (CSG Pro) – Model and Visualize Financial and Accounting Data with Power BI and Excel

register here: https://bit.ly/2JOJy5v

5:00 ; workshops wrap up, seating in main event area (dinner served / networking)

5:35 ; presentation intro – RON ELLIS GAUT (CSG Pro / Portland Power BI User Group Leader)

6:00 ; presentation 1 – BRIAN GRANT (CSG Pro) – Shining a New Light on Calculate

6:45 ; presentation 2 – ADAM SAXTON (Microsoft / Guy in a Cube) – Business Applications Summit Recap

7:30 ; presentation 3 – MATT ALLINGTON (Excelerator BI) – DAX as a Query Language

8:15 ; closing remarks

8:30 PM ; event ends

Questions?  Contact Gregory Petrossian: gregp@csgpro.com for sponsorship options

SQL, M or DAX: When Does it Matter?

 

Column-based calculations are part of every BI project. Some of the most common examples include building a street address column from individual fields, concatenating a person’s full name from First, Middle and Last Name fields; or creating a location string from City, State and Country fields. More complex examples might require a lookup or join operation to get a reference value used in a calculation that is then stored as a column on each entity record.  Keep in mind that we are strictly talking about calculated column values that are stored for each row and not dynamic calculations that run in the context of filters and slicers.  Those are measures and that is a separate topic.

It Depends, or It Matters (one or the other)

If the data source is a relational database that supports queries, should you perform these calculations in SQL, Power Query “M”, or in a DAX calculated column? The standard tongue-in-cheek answer from a consultant is usually “it depends”. That was the answer I grew up with, but apparently popular language has changed in the last generation. My kids, who are all now young adults, say “it matters”. Back in the day, if I said “Hey, Dad. Can we get ice cream on the way to the store?” He would say, “It depends on whether you get your chores done.” My kids would say “it matters whether I get back from the beach on time.” So, it either depends or it matters, I guess.

Self-service BI is all about having the freedom to create reports that make an impact and bring important value to business users and leaders.  When importing, shaping and modeling data; if we can get simple and mundane tasks out of the way, this leaves time and energy to move on to more important things.  If you can just get the core table structures in-place; with unique keys, calculated columns, and numeric columns for summaries and aggregate measures, you can design the more impactful bits of the solution to support the report design.

Many BI projects start out the same way, with aspirations to import data from several different sources, to work-out the complexities of cleansing and matching records in various tables to create a nice uniform data model used to build all kinds of beautiful dashboards and interactive reports.  Our optimism about making quick progress at the beginning of the project is often squelched when we realize that the data source for a lookup table isn’t reliable, and that the system of record is an application controlled by a different business group in some remote corner of the organization. The data is in a different format, access is restricted and the person in charge of managing it is on extended leave. We get caught-up in the complications of just getting essential data into the model and then deliver far less than expected. I can’t tell you how common this scenario is – especially in larger projects.

For calculated columns that end up stored in a data model table, there is rarely a difference in performance, storage or report query speed based on the technique used to calculate the column value. In cases where there is a technical advantage, the decision should be clear – use the most optimal method that is feasible. In the majority of cases where there is no strong technical argument for one method or the other, use the method that simplifies development and maintenance, and offers more control.

You should have a standard method for managing calculated columns, so you know how to maintain them down the road. This might seem trivial so why does it matter so much?

The data model schema is the foundation for your reporting solution and making changes after the rest of the solution is designed can be catastrophic if you don’t plan and manage future changes. A semantic data model is literally a house of cards. Deleting, renaming or changing the data type for a column could break every calculation and the report visual referencing that column. Whether you should create these calculated values in a source query using hand-written SQL or a database object like a view or user-defined function, in Power Query or as a calculated column using DAX – that will depend on who maintains the Power BI or SSAS model and who should manage the design in the future.

SQL and Database Objects

As a general rule of thumb, in formal SSAS projects built on a relational data mart or data warehouse that is managed by the same project team as the BI data model, I typically recommend that every table in the model import data from a corresponding view or UDF stored and managed in the relational database. Keep in mind that is the way we’ve been designing Microsoft BI projects for several years. Performing simple tasks like renaming columns in the SSAS data model designer was slow and cumbersome. Performing this part of the data prep in T-SQL was much easier than in SSDT. With the recent advent of Power Query in SQL Server Data Tools, there is a good argument to be made for managing those transformations but the tool is still new and frankly I’m still testing the water. Again, keep changes in one place for future maintenance.

Do your absolute best to avoid writing complex SQL query logic that cannot be traced back to the sources. Complicated queries can become a black box – and a Pandora’s box if they aren’t documented, annotated and easy to decipher.

Power Query/M

For less formal projects in Power BI data models, Power Query is king. We’ve never had a tool so flexible and easy to use. If I’m importing data from multiple sources into a single model, you bet I’m going to use Power Query instead of SQL queries because I’ll know where to find and manage all the query definitions.

DAX Calculated Columns

Why not use DAX calculated columns? There is a good argument for using DAX. It’s quick and easy, and sometime more convenient. If I add a custom column to a multi-million row table defined in Power Query, I have to re-process the table to see the new column. If I use DAX, I don’t have to wait. If the calculation relies on a DAX calculation residing in the rest of the model, DAX is the clear winner. These cases are less common though. Once again, I’ll make the argument to manage calculations, as much as possible, in one place.

IT Process, Business Culture, Team Dynamics, Rules & Restrictions

Now that you have some clear criteria for always implementing column calculation in either SQL, Power Query or DAX; let me inject some reality back into the “it depends or it matters” equation.

I do my best to put cynicism aside and focus on what it takes to get IT projects over the finish line. I’ve found that good BI project practitioners positive, optimistic and tough skinned; although there are many forces at work to change this disposition. If you know what I mean, no further explanation is needed. If you don’t, you will. Wherever you choose to work, just do what you can to maintain your perspective throughout your career.

I’ll give one example that represents situations I’ve encountered on several larger, formal BI projects over the years:

The BI project Architect, Database Administrator, Lead Developer and IT Director all agree that any schema dependencies on the data warehouse or data mart should be managed using database views. This is a paramount rule in the solution architecture. The SSAS data model and Power BI data model developers should import tables from these views. The team is using a pure Agile mythology and will use JIRA to manage and assign tasks performed in two-week team sprints.

Based on high-level report requirements documented by the Business Systems Analyst, the Lead Architect creates views in the database build script. The ETL developer must stage the source data and then the data warehouse ETL developer populates the dimension table before the view can be created, which takes 3 sprints or six weeks. The Power BI data model developer adds the table to the model in the 4th sprint. After a prototype report is created, the BSA gets feedback from a stakeholder user who tells us that customer names should be in a single column rather than separate first name and last name columns. A task is added in JIRA to modify the view with another task to refresh the data model, so it takes two weeks to add the column.

After the data warehouse is in production, the Power BI report developer gets word that the customer city, state and zip code need to be concatenated into a single column and marked as a geographical location, so they can be used in a map visual. The data warehouse is in production and managed by an offshore DBA group. A support ticket is created to request that the view used to populate the customer table be altered and a CustomerLocation column be added. Three days later, the contracted help desk determines this is not in their area of responsibility and closes the ticket as “completed” while the model developer continues to wait for a call or email. The email goes to the IT Director, who happily dismisses it since it was marked as “completed”. Two weeks later, the issue resurfaces and the Project Manager organizes a meeting with the IT Director, BI Lead Architect, BI Lead Developer, Database Developer, In-house DBA and Help Desk Contractor Liaison to resolve the issue. In the meantime, users have exported their report to Excel and are working around the issue using a copy of the data.

A month after the request, the Power BI Developer spends 2 minutes creating a DAX calculated column and then creates the map report.

SQL, M or Dax? – part 2

This is a post about a post about a post.  Thanks to those of you who are entering comments in the original May 12 post titled SQL, M or DAX?  This is a popular topic. And thanks to Adam Saxton for mentioning this post in his Guy in A Cube Weekly Roundup.

This is a HUUUUGE topic and I can tell that I’ve struck a chord with many BI practitioners by bringing it up.  Please post your comments and share your ideas.  I’m particularly interested in hearing your challenging questions and your thoughts about the pros-and-cons of some less-obvious choices about whether to implement transformations & calculations in SQL, M or DAX.

This week, I have had engaging conversations on this topic while working on a Power BI consulting project for a large municipal court system.  As a consultant, I’ve had three weeks of experience with their data and business environment.  The internal staff have spent decades negotiating the intricacies and layers upon layers of business process so of course, I want to learn from their experience but I also want to cautiously pursue opportunities to think outside the box.  That’s why they hired me.

Tell me if this situation resonates with you…  Working with a SQL Server database developer who is really good with T-SQL but fairly new to Power BI & tabular modeling, we’re building a data model and reports sourced from a line-of-business application’s SQL Server database.  They’ve been writing reports using some pretty complicated SQL queries embedded in SSRS paginated reports.  Every time a user wants a new report, a request is sent to the IT group.  A developer picks up the request, writes some gnarly T-SQL query with pre-calculated columns and business rules.  Complex reports might take days or weeks of development time.  I needed to update a dimension table in the data model and needed a calculated column to differentiate case types.  Turns out that it wasn’t a simple addition and his response was “I’ll just send you the SQL for that…you can just paste it”.  The dilemma here is that all the complicated business rules had already been resolved using layers of T-SQL common table expressions (CTEs), nested subqueries and CASE statements.  It was very well-written SQL and it would take considerable effort to re-engineer the logic into a dimensional tabular model to support general-use reporting.  After beginning to nod-off while reading through the layers of SQL script, my initial reaction was to just paste the code and be done with it.  After all, someone had already solved this problem, right?

The trade-off by using the existing T-SQL code is that the calculations and business rules are applied at a fixed level of granularity and within a certain business context.  The query would need to be rewritten to answer different business questions.  If we take the “black box” approach and paste the working and tested SQL script into the Power Query table definition, chances are that we won’t be able to explain the query logic in a few months, after we’ve moved on and forgotten this business problem.  If you are trying to create a general-purpose data model to answer yet-to-be-defined questions, then you need to use design patterns that allow developers and users to navigate the model at different levels of grain across different dimension tables, and in different filtering contexts.  This isn’t always the right answer but in this case, I am recommending that we do as little data merging, joining and manipulation as possible in the underlying source queries.  But, the table mapping between source and data model are not one-to-one.  In some cases, two or three source tables are combined using SQL joins, into a flattened and simplified lookup table – containing only the necessary, friendly-named columns and keys, and no unnecessary clutter like CreatedDateTime, ModifiedDateTime and CreatedByUser columns.  Use custom columns in M/Power Query to transform the row-level calculated values and DAX measures to perform calculations in aggregate and within filter/slicing/grouping context.

I’d love to hear your thoughts and ideas on this topic.

 

 

 

 

 

 

 

 

SQL, M or DAX?

We live in a world of choices and we have many tools at our disposal.  In Microsoft Business Intelligence solutions using tools like Power BI and SQL Server Analysis Services, you have at least three different ways to perform data collection, transformations and calculations.  A question I get all the time is: “Which database or BI tool should be used to perform routine tasks?  Is it best to shape and transform data at the source, in Power Query using M script, or in the data model using DAX?”

In this series, I’ll demonstrate options for creating utility and dimension tables, columns and calculations using each option and discuss the advantages, disadvantages and recommended practice for each.

SNAGHTML176b0f5

I welcome your questions and ideas on these topics.  Please post comments to this post with your questions and challenges.  Let’s get started with one of the most common examples…

Creating a Date Dimension Table

A Date dimension table is an essential component in most any data warehouse or reporting database so techniques to generate these tables have been around for a long time.  The foundation of a Date dimension table is a table containing one row per contiguous date in a range that includes every possible transaction date or fact record.  To make reporting easier, it is common practice to have multiple date dimensions in the semantic model.  For example, if sales transaction facts have an Order Date and a Delivery Date, and both are used independently for reporting; there may be an Order Date dimension and a Delivery Date dimension in the model.

A common practice for building the dimension table is to just populate a single Date type column with the sequential date values.  After these rows are inserted, date part functions may be used to populate additional columns by referencing the Date value in an expression.  Most every language includes, for example, a MONTH() and YEAR() function to convert a date value into these date parts.

SQL

If you have a data warehouse or a relational database specifically suited to support your Power BI and reporting models, use that to define all of your tables using conventional techniques like T-SQL.  Examples for generating a date reference or dimension table are easy to find online, primarily because this is the oldest and most enduring technique, used for many years in conventional data warehouse design.  T-SQL is a flexible language but the SQL date part functions are pretty bare bones.  In the end, it really comes down to preference and language familiarity.

I think there is a good argument to be made for not only defining a date dimension using familiar SQL script but for persisting the table in the data warehouse along with other standardized dimension tables.  This approach is optimal when you are working with SQL Server or another relational database as your primary data source.

Reporting, BI and dashboard projects don’t always reply on a data warehouse.  Self-service BI solutions usually start with ad-hoc data mashups to support analytic reports rather than a holistic IT-driven solution.  If you aren’t using a relational database as the primary data source, you may be better off using a tool managed within Power BI or SSAS.

Example

There are several different techniques that include using a cursor or a WHILE loop to iterate through each date in a range, one row at a time.  One of the best techniques I’ve found is this example from Aaron Bertrand.  Adding special columns to keep track of holidays or special calendar periods (like Fiscal, 4-5-4, ISO, etc.) can require a lot of complex code.

— Date dimension script by Aaron Bertrand:

https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server


CREATE TABLE #dim
(
[date]       DATE PRIMARY KEY,
[day]        AS DATEPART(DAY,      [date]),
[month]      AS DATEPART(MONTH,    [date]),
FirstOfMonth AS CONVERT(DATE, DATEADD(MONTH, DATEDIFF(MONTH, 0, [date]), 0)),
[MonthName]  AS DATENAME(MONTH,    [date]),
[week]       AS DATEPART(WEEK,     [date]),
[ISOweek]    AS DATEPART(ISO_WEEK, [date]),
[DayOfWeek]  AS DATEPART(WEEKDAY,  [date]),
[quarter]    AS DATEPART(QUARTER,  [date]),
[year]     AS DATEPART(YEAR,     [date]),
FirstOfYear  AS CONVERT(DATE, DATEADD(YEAR,  DATEDIFF(YEAR,  0, [date]), 0)),
Style112     AS CONVERT(CHAR(8),   [date], 112),
Style101     AS CONVERT(CHAR(10),  [date], 101)
);

— use the catalog views to generate as many rows as we need

INSERT #dim([date])

SELECT d
FROM
(
SELECT d = DATEADD(DAY, rn – 1, @StartDate)
FROM
(
SELECT TOP (DATEDIFF(DAY, @StartDate, @CutoffDate))
rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
— on my system this would support > 5 million days
ORDER BY s1.[object_id]
) AS x
) AS y;

Power Query/M

In my opinion, Power Query is the best choice when other query transformations are also managed in Power Query.  For simplicity, you can keep all of your query and transformation logic in one place.  If you are just getting started with Power BI and aren’t inclined to use a different technique, use this one.

Example

Nearly every step in this process can be performed using menu selections and simple features in the Power Query user interface.  It just takes a little creativity to get started.  I’ve done this using a few different approaching until arriving at this one.  It’s easiest and most flexible.

  • Start by creating two parameters named “Dates From” and “Dates To”.  Assign them values to define the range of dates you need in the date dimension table; like January 1, 2010 and December 31, 2018.
  • Use the Get Data menu to create a Blank Query
  • The first two steps need to be entered manually.  Open the Advanced Editor and pastes these first two lines on a new line after the “let” command:

     DateCount = Duration.Days(Duration.From( #”Dates To” – #”Dates From” )),
Source = List.Dates(#”Dates From”, DateCount, #duration(1,0,0,0))

  • Switch back to the Transform ribbon tab and then click Convert > To Table
  • Change the name of the new date columna dn change the data type to Date
  • At this point, you can simply use the menus on the Add Columns ribbon to generate all of the date part columns you need in the date dimension table

The resulting M query can be viewed in the Advanced Editor:

let
DateCount = Duration.Days(Duration.From( #”Dates To” – #”Dates From” )),
Source = List.Dates(#”Dates From”, DateCount, #duration(1,0,0,0)),
TableFromList = Table.FromList(Source, Splitter.SplitByNothing()),
#”Renamed Columns” = Table.RenameColumns(TableFromList,{{“Column1”, “Date”}}),
#”Changed Type” = Table.TransformColumnTypes(#”Renamed Columns”,{{“Date”, type date}}),
#”Inserted Year” = Table.AddColumn(#”Changed Type”, “Year”, each Date.Year([Date]), Int64.Type),
#”Inserted Month” = Table.AddColumn(#”Inserted Year”, “Month”, each Date.Month([Date]), Int64.Type),
#”Inserted Month Name” = Table.AddColumn(#”Inserted Month”, “Month Name”, each Date.MonthName([Date]), type text),
#”Inserted Quarter” = Table.AddColumn(#”Inserted Month Name”, “Quarter”, each Date.QuarterOfYear([Date]), Int64.Type),
#”Inserted Week of Year” = Table.AddColumn(#”Inserted Quarter”, “Week of Year”, each Date.WeekOfYear([Date]), Int64.Type),
#”Inserted Week of Month” = Table.AddColumn(#”Inserted Week of Year”, “Week of Month”, each Date.WeekOfMonth([Date]), Int64.Type),
#”Inserted Day” = Table.AddColumn(#”Inserted Week of Month”, “Day”, each Date.Day([Date]), Int64.Type),
#”Inserted Day of Week” = Table.AddColumn(#”Inserted Day”, “Day of Week”, each Date.DayOfWeek([Date]), Int64.Type),
#”Inserted Day of Year” = Table.AddColumn(#”Inserted Day of Week”, “Day of Year”, each Date.DayOfYear([Date]), Int64.Type),
#”Inserted Day Name” = Table.AddColumn(#”Inserted Day of Year”, “Day Name”, each Date.DayOfWeekName([Date]), type text),
#”Renamed Columns1″ = Table.RenameColumns(#”Inserted Day Name”,{{“Month”, “Month Number”}, {“Quarter”, “Quarter of Year Number”}}),
#”Added Custom” = Table.AddColumn(#”Renamed Columns1″, “Quarter Name”, each “Q” & Number.ToText([Quarter of Year Number])),
#”Renamed Columns2″ = Table.RenameColumns(#”Added Custom”,{{“Day”, “Day of Month”}}),
#”Changed Type1″ = Table.TransformColumnTypes(#”Renamed Columns2″,{{“Quarter Name”, type text}}),
#”Reordered Columns” = Table.ReorderColumns(#”Changed Type1″,{“Date”, “Year”, “Month Number”, “Month Name”, “Quarter of Year Number”, “Quarter Name”, “Week of Year”, “Week of Month”, “Day of Month”, “Day of Week”, “Day of Year”, “Day Name”})

in
#”Reordered Columns”

Beyond ordinary Gregorian calendar date parts, specialized columns like Fiscal periods, holiday flags and components of a 4-5-4 calendar are a little easier to do in M because

the language includes advanced functions to support complex formulas.  I’ll share some of these advanced techniques in a later post.

DAX

Calculated tables were recently added to the tabular model designer in both Power BI Desktop and the tabular model project editor in SQL Server Data Tools (SSDT) for Visual Studio.  This feature uses a handful of new table-based DAX functions, which include CALENDAR and CALENDARAUTO, for easily defining date dimension tables directly in the model.

Getting started is simple:

  • Click the New Table button on the Modeling ribbon
  • Enter the following script into the formula bar:

       My Calendar = CALENDAR(date(2018,1,1), date(2018, 12, 31))

  • Now you can add new calculated columns and apply the appropriate DAX functions to create date part columns using the Custom Column editor.

This post on the Power BI Tips site demonstrates a few variations of DAX-generated calendar tables: https://powerbi.tips/2017/11/creating-a-dax-calendar/

Each column in the table will be a separate expression or calculated column.  With regard to performance or model optimization, there is no additional overhead or good argument not to use DAX to generate a date dimension table.  However, using Power Query & M to transform data and create some tables and then DAX to generate other tables in the model can be more messy than keeping everything in one place.

So, why are there two different ways to create tables in Power BI?

This is an excellent question and it is really just an artifact of the evolution of the product and its constituent technologies.  The modeling tools behind Power BI (DAX, VertiPaq & the SSAS Tabular model) were created first and became the Power Pivot add-in for Excel.  As the DAX language evolved, that development team gave us the ability to generate tables using DAX Script.  Not long after that, a separate product team created Power Query and the M data mashup language.  Power Query was also made available as an add-in for Excel.  Eventually both tools found their way into the Power BI Desktop product.

Final Recommendation

If you have a SQL Server data warehouse, you can use SQL to create date dimension tables.  It’s usually best to unify all of the reporting data in the data warehouse or data mart to create a single version of the truth for reporting.  You can also use tools like SSIS or Azure Data Factory to build and manage these objects before the data is imported into Power BI or the Analysis Services data model.

If using Power BI only, use the Get Data tools to build all the tables, including the date dimension(s).  There is nothing wrong with using the DAX techniques but that is my second choice in the Power BI toolbox, for this particular need.

Facebook Live Pop-up Session Recording

A big THANK YOU to everyone who attended the Facebook Live Pop-up session today. This was a fun event and I enjoyed taking and answering your questions. A recording of the live session is available right here:

We’re not quite sure why the video jumped around a bit but it didn’t seem to be too much of a distractor. We tested everything and had no issues until the event (of course!). I recently upgraded my older LifeCam 1080p camera to the LifeCam Studio HD camera – so maybe blasting more bits through the service caused some unrest. With that exception, I’d love to have your feedback about the format and the whole live Q&A concept.

Another concept I’m kicking around is to provide a forum for you and others to request guided training content based on your questions.  It would be sort of a Q&A forum that would drive the way we build online training lessons.  What do you think?
Please post your comments below.

PASS Facebook Live Pop-up Expert Series

There are some great learning opportunities available from PASS and I am exciting to participate in two online events this month!

Please join me on April 24 for a live chat about all things BI, reporting and data analytics.  Ask me anything you want about these or related topics and I’ll answer your questions, talk about my experience or find out what the community has to say.  The session is on Tuesday, April 24th at 6PM UTC (that’s 11 AM here in Pacific Time).  Follow the image link to put it on your calendar.  You can use the comments on the Facebook post or send an email if you’d like to queue up your questions ahead of time.

Here are some topics to get you started:

  • Is self-service reporting and data modeling really sustainable?
  • New features are released (monthly).  How do we keep (IT/or users) up to speed?
  • Where can we find best practice guidance for our solutions?
  • What’s the best tool to use for a certain style of reporting solution?
  • Differences between Power BI in the service and on-premises
  • What is the future for SSRS and Power BI Report Server?
  • How do I license Power BI, Report Server and my users?
  • Can we expose reports externally?
  • What is the migration path from Power BI tabular data models to on-premises and Azure AS models?
  • What’s up with mobile reporting?
  • How do I get started with Power Query & M
  • What’s the best way to learn and get support with DAX and calculations?
  • How do Excel, SSRS, Power BI and SSAS work together (or do they?)
  • What’s unique about your scenario and business rules?  How do we best proceed and meet those requirements?
  • What’s up with reports in SharePoint, external-facing application, embedding reports and self-service reporting?

There have already been some great sessions from Kendra Little and Bob Ward – which I have thoroughly enjoyed watching.  I’ve always loved Kendra’s presentation style and positive energy when she speaks.  Bob is a tried-and-true SQL Server expert with many years of experience on the SQL Server product engineering team.

Join me live, learn some good stuff and we’ll have some fun!

24 Hours of PASS

Every year community speakers present the 24 Hours of PASS (24HOP) which will be on April 25th.

24HOP Call for Speakers: Cross-Platform SQL Server Management

Every hour, a different presenter will deliver a 60 minute session on a specialized topic from midnight to midnight UTC.  My talk will be the Nine Realms of Power BI and the many different ways Power BI may be used along with other technologies to deliver Business Intelligence, reporting and analytic solutions.

My session is at 4PM Pacific Time on Wednesday, April 25th.  That’s 11PM UTC for you night owls in western Europe.  The rest of you can do the TZ math for your time zone.