SQL or M? – SSAS Partitions in Power Query/M

This is a continuation of this post

In the data platform industry, we have been working with SQL for decades.  It’s a powerful language and over many years, we’ve learned to work with it’s strengths and to understand and work around it’s idiosyncrasies.  M is a considerably more modern and flexible query language.  Some best practices have evolved but many are still learning the basic patterns of effective query design.  Reaching that stage with a technology often takes years of trial-and-error design and a community willing to share their learnings.  I will continue to share mine and appreciate so many in the community who share theirs.

Why Use M Instead of SQL?

For database professionals using SQL Server as the sole source of data for an SSAS or Power BI data model, there is a solid argument to be made in favor of encapsulating the query logic in database objects.  DBAs need to manage access to important databases.  A comment posted in an earlier post on this topic mentioned that SQL Server views can implement schema binding – which doesn’t allow a table or any other dependent object to be altered in such a way that it would break the view.  This is a good design pattern that should be followed if you are the database owner, have the necessary permission and flexibility to manage database objects as part of your BI solution design.  Ultimately, this is an organizational decision.  If the BI solution developer is not the DBA, you may have limited options.  If you don’t have control over the source database objects, if you are not using SQL Server or otherwise prefer to manage everything in the SSAS or Power BI project, Power Query is probably the right place to manage all the query logic.

In my earlier post, I used a table-valued user-defined function to manage the partition filtering logic in SQL Server.  Rather than using SQL and database objects, we’ll use Power Query alone.  The working M script is shown below.

For brevity, I’m starting by showing the solution but I will show you the steps we went through to get there a bit later.

Updating the Partition Definitions

The three partitions defined in the earlier example are replaced using the following M script, which returns exactly the same columns and rows as before.

image

Here is the M script for each partition:

“This week” partition:

let
Source = #”SQL/localhost;ContosoDW”,
SalesData = Source{[Schema=”dbo”,Item=”FactSalesCompleteDates”]}[Data],
#”Filtered Rows” =
Table.SelectRows( SalesData, each
[DateKey] >=
Date.StartOfWeek( DateTime.FixedLocalNow() )
)

in
#”Filtered Rows”

“This month before this week” partition:

let
Source = #”SQL/localhost;ContosoDW”,
SalesData = Source{[Schema=”dbo”,Item=”FactSalesCompleteDates”]}[Data],
#”Filtered Rows” =
Table.SelectRows( SalesData, each
[DateKey] >=
Date.StartOfMonth( DateTime.FixedLocalNow() )
and
[DateKey] <
Date.StartOfWeek( DateTime.FixedLocalNow() )
)

in
#”Filtered Rows”

“Before this month” partition:

let
Source = #”SQL/localhost;ContosoDW”,
SalesData = Source{[Schema=”dbo”,Item=”FactSalesCompleteDates”]}[Data],
#”Filtered Rows” =
Table.SelectRows( SalesData, each
[DateKey] <
Date.StartOfMonth( DateTime.FixedLocalNow() )
)

in
#”Filtered Rows”

The Power Query Litmus Test: Query Folding

When connected to an enterprise data source like SQL Server, Power Query should be able to pass an important test.  Use the Design… button to view the Power Query Editor.  Select the last query step and then right-click to show the menu.

image

If the View Native Query menu option is enabled, you are good.  This means the the query is being folded – and that’s a good thing.  Query folding converts the query steps into a native query for the database engine to execute.  This is the resulting T-SQL query script generated by Power Query:

select [_].[SalesKey],
[_].[DateKey],
[_].[channelKey],
[_].[StoreKey],
[_].[ProductKey],
[_].[PromotionKey],
[_].[CurrencyKey],
[_].[UnitCost],
[_].[UnitPrice],
[_].[SalesQuantity],
[_].[ReturnQuantity],
[_].[ReturnAmount],
[_].[DiscountQuantity],
[_].[DiscountAmount],
[_].[TotalCost],
[_].[SalesAmount],
[_].[ETLLoadID],
[_].[LoadDate],
[_].[UpdateDate]

from [dbo].[FactSalesCompleteDates] as [_]

where [_].[DateKey] >= convert(datetime2, ‘2018-07-01 00:00:00’) and [_].[DateKey] < convert(datetime2, ‘2018-07-15 00:00:00’)

You don’t need to do anything with this information.  It’s just good to know.  End of story.

And Now… The Rest of The Story

Power Query is an awesome tool that does some amazingly smart things with the simple data transformation steps you create in the designer.  However, it is important to make sure Power Query produces efficient queries.  During the development of this solution, I created an early prototype that didn’t produce a query that would fold into T-SQL.  Thanks to Brian Grant, who is an absolute genius with Power Query and M, for figuring this out (BTW, you can visit Brian’s YouTube tutorial collection here).

In my original design which I prototyped in Power BI Desktop, I thought it would make sense to create custom columns for each of the date parts needed to filter the partitions.  Here’s the prototype query for the query I originally named “This Month Thru Last Week”:

let
Source = FactSales,

#”Add DateTimeNow” = Table.AddColumn(Source, “DateTimeNow”, each DateTime.LocalNow()),
#”Change Type DateTime” = Table.TransformColumnTypes(#”Add DateTimeNow”,{{“DateTimeNow”, type datetime}}),
#”Add StartOfThisWeek” = Table.AddColumn(#”Change Type DateTime”, “StartOfThisWeek”, each Date.StartOfWeek([DateTimeNow]), type date),
#”Add StartOfThisMonth” = Table.AddColumn(#”Add StartOfThisWeek”, “StartOfThisMonth”, each Date.StartOfMonth([DateTimeNow]), type date),
#”Add StartOfPreviousMonth” = Table.AddColumn(#”Add StartOfThisMonth”, “StartOfPreviousMonth”, each Date.StartOfMonth(Date.AddMonths([DateTimeNow], -1)), type date),
#”Partition Filter” = Table.SelectRows(#”Add StartOfPreviousMonth”, each ([DateKey] >= [StartOfThisMonth] and [DateKey] < [StartOfThisWeek]) ),
#”Removed Columns” = Table.RemoveColumns(#”Partition Filter”,{“ETLLoadID”, “LoadDate”, “UpdateDate”, “DateTimeNow”, “StartOfThisWeek”, “StartOfThisMonth”, “StartOfPreviousMonth”})

in
#”Removed Columns”

As you can see, I created separate columns using Transform menu options based on the current date and time, stored in a custom column named “DateTimeNow”:

  • StartOfThisWeek
  • StartOfThisMonth
  • StartOfPreviousMonth

The rest was simple, I just added filters using these columns and then removed the custom columns from the query in the last step.  All good with one small exception… it didn’t work.  We learned that Power Query can’t use custom column values to build a foldable filter expression.  The filters just won’t translate into a T-SQL WHERE clause.

Checking the last query step with a right-click shows that the “View Native Query” menu option is grayed-out so No Folding For You!

image

Simple lesson: When query folding doesn’t work, do something else.  In this case, we just had to put the date comparison logic in the filter steps and not in custom columns.

4 thoughts on “SQL or M? – SSAS Partitions in Power Query/M

  1. Pingback: SQL, M or DAX? | Paul Turley's SQL Server BI Blog

  2. Pingback: SQL, M or Dax? – part 2 | Paul Turley's SQL Server BI Blog

  3. Pingback: SQL, M or DAX: When Does it Matter? | Paul Turley's SQL Server BI Blog

  4. Pingback: SQL or M? – SSAS Partitions Using SQL Server Table-Valued Functions (UDFs) | Paul Turley's SQL Server BI Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s