SQL, M or Dax? – part 2

This is a post about a post about a post.  Thanks to those of you who are entering comments in the original May 12 post titled SQL, M or DAX?  This is a popular topic. And thanks to Adam Saxton for mentioning this post in his Guy in A Cube Weekly Roundup.

This is a HUUUUGE topic and I can tell that I’ve struck a chord with many BI practitioners by bringing it up.  Please post your comments and share your ideas.  I’m particularly interested in hearing your challenging questions and your thoughts about the pros-and-cons of some less-obvious choices about whether to implement transformations & calculations in SQL, M or DAX.

This week, I have had engaging conversations on this topic while working on a Power BI consulting project for a large municipal court system.  As a consultant, I’ve had three weeks of experience with their data and business environment.  The internal staff have spent decades negotiating the intricacies and layers upon layers of business process so of course, I want to learn from their experience but I also want to cautiously pursue opportunities to think outside the box.  That’s why they hired me.

Tell me if this situation resonates with you…  Working with a SQL Server database developer who is really good with T-SQL but fairly new to Power BI & tabular modeling, we’re building a data model and reports sourced from a line-of-business application’s SQL Server database.  They’ve been writing reports using some pretty complicated SQL queries embedded in SSRS paginated reports.  Every time a user wants a new report, a request is sent to the IT group.  A developer picks up the request, writes some gnarly T-SQL query with pre-calculated columns and business rules.  Complex reports might take days or weeks of development time.  I needed to update a dimension table in the data model and needed a calculated column to differentiate case types.  Turns out that it wasn’t a simple addition and his response was “I’ll just send you the SQL for that…you can just paste it”.  The dilemma here is that all the complicated business rules had already been resolved using layers of T-SQL common table expressions (CTEs), nested subqueries and CASE statements.  It was very well-written SQL and it would take considerable effort to re-engineer the logic into a dimensional tabular model to support general-use reporting.  After beginning to nod-off while reading through the layers of SQL script, my initial reaction was to just paste the code and be done with it.  After all, someone had already solved this problem, right?

The trade-off by using the existing T-SQL code is that the calculations and business rules are applied at a fixed level of granularity and within a certain business context.  The query would need to be rewritten to answer different business questions.  If we take the “black box” approach and paste the working and tested SQL script into the Power Query table definition, chances are that we won’t be able to explain the query logic in a few months, after we’ve moved on and forgotten this business problem.  If you are trying to create a general-purpose data model to answer yet-to-be-defined questions, then you need to use design patterns that allow developers and users to navigate the model at different levels of grain across different dimension tables, and in different filtering contexts.  This isn’t always the right answer but in this case, I am recommending that we do as little data merging, joining and manipulation as possible in the underlying source queries.  But, the table mapping between source and data model are not one-to-one.  In some cases, two or three source tables are combined using SQL joins, into a flattened and simplified lookup table – containing only the necessary, friendly-named columns and keys, and no unnecessary clutter like CreatedDateTime, ModifiedDateTime and CreatedByUser columns.  Use custom columns in M/Power Query to transform the row-level calculated values and DAX measures to perform calculations in aggregate and within filter/slicing/grouping context.

I’d love to hear your thoughts and ideas on this topic.

 

 

 

 

 

 

 

 

SQL, M or DAX?

We live in a world of choices and we have many tools at our disposal.  In Microsoft Business Intelligence solutions using tools like Power BI and SQL Server Analysis Services, you have at least three different ways to perform data collection, transformations and calculations.  A question I get all the time is: “Which database or BI tool should be used to perform routine tasks?  Is it best to shape and transform data at the source, in Power Query using M script, or in the data model using DAX?”

In this series, I’ll demonstrate options for creating utility and dimension tables, columns and calculations using each option and discuss the advantages, disadvantages and recommended practice for each.

SNAGHTML176b0f5

I welcome your questions and ideas on these topics.  Please post comments to this post with your questions and challenges.  Let’s get started with one of the most common examples…

Creating a Date Dimension Table

A Date dimension table is an essential component in most any data warehouse or reporting database so techniques to generate these tables have been around for a long time.  The foundation of a Date dimension table is a table containing one row per contiguous date in a range that includes every possible transaction date or fact record.  To make reporting easier, it is common practice to have multiple date dimensions in the semantic model.  For example, if sales transaction facts have an Order Date and a Delivery Date, and both are used independently for reporting; there may be an Order Date dimension and a Delivery Date dimension in the model.

A common practice for building the dimension table is to just populate a single Date type column with the sequential date values.  After these rows are inserted, date part functions may be used to populate additional columns by referencing the Date value in an expression.  Most every language includes, for example, a MONTH() and YEAR() function to convert a date value into these date parts.

SQL

If you have a data warehouse or a relational database specifically suited to support your Power BI and reporting models, use that to define all of your tables using conventional techniques like T-SQL.  Examples for generating a date reference or dimension table are easy to find online, primarily because this is the oldest and most enduring technique, used for many years in conventional data warehouse design.  T-SQL is a flexible language but the SQL date part functions are pretty bare bones.  In the end, it really comes down to preference and language familiarity.

I think there is a good argument to be made for not only defining a date dimension using familiar SQL script but for persisting the table in the data warehouse along with other standardized dimension tables.  This approach is optimal when you are working with SQL Server or another relational database as your primary data source.

Reporting, BI and dashboard projects don’t always reply on a data warehouse.  Self-service BI solutions usually start with ad-hoc data mashups to support analytic reports rather than a holistic IT-driven solution.  If you aren’t using a relational database as the primary data source, you may be better off using a tool managed within Power BI or SSAS.

Example

There are several different techniques that include using a cursor or a WHILE loop to iterate through each date in a range, one row at a time.  One of the best techniques I’ve found is this example from Aaron Bertrand.  Adding special columns to keep track of holidays or special calendar periods (like Fiscal, 4-5-4, ISO, etc.) can require a lot of complex code.

— Date dimension script by Aaron Bertrand:

https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server


CREATE TABLE #dim
(
[date]       DATE PRIMARY KEY,
[day]        AS DATEPART(DAY,      [date]),
[month]      AS DATEPART(MONTH,    [date]),
FirstOfMonth AS CONVERT(DATE, DATEADD(MONTH, DATEDIFF(MONTH, 0, [date]), 0)),
[MonthName]  AS DATENAME(MONTH,    [date]),
[week]       AS DATEPART(WEEK,     [date]),
[ISOweek]    AS DATEPART(ISO_WEEK, [date]),
[DayOfWeek]  AS DATEPART(WEEKDAY,  [date]),
[quarter]    AS DATEPART(QUARTER,  [date]),
[year]     AS DATEPART(YEAR,     [date]),
FirstOfYear  AS CONVERT(DATE, DATEADD(YEAR,  DATEDIFF(YEAR,  0, [date]), 0)),
Style112     AS CONVERT(CHAR(8),   [date], 112),
Style101     AS CONVERT(CHAR(10),  [date], 101)
);

— use the catalog views to generate as many rows as we need

INSERT #dim([date])

SELECT d
FROM
(
SELECT d = DATEADD(DAY, rn – 1, @StartDate)
FROM
(
SELECT TOP (DATEDIFF(DAY, @StartDate, @CutoffDate))
rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
— on my system this would support > 5 million days
ORDER BY s1.[object_id]
) AS x
) AS y;

Power Query/M

In my opinion, Power Query is the best choice when other query transformations are also managed in Power Query.  For simplicity, you can keep all of your query and transformation logic in one place.  If you are just getting started with Power BI and aren’t inclined to use a different technique, use this one.

Example

Nearly every step in this process can be performed using menu selections and simple features in the Power Query user interface.  It just takes a little creativity to get started.  I’ve done this using a few different approaching until arriving at this one.  It’s easiest and most flexible.

  • Start by creating two parameters named “Dates From” and “Dates To”.  Assign them values to define the range of dates you need in the date dimension table; like January 1, 2010 and December 31, 2018.
  • Use the Get Data menu to create a Blank Query
  • The first two steps need to be entered manually.  Open the Advanced Editor and pastes these first two lines on a new line after the “let” command:

     DateCount = Duration.Days(Duration.From( #”Dates To” – #”Dates From” )),
Source = List.Dates(#”Dates From”, DateCount, #duration(1,0,0,0))

  • Switch back to the Transform ribbon tab and then click Convert > To Table
  • Change the name of the new date columna dn change the data type to Date
  • At this point, you can simply use the menus on the Add Columns ribbon to generate all of the date part columns you need in the date dimension table

The resulting M query can be viewed in the Advanced Editor:

let
DateCount = Duration.Days(Duration.From( #”Dates To” – #”Dates From” )),
Source = List.Dates(#”Dates From”, DateCount, #duration(1,0,0,0)),
TableFromList = Table.FromList(Source, Splitter.SplitByNothing()),
#”Renamed Columns” = Table.RenameColumns(TableFromList,{{“Column1”, “Date”}}),
#”Changed Type” = Table.TransformColumnTypes(#”Renamed Columns”,{{“Date”, type date}}),
#”Inserted Year” = Table.AddColumn(#”Changed Type”, “Year”, each Date.Year([Date]), Int64.Type),
#”Inserted Month” = Table.AddColumn(#”Inserted Year”, “Month”, each Date.Month([Date]), Int64.Type),
#”Inserted Month Name” = Table.AddColumn(#”Inserted Month”, “Month Name”, each Date.MonthName([Date]), type text),
#”Inserted Quarter” = Table.AddColumn(#”Inserted Month Name”, “Quarter”, each Date.QuarterOfYear([Date]), Int64.Type),
#”Inserted Week of Year” = Table.AddColumn(#”Inserted Quarter”, “Week of Year”, each Date.WeekOfYear([Date]), Int64.Type),
#”Inserted Week of Month” = Table.AddColumn(#”Inserted Week of Year”, “Week of Month”, each Date.WeekOfMonth([Date]), Int64.Type),
#”Inserted Day” = Table.AddColumn(#”Inserted Week of Month”, “Day”, each Date.Day([Date]), Int64.Type),
#”Inserted Day of Week” = Table.AddColumn(#”Inserted Day”, “Day of Week”, each Date.DayOfWeek([Date]), Int64.Type),
#”Inserted Day of Year” = Table.AddColumn(#”Inserted Day of Week”, “Day of Year”, each Date.DayOfYear([Date]), Int64.Type),
#”Inserted Day Name” = Table.AddColumn(#”Inserted Day of Year”, “Day Name”, each Date.DayOfWeekName([Date]), type text),
#”Renamed Columns1″ = Table.RenameColumns(#”Inserted Day Name”,{{“Month”, “Month Number”}, {“Quarter”, “Quarter of Year Number”}}),
#”Added Custom” = Table.AddColumn(#”Renamed Columns1″, “Quarter Name”, each “Q” & Number.ToText([Quarter of Year Number])),
#”Renamed Columns2″ = Table.RenameColumns(#”Added Custom”,{{“Day”, “Day of Month”}}),
#”Changed Type1″ = Table.TransformColumnTypes(#”Renamed Columns2″,{{“Quarter Name”, type text}}),
#”Reordered Columns” = Table.ReorderColumns(#”Changed Type1″,{“Date”, “Year”, “Month Number”, “Month Name”, “Quarter of Year Number”, “Quarter Name”, “Week of Year”, “Week of Month”, “Day of Month”, “Day of Week”, “Day of Year”, “Day Name”})

in
#”Reordered Columns”

Beyond ordinary Gregorian calendar date parts, specialized columns like Fiscal periods, holiday flags and components of a 4-5-4 calendar are a little easier to do in M because

the language includes advanced functions to support complex formulas.  I’ll share some of these advanced techniques in a later post.

DAX

Calculated tables were recently added to the tabular model designer in both Power BI Desktop and the tabular model project editor in SQL Server Data Tools (SSDT) for Visual Studio.  This feature uses a handful of new table-based DAX functions, which include CALENDAR and CALENDARAUTO, for easily defining date dimension tables directly in the model.

Getting started is simple:

  • Click the New Table button on the Modeling ribbon
  • Enter the following script into the formula bar:

       My Calendar = CALENDAR(date(2018,1,1), date(2018, 12, 31))

  • Now you can add new calculated columns and apply the appropriate DAX functions to create date part columns using the Custom Column editor.

This post on the Power BI Tips site demonstrates a few variations of DAX-generated calendar tables: https://powerbi.tips/2017/11/creating-a-dax-calendar/

Each column in the table will be a separate expression or calculated column.  With regard to performance or model optimization, there is no additional overhead or good argument not to use DAX to generate a date dimension table.  However, using Power Query & M to transform data and create some tables and then DAX to generate other tables in the model can be more messy than keeping everything in one place.

So, why are there two different ways to create tables in Power BI?

This is an excellent question and it is really just an artifact of the evolution of the product and its constituent technologies.  The modeling tools behind Power BI (DAX, VertiPaq & the SSAS Tabular model) were created first and became the Power Pivot add-in for Excel.  As the DAX language evolved, that development team gave us the ability to generate tables using DAX Script.  Not long after that, a separate product team created Power Query and the M data mashup language.  Power Query was also made available as an add-in for Excel.  Eventually both tools found their way into the Power BI Desktop product.

Final Recommendation

If you have a SQL Server data warehouse, you can use SQL to create date dimension tables.  It’s usually best to unify all of the reporting data in the data warehouse or data mart to create a single version of the truth for reporting.  You can also use tools like SSIS or Azure Data Factory to build and manage these objects before the data is imported into Power BI or the Analysis Services data model.

If using Power BI only, use the Get Data tools to build all the tables, including the date dimension(s).  There is nothing wrong with using the DAX techniques but that is my second choice in the Power BI toolbox, for this particular need.

Facebook Live Pop-up Session Recording

A big THANK YOU to everyone who attended the Facebook Live Pop-up session today. This was a fun event and I enjoyed taking and answering your questions. A recording of the live session is available right here:

We’re not quite sure why the video jumped around a bit but it didn’t seem to be too much of a distractor. We tested everything and had no issues until the event (of course!). I recently upgraded my older LifeCam 1080p camera to the LifeCam Studio HD camera – so maybe blasting more bits through the service caused some unrest. With that exception, I’d love to have your feedback about the format and the whole live Q&A concept.

Another concept I’m kicking around is to provide a forum for you and others to request guided training content based on your questions.  It would be sort of a Q&A forum that would drive the way we build online training lessons.  What do you think?
Please post your comments below.

PASS Facebook Live Pop-up Expert Series

There are some great learning opportunities available from PASS and I am exciting to participate in two online events this month!

Please join me on April 24 for a live chat about all things BI, reporting and data analytics.  Ask me anything you want about these or related topics and I’ll answer your questions, talk about my experience or find out what the community has to say.  The session is on Tuesday, April 24th at 6PM UTC (that’s 11 AM here in Pacific Time).  Follow the image link to put it on your calendar.  You can use the comments on the Facebook post or send an email if you’d like to queue up your questions ahead of time.

Here are some topics to get you started:

  • Is self-service reporting and data modeling really sustainable?
  • New features are released (monthly).  How do we keep (IT/or users) up to speed?
  • Where can we find best practice guidance for our solutions?
  • What’s the best tool to use for a certain style of reporting solution?
  • Differences between Power BI in the service and on-premises
  • What is the future for SSRS and Power BI Report Server?
  • How do I license Power BI, Report Server and my users?
  • Can we expose reports externally?
  • What is the migration path from Power BI tabular data models to on-premises and Azure AS models?
  • What’s up with mobile reporting?
  • How do I get started with Power Query & M
  • What’s the best way to learn and get support with DAX and calculations?
  • How do Excel, SSRS, Power BI and SSAS work together (or do they?)
  • What’s unique about your scenario and business rules?  How do we best proceed and meet those requirements?
  • What’s up with reports in SharePoint, external-facing application, embedding reports and self-service reporting?

There have already been some great sessions from Kendra Little and Bob Ward – which I have thoroughly enjoyed watching.  I’ve always loved Kendra’s presentation style and positive energy when she speaks.  Bob is a tried-and-true SQL Server expert with many years of experience on the SQL Server product engineering team.

Join me live, learn some good stuff and we’ll have some fun!

24 Hours of PASS

Every year community speakers present the 24 Hours of PASS (24HOP) which will be on April 25th.

24HOP Call for Speakers: Cross-Platform SQL Server Management

Every hour, a different presenter will deliver a 60 minute session on a specialized topic from midnight to midnight UTC.  My talk will be the Nine Realms of Power BI and the many different ways Power BI may be used along with other technologies to deliver Business Intelligence, reporting and analytic solutions.

My session is at 4PM Pacific Time on Wednesday, April 25th.  That’s 11PM UTC for you night owls in western Europe.  The rest of you can do the TZ math for your time zone.

Tour of the Power BI Solution Advisor

As a follow-up to my earlier post titled “Nine Realms of Power BI and the Power BI Solution Advisor“,  I’ve recorded this 7 minute tour of the solution advisor tour:

at last count, the tool has been accessed about 650 times.  Thanks for visiting!

I’ll also follow-up here with another tour to step-through the “making of” the tool and a peek inside the design.

Using Power Query “M” To Encode Text As Numbers

I worked through a brain-teaser on a consulting project today that I thought I’d share in case it was useful for someone else in the community.  We needed to convert application user names into an encoded format that would preserve case sensitive comparison.  Here’s the story… A client of mine is using Power BI Desktop to munge data from several different source systems to create analytic reports.

Two-Phase BI  Projects

I’m going to step out of the frame just a moment to make a soapbox speech:  I’m a believer in two-phase Business Intelligence project design.  What that means in a few words is that we rapidly work through a quick design, building a functional pilot or proof-of-concept to produce some reports that demonstrate the capability of the solution.  This gets stakeholders and folks funding the project on-board so we can get the support necessary to schedule and budget the more formal, production-scale long-term business solution.  Part of the negotiation is that we might use self-service BI tools to bend or even break the rules of proper design the first time through.  We agree to learn what we can from this experience, salvage what we can from the first phase project and then we adhere to proper design rules, using what we learned to build a production-ready solution in Phase Two.

Our project is in Phase One and we’re cutting corners all over the place to get reports done and ready to show our stakeholders.  Today I learned that the user login names stored in one of the source systems, which we will use to uniquely identify system users, allows different users to be setup using the same combinations of letters as long as the upper and lower case don’t match.  I had to ask the business user to repeat that and I had heard it right the first time.  If there were two users named “Bob Smith” that were setup with login user names of “BOBSMITH” and “BobSmith”, that was perfectly acceptable per the rules enforced in the application.  No right-minded application developer on this planet or any other should have let that happen but since their dink-wad software produces this data, we have to use it as it is.  In the Phase Two (production-ready) solution we will generate surrogate keys to define uniqueness but in this version, created with Power BI Desktop, I have to figure out how to make the same user name strings, with different upper and lower-case combinations, participate in relationships and serve as table key identifiers.

SNAGHTML1158f792

Wouldn’t it be nice if I could convert each UserName string to a numeric representation of each character (which would be different for each upper or lower case letter).  I knew that to convert each character one-at-a-time, I would need to bust off each string into a list of characters.  Let’s see…  that’s probably done with a List object but what method and where do I find the answer?

It’s Off To The Web, Batman!

Yes, I Googled it (I actually used Bing) and found several good resources.  Most official docs online weren’t very helpful.  I have a paper copy of Ken Puls book where he mentions List.Splitter, which seemed promising.  I have an e-copy of Chris Webb’s book – somewhere – and I know he eats and breathes this kinda stuff.  Running low on options, I came across Reza Rad’s December, 2017 blog post and found Mecca.  Reza has an extensive post about parsing and manipulating lists. He helped me understand the mechanics of the List.Accumulate function, which is really powerful.  Reza provides several good examples of List manipulation; pulling lists apart and putting them back together.  This post didn’t entirely address my scenario but did give me a foundation to figure the rest out on my own.  The post is here. It was educational and sent me in the right direction.  But, the sample code didn’t resolve my issue entirely.  It did, however get me thinking about the problem a certain way and I figured it out.  HOT DANG!

So Here’s The Deal

The first step was to tear each string down into a List object.  At that point, you have a collection of characters to have your way with.

I created a calculated column and entered something like this:

=Text.ToList( [UserName] )

image

If you were to add this column in the query design and then scroll on over to the new column, you’d see that it shows up as a List object placeholder, just all waiting for you to click the magic link that navigates to the list of all the characters in the column string.

image

We don’t want to do this.

Beep Beep Beep…. Backing up The Bus

Removing the navigation step and looking at the column of List object placeholders…  I want to modify the M code for this step to do the following:

  1. Parse the items in the list (each character in the UserName field)
  2. For each character, convert it to a number
  3. Iterate through the list and concatenate the numbers into a new string of numerals

To enumerate over the elements of a list and put the list members back into some kind of presentable package (like a single string or a number), we can use the Accumulate method.

The Accumulator is a little machine with a crank handle on the side.  Every turn of the handle spits out on of the element values, using the current variable.  You can do whatever you want with the object in the current variable, but if you want to put it back into the machine for next turn, you should combine it with the state variable, which represents the previous value (when the handle was cranked the last time).

Here’s my final desired result:

image

In a nutshell, List.Accumulate contains two internal variables that can be used to iterate over the elements of a list (sort of like an array) and assemble a new value.

The state variable holds the temporary value that you can build on each each iteration, and the current variable represents the value of the current element.  With an example, this will be clear.

The final code takes the output from “Text.ToList” and builds a List object from the characters in the UserName field on that row.

Next, List.Accumulate iterates over each character where my code uses “Character.ToNumber” over the current character to convert it to numeric form.

Adding this custom column…

image

…generates this M code in the query:

= Table.AddColumn(#”Reordered Columns”, “Encoded UserName 1”, each List.Accumulate(
Text.ToList([UserName])
, “”
, (state, current)=>
state
&
Number.ToText(
Character.ToNumber(current), “000”
)
)

Just like magic, now I have a unique numeric column representing the distinct upper and lower-case characters in these strings, that I can reliably be used as a key and join operator.

Bad Data Happens

As I said earlier, in a solution where we can manage the data governance rules, perhaps we could prevent these mixed-case user names from being created.  However, in this project, they did and we needed to use them.

Nine Realms of Power BI and the Power BI Solution Advisor

The use cases for Power BI, along with its many companion technologies, are numerous.  Many organizations are exploring the use of Power BI in enterprise-scale solutions and struggling with the myriad of options and choices.  I’ve grouped these options into nine categories that I call the “Nine Realms of Power BI”.  Along with my friends at CSG Pro – Brian, Greg & Ron, we have created a Power BI-based tool that you can use as a sort-of survey to assess your business and technical requirements and then recommend a reference solution architecture in one of these categories.  The options, components and reference architectures, capabilities, limits and cost guidelines are detailed later in this presentation.  I’ll also take you on a tour of the solution advisor tool, which I have published for public Internet users.

This is a presentation I prepared for the Redmond SQL Saturday that I will also use for some future presentations.

Slide1

Let’s start by grouping requirements and solution criteria into eight categories.  In the solution advisor, you’ll choose one option from each of these. We’ll explore these categories in detail a bit later.

Slide3

Why Nine Realms?  I actually came up with nine solution architectures before the “Nine Realms” theme came to me, but I found it fitting that these concepts seems to align with the Norse mythology depicted in the Thor movies from Marvel Comics.  After doing a little reading, I found that these stories have been around for centuries and are rooted in real Viking folklore that have some real substance behind them.

In short, according to tradition, the nine realms or worlds are branches of the cosmological tree; Yggdrasil. The realms include familiar worlds depicted in the stories we know, like Asgard – the home of the gods – and Midgard – home of the humans, which is earth.

Not all the worlds in the Yggdrasil tree are necessarily “better” or “worse” than, or above or below, others but they are all different, with attributes better suited for their inhabitants.  I find this to be a relevant analogy.

Stay with me here and I’ll show you how this all relates to the various incarnations of Power BI solutions.

Slide4

Asgard is the home of the gods and is a place resembling Utopia, or a perfect world where everything is meticulously architected and all questions have answers.

Likewise, in a perfect BI solution, every base is covered and the solution achieves something approaching perfection.  Delivering such a thing is a goal of many BI solutions but achieving perfection is costly and often extends the technical scope and delivery timeline of a solution.  The stresses to achieve the utopian dream of a perfect BI solution can tread practical limits of not only time and money but also of patience and sanity; stakeholder commitment, interpersonal relationships among staff and leaders, work-life balance and the overall health of team business culture.

Slide5

Which of the worlds is right for you and your audience?  Which one of the worlds should you try to achieve?

I promise to get serious here soon, but please indulge me with the “Thor” theme for just a moment…

Start by understanding your capabilities and stay focused on your objectives.  Keep your enemies close… in other words, understand the forces working against your success and strategically plan to overcome them.

Every distraction that deviates from of your planned solution – every new feature, every one-off promise to a stakeholder, every exception to the constrained list of in-scope deliverables – becomes your enemy.  Each of these metaphoric “friends” seems welcoming and well-intentioned until the schedule slips and the list of deliverables and challenges becomes insurmountable and unobtainable within your deadlines and technical capabilities.

Slide6 Slide7

This slide is key.  Power BI has a rich heritage of technologies that go back many years and are deeply engrained into the desktop application and cloud service – but some of these technologies also has more capable services outside of the desktop product.  For example, Power BI Desktop actually uses a scaled-down instance of SQL Server Analysis Services, which implements the Vertipaq tabular in-memory analytics engine.  If you need more horsepower than the Power BI Desktop modeling component provides, you can graduate to a full-blown SSAS instance and continue to work with a very similar, but more robust, data modeling tool that will scale on-prem or to the cloud to accommodate significantly more data and richer admin controls.  Be mindful, though, that making the leap from Power BI Desktop to enterprise SQL Server tools can be a big undertaking.

Slide8

How about your audience?  Who and where are they?  How do you need to secure your solution, reports and data?

Slide9 Slide10

Where will you host your reports and how will users access them?  …in the cloud using the Power BI service – or on-premises using Power BI Report Server?

Slide11 Slide12

The Nine Realms of Power BI

As promised, here are the Nine Realms of Power BI.  They are roughly categorized into three or four different groups.

The top row are all solution options that utilize the Azure cloud-based Power BI service (PowerBI.com), with the cached data model and reports deployed to the cloud service, or with reports in the cloud and data remaining on-prem.

The second row of options are exclusively on-premises with no reliance on cloud services or cloud storage.

The seventh item, “Azure SSAS – Deployed to Service”, is entirely cloud-based and requires no on-prem infrastructure at all.

The remaining two items are special use cases where reports and dashboards are embedded into and managed by a custom application; or data is fed in real time to live visuals.

Slide13

Solutions are cloud/on-prem hybrid, entirely on-prem, entirely cloud-based or specialized solutions such as embedded or live-streaming.

Slide13-Groups

Now back to the solution requirement categories.  Here they are in detail.  Consider this like a survey.  The solution advisor asks the questions on the right for each of the categories:

Slide14

Power BI Solution Advisor

You can access the Power BI Solution Advisor by clicking the slide image.

With a little help form my friends, we have built this tool – using Power BI of course – to assess the solution requirement criteria and recommend relevant solution architectures.

Let’s take a quick look at the tool and then we will explore it in detail a little later.  The recommended architectures are details in the slides that follow.  (3-18 update: Video Tour of the Power BI Solution Advisor)

Slide15

1. Cached Data Model, Deployed to Service

For secure report sharing, Power BI Pro licenses are required for all users without Premium capacity licensing.

Premium capacity licensing covers unlimited read-only users. Pro licenses are required for publishing & sharing.

Slide16

2. SSAS Direct Connect, Deployed to Service

In many respects, this is the most versatile mode for using the Power BI platform with high volume data managed on premises. The latest version of Power BI Desktop may be used with new and preview features. With reports published to the service, key features like dashboards, natural language Q&A, mobile access, alerts and subscriptions are supported. Connecting to SSAS through the gateway enables you to manage full-scale semantic models in tabular and multidimensional, using partitions for incremental data refresh. Compared to DirectQuery, this option has better performance and unlimited DAX calculation features.

In simple terms, data is read from the on-prem data model in real-time as users interact with reports; but the service is even smarter than that.  To optimize performance and reduce unnecessary network traffic, query results get cached and refused for short periods.

Caching policy: https://docs.microsoft.com/en-us/power-bi/service-q-and-a-direct-query#what-data-is-cached-and-how-is-privacy-protected

Slide17

3. DirectQuery, Deployed to Service

The goal of DirectQuery is to enable as much capability as possible without caching data in a persistent data model. Rather then performing calculations on in-memory tables in a Vertipaq model, report interactions are translated into native queries for the data source to process and return aggregated results. To that end, report query performance will lag and complex calculations are limited. DAX functions that consume high data volume are impacted the most (e.g. SUMMARIZE, CALCULATETABLE, YTD, PARALLELPERIOD, RANKX, etc.)

There will always be performance and functionality limits with this feature but it will likely continue to see investments to improve performance as much as feasibly possible.

DirectQuery is typically chosen when: 1) a Microsoft customer have not fully embraced cached model or SSAS modeling concepts, or 2) when a relational data warehouse/mart is performance-tuned to address specific query & report scenarios within acceptable limits.

Slide18

4. Cached Data Model, Deployed On-Premises

Reports are deployed to an on-premises instance SQL Server Reporting Services called “Power BI Report Server”.

SSRS catalog database requires SQL Server 2008+

Power BI Report Server licensing requirements: SQL Server Enterprise edition with Software Assurance, or Power BI Premium capacity.
Due to slower product release cycles, PBIRS features & capabilities lag behind Power BI Desktop/service by 1-4 months (PBIRS updates are about every quarter.)

User could have two version of Power BI Desktop installed (older version for PBIRS & latest version). Be cautious with version control.

Slide19

5. SSAS Direct Connect, Deployed On-Premises

This option provides for a fully-scaled out enterprise solution with no dependencies on cloud services.

No model data size limit.

Role-based, row-level security (RLS) is supported in SSAS.

Enterprise scaled architecture (PBIRS & SSAS on separate machines) will require constrained delegation/Kerberos configuration unless static credentials are stored.

Scale-out architecture is supported on each tier by load-balancing multiple SSAS machines and/or load-balancing multiple PBIRS machines.

PBIRS doesn’t support Power BI service features like dashboards, natural language Q&A, alerts, mobile app access & R visuals.

Slide20

6. DirectQuery, Deployed On-Premises

This option also provides for a fully-scaled out enterprise solution with no dependencies on cloud services.

No data source size limit.

Performance degradation and DAX calculation limits apply (same as DirectQuery in the serveice).

Scale-out architecture is supported by load-balancing multiple PBIRS machines.

PBIRS doesn’t support Power BI service features like dashboards, natural language Q&A, alerts, mobile app access & R visuals.

Slide21

7. Azure SSAS Direct Connect, Deployed to Service

In most respects, this option is identical to using SSAS on-premises except no gateway is required to connect to Azure SSAS.

No on-premises hardware investment is required for this option since everything is hosted in the Azure cloud.

No SSAS product licensing costs. ASSAS costs are billed for hourly usage depending on capacity & service tier (developer: $ .13, production: $ .43 to $ 20.76 per hour)

Requires Azure Active Directory which can be federated to on-premises domain.

ASSAS is tabular only, same or slightly newer build as latest boxed product (2017/1400) & support older compatibility modes.

Capabilities & features are the same as using SSAS on-prem.

Slide22

8. Embedded Service & Embedded Solutions

Power BI Embedded now supports all features of a solution deployed to the Power BI service.

Managed through Azure services in the Azure portal.

Capacity & usage-based costs range from $1 to $32 per hour.

Service may be paused & managed through the API.

Slide23

This diagram depicts the components and interactions of an embedded solution.

Detailed information:

Power BI .NET SDK (server-side code): https://github.com/Microsoft/PowerBI-CSharp

Power BI JavaScript SDK (client-side code): https://github.com/Microsoft/PowerBI-JavaScript

Power BI REST API: https://msdn.microsoft.com/library/dn877544.aspx

https://docs.microsoft.com/en-us/power-bi/developer/embedding

https://azure.microsoft.com/en-us/pricing/details/power-bi-embedded/

https://docs.microsoft.com/en-us/power-bi/developer/embed-sample-for-customers

Slide24

9. Live Streaming Solutions

Streaming is a capability for developing custom solutions on top of the Power BI service.

The feature set is light and simple.

No separate licensing is required.

Streaming types & capabilities:

•Pushed dataset: Supports standard report visuals if “Historic data analysis” is switched on; caches data in a dynamically-created Azure SQL database.

•Streaming dataset: Does not store data… only dashboard tiles are supported. Push from REST API or as endpoint from streaming service, like Azure Stream Analytics.

•PubNub: Streaming dataset tailored to consume standard PubNub channels.

https://docs.microsoft.com/en-us/power-bi/service-real-time-streaming

Slide25

Now for a deeper-dive look at the Power BI Solution Advisor…

This project is a work-in-progress that can used to provide direction and to explore solution options.

It is not perfect or comprehensive but can help recommend solution architectures based on chosen requirements and solution criteria.

The second page uses bookmarks to navigate through the requirement category slicers and display candidate solution architectures.

Right-click a solution architecture “tile” to drill-through to components and help links.

On the final page:

The relative complexity of the chosen solution is estimated, based on selected components.

Select any combination of components to see related help topics and links to articles & resources.

Slide26

Again, I need to credit my friends at CSG Pro in the Portland area, for teaming up to build this tool.  It was an entry in a recent Power BI Hackathon.  CSG Pro hosts our monthly Power BI User Group meetings on the 4th Wednesday evening of the month in Beaverton, OR.

You can learn more about their consulting and development services at CGSPro.com

If you would like to download a copy of the presentation slide desk, it’s here: https://sqlserverbiblog.files.wordpress.com/2018/02/nine-realms-of-power-bi.pdf.  Feel free to use it as long as you keep all content intact including my contact information and copyright info.  As always, your comments and questions are welcome.