Next Level Power BI – M Language Basics

This is a series of posts and exercises from a recent full-day workshop called Next Level Power BI. Check back for additional lessons and freely post your questions and feedback in the comments section after this post.

Power Query is a modern data acquisition and transformation technology and it is the “Get Data” component of Power BI and many other Microsoft tools. The query script language of Power Query is known as “M” or the Microsoft Data Mashup language. The menu options and selections in the Power Query Editor interface build this “M” script for you but to use Power Query at an expert level, you must understand fundamental concepts of the language. The post will step you through some essential mechanics of the language to build the necessary fundamental skills.

What you learn here can be applied to the Power Query implementation of other Microsoft products, which currently include:

  • Power BI Desktop Query Editor
  • Power BI Data Flows
  • Excel “Get & Transform” on the Data ribbon
  • SQL Server Analysis Services (SSAS) Tabular projects in SSDT/Visual Studio, or Azure Analysis Services.
  • Azure Data Factory: Power Query task

By the time you read this, there are likely to be more implementations of Power Query.

Basic Syntax

A quick review of the characters that are used to define or encapsulate groups of objects will help to better understand and recognize what you are seeing in the following code examples.

Parentheses, square brackets and Set braces all have different meanings. For references, here they are in the same order:

( ) – Parentheses are used to pass parameters to a function and to control mathematical order of operations.
[ ] – Square brackets encapsulate a set of records.
{ } – Set braces encapsulate a list of values.

Most functions in M are part of an object class (such as Date, Table or Text) and the syntax is something like this:

  • ObjectClass.Function( ),
  • ObjectClass. Function( Parameter ),
  • ObjectClass. Function( Parameter1, Parameter2 )

Examples:

  • Date.AddYear ( <date value> )
  • Table.FromList ( <list object> )
  • Text.End ( <text string>, 2 ) – returns last two characters from right of string

A few older functions that were part of the initial release of Power Query are not prefixed with the object class name. For example the #date( ) function has three parameters: Year number, Month number and Day number.

Any word may be used as a variable. Common examples include step names like Source. Variables are often multiple words containing no spaces, such as ServerName. You are strongly encouraged to rename steps with short but meaningful phrases to make them self-documenting and easy to understand.

Are you with me so far?

The Mighty #shared Function

This little function is literally the key to the entire kingdom and will return every Power Query object defined in your file along with every available M object class and function in the entire language along with complete documentation and code examples. This is the best kept secret in Power Query.

Add a new blank query and type the following into the formula bar, and watch while magic things happen:

=#shared()

The results will be returned as a list of objects and functions. Convert that to a table and you can view the definition of several hundred objects and functions. The M laguange contains nearly 900 functions and more are frequently added to new product versions.

Click the cell next to the link for any function to peek at the definition or click the link to explore related details.

Getting Started

The following examples are hand-written queries entered into the Advanced Editor in Power Query. None of these queries have an actual data source. Values are generated within the code itself but all of these examples can be applied to real queries based on any data source.

Each of the following queries adds a little more to the previous query to explore progressively more advanced objects and code techniques in the M language.

If you create a new blank query in Power Query, it will generate code like this:

let
Source = “”
in
Source

This is the fundamental pattern to get you started.

Values

Let’s modify the script to give the Source variable a text value of “This is a value” (creative, I know).

Use the Advanced Editor to view the generated M script.

Here’s just the script outside of the Advanced Editor:

//——————————————————-
//#1: Object Types: Value
//——————————————————-
let
Source = “This is a value”
in
Source

Use the Done button to close the Advanced Editor to see the result.

Unless they are explicitly data-typed, variable object and data types are automatically assigned, which you will see in the following examples.

The Source variable represents a value type object that is implicitly typed as text. You can see the resulting value in the query designer:

Records

In M, records are encapsulated in square brackets. This script generates two single column records.

//——————————————————-
//#2: Object Types: Record
//——————————————————-
let
Source = [Record1 = “Value 1”, Record2 = “Value 2”]
in
Source

The result is two single column records:

Lists

The List object in M is denoted using Set braces, sometimes called “squiggly brackets”. The following script creates a list of four text-type values:

//——————————————————-
//#3: Object Types: List of Values
//——————————————————-
let
Source =
{ “Value 1”, “Value 2”, “Value 3”, “Value 4” }
in
Source

The list is shown below. Note the column header in the preview grid reminds us that this is a List object. Also note that the designer shows the List Tools ribbon with options to convert the List to a Table, because that’s what we normally do in Power Query to be able to consume List data.

Now, we can start to combine these concepts and techniques. The following query script creates two records, each containing two columns with corresponding values:

//——————————————————-
//#4: Object Types: List of Records
//——————————————————-
let
Source =
{
[Column1 = “Value 1”, Column2 = “Value 2”],
[Column1 = “Value 3”, Column2 = “Value 4”]
}
in
Source

The designer shows two records as links. This is because these are not consumable objects. To use them, you can either convert them to a table or expand one record by clicking the link to view the values.

To peek inside a record without adding a new query step, you can click in a cell but not on the link and the value will be displayed in the preview area at the bottom of the screen.

The first time you see a link presented in this way, you are likely to click on it to see what happens. No damage done, but this will generate a step that produces a single record and you will lose sight of the entire list. To go back, just delete the generated step in the Applied Steps list.

Tables

If you use the Convert to Table button on the toolbar, that will add the steps you see in this query.

The Table.FromList function is a good example of several similar functions that serve a similar purpose. As you look through all the functions within the different object classes, you will notice several different “From” and “To” functions that convert an object from one type to another type. These are both easy to use and they are easy to identify and to understand their purpose. This one should be obvious: As the FromList function in the Table class, it converts a List type object to a Table. Another example, the ExpandRecordColumn in the Table class expands a column containing delimited values into separate columns.

//——————————————————-
//#5: Object Types: Table
//——————————————————-
let
Source =
Table.FromList(
{
[Column1 = “Value 1”, Column2 = “Value 2”],
[Column1 = “Value 3”, Column2 = “Value 4”]
},
Splitter.SplitByNothing(), null, null, ExtraValues.Error
),
#”Expanded Column1″ = Table.ExpandRecordColumn(Source, “Column1”, {“Column1”, “Column2”}, {“Column1”, “Column2”})
in
#”Expanded Column1″

The result of these steps produces the results you see here. We began with a list of two records, each having comma-separated values. After the conversion steps, we now have a Table object with two rows and two columns with values and headers.

Transforming Each Row

When you use the Transform ribbon or menu options to perform most any transformation on a column, you are in essence saying “I want to perform this action on the specified column in each row of the table.” The key word in this statement is “each” and this is an important differentiation in the following M code that was generated when extracting the last character of the column named Column2.

An example of this generated query is shown in Query #6 below.

//——————————————————-
//#6: Object Types: Transform each row
//——————————————————-
let
Source =
Table.FromList(
{
[Column1 = “Value 1”, Column2 = “Value 2”],
[Column1 = “Value 3”, Column2 = “Value 4”]
},
Splitter.SplitByNothing(), null, null, ExtraValues.Error
),
#”Expanded Column1″ = Table.ExpandRecordColumn(Source, “Column1”, {“Column1”, “Column2”}, {“Column1”, “Column2”}),
#”Extracted Last Characters” = Table.AddColumn(#”Expanded Column1″, “Right Char”, each Text.End([Column2], 1), type text)
in
#”Extracted Last Characters”

Let’s isolate just that step and examine the M code in the last line of the query:

#”Extracted Last Characters” = Table.AddColumn(#”Expanded Column1″, “Right Char”, each Text.End([Column2], 1), type text)

The ‘#”Extracted Last Characters”‘ variable is being assigned the outcome of The AddColumn function. We know that the return object type is a table because of the object class Table always returns a Table type object. The AddColumn function takes three parameters:

  • A table type object (usually the name of the previous step variable)
  • Name of the new column as text
  • An expression that returns a value for the new column

The last parameter argument in this case is the output of the Text.End function. You might guess that this is going to be a text type value. The End function, similar to the RIGHT function in Visual Basic, returns a specified number of characters from the end of a text string. But, since Column2 exists in every record in the table returned by the ‘ #”Expanded Column1″ ‘ variable, how do we specify that we want to do this for each record, one-at-a-time? …go got it. We use “each“.

Note to the Right Char column in the following results. It is the right-most character from Column2:

Transforming Columns In-place

Rather than creating a new column containing a value extracted from a previously created column, you can also modify an existing column. This is done using the TransformColumns function of the Table class. Again, we’re looking at the last step of the following query that was generated by using the Extract menu on the Transform ribbon or menu:

//——————————————————-
//#7: Object Types: Transform each row col in place
//——————————————————-
let
Source =
Table.FromList(
{
[Column1 = “Value 1”, Column2 = “Value 2”],
[Column1 = “Value 3”, Column2 = “Value 4”]
},
Splitter.SplitByNothing(), null, null, ExtraValues.Error
),
#”Expanded Column1″ = Table.ExpandRecordColumn(Source, “Column1”, {“Column1”, “Column2”}, {“Column1”, “Column2”}),
#”Extracted Last Characters” = Table.TransformColumns(#”Expanded Column1″, {{“Column2”, each Text.End(_, 1), type text}})
in
#”Extracted Last Characters”

Here is the last step isolated:

#”Extracted Last Characters” = Table.TransformColumns(#”Expanded Column1″, {{“Column2”, each Text.End(_, 1), type text}})

Instead of the column name being passed to the Text.End function, we see an underscore. This is just a shorthand reference that means “use the column currently being transformed”.

Creating a Table from Records

Now back to completely hand-written code… This one’s easy:
By default, new columns are data-typed as “Any” which is essentially an unclassified data type and this is not typically a best practice. Columns should be strongly typed so that grouping, aggregation and DAX functions work correctly in the data model.

In the following query, we apply the TransformColumnTypes function from the Table class. The function requires a table and at least one column name and data type reference. You can pass any number of columns to this function, each encapsulated with set braces, and with another set of braces around the whole thing.

Here, you can see the results: one table with headers, two columns and three records. This all looks good.

Creating a Table from a Set of Records

Now, we’re going to start putting things together by using object type variables. I think you’ll agree that the code is much easier to read.

And we see similar results as before but with a much cleaner approach and readable code.

Extracting a Record from a Table

As a final assignment, I need to get only one record so I essentially need to unwind the objects that were used to assemble this table and get only one record from it. I want lemons so I need to figure out what the ordinal record number is within the table’s record collection.

Since there are three records, there is automatically an ordinal (index) number assigned to each. Sometimes you just have to guess if you don’t already know, if the first record starts at zero or one. Turns out that all these collections are zero-based so if you want to identify the second record (for Lemons) in the table, it’s going to be ID = 1.

Here’s the syntax:

let
Rec1 = [Product ID = 123, Product Name = “Oranges”],
Rec2 = [Product ID = 124, Product Name = “Lemons”],
Rec3 = [Product ID = 125, Product Name = “Apples”],

MySet = { Rec1, Rec2, Rec3 },

MyTable = Table.FromRecords( MySet ),
MyLemons = MyTable{1}

in
MyLemons

SO what’s all the fuss… what does this actually do?

I’m glad you asked. Let’s take a look.

Isn’t she a beauty?

It’s the best looking lemon I’ve seen all day.

Now you’ve seen the mechanics of how M language functions can be used to perform very simple value manipulations and transformations. More importantly, you have learned about the relationship between important objects like Values, Records, Lists of Values and Lists of Records, oh – and let’s not forget – Tables. That’s the one that actually does stuff in a data model,

I hope this has helped you to broaden your horizons with Power Query and the M language. Please comment with you feedback.

Chat with Donald Farmer at PASS Summit 2019

Enabling the modern data analyst by becoming shopkeepers rather than gatekeepers of information

I appreciated the opportunity to sit down and talk with one of my heros in the industry. Donald Farmer has been a recognized leader in the Business Intelligence and Analytics community for the past two decades and his thought leadership is even more relevant today. He played a key role in shaping the Microsoft BI toolset as a Principal Program Manager from 2001 when the products we use today were in their infancy. He’s also been outside of Microsoft circles which gives him a broad and unique industry perspective. As the Vice President of Innovation and Design at Qlik Technologies, he brought Qlick Sense to market. He remains a friend and advocate for the community. Donald holds multiple patents and has published multiple books. Today he is doing his own thing as the Principal of TreeHive Strategy.

The chat was 9:41 long and is worth watching every second. I have added captions to highlight a few key points which are also listed below. Please enjoy.

Key Points

Route Knowlege is a view of information at a micro level, like turn-by-turn directions. This might be important when you only need to answer a single question without a broad understanding of the entire landscape.
By contrast, Survey Knowlege is a view of information in the context of the bigger picture. Someone with Survey Knowlege knows “how the whole thing hangs together“.

Serving the needs of the modern Business Data Analyst requires a change in attitude. We should become provisioners of information rather than protectors of information. In other words, become shopkeepers rather than gatekeepers.

Actionable data is not always detail data and consuming a summation can not only be easier but more useful than getting lost in the details and the accuracy or precision of a number. Drilling to detail in context is a natural part of the exploration process. He talked about a pattern in the process to provide actionable business information, which includes these stages:

  1. Orientation
  2. Glimsing
  3. Examination
  4. Acquisition
  5. Action

I appreciate Donald taking the time to both speak at PASS Summit and to chat with me for this blog series. Please watch my blog for additional interviews and insights from PASS Summit 2019

Next Level Power BI – SVG Databars

Power BI table and matrix visuals can render SVG object elements in a column or measure to display in-line web content. You can build dynamic content like KPI indicators, databars and sparklines by altering SVG element content using DAX expressions. In this post, I will show you how to create databars that show negative and positive values with different color fills based on category values. Of course, Power BI has built-in databars in the conditional formatting options for numeric columns but you cannot alter the color and other attributes dynamically – at least not yet. For the time being, this technique provides more flexibility than the built-in databar feature.

Read This First

In this post from earlier this week, I introduced a simple technique to produce symbols and graphics using columns or a measure to render SVG images in a table visual. I recommend that you read that post to understand the simple implementation of this technique.

Dynamic Databars

Instead of a graphic symbol, the databar really just consists of a rectangle that has a variable start and end position. I want to change the fill color layer based on whether the value is negative or positive, and based on a category value in my data. You could apply any imaginable business logic to change the color or other attributes. I also want to layer formatted text over the top of the databar rectangle. Check out the example below showing a series of records. My simplified demo dataset causes the same value to repeat across the four Flintstones characters but this Databar measure could be applied to any set of data. Notice that negative values are red and the bar “starts” at the center and is proportionally longer as it extends to the left of center for larger negative values. The same is true for positive values that start in the center of the cell and move to the right.

Before we examine the code to generate the dynamic databar, let’s look at the rendered SVG element for a couple of these cells. If I hover over the cell for ID 1 and “Barney”, the SVG definition is displayed in the tooltip.

One fun thing about writing this blog post and showing you fragments of HTML script is that it is a little challenging to copy and paste the script into this post without it being interpreted and rendered as a graphic. This is one reason I’ve elected to use screen capture images.

Examine the above script and you will see the “<rect” element opening tag followed by attributes that specify the x axis, y axis and height values. These attributes draw the rectangle starting at 79 pixels from the left-most edge of the viewBox (left side of the cell), 80 pixels tall and 181 pixels wide. Keep in mind that the graphic does get rescaled with the table so these are only relative measurements. The fill property of the style attribute fills the rectangle with the color Red.

There is also a “<text” element with similar attributes that place centered (anchored to the middle of the viewBox) text at the mid-point of the viewBox width of 300 pixels. For simplicity, there are a few attributes I’m not mentioning but it’s fairly simple to recognize them and figure out what’s going on.

Hovering over the cell for ID 9 and “Barney” with the value 1495, the x and y axis values cause a DodgerBlue filled rectangle to start at 260 pixels and to be 236 pixels wide.

Show Me the DAX

The DAX code for the DataBar measure is below. The actual expression code is fairly simple but it took a little trail-and-error to work-out the measurements and scaling logic. Rather than building the expression in-line, I’ve invested work up-front to put all the values into variables. First, we get the minimum value for all the selected ID rows (using the MINX iterator function with the ALLSELECTED filtering function) and store that in the MinValue variable. The MaxValue variable is similar with the added condition to handle negative or positive values. This may all seem a bit verbose but its really just math.

There is a disconnected table in my sample data model named Flintstone Characters with character names that I place onto the Columns axis of the matrix visual. In a real-world solution, this could be a field like product categories, geographic regions or anything else. The BarFill variable is set to a color value (which could be either a web color name or Hexadecimal value) based on the selected character name. When grouping a visual on a value, the SELECTEDVALUE function will return a single value. After resolving all the variable values, this measure emits the SVG markup similar to the examples shown previously.

DataBar =
VAR MinValue = MINX(ALLSELECTED(‘Values'[ID]), calculate(SUM(‘Values'[Value])))
VAR MaxValue = MAXX(ALLSELECTED(‘Values'[ID]), calculate(SUM(‘Values'[Value]))) + IF(MinValue<0, ABS(MinValue), 0)
VAR BarValue = SUM(‘Values'[Value])
VAR FormattedValue = FORMAT( BarValue, “#,##0” ) –“#,##0.00” –“#,##0.0 %”
VAR svgWidth = 600
VAR svgHeight = 200
VAR svgFontStyle = “normal 70px sans-serif”
VAR BarHeight = 80
Var BarRange = ABS(MinValue) + MaxValue
VAR BarValueFactor = DIVIDE(svgWidth, BarRange)
VAR BarWidth = INT( ABS(BarValue) * BarValueFactor )
VAR BarZero = INT( DIVIDE(ABS(MinValue), MaxValue) * svgWidth )
VAR BarStart = IF(BarValue < 0, BarZero – BarWidth, BarZero)
VAR BarOpacity = “0.5”
VAR BarFill = IF( BarValue < 0, “Red”,
SWITCH( SELECTEDVALUE(‘Flintstone Characters'[Character]),
“Fred”, “DarkOrange”,
“Wilma”, “Purple”,
“Barney”, “DodgerBlue”,
“Betty”, “DarkGreen”
)
)
VAR svg_start = “data:image/svg+xml;utf8,” VAR svg_end = “”
VAR svg_rect = “”
VAR svg_text = “” & FormattedValue & “”
// x=(half of width width when text is centered)
RETURN
svg_start & svg_rect & svg_text & svg_end

download this code

Like in the Symbols example in the earlier post, you must select the measure and set the Data Category in the Modeling ribbon to Image URL. After that, the measure can be dropped into a matrix visual grouped on the ID and Character fields, to produce the databar visuals I showed in the first example above.

It Takes a Village

A few experts in the community have created SVG sample projects with various techniques. David Eversveld has blogged about generating SVG images in this series. When I started working out the code for the DataBar technique, I reached out to some friends in the MVP community and Darren Gosbell was kind enough to help-out. In the process Darren discovered a bug (or an undesired behavior) in Power BI that ignores the aspect ratio directive in the SVG element and draws a square image even when you don’t want it to. He posted a suggestion the Power BI Ideas forum and I encourage you to go vote to help get this fix prioritized by the Power BI product team. Before this is corrected, the technique is useful but will take-up more vertical space than you might prefer.

Creating in-line Sparklines

On a related topic, Brian Grant has developed a method to produce SVG trend line charts, commonly known as “sparklines”, using a creative string replacement technique. You can find a tutorial for his solution here.

download sample PBIX file

Next Level Power BI – SVG Symbols

With so many different methods to visualize data – just within Power BI – this can be a daunting choice. Let’s see… there are a plethora of standard visuals that can be used in creative ways. A software developer can create a custom visual if they have the time and chops for that kind of thing. You can use a custom visual generator tool like Charticulator or the Chart designer from PowerBI.Tips. And, you can render unique visuals for statistical analysis using R or Python. The technique I’m going to demonstrate in this series doesn’t require a lot of work or special skills so it is relatively simple and easy to use.

In this first post in a series called “Next Level Power BI”, I demonstrate how to create a simple graphical symbol as a Scalable Vector Graphic – just to get started. In a later post, we will expand the technique to create dynamic databars. The same technique can be used to create icons, sparklines and practically any imaginable graphic can be produced with calculated column or measure.

I’ll start with a simple example. Let’s say that you need to display a set of symbols to correspond to a data condition of some kind. For simplicity, let’s just start with the graphics. Here are examples of three symbols displayed in a table visual on a Power BI report page:

Finding the SVG markup for these symbols can be as simple as a web search. You can also create your own using vector graphic editing tools like Adobe Illustrator. The markup for these three symbols is shown here in Power Query. In different Power BI solutions, I have stored and exported the SVG markup in Excel. It’s just text so you can store it anyway you prefer. All three symbols are defined as a series of spatial points in the path element but you can draw shapes, text and a variety of things. There are several good SVG references on the web but I found this collection from Jakob Jenkov to be particularly useful: http://tutorials.jenkov.com/svg/index.html

In this solution, I just used the Enter Data feature and then copied and pasted the SVG markup directly into the query – one row per image.

We’re not quite done because Power BI requires the following header element to be appended to the beginning of each graphic:

data:image/svg+xml;utf8,<?xml version=”1.0″ encoding=”iso-8859-1″>

You can approach this a few different ways but I added a custom column named Definition and then removed the original column using this expression:

After closing and applying the query, to render the SVG image which exists in the Definition column within a report; it must be categorized as an Image URL. In the report designer, select the column from the Field list and then choose the Data Category on the Modeling ribbon.

The column is now usable as an image in a table or matrix visual, which should appear like the first screen image above.

To create a measure to display only a single image, you can create a measure like this one that returns only one value from the column:

Spreadsheet Symbol =
CALCULATE( MIN( ‘Images'[Definition] ),
‘Images'[Image Name] = “Spreadsheet”
)

The measure must also be categorized as an Image URL and then it can be used as an individual symbol or image in a report within a table visual.

That covers the bare bones basics of SVG image columns and measures. There is so much that you can do, expanding on this technique by injecting dynamic logic into the measure code. I’ll cover that in a subsequent post.

Next-Level Power BI – SVG DataBars

So, I wrote a bedtime book now?!

Is the 2010, first edition of my SQL Server Reporting Services Recipe Book so snooze-worthy that it is now being used as a bedtime story? Huh uh!

O’ contraire, This book contains so much of what real life has to offer. So much insight and deep, deep learning opportunities that instruct, educate, train and prepare an industry professional to take their report designs to new heights, to inspire and to elevate the reader by using report design techniques never attempted before. All on a rock-solid foundation of tried-and-true report design best practice techniques that will totally rock your world and change the way you think about reporting! These are the sort of insights contained within the pages of this exciting book about SSRS report recipes.

So, what is this business about Paginated Report Bear wanting to read my world-changing book as a bed time story? Really… well, that’s not going to work!

Oh, Chris (Finlan), my friend… why oh why do you torment me like this?

Matthew, we need to talk about your Dad and his wayward little stuffed friend!

Power Platform World Tour Vancouver Presentations

These are copies of my presentation decks from the sessions today at the Power Platform World Tour event in Vancouver, BC. I appreciate all who attended.

The Charticulator session was originally presented in July at the Power BI Symposium in Portland and here is the recording of the session:

Please feel free to contact me with questions or feedback using comments in this post.

Power BI Project Good and Best Practices

Paul Turley
Power & Data Platform MVP, Principal Consultant, Pragmatic Works

Preface: This post is set set of guidelines that address many aspects of new projects. Although I feel strongly that many of my recommendations are correct and applicable in most uses cases, other experienced practitioners have valuable ideas and I welcome their input. Please post your comments, links to other resources and ideas that may be different than mine. This will be a living document with ongoing revisions with the goal to provide a comprehensive, best practices guide as the platform continues to mature and as experts continue to use it.

Revision: 8/25/2019 – cleaned-up & added checklist
Revision: 9/7/2019 – added Security section & corrected Last Refresh Date/Time

I find there there are so many things to remember when starting a project that a checklist is handy. I’ve been collecting the following as notes for some time. Bare with me as I work on consolidating this article into a concise checklist.

The purpose of this article is to outline a set of guidelines and recommended practices for managing Microsoft Power BI projects. This guide is primarily focused on the work performed by the IT-managed BI Solution Developers for formally-managed BI projects.

Why “Good” and “Best”?

Power BI is a tool and platform that serves two masters: IT developers and business data users. As such, some practices are absolutely best-practice and should be followed with rare exception. Other recommendations are conditional guidelines, with possible exceptions. The flexibility of Power BI makes it possible for industry practitioners to developed their own solutions to apply design patterns in the absence of a specific product feature (like source control, for example).

This guide is largely based on my experience, discussions with other members of the MVP community, clients and fellow consultants. Most of the information in this recommendation guide are based on the consensus of parties with the greatest experience with the Power BI platform. These parties include:

  • Members of the Microsoft MVP Program (Data platform and Business Applications focus) who participate in daily interactions with the Power BI product development and leadership teams at Microsoft.
  • Microsoft certified solution partners who offer consulting, training and value-added services to Microsoft product customers.

Power BI has come a long way in a short time; since the Power BI service and Power BI Desktop were introduced in 2015. At that time, the emphasis was to enable self-service business users to analyze and visualize data in small-scale ad hoc reports. As Power BI grew into the enterprise space, it has been adopted by IT-managed data governance and solution development teams. Power BI is displacing – and being used in concert with – conventional IT-centric Business Intelligence product technologies that have been evolving for decades (such as SQL Server Analysis Services, Reporting Services and Integration Services). With this history and the rapid adoption of the platform, standard design practices will continue to evolve.

Two discreet audiences and roles should be considered for Power BI:

  • Business Data Analyst
  • IT-managed BI Solution Developers

Consider these roles, with respect to both the solution developer(s) and users, as you begin to plan a new project.

Business Data Analyst

Members of this role are part of a rapidly growing segment of the industry. They understand how to use software tools and understand data manipulation and reporting within the confines of area of business focus. Traditionally, data analysts were data consumers who use financial spreadsheets but now they use advanced data preparation tools to perform reporting and deeper data analytics. Business Data Analysts generally fall into two subcategories:

  • Advanced Data Consumers
    These users explore data using spreadsheets, pivot tables and chart-style reporting tools that are served by existing multidimensional cubes and data models. They do not write code and work with IT/BI Developers, Data Modelers or Advanced Data Analysts to collect and model the data they use to create their reports.
  • Data Modelers, Advanced Data Analysts and Data Scientists
    These advanced users perform their own data mashups, produce data models and write limited code to create solutions for their department or business area (using Power Query, DAX, R, Python, etc.) They do no architect and develop governed and certified datasets for enterprise-level business use.

IT-managed BI Solution Developers

Enterprise-scale BI solutions are typically created by a team of specialists who may include:

  • Business Systems Analysts
  • BI Solution Architects
  • Project Managers
  • Data Transformation (ETL) Developers
  • Enterprise Data Modelers
  • Measure code experts (using DAX)
  • Report Developers
  • Operations & Deployment Engineers

Power BI may be used alone or with other products to create BI data transformation queries, data models, reports and dashboards. IT-BI managed scenarios:

  • Complete BI solution design, development and deployment, consumed by end users.
  • Data models created and deployed to support user ad hoc report design.
  • Support Advanced Data Analyst data transformation, data modeling & report design
  • Migrate advanced user created data models to governed/certified datasets, transferring ownership to IT & operations

Project Types – simplified

For simplicity, this guide refers to Power BI projects as either “Formal” or “Informal” using the following definitions:

  • Formal projects are scoped, funded, staffed and executed with the collaboration of a business champion and stakeholders; and IT Business Intelligence developers and data managers. These projects promote business and IT-governed datasets and certified reports.
  • Informal projects are executed by business users and are considered ad hoc in nature. Datasets are generally not IT governed, and reports are typically not certified.
  • Hybrid projects can be anything in-between. They might be a user-authored report using published, certified dataset used for self-service reporting. Informal, self-service datasets can be migrated to governed datasets in collaborative IT/business projects.

Again, this guide is primarily focused on the work performed by the IT-managed BI Solution Developers for formally-managed BI projects. Topics included in this guide:

  • Solution Architecture
  • Managing Power BI Desktop Files
  • Datasets and Reports
  • Version Control & Lifecycle Management
  • Workspace and App Management
  • Data Model and Power Query Design Guidelines
  • Dimensional Design
  • Query Optimization
  • Managing Dataset Size with Parameters
  • Implicit and Explicit Measures
  • If Users Need Excel, Give them Excel
  • Certified & Shared Datasets
  • Enterprise Scale Options

Solution Architecture

All Business Intelligence projects involve the same essential components including:

  • Source Queries
  • Data transformation steps
  • Semantic data model
  • Calculations (typically measures)
  • Data visualizations that reside on:
    • reports
    • report pages
    • dashboards

These components can be built using different products and development tools. Lengthy discussions can be had to debate the nuances and merits of different options; but a short summary of recommendations follows. To summarize these options in simple form, first consider the following options:

  1. Single PBIX File

    Build all queries, transformations, the data model, measures, reports and pages
    with Power BI Desktop in a single PBIX file and deploy to the Power BI cloud
    service; to be shared with users.

Recommendation/Cost/Benefit:
This option is most suitable for Business
Data Analysts, erring on the side of simple design, typically optimal for
small, informal self-service projects where a desktop business user needs to perform their own data mashups and quick, ad hoc analysis. Risks include unknown data integrity and the use of uncertified dataset results. Informal projects can effectively serve as prototypes for formally developed and managed dataset development. Single file projects can be migrated to dataset/report two file projects with some rework required for the reports, depending on the report complexity.

Separate Dataset & Report PBIX

In formal projects, use Power BI Desktop to author a separate PBIX “dataset” file containing the following: queries, transformations, the data model and measures. These are in turn deployed to the Power BI cloud service. Develop one or more separate PBIX “report” files that connects to the published dataset file in the service. All report pages and visuals reside in the “report” file. Some additional measures can be created in the report
file.

Recommendation/Cost/Benefit:
This pattern is recommended for most projects where data governance is important and where new report features are likely to be requested in the future. By separating the dataset from report development; the data model design can be locked-down and managed by a data steward. This approach provides report designers freedom to add visuals and make report enhancements without risking the stability and integrity of the data model that may be serving data to other reports.
Development and management of a two file project is a little more cumbersome because the dataset must be redeployed before changes are available in reports.
Self-service report users who need to explore and visualize data, can use a governed and certified data model. Changes and enhancement should be coordinated between the report designer and the data steward or IT resource managing the data model.

SSAS with Power BI

Taking the “traditional route” in formal projects to build the data model in SSAS, hosted either on-prem or in Azure Analysis Services, and then creating reports and visuals with Power BI was a necessity under enterprise-level feature were added to the Power BI platform in 2017-2018. Although there are still some good reasons to use SSAS rather than Power BI to build the data model (called a “dataset” in Power BI), Power BI can be the more cost-effective choice today. Power BI and SSAS both offer unique capabilities which doesn’t clearly make one option always better than the other.

You can develop the source queries, transformations, data model and measures in an Analysis Services (SSAS) project using Visual Studio (also called SQL Server Data Tools or SSDT) and in turn deploy the SSAS database/semantic model to an on-premises SSAS instance or Azure Analysis Services (AAS) cloud service. Connect new Power BI reports to the deployed SSAS model using a direct connection. SSAS/AAS can handle data models with more than the 1 GB, 3 GB & 12 GB compressed data size limits of Power BI. The SSAS development environment can be more difficult to navigate and the model designer can be unstable. When the VS/SSDT designer crashes, developers can lose work even if they are routinely saving their work.

Recommendation/Cost/Benefit:
Favor using Power BI. It typically takes considerably longer to develop data models in Visual Studio/SSDT than using Power BI Desktop. This project architecture is no longer the primary recommendation for BI solution development. Microsoft is directing most of their development and support resources to Power BI rather than SSAS. Although SSAS still has some advanced capabilities not yet present in Power BI; Power BI can handle larger data volumes in certain cases. Microsoft and Microsoft partners recommend using Power BI is the tool for developing most new BI solutions.

  • DirectQuery vs Import/In-memory

    Power BI desktop using DirectQuery mode, rather than import mode to reference tables in an underlying data mart or data warehouse relational database. No data is stored in the Power BI or SSAS data model; live SQL queries are generated when navigating report visuals.

    Recommendation/Cost/Benefit:
  • Avoid this option, unless needed to satisfy unique requirements. Power BI and SSAS both are favored and are optimized to use: import mode, in-memory tabular data models for all but rare and exceptional scenarios. DirectQuery significantly limits the DAX Calculations supported in the data
  • model. Query speed is typically many times slower than import, in-memory mode. The advantage of DirectQuery mode is that queries are executed directly against source tables in real-time. Consider building composite models with both import and DirectQuery mode to get good performance and drill-down to live transactional details. Composite models are typically complex and more time-consuming to develop, test and deploy.

Managing Power BI Desktop Files

Power BI Desktop files (PBIX and PBIT) should be stored in a centrally managed network-assessable folder, where they can be accessed by all
members of a BI report development team. The storage folder should support automatic backup and recovery in the case of storage loss. Options depend on currently supported services and organization preferences within the following guidelines:

File Size and Folder Synchronization

  • Report and dataset developers must open files from the Windows file system and not from a web browser interface. Files must either reside in or be synchronized with the Windows file system.
  • Files containing imported data typically range in size from 100 to 600 MB. Any shared folder synchronization or disaster recovery system should be designed to effectively handle multiple files of this size.

Options:

  • OneDrive For Business (shared by team, with folder synchronization).
  • SharePoint or SharePoint Online (with folder synchronization).
  • GitHub and/or VSTS with local repository & folder synchronization. If used, Git must be configured for large file storage (LFS) if PBIX files are to be stored in the repository.

File Naming

Give files an intuitive and friendly name, that will result in them not being confused with similar report projects. In default view, only the first 20 or so characters are visible for each report. Use self-evident names that can be used to identify datasets and reports, using the beginning of long names. Workspace and report names have limited space in the Power BI portal navigation bar. Use short and concise names when possible; but try to use names that differentiate like-named items in the first 20 characters. The full name is visible in the hover-over tooltip or by resizing the panel.

image

Datasets and Reports

Formal report projects with certified datasets should be developed with reports and datasets divided into separate files. By default, all Power BI objects, including reports and datasets, are stored in a single file. When the PBIX file is published from Power BI Desktop to a workspace; the PBIX file is separated into separate dataset and report objects. This default behavior is convenient for informal, small-scale, self-service analysis; but not optimal for managing formal solutions with governed datasets, managed by teams of developers or report authors.

Separating the dataset and report into different PBIX files; allows them to be managed and versioned separately. This approach also promotes a division of labor and ownership, particularly for certified datasets that must undergo scrupulous validation.

Like the way SSAS projects are used to develop multidimensional cubes and tabular models; deploying a separate Power BI dataset provides an uncomplicated black box solution for ad hoc report design and data exploration. Power BI can now be used to develop large data models containing 10s of gigabytes of compressed data. Using these development tools take far less time and possibly without additional licensing fees.

Version Control & Lifecycle Management

At the present time, Power BI has no integrated version control mechanism. In addition, Power BI is incompatible with most version control systems that perform branching and differencing operations on code and script files that are common in application development projects. Power BI (PBIX and PBIT) report and dataset files are stored as binary archive files with internal checksum metadata. Change and differencing operations performed by most code and version management systems will corrupt these files and render them inoperable.

Reality: Integrating changes made to multiple PBIX files is difficult to manage and challenging to perform technically. Even with differencing tools and effective version-control, just avoid multi-developer work on Power BI dataset files.

The Power BI ALM Toolkit (based on the older BISM Normalizer tool for SSAS Tabular) will make it easier to compare differences. Until (or if) specialized tooling becomes available, you should avoid replying in this method to manage project work.

Power BI Desktop PBIX files that contain imported data can be large; typically, in the range of 100-600 MB. However, they can be checked into a version control repository for the purpose of backing up and restoring files.

PBIX files can be reduced to small files by saving them to a Power BI Template (PBIT) file. This removes all data and may be an effective to make occasional backups for version archive purposes and disaster recovery. The template can then be used to generate a new PBIX file; much like using a template to generate a new Word, Excel or PowerPoint document.
Saving a PBIX to a template for backup purposes takes minimal effort; but rehydrating a PBIX from a template is more labor-intensive and time-consuming. The reason for this is that a new file must be created and then queries must reprocessed to populate tables with new data.

Recommendations:

File versions

Store the dataset and report PBIX files separately using a version number postfix for the file name in the following the format: Major.Minor.Revision. For example:

Manufacturing Cost Analysis Dataset v1.5.3.PBIX

Keep only the previous file versions that are useful for historical reference and recovery. Report files are small and don’t take up much storage; but, dataset files can be large. If you need to archive older dataset files, resave them as Power BI Template (PBIT) files to reduce the file size. However, a developer must continue to work with the current PBIX file. Recovering a dataset from a template is time-consuming so developers cannot actively develop using only template files.

Remove the version number from file names prior to production deployment.

Dataset Version History

By keeping version history information in an internally-generated table in the data mode; durable revision history notes and version information is self-contained and not dependent on external data sources.

Add a Version History table to the data model by using the “Enter Data” feature in Power Query to create a table containing these columns:

  • Version Number
  • Revision Date/Time
  • Developer Name
  • Revision Notes

Add a custom column named Refresh Date/Time that will be used to capture the date & time every time the queries & model is refreshed. This custom column could be added to any table but it is best to use a table with few rows because the value will be the same in each row.

DateTime.LocalNow()

Add the following measures to the model:

Measure Name Definition
Last Refresh Date/Time = MAX( ‘Version History'[Refresh Date/Time] )
Current Version = MAX( ‘Version History'[Version Number] )

Create a Dataset Information page in the dataset PBIX file and in any report that references the dataset.

Every time a revision is made to the data model, edit the Enter Data
step and add a row to the table with a new version number. Use the following guidelines to increment components of the version number:

Component 
Major releaseIncrement only when a newly scoped solution is released to PROD.
Major releases have a specific set of planning features. A new major release is created only for a new set of planned features or significant enhancements as part of a new scope of work.
Minor Version NumberAs a team, decide when changes are significant enough to warrant an incremental minor version release; typically, when changes are made to QA or PROD releases within a major release. Generally, for minor feature
enhancements: For a dataset, minor version changes may break or affect a report’s design.
For a report, minor version changes add or change functionality aside from fixes.
Fix or Revision NumberIncrement for every bug fix or design change, that will be deployed to the DEV environment or visible to other developers. Generally, for bug
fixes but not for adding features. May be incremented by the developer. Reset
to zero with a new minor version number.

For example; the following version number represents Major Release Version 1Minor Version 5 and Fix/Revision Number 3:

v1.5.3

Datasets and reports should be versioned separately. The major version number for reports and datasets may be synchronized for major workspace app releases.

Be mindful, that the purpose for version numbering is to provide at simple standard to keep track of changes. The process can easily be over-thought and over-engineered.  Keep it simple, discuss changes as a team and make process adjustments to focus on priority tasks.

Release Management, DevOps and Automation

(topic draft – expect updates – please comment)

This is an area where the Power BI platform currently lacks a formal process or mechanism. We know that release management tools are being developers but in the early stages. We may or may not see a release management too from Microsoft that blends with other tools that you might have in use, such as VSTS and AzureDevops. In my experience, automated Power BI release management isn’t really being practiced in the industry yet. That said, there are techniques available. PowerShell script may be used to publish datasets and reports, and to change dataset bindings. It is possible to either publish to a production workspace or to effectively move assets from one workspace to another. This approach is discussed briefly in the Power BI Enterprise Deployment Guide. Other approaches are discussed here:

Power BI release management

PLEASE ADD REFERENCES

https://www.msbiblog.com/2018/09/12/power-bi-release-management/

Workspace and App Management

To share reports to users who do not have a Power BI Pro license (each license has an additional cost) must be in a Premium capacity workspace and the workspace must be published as an app.

Create a separate workspace to share with Free licensed users

Create a separate workspace for formal projects that have a report, collection of reports and/or dashboard that should be distributed to multiple view-only users. A workspace must be converted to an app to be shared with users who do not have Power BI Pro licenses assigned to them. Once the workspace is published as an app; it can be shared with any group of “free” users or all users in the organization. The visible text for workspace names is limited; name workspaces so they be uniquely identified in about 20 characters or less.

For a formal project, create the following workspaces:

DEV Workspace

Only development team members need Contributor access to this workspace. This workspace does not need to have Premium capacity; unless, developers need to unit test incremental refresh or other Premium features.

QA Workspace

All testers must have View access for testing and Contributor access for report authoring. Should be in Premium capacity to test incremental refresh.

PROD Workspace

Omit the “PROD” designation in the name. This workspace will be the
name of the published app that users will see in their Apps, Home and Favorite pages so use a name that is simple and sensible. Must have Premium capacity to share the app with non-Pro licensed users.

image

Use a common workspace for informal reports not shared with Free
licensed users

Multiple self-service and informal report files can share a common workspace without converting the workspace to an app. All users accessing any of those reports, must have a Power BI Pro license assigned to them. Informal or self-service Power BI projects typically don’t require 
separate Development or QA workspaces.

Data Model and Power Query Design Guidelines

The topics in this section are not an exhaustive guide to Power BI query and data model design. Following are common issues and recommendations often encountered in projects.

Dimensional Design

Always design data models for optimal performance, even when it “doesn’t matter” in simple projects. As data volumes grow and as data models are enhanced to address new requirements, increased complexity leads to performance challenges. There are many books on this subject dealing with the intricacies of best practice data model design. In brief:

  • Build star schemas – wherever possible, reshape data into fact a dimension tables with single key, one-to-many relationships from dimensions to fact.
  • Enforce dimension key uniqueness – Just because a key value “should” be unique, there is no guarantee that it will be unless enforced at the data source. Perform grouping and duplicate reduction in the data source views or Power Query queries to guarantee uniqueness. Duplicate record count checks and other mechanisms can be applied to audit source data for integrity but do not allow the data model to violate these rules.
  • Avoid bi-directional filters and unnecessary bridging tables – These data modelling patterns adversely affect performance.
  • Calculated columns – Should be applied in Power Query and not in DAX calculated columns wherever possible. This maintains a consistent design pattern for maintainability.
  • Annotate code – Use in-line comments and annotations in all code including SQL, M and DAX; to explain calculation logic and provide author and revision information.
  • Hide all fields not used directly by users. These include: primary and foreign key columns, numeric columns used to create measures, and columns used to specify the sort order of other fields.
  • Set to Do Not Summarize – Any non-hidden numeric columns that are not intended to roll-up or summarize values should be set to “Do Not Summarize” in the Modeling ribbon in Power BI Desktop. Columns set to summarize are indicated with a Sigma icon.

Support Query Folding with Views

Data source queries in Power BI are authored using the Power Query editor; which generates queries in the Microsoft Data Mashup “M” universal query language. A critical optimization technique employed by Power Query is called “Query Folding”; where the “M” query definition is translated into a native query that can be executed by the source database server. If query folding works, SQL Server executes queries and returns results to the Power BI service through the on-premises data gateway. If query folding can’t be used, all raw data is streamed through the gateway, held in memory in the service and then processed inefficiently, row-by-row.

Query folding works reliably when queries authored in Power Query reference relational database tables or views. Avoid using SQL statements in Power Query and avoid using stored procedures as query data source objects. This can be a difficult transition for database professionals who are accustomed to writing SQL queries but Power Query can optimize its queries only when referencing tables or views.

In rare cases to avoid extremely long-running queries that might time-out during data refresh, it may be necessary to materialize source table results in a table stored in the EDW or data mart database. Using parameters with an incremental refresh policy can also reduce or eliminate query time-out issues.

Managing Dataset Size with Parameters

Use Power Query parameters to filter large tables (typically only fact tables). The incremental refresh feature of Power BI Premium capacity will automatically generate partitions and only process new or partial data to avoid reloading large tables.

Even if you don’t intend to implement incremental refresh policies; data refresh parameters allow the working dataset size to be reduced in the development environment and to deploy a smaller dataset file that can then be refreshed with a full load of data in the service.

Recommendation:
Design large tables with a parameterized range filter that conforms to the requirements supporting incremental refresh policies.

When a dataset PBIX file approaches 400 MB in size, parameters should be used to filter large tables and reduce the working set of records for development.

Dataset approaching 1 GB in size should have fact tables configured with incremental refresh policies and must be in a Premium capacity workspace.

First, create two Date/Time type parameters named RangeStart and RangeEnd. Next create a date range filter in the early steps of each fact table on a Date/Time type column in the table. Incremental refresh policy requirements are detailed in this article: https://docs.microsoft.com/en-us/power-bi/service-premium-incremental-refresh

Measures & Numeric Columns

Format all DAX code for easy readability. not only does it look better but will help avoid mistakes and support debugging and maintenance. As a guide, use DAXFormatter.com to “beautify” measures and follow the same code formatting pattern going forward.

Numeric Measures

All measures should return the correct numeric type values and every measure should be correctly formatted. This seemingly trivial subject is important and an issue in most informal and ill-managed projects.  Requirements should be defined for data types, precision and formatting.  Common items include:

  • Measures are designed to summarize and total over any filter, slicer or grouping condition. Apply the appropriate roll-up or conditional level logic if it is not 
  • Percent and ratio measures should be fractional decimals. For example, “45 percent” is represented as a decimal value .45. When formatted, it appears as 45%, or 45.00% with two decimals for better precision.
  • All whole or large number measures should include a thousand separator.
  • Use the DAX DIVIDE function instead of the forward slash operator to handle blank and zero division errors.

Text Measures

  • Do not conditionally return text values to handle exceptions (like “not applicable”) in a measure that should return a numeric value. 
    Create a separate text-type measure to support titles & labels, if needed.

Implicit and Explicit Measures

For simple and informal projects, Power BI supports implicit measures. Numeric columns in each table are automatically set to summarize data for ad hoc reporting. This behavior is both convenient and confusing for ad hoc report users. In formal projects, avoid using implicit measures for the following reasons. Instead, create a measure for any numeric column values that should be aggregated for reporting, and then hide all redundant numeric columns.

  • Implicit measures don’t work in Excel pivot table and expressions
    Like when working with cubes and connected SSAS models, only explicit measures are available in Excel report.
  • Two choices confuse users
    Since all data models will include explicit measures, written using DAX. It is confusing for users to understand when to use measures and when to use numeric columns. In conventional Excel pivot tables (which connects to Power BIas i f it were a multidimensional cube); measures and calculation objects are contained in a measure groups and all columns (whether numeric or any other data type) are dimension attributes. The introduction of PowerPivot, SSAS tabular and Power BI confuses most users with inconsistencies.
  • Measures are Flexible
    Inevitably, some implicit measure will eventually be replaced with explicit measures when they are required to support some conditional logic or alternate aggregation behavior.  Starting with explicit measures, avoids retraining users and introducing inconsistent behavior.

Promote Self-service Reporting

The true value of the Power BI platform is when business users can use the interactive reporting features of Power BI to get answers to business questions. This can be as simple as starting with an existing report and changing a slicer or filter; or adding a visual to a report page to see a selected set of data. Novice report author/users can easily be taught to drag-and-drop fields into a new report page to group and slice data from an IT & certified dataset. Starting with IT/business certified datasets published to the Power BI service, users with a range of interests and skills can modify or author reports using their web browser without altering governed data or protected reports. 

For user training and support purposes, consider educating users with skills in the following roles:

Novice Report Users

  • Navigate to an existing published app or report
  • Understand how to dashboard tiles, report pages and visuals
  • Use filters and slicers
  • Use visual interactions to select and slice data
  • Use report visual drill-through navigation
  • Use Q&A to ask and answer data questions

Casual Data Analysts

  • Edit a report
  • Change a report visual to a different type of visual
  • Add a report page
  • Add visuals and fields to a report page
  • Add and use slicers to filter and select data
  • Use Analyze in Excel to create a connected pivot table
  • Use Excel to add fields, groups, totals, filters & slicers to a pivot table
  • Set alerts & create a report subscription
  • User bookmarks to create custom filter sets and navigation

Intermediate Report Authors/Analysts

  • Using a published certified dataset, create a new report
  • Add pages, visuals, slicers and filters
  • Add custom measures to a report using DAX
  • Create drill-through actions & navigation
  • Create a custom dashboard by pinning tiles

Use SSRS/Paginated Reports for Operational Reporting

Power BI is not a replacement for paginated, operational reporting. For static, multi-page, printable reports; use SQL Server Reporting Services (SSRS) instead of Power BI. SSRS (now called “Paginated Reports”) is integrated into the Power BI service with Premium capacity licensing, and can be integrated with interactive Power BI reports and Power BI data datasets.

To a limited degree, some operational reports can be reproduced using Power BI reports and SSRS can be used, some a limited degree, to create interactive reports.

Recommendation:
Use the right tool for the job and consider the strengths and limitations of Power BI, SSRS/Paginated Reports and Excel to author and deliver reports for different audiences and use cases. 

If Users Need Excel, Give them Excel

Financial users often misunderstand that they must adapt to visual dashboard reporting when they really need spreadsheet reports with reliable, current data. Power BI was borne from Excel and from SQL Server Analysis Services (traditionally “cubes”). Therefore, it can meet the needs of both self-service users and IT-managed enterprise BI solutions. In many ways, Excel are a perfect pair. If users need spreadsheets, don’t give them reports that look like spreadsheets. Give them connected spreadsheets; then train them to use Excel from a Power BI report and certified dataset.

Analyze in Excel is a gem that is often not promoted by IT-based report authors; typically, because report requirements are usually specified before users lean about this capability. It really is a best-of-both-worlds solution; because it keeps governed data in a published dataset which allows users to access it via a secure live connection from Excel.

When selecting Analyze in Excel from a published report or dataset in the service, Excel opens with a new connection (using an OCD file) directed at the published dataset. The Analyze in Excel ribbon (installed to Excel when downloaded) can also be used to create a new connection from Excel.
By default, a pivot table is generated in a new worksheet; but advanced Excel users can use cube functions to place live Power BI dataset data anywhere in a workbook.

https://sqlserverbi.blog/2016/05/10/power-bi-and-excel-the-dynamic-duo/

Recommendation:
Rather than exporting data from Power BI (and creating off-line copies of data), promote the use of live, connected reporting with Excel. 

Enable Analyze in Excel in selected reports and datasets; then train a limited audience of advanced users to promote its use. Establish best practices for each business use case. Generally information about this feature is available here: https://docs.microsoft.com/en-us/power-bi/service-analyze-in-excel

Implementing Security

There are many good references and tutorials about data and report security that I won’t cover here but will summarize the recommended approaches. Data level security and report/content level security are separate concepts that should be coordinated.

Report/Content-Level Security

There are multiple options for sharing report content with users.  Users cannot view a dashboard or report unless they have been granted access to a workspace or have been invited to share a report.

If you have a Premium capacity Power BI tenant, you can package the content of a workspace [dashboard(s) and report(s)] as an app which can be shared with any users in your organization who do not need individual Power BI Pro licenses to read and view content. These users will not be able to edit or author additional reports.

Users with Pro licenses assigned through the Office 365 Portal can have a workspace or report shared with them, which they can (unless otherwise restricted) edit, save, copy or share with others.

Data-Level Security

Once a user can access a report, they will see the results of all data in the underlying dataset for that report. Users with report level access can read all dataset data unless role-based and/or row-level security has been implemented. Role/RLS security work essentially the same way in SSAS and with data models developed with Power BI Desktop.

Role-based Security

Role-based Security is active when at least one role is defined for the dataset, filter expressions are added to the role and them user or group members are added to the role. Conditional filtering is performed for each table in the data model using a DAX expression.

Row-level Security

RLS is an extension of role-based security. In addition to creating one or more roles with filter expressions and adding role members, filtering may be conditional based on the authenticated user.  The typical approach is to populate and maintain a mapping table of users and key values for each table in the data model that should have restricted access. All related data will be filtered. For example, if a user can only see data for a particular sales territory, they should only see sales totals for the territory.

The user/mapping tables must be maintained either manually or through automation if the user/object permission information is available in an existing data source. Don’t over-engineer this. The user/mapping information can be managed in a spreadsheet and securely stored in SharePoint or OneDrive, or driven by a PowerApp or custom form. *This requirement is often missed in large projects.

Recommendation:
Add users to Office or Azure Active Directory groups and then use groups to share report content. If you are using role-based or row-level security, use groups to manage role membership.

If using Premium, use workspace apps to share and distribute reports with  read-only users and reduce Pro license requirements.

Create a strategy for managing RLS user/entity mapping tables. Either manage manually or automate if the data is available.

Certified & Shared Datasets

To promote reuse and data governance, datasets can be used in reports residing in different workspaces (including a personal workspace). Datasets can be endorsed to designate those that are certified and promoted by data owners and the organization. The Power BI service includes a platform and workflow to manage promoted and certified datasets. When enabled, a dataset author can enable a dataset to be marked for promotion and certification. Designated users can test and optionally certify specific datasets that are known to be reliable and conform to organizational
governance standards.

Dataset promotional and certification markings require the coordinated effort of the solution developer, Power BI tenant admin, data steward or business stakeholders. The actual process is simple; but the certification process should be defined and managed by solution business owners and service administrators.

The following image illustrates the
aforementioned process.

clip_image002

clip_image004

Information about this capability: https://docs.microsoft.com/en-us/power-bi/service-datasets-certify

Enterprise Scale Options

In many ways, Power BI has now surpassed the capabilities of SQL Server Analysis Services. Microsoft are investing in the enterprise capabilities of the Power BI platform, by enhancing Power BI Premium Capacity, adding Paginated Report and features to support massive scale specialized use cases. Consider the present and planned capabilities of the Power BI platform; before, choosing another data modeling tool such as SSAS.

Resources:

https://sqlserverbi.blog/2018/07/27/power-bi-for-grownups/ 

https://sqlserverbi.blog/2018/12/13/data-model-options-for-power-bi-solutions/

Project Preparation Checklist

[_] Categorize the solution by identifying the author & user roles related to the project:

[_] Author role: Business Data Analyst

[_] Author role: Skilled Data Modeler, Analyst, Data Scientist

[_] Author role: IT BI Developer

[_] Users’ role: Report/Dashboard Consumer

[_] Users’ role: Self-service Report Author

[_] Users’ role: Advanced Data Analyst

[_] Develop & Document Support & training plan for users

[_] Identify the Solution Type for the project. This will guide other project management designs:

[_] Design single PBIX file for small geoup, departmental project authored by one developer for a limited group of users

[_] Design & deploy a separate dataset PBIX file – from report file(s) – when the dataset should be branded as a Certified dataset

[_] Design separate dataset and report PBIX fiels for formal projects with more than one dataset & report developer, to coordinate work

[_] Use SSAS/AAS as a data modeling option when those databases exist or where IT operations insist o management development and maintenance through integrated source control (e.g. Visual Studio Team Services & Azure DevOps)

[_] Identify the Project Type & related Soluton Architecture:

[_] Project type: Formal project

[_] Project type: Informal project

[_] Project type: Hybrid project

[_] Architectural approach: Single PBIX

[_] Architectural approach: Separate dataset and report PBIX

[_] Architectural approach: Report PBIX connected to SSAS or AAS

[_] Understand DirectQuery model trade-offs and special use cases. Avoid if possible.

[_] Create storage locations and folder structure for Development file management:

[_] Development file storage

[_] Team member collaboration environment & processes

[_] Folder synchronization

[_] Define File naming standards

[_] Decide on dataset and report names

[_] Define the Version Control & Lifecycle Management

[_] Postfix files with 3-part version number

[_] Remove ver number from published files in QA and PROD

[_] Create Version History table in Power Query

[_] Increment version numbers in data model

[_] Backup PBIT files for archive

[_] Create measures: Last Refresh Date/Time

[_] Create measure: Current Version

[_] Add data model info page to report

[_] Define your Release Management, DevOps & Automation strategy (if any – Might be OK to deploy files manually) (to automate or not to automate)

[_] Decide on Workspace and App Management, workspace & app name, etc.

[_] Create PROD workspace (omit PRD from name), assign dedicated capacity

[_] Create QA workspace (post-fix name with QA), assign dedicated capacity

[_] Create DEV workspace (post-fixo name with DEV), dedicated capacity not required

[_] Assign Pro licenses to all developers, admins and report author users (QA?)

[_] Assign membership and access to workspaces

[_] Create fact date range filter parameters: RangeStart & RangeEnd

[_] Filter large fact tables with range filters, consider incremental refresh policies if slow and/or over 800 MB compressed.

[_] Design source queries (T-SQL?) to reshape source data into conformed dimension & fact tables

[_] Create views in database for each dimension and fact

[_] Enforce key uniqueness to remove all duplicate keys from all dimension tables

[_] Query Date dim/lookup table at source if it exists

[_] If not available, generate Date dim/lookup table in Power Query

[_] Avoid bi-directional relationship wherever possible

[_] Include code annotations & comment blocks in: views, M queries,measures

[_] Hide all key columns and other columns not used directly by users

[_] Use parameters to filter and reduce dataset PBIX size to ~400 MB or less

[_] Create incremental refresh policies for dataset larger than ~500 MB

[_] Use Premium capacity workspaces for datasets approaching 900 MB

[_] Use Premium capacity workspace when you need to share reports with non-Pro licensed user

[_] For non-Pro licensed user, convert QA (and then PRD) workspace to app and distribute the app to users via AD group membership

[_] Create explicit measures for all aggregate-able column values

[_] Hide all measure base numeric columns

[_] Set all non-aggregate-able numeric column default summarization to “Do Not Summarize”

[_] Format add whole numbers with thousand separators

[_] Format all currency & decimal measures to defined standard (perhaps 2 decimal, thousand separator)

[_] Format all ratios and percentage with appropriate decimal positions

[_] Use the DIVIDE function for all division operations to avoid div by zero

[_] Name all measures and user-visible fields with mixed-case, short,

[_] friendly sentenc-like terms

[_] Assign measures to fact table where they logically belong – or to an empty measure group table.

[_] Measures that don’t have an obvious home (fact) table should be assigned a measure group table.

[_] Precede measure group table names with “_” so they sort to the top of the list. (display folders are not ready for use in Power BI Desktop)
As a team decide:

[_] When to increment the revision/fix number

[_] When to increment the minor version number

[_] Who owns the migration to QA process? How is it implemented?

[_] Who own migration from QA to PROD? How is it implemented?

[_] How is user/stakeholder sign-off obtained?

[_] How are data source requirements defined, presented & documented?

[_] Who owns & maintains the requirements log?

[_] How are measure & KPI calculation requirements documented & maintained?

[_] How are the report design, layout & visualization requirements documented & maintained?

[_] What is the organization & project styling & branding standard?

[_] Does the organization have a standard brand image? Where is it documented? How owns that standard & can answer questions?

[_] Should report be designed for mobile consumption (with mobile layouts)?

Power BI Product Suggestion: Show Measure Dependencies

I was just working on a client project where we have several complex measures that reference other measures. Sometimes it can be hard to keep all these dependencies in mind. As a feature, it would be great for Power BI Desktop to show these dependencies and navigate between them. After a quick search in the Power BI Community Ideas site, I posted a suggestion. Afterward, I found an older suggestion made by Avi Signh here: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/16783000-measure-dependency-view-similar-to-the-query-depe

Here’s my version in the mockup below. I think there’s a lot of potential to pile-on and add to this idea but I think that having a simple tree view of dependent measures would be a great start. If you agree, please vote this idea up on the Power BI community forum.

Show measure dependencies in the field list

Measures that reference and depend on other measures can be difficult to trace and debug.  To make this easier, provide the option in the field list to show the measures that are referenced by other measures. This should be optional and performed individually for each measure to avoid unnecessary performance overhead.  Here’s a conceptual mockup:

measure dependency

Power BI Licensing

Power BI licensing options are a mystery to many. It’s not that it is complicated; in fact, on the contrary, Power BI pricing is actually quite simple once you understand the basics. It is just that there are so many options and different features and capabilities. This short video blog post walks you through the costs and licensing options for Microsoft Power BI and the Power BI platform including free, Pro and Premium capacity. I also talk about alternative licensing options for accessing Premium capacity features with the ability to scale cost and capacity.

This post is part of the Azure Everyday series from Pragmatic Works. See the entire series of video blog posts at: https://blog.pragmaticworks.com/topic/azure-every-day

Power Platform Virtual Conference

I am happy to join several world-class speakers on July 23 & 24 for the Power Platform Virtual Conference. I will be speaking about how to create “no-code” Power BI custom visuals along with Mike Carlo from PowerBI.tips.

Join me on Tuesday, July 23 at 5 p.m. EDT for my session on Creating Power BI Custom Visuals with No Code at the Power Platform Virtual Conference. This 2-day online conference dedicated to the #PowerPlatform is only $49. Register today! #PowerPlatformCon #PowerBI #PowerApps #MicrosoftFlow http://powerplatformconference.com/

paul_turley-01

Looking to enhance your Power BI, PowerApps and Flow skills but can’t attend an in-person conference? Check out this virtual (on-line, live) conference featuring several headlining, international conference presenters. Join Pragmatic Works for the only 100% virtual conference dedicated to the #PowerPlatform on July 23rd and 24th! http://powerplatformconference.com/ #powerplatformcon #PowerBI #PowerApps #MicrosoftFlow

power-platform-virtual-conference-v04_twitter

Links to Pragmatic Works’ Social Accounts

Twitter: https://twitter.com/PragmaticWorks

LinkedIn: https://www.linkedin.com/company/PragmaticWorks

Facebook: https://facebook.com/PragmaticWorks

Drillthrough Navigation Between Power BI and SSRS Paginated Reports

Report navigation is, in my opinion, the essence of true data exploration and discovery. It lets business users see and understand important information in different forms; in summary and in detail, within context.  SSRS and Power BI are truly a dynamic duo for reporting and data discovery. I wanted to post a follow up to these two articles because they are apparently very popular topics. Posted two days apart on my blog, they received 2,744 and 3,356 views. Woo!

image

Nine years ago we released the Reporting Services Recipes Book from Wrox Press, which was a very popular collection of design patterns and techniques, often referenced frequently by Microsoft leaders at industry conferences. In the book, we showcased different ways to move between SSRS reports by passing parameters and dynamic URLs to create interactive reporting solutions. I am working with my former co-author and a team of talented BI and report design experts on the second edition of the the Recipe book that will include not only updated examples of SSRS patterns for on-prem report solutions but we are including new patterns for Power BI Report Server and Paginated reports in the Power BI service.

Today we have Power BI in addition to SSRS (now called “Paginated Reports”). These tools are different and often serve distinctly different purposes for creating reporting solutions. But, using Power BI and SSRS/Paginated Reports together opens doors to create incredibly rich integrated report and data discovery solutions.

I recently published two blog posts to share some of my work-in-progress for the recipe book: Drillthrough from Power BI to an SSRS Paginated Report, and: Drillthrough from Paginated Report to Power BI. Both of these posts demonstrate navigation from one of the report tools to the other, essentially allowing users to work with these two capable tools in a full circle.  As the newer and more modern data analysis tool, Power BI fills and important gap but it is not a replacement for the type of multi-page grouped and repeated style of reporting that SSRS is optimized for. Likewise, Power BI excels as an interactive tool for data discovery and self-service analysis. SSRS and Power BI were borne from the same platform and and have common architectures but also many differences. Used together, the realm of possibilities is expansive.

It is important to understand that the techniques used to navigate and pass filter context between these report tools is limited to the capabilities of web query string parameterization. In my experience, this is rarely a show-stopper but I do occasionally encounter folks pushing the limits – both practically and technically.

Both SSRS and Power BI have their own drillthrough capabilities using internal mechanisms that do not rely on the web browser and report server to handle web URLs. However, moving between different report types relies and web URLS. Although powerful and flexible, there are some limitations. Each web browser has it’s own maximum URL length limit, ranging from 2048 to 2083 or so characters. The server-hosted web services also have their own limits. For simplicity, let’s call the limit about 2000 characters, which will include the base domain and folder path URL, and the query string directives, parameters and values.  Keep all this within the ~2000 character limit and you’re good.  Also be mindful that all this is visible in the browser’s address bar. Requests and responses sent to the Power BI service are encrypted so there is little risk of exposure outside the session. Your on-prem report server connections can also be protected using certificate-based encryption.  Just be mindful about what you pass in sight of the user and web browser after the stream is decrypted on the client.

The report navigation recipes posted here on my blog are early drafts. I welcome your feedback, questions and ideas for more report recipes. Sample files and other downloadable content that are referenced in the recipes will be available when the book is published – and no, that date has not been set, but I will make an announcement when we know.  Thank you in advanced for your questions and comments.

Drillthrough from Paginated Report to Power BI

— report recipe
also see: Drillthrough from Power BI to an SSRS Paginated Report

In the recipe titled “Drillthrough from Power BI to Paginated Report” I demonstrate how to use report parameters and a dynamic URL address to navigate between reports and pass parameter values in the URL from a Power BI report to detailed data in an SSRS paginated report. Using a similar technique, we can navigate from a paginated report to a Power BI report.

Power BI is very flexible and does not require you to define parameters like you would in a paginated report. You can literally filter a report using any field in the dataset.

Product Versions

SSRS/Paginated Reports (one of the following):

· SQL Server Reporting Services 2016 or newer

· Power BI Report Server

· Paginated Reports in the Power BI service

Power BI Desktop:

· Power BI Desktop May 2019 or newer

What You’ll Need

· A Power BI report connected to a data source using import, DirectQuery or SSAS direct connect mode.

· Paginated Report server: SQL Server Reporting Services (Standard or Enterprise Edition), Power BI Report Server or a Power BI subscription with Premium capacity

· Moderate level Visual Basic.NET coding skills

Difficulty level: 300

The sample database and files will be available in the forthcoming book: SQL Server Reporting Services Paginated Report Recipes, 2nd Edition (working title).
These instructions are provided as an example but refer to files that will be available when the book is published. Please contact me in the comments with questions and feedback.

Preparing the target Power BI Report

You don’t have to do anything special in the Power BI report design to enable parameterized filtering; you simply need to know the tables and field names you will need to use for filtering. Most of the effort is building the dynamic URL in the paginated report.

The first order of business is to make note of each field or measure to use for filtering. I suggest jotting them down in a text editor using the notation: TableName/FieldName.

Note: Filtering is applied at the report level, so navigating to a report using URL filters will affect all the pages in that report.

In the following examples, I will demonstrate my experience with a report deployed to my Power BI tenant. In the exercises a little later, you will deploy a sample report and work through these steps in your own Power BI environment.

Addressing Power BI Objects

There are several URL-addressable objects in the Power BI service to which you can navigate using a URL. Each of these objects is assigned a globally unique identifier that can be addressed in a URL. Reports can be filtered, and dashboards cannot, but other objects that are related to reports can be filtered and will respond to a filter directive in the URL. For each object type listed in the following table, navigate to the object in your web browser and copy the URL from the address bar.

Object Type

Accepts Filters

Description

Dashboard

No

Dashboards don’t accept filters

Report

Yes

Shows default report page with report-level filters

Report page

Yes

Applies report-level filters to page

Bookmark

Yes

Bookmark and filter directives can be combined in the URL query string using “&” (rather than “?”) to append directives

Workspace app

Depends

Default page in app navigation must be a report rather than a dashboard

Workspace app report page

Yes

Same as a report. Use path to app or a report within the app

Workspace app bookmark

Yes

Same as report bookmarks. Filter directive may be added to the query string using “&” rather than “?”.

Filter Directives & Syntax

The URL query string filtering syntax for Power BI is borrowed from the OData standard which offers a great deal of flexibility. For example, depending on data types; you can specify that a field is equal to, less then, greater than, contains, in a range, or includes a list of parameter values. The filter directive is simply tacked onto the end of the URL specifying an object path in the form of:

?filter=Table/Field eq value

Text and date type values are passed in single quotes:

?filter=Table/Field eq ‘value

Multiple filters may be appended together using the “and” operator. This example uses real table and field names:

?filter=Product/Brand eq ‘Contoso’ and Sales/Quantity lt 5

This will filter all pages on the report to include only records where the Brand field in the Product table is “Contoso” and the Quantity field in the Sales table is less than 5.

All of the available filter operators are listed in the following table.:

Operator

Description

eq

Equal to

ne

not equal to

ge

greater than or equal to

gt

greater than

le

less than or equal to

lt

less than

in

in list (item1, item2, item3)

Before I can add the filter directive, I need to get the report URL. If I go my Power BI service, navigate to a workspace and then click to open a report, I can capture the URL from the address bar. As an example, here’s the address I captured for a report published in my Power BI tenant:

https://app.powerbi.com/groups/cdae7f3c-b818-493a-af56-55308d41798f/reports/0dcf5a4d-1bf2-438f-b13b-95eb9adadd91/ReportSection

You can see that each object type (“groups”, “reports”, etc.) is proceeded by a globally unique identifier; which is a long string of characters generated by the service. These identifiers don’t change after the object is deployed or created in the service.

Notice that the ReportSection element at the end of the address has no value which means that I have not specified a report page. If I were to use this address to navigate to the report, Power BI will show me the default page, which is the page that was visible when the report was last saved. If I navigate to another page and then capture the URL, the object ID of that page is appended to the “ReportSection” text at the end of the address:

https://app.powerbi.com/groups/cdae7f3c-b818-493a-af56-55308d41798f/reports/0dcf5a4d-1bf2-438f-b13b-95eb9adadd91/ReportSection3b93b5700b15a516ed98

Note: The syntax for the page or ReportSection is a little different than other objects and does not use a forward slash before the object ID.

The common terms we use for some of these objects are a little different that the attribute names used in the URL query string, as noted here. I will show you some additional object names after this first example.

Object Type

Query String Attribute

Object ID

Workspace

groups

cdae7f3c-b818-493a-af56-55308d41798f

Report

reports

0dcf5a4d-1bf2-438f-b13b-95eb9adadd91

Page

ReportSection

3b93b5700b15a516ed98

Now, to add the filter directive to the report URL… When I append the filter directive from the earlier example to the report URL like this…

https://app.powerbi.com/groups/cdae7f3c-b818-493a-af56-55308d41798f/reports/0dcf5a4d-1bf2-438f-b13b-95eb9adadd91/ReportSection?filter=Product/Brand eq ‘Contoso’ and Sales/Quantity lt 5

…the report opens with only data where the Brand is Contoso and Sales records have a quantity less than 5. But, my browser seems to have mangled the URL making it difficult to decipher:

https://app.powerbi.com/groups/cdae7f3c-b818-493a-af56-55308d41798f/reports/0dcf5a4d-1bf2-438f-b13b-95eb9adadd91/ReportSection?filter=Product~2FBrand%20eq%20%27Contoso%27%20and%20Sales~2FQuantity%20lt%205

This is a necessary practice when passing certain information in a URL called “escaping”. There are certain characters that either have special meaning to a web server or simply can’t be passed in a web URL without special encoding. Spaces and forward slashes are the most common examples that must be converted to a special string of characters so that a web server (in this case the Power BI service) understands what a web client (in this case, my web browser) is asking for in the web address and query string directives. Technicalities aside, we must substitute these characters with special “escape character” strings. Cryptic as it might seem, this is simply a requirement.

Report URL on Power BI Report Server

If you are using Power BI Report Server rather than the Power BI service, you can get the report URL in the same way. Simply navigate to the report and capture the URL in the address bar. You will not see object IDs like those generated by the Power BI service but just the named folder and report. For example, here is the path to a report named Sales Analysis in a folder named Report Recipes on my local report server:

http://localhost/reportspbrs/powerbi/Report%20Recipes/Sales%20Analysis

The pattern for adding filters to a report published to an on-premises report server is the same as the Power BI service. Append the filter directive query string parameter to the URL using the same syntax (beginning with “?filter=”).

Handling Special Characters

If any object names that contain literal spaces or other special-purpose characters (like % or @ for example), you must explicitly replace these with properly escaped notation. In the previous example, you saw that the web browser replaces forward slashes with “~2F” and spaces with “%20” within the URL.

This is where things might get a little confusing but it’s really not complicated. The HTTP escape character encoding that your web browser does automatically uses a different syntax than the Power BI service or Power BI Report Server. For object names (tables or fields) in the filter directive, Power BI employs a different syntax called Unicode URL escape codes, which is a seven-character string beginning with “_x” and ending with “_” and a four-character hexadecimal number in the middle. To represent a space, the code is “_x0020_”. Use the following table for reference:

Character

HTTP Escape Code

Unicode URL Escape Code

Space

%20

_x0020_

$

%24

_x0024_

&

%26

_x0026_

`

%60

_x0060_

:

%3A

_x003A_

<

%3C

_x003C_

>

%3E

_x003E_

[

%5B

_x005B_

]

%5D

_x005D_

{

%7B

_x007B_

}

%7D

_x007D_

%22

_x0022_

+

%2B

_x002B_

#

%23

_x0023_

%

%25

_x0025_

@

%40

_x0040_

/

%2F

_x002F_

;

%3B

_x003B_

=

%3D

_x003D_

?

%3F

_x003F_

\

%5C

_x005C_

^

%5E

_x005E_

|

%7C

_x007C_

~

%7E

_x007E_

%27

_x0027_

,

%2C

_x002C_

Let’s say that I need to specify a filter for a field named Product Category in a table named Sales Orders, and I only want to see data for two specific categories, Accessories and Bikes. Before adding escape codes, the filter string would look like this:

?filter=Product/Product Category in (‘Accessories’,’Bikes’)

If I replace the necessary characters (within object names) with Unicode URL escape codes, the filter string becomes:

?filter=Product~2FProduct_x0020_Category in (‘Accessories’,’Bikes’)

Additionally, my web browser automatically adds HTTP escape codes for spaces the quotation characters outside of object names, which produces this string:

?filter=Product~2FProduct_x0020_Category%20in%20(%27Accessories%27,%27Bikes%27)

In summary, you do not need to apply the HTTP escape codes because your browser will do that for you. If you capture the URL with the escaped characters, that’s fine. However, you must replace spaces and special characters with the filter directive object names with the appropriate Unicode escape codes.

Now that you understand the syntax of the filter directive and the need for escape codes to replace certain characters and spaces, you are equipped to build a fully-qualified URL for the report and folder.

Build the Report Navigation Solution

In this solution, you will make enhancements to a fairly standard SSRS paginated report. You will use a custom Visual Basic function and expression to encode a dynamic report URL with parameters to filter a Power BI report.

Design the Source Paginated Report

A starting report is provided for you. This is a simple SSRS paginated report connected the SSRS2019Recipes sample database.

1. Make a copy of the Sales Summary-Starter.rdl file and rename it Sales Summary.rdl.

2. Open Sales Summary.rdl in Report Builder (shown in Figure 1).

clip_image002

Figure 1

3. Open the data source and make sure you can connect to the SSRS2019Recipes database on you SQL Server. Change the server name in the connection of necessary and test the connection.

4. Leave Report Builder open to resumereport design work after the following steps.

Open and Publish the Power BI Target Report

1. Open Power BI Desktop and then open the Sales Analysis (Drillthrough Target).pbix report file.

2. Publish the Sales Analysis (Drillthrough Target) report to a workspace in your Power BI tenant or to a folder in you Power BI Report Server.

Note: If you are using Power BI Report Server rather than the Power BI Service, you must use the version of Power BI Desktop that installs with Power BI Report Server. To publish a report in Power BI Desktop for report server, choose Save As… from the File menu. To publish a report to the Power BI service, choose Publish from the File menu.

3. Navigate to the published Sales Analysis (Drillthrough Target) report in your web browser, shown in Figure 2.

4. Select the entire URL in the address bar and copy it to the Windows clipboard.

clip_image004

Figure 2

Continue Designing the Source Paginated Report

Now that you have the complete URL for the published Sales Analysis Power BI report in the clipboard, you can return to Report Builder where you were previously editing the Sales Summary SSRS paginated report.

1. In Report Builder, select the report parameter named PBIReportURL.

2. Right-click the selected parameter and choose Edit Parameter from the menu.

1. Select the Default Values page.

2. Click to select the Specify values radio button.

3. Click the Add button.

4. Paste the URL for the Power BI report into the Value box as shown in Figure 3.

5. Click OK to accept the parameter changes.

clip_image006

Figure 3

Create a Custom Function to Encode Object Name Text

1. In Report Builder, right-click the report designer background outside of the report body (dark gray area) and select Report Properties from the menu.

2. In the Report Properties dialog. Select the Code page.

3. In the Code box, enter the following Visual Basic.NET code.

Note: The custom code editor has no syntax validation or debugging features, so it can be challenging to debug and correct errors in your code. I recommend that you challenge yourself by hand-entering this code but if you have any trouble, a copy of this code is provided in the Code.txt file for your convenience.

Function EncodeObjectNames ( ObjectNames as String ) As String

Dim sText As String = ObjectNames

‘– Replace Spaces —

sText = REPLACE( sText, ” “, “_x0020_”)

‘– Replace $ —

sText = REPLACE( sText, “$”, “_x0024_”)

‘– Replace & —

sText = REPLACE( sText, “&”, “_x0026_”)

‘– Replace % —

sText = REPLACE( sText, “%”, “_x0025_”)

‘– Replace ‘ —

sText = REPLACE( sText, “‘”, “_x0027_”)

Return sText

End Function

The completed function is shown in Figure X

4. Click OK to accept the code changes and close the Report Properties dialog.

clip_image008

Figure 4

Add Link Icon

1. Select the Insert tab in the main Report Builder application.

2. From the Insert ribbon, click the Image button.

The mouse pointer changes to an Image icon.

3. In the table, click the right-most cell in the Detail column to drop the new image.

The Image Properties dialog opens.

4. On the Image Properties dialog, click the Import button.

5. Drop down the file extension filter list which is currently set to JPEG files (*.jpg).

6. Select the option to show all file types: All files (*.*).

7. Browse to the correct folder and select the Link Icon.png file.

clip_image010

Figure 5

Add Navigation Action

1. Select the Action page in the Image Properties dialog.

2. Under Enable as an action:, click to select the radio button labelled Go to URL.

3. Click the “fx” button to the right of the Select URL: drop-down list.

clip_image012

Figure 6

4. In the Expression dialog, enter the following code.

Note: Be particularly mindful of the literal single quotes surrounding the ProductSubcatory field reference at the end of the expression.

=Parameters!PBIReportURL.Value

& “?filter=” & Code.EncodeObjectNames( “Product/Product Subcategory” )

& ” eq ‘” & Fields!ProductSubcategory.Value & “‘ and Date/Year eq ” & Fields!Year.Value

5. Click OK to accept the new image settings.

6. Save your changes to the Sales Summary report.

7. Switch to Preview and test the drillthrough navigation link by clicking the link icon on any row.

clip_image014

Figure 7

The target report should show only data for the selected Subcategory and Year.

8. Deploy the Sales Summary paginated report to your report server or Premium Power BI tenant.

9. Test the deployed Sales Summary report to verify that it works as it did in the designer.

Drillthrough from Power BI to an SSRS Paginated Report

— report recipe
also see: Drillthrough from Paginated Report to Power BI

Navigating from a Power BI report to an SSRS Paginated report with filters and parameters.

Product Versions
SSRS/Paginated Reports (one of the following):
– SQL Server Reporting Services 2016 or newer
– Power BI Report Server
– Paginated Reports in the Power BI service
Power BI Desktop:
– Power BI Desktop May 2019 or newer
All versions of Reporting Services accept parameters in a similar fashion, but subtle changes were made to URL parameter formats in SSRS 2016.  Some adjustments may be required if you want to apply these techniques to SSRS versions prior to 2016.
What You’ll Need
– A Power BI report connected to a data source using import, DirectQuery or SSAS direct connect mode.
Report server:
– Power BI Report Server or a Power BI subscription with Premium capacity (minimum: P1 or A4)
– Moderate level DAX coding skills  
Difficulty level: 200

Designing the Power BI Report

This recipe primarily involves Power BI report design techniques. I’m not going to get into the details of Power BI report design but will cover the basics with a partially-completed report to get you started. If you are less-experienced with Power BI you can use this as an example for future report projects.

The sample database and files will be available in the forthcoming book: SQL Server Reporting Services Paginated Report Recipes, 2nd Edition (working title).
These instructions are provided as an example but refer to files that will be available when the book is published. Please contact me in the comments with questions and feedback.

Solution

Figure 1 shows the Sales Summary Power BI report, which is deployed to my local Power BI Report Server along with the Sales Detail, a paginated (SSRS) report also deployed to my local Power BI Report Server.

clip_image002

Figure 1

We will start by looking at the target report. This is just a standard SSRS paginated report. A completed copy of this is provided for you to configure and deploy to your report server or Premium Power BI service tenant. The sample report is very simple and could easily be enhanced to include more information in greater detail.

1. Open the Sales Detail.rdl report in Report Builder

There are four report parameters in this report, two of which are used to connect to the SQL Server database. The other two parameters are used to pass filter values to the underlying query.

2. Edit the ServerName parameter and set the Default property to the name of your server or instance

The default value for this parameter LocalHost which connected to my local default instance of SQL Server. You can leave this setting if your SQL Server instance is the same as mine.

3. Close and save any changes the ServerName parameter

4. Preview the report and make sure it connects and displays data from the sample database

5. Select different values for the Year and Country parameters using the parameter dropdown lists and click the View Report button to see values in the report change, and to confirm that the parameters are working correctly.

6. Use the Save As… option on the File menu to save the report to a folder on your report server or a workspace in your Premium Power BI tenant

Note: Depending on the Report Builder menu option and deployment target, the terms “save” and “publish” are used interchangeably, meaning that the report has been deployed to a server or service where it can be ran and viewed by users.

Get the address for the published paginated report

1. Open Notepad or your favorite text editor

2. In a web browser, enter the address for your report portal or Power BI portal where you published the Sales Detail paginated report

Tip: If you don’t know the address for your web portal or report server, you can use Report Server Configuration Manager to lookup the Web Service URL and Web Portal URL. By default, the web portal address is http://SERVERNAME/Reports and the report server address is http://SERVERNAME/ReportServer (where SERVERNAME is the name of the server where Reporting Services is installed) but this is configurable and might be different in your environment.

3. Navigate to the folder or workspace and run the newly published report, verify that the report runs and returns data

4. Use Ctrl-C to copy the full address for the report from the browser address bar to the Windows clipboard

5. Paste the address into the text editor

The next step can seem a little tricky but it’s not really complicated. You need to get the path to the report on the report server rather than the report portal.

If you published the paginated report to your on-premises report server, do the following:

1. In your text editor, copy and paste the report address to a new line

2. In the new address line, locate the text “Reports” and change it to “ReportServer”

3. Copy the modified address line to the address bar in your browser and press Enter

The browser should navigate to a text menu page similar to Figure 2.

clip_image004

Figure 2

The address for the report is actually not yet in the correct format but it was sufficient to navigate to the Home folder on the report server. The following steps are used to obtain the correctly-formatted report address.

1. Use the links displayed on the report server menu page to navigate to the folder to which you published the Sales Detail report and click the report name link to run the report

2. When the report is displayed, copy the complete address from the address bar and paste it to a new line in your text editor

This is the correct path for this report on the report server. You can new add parameter values to the end of the address.

3. Append parameters and values to the address in the address bar like this. Each parameter is preceded with an ampersand character:

&Year=2018&Country=Canada

4. Press Enter and verify that the parameter is changed in the report

Open the starting report in Power BI Desktop and add drillthrough navigation for the paginated report

1. Make a copy of the Sales Summary-Starter.pbix file and rename it Sales Summary.pbix

2. Open Sales Summary.pbix in Power BI Desktop

If you will be using Power BI Report Server to deploy report on premises, make sure that you use the version of Power BI Desktop designed for your server version. If in doubt, you can download and install Desktop from the download menu in your report web portal.

3. On the Home ribbon, click the Edit Query button to open the Power Query editor

4. You will see three query parameters. Two parameters are used to manage database connection information in a similar manner to the Sales detail paginated report.

Tip: Using parameters in the Power Query editor can help manage variable information like connection information, file paths and addresses.

In the starter report file, there are three text type query parameters. Modify these parameters, using the following instructions:

Parameter NameDescription
ServerNameSQL Server address or server name. You can use “LocalHost” if it is a local instance.
DatabaseNameSSRS2019Recipes
ReportURLPath to the deployed SSRS/paginated report

5. If necessary, change the ServerName parameter current value to connect to your SQL Server

6. Copy the report server address for the deployed Sales Detail report from your text editor and past it into the current value ReportURL parameter.

Build the dynamic target report address link as a measure in Power BI

The ReportURL Parameter value is exposed as a column in a table named Constants. You will use this to build a dynamic link in the report.

1. On the Home ribbon in the Power Query editor, click Close & Apply to process queries and return to the report designer.

2. Figure 3 shows the report designer in Power BI Desktop. The objective is to add links to each item displayed in the table visual, allowing a user to see a paginated report with details related to a selected year and country.

3. Expand the Sales Summary Subcategory Country table in the field list on the right

clip_image006

Figure 3

4. Click the ellipsis (…) to the right of table name to see the context menu and select New Measure

5. The formula bar is displayed above the report canvas with placeholder text: Measure =

Note: You can use the down arrow on the right side of the formula bar to expand the formula editor window, and the up arrow in the same location to collapse the formula editor

6. Replace the placeholder text with the following new measure:

Detail Report Link =
VAR SelectedYear = SELECTEDVALUE( 'Sales Summary Subcategory Country'[Year] )
VAR SelectedCountry = SELECTEDVALUE( 'Sales Summary Subcategory Country'[Country] )
VAR RptURL = SELECTEDVALUE( 'Constants'[Report URL] )

RETURN
RptURL & "&Year=" & FORMAT( SelectedYear, "0000") & "&Country=" & SelectedCountry

The purpose of this measure is to build the address text used for our link to the detail paginated report. The SelectedYear and SelectedCountry variables get those respective values. The RptURL variable gets the report path the parameter. This DAX script assembles the URL in the appropriate format.

As you enter the text, use the Intellisense and autocompletion features offered by the editor to build the DAX code. These features can help with proper code syntax and save you from typing errors.

7. Press Enter to validate the code and save the measure

8. With the measure selected in the field list on the right, switch to the Modeling ribbon

9. Use the Data Category drop-down to select Web URL (see Figure 4)

clip_image008

Figure 4

Add report links to the report

With the Detail Report Link measure added to the data model and categorized as a web URL, you can add a link to the table that will dynamically generate a context-specific link for report navigation.

1. Before you go any further, click the floppy disk icon on the toolbar to save your work

2. Select the table on the right side of the report canvas that shows sales by product category grouped by years

3. Drag the new Detail Report Link measure from the field list to the Values field well below the SalesAmount field. This adds a new column to the table visual showing a long URL in blue “link” text (see Figure 5)

clip_image010

Figure 5

Tip: It is easy to deselect a visual so before you begin changing properties, make sure that you still have the table visual in the report canvas

4. In the left-most blade in the panel to the right of the report canvas, click the paint roller icon to see the properties for the table visual, shown in Figure 6

clip_image012

Figure 6

5. Expand the Values group

6. Locate the URL icon property and switch the state of the property to On

7. Verify that the web addresses displayed in the last column of the table are replaced with a small blue link icon, which you can see in Figure 7

clip_image014

Figure 7

Test and publish the completed report solution

A this point, navigation should work from the report in Power BI Desktop, just as it should after the report is deployed.

1. Hover over the link icon on a few different rows to see the parameters at the end of the address in the tooltip

2. Test the report navigation by clicking a link after noting the corresponding year and country.

3. Publish or save the paginated report to your report server or Premium Power BI tenant

4. Test drillthrough navigation in the deployed reports

5. Navigate to the deployed copy of the Power BI report

6. Locate a row with values for a specific year and country and click the corresponding link

The paginated report opens in the browser

7. Verify that the paginated report parameters reflect the selected year and country

clip_image016

Figure 8

Final Thoughts

Report navigation is the essence of data exploration and gives users the ability to understand their data in proper context. Dashboard and summary reports help business users see the big picture but having access to details let’s them understand the low-level details behind the numbers. Power BI as an ideal tool for summary-level, interactive reporting and paginated reports are great for consuming detail-level data. Using this technique allows you to have both of these capabilities. Power BI reports can be published to your on-premises Power BI Report Server or to the cloud Power BI service. Paginated reports do not have to be in the same location and can be deployed on-prem in a standard SSRS report server, Power BI Report Server or in you Premium Capacity Power BI Service tenant.

A couple of things to keep in-mind when using this technique are that there are string size limits when passing parameters in a web URL. You should avoid passing sensitive information in clear text parameters. When navigating between reports located on different services or services, users may be prompted for their credentials to login to a service.

Power BI Performance Analysis

How can you find the cause of slow performance issues in Power BI? This post demonstrates some of the advanced tools you can you to test slowing performing queries and to do deep troubleshooting and performance analysis in your data models. Using the new Performance Analyzer in the Power BI Desktop May 2019 release, finding the cause of slowly-performing reports is easy.

The Performance Analyzer allows you to see the time it takes to run DAX queries for each visual and slicer on a report page. It also shows how long it takes to render the visual and for “other” activities such as web service calls and waiting for other blocking visual queries. After viewing the timings for each visual, you can export all the results to a JSON file for comparison and trend analysis.

The Performance Analyzer will be my first step for analyzing report performance. DAX Studio has several advanced features that are useful for performing deep analysis, troubleshooting and DAX measure development. It is a tool that requires more investment and some advanced knowledge to get beyond the surface of simple issues, but has more advance query tracing, development and problem-solving features.

Tell me about your experience. Please post your comments about your experience using the Performance Analyzer. What other techniques and tools are you using to find and troubleshoot performance?

Advanced DAX Training from Marco Russo in Portland: June 2019

It’s very rare to have the opportunity to learn DAX from the world’s leading expert but we will have that opportunity in Portland on June 19-21! The one-and-only Marco Russo is coming to Portland to teach The Italians’ Mastering DAX course. No one knows DAX like Marco and his ability to teach others about simple, advanced and uniquely insightful concepts is truly remarkable. DAX is simple but it’s not always easy. You can create some amazing calculations and analytical expressions with DAX but many advanced concepts allude most users without help learning some challenging concepts. If you need to learn DAX to use in Power BI, SSAS Tabular or Excel Power Pivot – or to take your DAX Skills to the next level, do not miss this training event.

While in town, Marco will also be the keynote speaker at the 2019 Pacific Northwest Power BI Symposium on June 18th. Last year’s event was a big success. We enjoyed hosting great speakers and a large number of local attendees last year and this event will be just as great.

Getting Started with Power BI Report Design level 101 – What Every New User Should Know

Let’s step through a few very simple things that every Power BI new user needs to know. Power BI is a great tool and it’s fairly easy to get started but there are a few gotchas that everybody experiences. It’s easy to get frustrated with little minor things until you master a few simple features in the designer. I’m just going to walk you through just a few of these items. It doesn’t matter how simple or how complex the project or your data model is, or what your data looks like. What I’m going to show you are some things that everyone experiences.

The first thing is that Power BI like most business intelligence tools really exists for the purpose of grouping and aggregating data. Now, that’s not to say that we can’t use it for detail-level reporting – and we can work with a lot of data but by default, it wants to group and it wants to aggregate. Sometimes that’s not what we want it to do.

I have a table here that I’ve just created, and this is just to simplify things but I’m going to work with a single table. You would experience the same thing if you are working with multiple tables in a data model. As you can see, I have a name column with the name of some of my my favorite cartoon characters. I have a location and I have a numeric column that I want to report on. We’re going to call that my measure or my implicit measure column. it’s just a column of numbers and just keep in mind that we’ve got three records for Fred two records for Barney two for Wilma etc and notice that those are in combination with different locations now I’m going to switch back to a blank report page and I would like to see the number of shells that are sold to a customer at a location and so I’m going to drag and drop the name column on to my report canvas and that’s going to create this table of names. The first thing that I noticed is the text is very small so I want to bump up the text size. I’m looking up here… looking at all of my ribbons for a place to bump up the text and I can’t figure out how to do that. The first thing is that you need to make sure that you have this visual selected. This is a table visual – there you can see that that’s a table and I can move that around and it can get a little confusing because when the table is already selected, and I go to grab it, that it doesn’t always move.

If you click off of the table and then hover over it, and grab it with the left mouse button; then I can move it around but that might seem to be a little inconsistent. I can also resize it but this white table over the white background… it’s hard to see where that table is. I want to bump up the text so I’m going to go to the paint roller. You can see that this is the format option. It’s simple – once you’ve done it a few times. With this table visual selected, when I choose the paint roller, I get to the formatting options. I’m going to go down to “Grid” and here you can see that I can bump up the text size. So, I can just spin that up to let’s say 14 points. That’s a whole lot easier to read. Because this is a unique combination or unique values in that column, I’ve only seeing four of them right now. I’m going to add location. Now, I can drag and drop into the visual.

With the visual selected, if I’m looking at the field list, I can drag and drop into what we call the “field wells”. This is the field list (on the right of the panel)… these are the field wells for the selected visual (on the left of the panel). Now you’re seeing the unique combination of name and location. I’ll drag Shells field into the field wells and notice that for Fred, I only see two rows. If I’m looking at the rows for Fred, how do I know if these are detail rows or if they’re being rolled-up? Now, we go back and we look at the table and we can figure that out pretty easily. If I’m just coming into this data for the first time, I may not be able to tell how how many rows or records there are in the underlying tables behind this visual. How can I figure that out so that I can validate this data and make sure that my data is accurate? Well, the the best thing that you can do is get a row count. There are a couple different ways to get a row count. If I were just to drag and drop one of these fields that I’ve already used into the visual, I can use it to count rows. I drop this down and I can say “let’s count that”. I don’t want to count it distinctly… I just want to count it and that gives me a row count because it is the count of the number of occurrences of location. Here you can see that this row in the table visual actually represents two records in the underlying table and so that kind of solves this puzzle for me by letting me know that 104.08 is actually the sum of the shells for two underlying records where the name is Fred and the location is Bedrock. There are two records that represent that one row in the table visual. Okay… so that that solves mystery number one.

Now another thing that I can do is drag and drop location onto the report canvas. Then I can switch this to Count and that just gives me an overall count for the entire report. I’ll switch that to a card visual whitch makes that a little bit bigger, so now I can reconcile the number of rows that I see here (which is six) against the number of records in my table (which is 8), knowing that that there’s at least one row here that consists of two or more underlying records in the table. That’s an easy way to resolve that.
Alright, item number two is how do I know where these visuals are on the page? I see that I’ve resized this visual and there’s whitespace all over the place and that’s not very intuitive. What I like to do in new reports is to choose the background of the report canvas, go to the paint roller (make sure that we’re looking at page information) so you should see that I have this property group called “Page Background”. Again, that’s because I just clicked on the background of my page. I’m going to go to the page background and I’m going to set the background color to a light shade of gray. I notice that it doesn’t change and that’s because by default the background of a page has a 100% transparency. why? I don’t know but if I change that, you can see that it is now a very light shade of gray. You see that my table also is transparent so I’m going to go to the paint roller and I’m going to turn the background on which does show me a slightly different background color. The default background is going to be white. I’m just going to double check that it is white. I’ll set the background transparency to 0%. I prefer to see solid colored visuals sitting on a light grey background because now I can see where things are and it’s much easier for me to differentiate where those visuals start and stop. That goes a long ways toward helping my users understand what they’re seeing on this report.

What if I didn’t want to roll-up the name and location to a single row on this table? How can I see my detail records regardless of whether the name and location are the same or not? The fact of the matter is that Power BI is always going to group on any fields that have the same value – so you have to have a unique column. You have to have a unique value that you can group on and that is the purpose of this Unique ID column. Before I just show you that I can use that Unique ID to show detail rows, I want to show you something that happens with any numeric column by default in Power BI. Numeric columns are always automatically set to summarize. That’s what this little Sigma icon is just to the left of this field. When you see that. We do want the shells column to roll-up and summed but not the Unique ID column; we want to be able to group on this Unique ID and use it to make sure that we get our detail rows. I need to come over here to the modeling tab and I’m going to drop down the “Default Summarization” list and set this to “Do not summarize”. That gets rid of that little Sigma icon and now if I have my table selected and then when I drag and drop that field… (oh, look at that it’s going to give me a count by default so it’s even overriding my my summarization). I have to tell it not to summarize right here in the field list. I’m actually getting one row in the table visual for every record in the underlying table and that’s exactly what I wanted. Unfortunately, there’s no way to to hide or get rid of this unique ID column. If I didn’t want, I can just make that column wide and I can resize the table and get rid of that the header text.

Hopefully, this helps dispel some some very simple quandaries that a lot of new users struggle with just a little bit. Best of luck as you’re getting started with Power BI!

Data Geeks and Data Jocks Meetup in Vancouver, Washington

Our little “Data Geeks” Meetup group in Vancouver, Washington is growing with a line-up of heavy-hitters and bigger crowds.

This month, on April 16th, we are honored to welcome Reid Havens, Microsoft MVP and Business Intelligence superstar from the Seattle area. Reid will be presenting “Demystifying Chart Types & Report Design Principles”. Reid is the Founder of Havens Consulting Inc. He has a formal background in technology and organizational management, having obtained separate Master’s Degrees in Organizational Development & Business Analytics. He has experience working with many of the Fortune 50, 100, & 500 companies. Additionally he guest lectures at the University of Washington and other universities, teaching Business Intelligence, Reporting, & Data Visualization.

Please RSVP on the user group Meetup page for the event. Thanks to Lexicon Solutions for sponsoring this month’s meeting. Please visit their site if you would like to upgrade your career or are seeking technical talent.

We’re seeing considerable cross-pollination of complimentary user groups in the greater Portland area which gives professionals in the region several quality, free learning and networking opportunities every month. These groups include the Oregon SQL PASS Chapter, which meets downtown Portland on the second Wednesday evening and the Portland Power BI User Group, which meets on Beaverton on the 4th Wednesday evening of each month. With the Vancouver Data Geeks meeting on the 3rd Wednesday evening, if you attend these data-centric user groups three out of four Wednesday evenings each month, you will be smarter than you can imagine and should have immense career opportunities!

In February, we had international trainer and well-known conference speaker, Peter Meyers share his Power BI insights to a crowd so large that we had to find a larger venue. Indeed, our Southwest Washington technical community is growing. Little did we know that if you search the Meetup site for ‘Data Geeks’ in Vancouver, you might find the “Vancouver Data Geeks” group in that other Vancouver way up north of the 48th parallel. We didn’t plan that nor did they but it has a been a minor point of confusion. We may be the smaller of the Vancouvers but we are making a name for ourselves, even in the shadow of Portland, Oregon; our big sister city to the south, just over the river.

Thank you all for participating in our growing community and we look forward to seeing you on April 16th for Reid’s session at the Interject Data Systems Office in Salmon Creek, right off I-5, Exit 7A.

Azure Everyday Roundup

Short (2-4 minute) video posts to help you learn about the Microsoft Azure analytics and data platform.

In November of last year, I joined the Pragmatic Works consulting team as Principal Consultant where I work alongside several great professionals, many of which I have known and have worked with over the years. One of the many things I appreciate is the deep expertise and collaboration we share across this diverse team of professionals. In early 2018, we began sharing a short video post every single day in a series called “Azure Everyday”. I’ve contributed several of these video posts in the past couple of months that I’d like to share with you here. I encourage you to check out the complete collection of Azure Everyday recordings using the link at the end of this post.

Each video post is a short insight or learning about a unique capability in Microsoft Azure. These short videos are a great way to become familiar with all of the Azure services and new features in the Microsoft cloud data platform.

New Features That Help Manage Large Volumes of Data

Data Governance with Power BI Premium

Azure Data Lake Storage to the Rescue!

Modeling Options for Enterprise Scale BI Solutions

Paginated Reports Integrated into the Power BI Service

To see the entire collection of Azure Everyday video posts, visit this page on the Pragmatic Works blog site.

Power BI at Enterprise Scale

Last night I was scheduled to present this topic at the Power BI User Group in Portland except the meeting was cancelled because of a snowstorm; except the snowstorm didn’t happen… not even a little. In fact, the roads were so clear and traffic was so light (because everyone stayed home from work to avoid the alleged storm) that I nearly got a speeding ticket on the freeway during what would have normally been rush hour traffic. I made the the 60 mile round trip from my house to the venue and back in under an hour.

Since I had prepped up my presentation, I decided to record and post it here on my blog. Please like it on YouTube but post your questions here so other community members can see them. I’ll add the transcription below.

This presentation demonstrates how to implement Incremental data refresh and discuss a recent reference project that required versioning & managing high-volume data. We discuss: Composite data models, dual-mode tables, partitions & parameters in Power Query.

We’re going to talk about new features in Power BI specifically designed for managing large-scale, serious IT-driven and managed solutions. This presentation was actually prepared for a live session at our local Power BI user group meeting which was cancelled due to a snowstorm that didn’t occur. The pop-music references were there to incent participation and there were prizes for correctly guessing the artists of the songs that I referenced – just for fun.

What’s this all about?
We’re going to talk about BI models… the way that we did serious data modeling in the past, today and where that’s going in the future. We’ll talk about some new, evolving emerging, and some existing Power BI Enterprise features which includes different modeling choices, how the design tool features are merging and where that’s heading over time. We’ll talk about working with data in high volume and massive scale; and then I will show you a reference project which is a real project that I worked on recently. I’m actually using screenshots of that project simply because there was a lot of sensitive data. A live demo of a sample project will follow.

We’ll talk about composite models and then I’ll demonstrate using parameterised queries – which you can use even if you’re just in a regular shared capacity tenant without premium features. And then incremental refresh policies, which are a premium feature, to take that to the next level.

The Way We Were
So let’s think about the song title first and we’re going to talk about the way that serious BI solutions used to be and then we’ll talk about where that’s going in the wake of Power BI. So, The Way We Were of course… that’s a reference to a song sung by Barbra Streisand for a movie that she co-starred in with Robert Redford of the same title.

Serious BI solutions have in the past used SQL Server Analysis Services for data modeling and storing that data, and of course we’re talking about analytical datasets that are used for analytical trend reporting the kinds of things that we typically use tools like Power BI for. In the past, Analysis Services is installed with SQL Server onto an on-premises server. You pay for that server; you not only have to buy the hardware and manage, optimize and tune but there are also licensing costs associated with SQL Server outside of Power BI. Typically, the person who’s going to design those data models and manage those data models is a person like this. They’re a very technical person. It’s a fairly complicated thing. It’s time-consuming and these people generally work for an IT organization. That’s the traditional approach.

Times, They Are A Chang’in
What’s the song reference? …of course that was a song performed by Bob Dylan in the 60s.
Visual Studio has long been the design tool of choice for serious semantic data model design and up to as recent as a year ago, that was my recommendation. If you’ve got serious data large volume data you need to protect that data, it needs to be a govern dataset that is being managed seriously within the organization… That’s typically going to be owned by IT; and they’re going to design it deploy it, manage it and then users can connect to it – or IT will build reports that users can use. That’s the way that things have been and but things are changing.

Come Together
Tool sets are coming together and of course “Come Together” as a reference to a song by the Fab Four, the Beatles.
What’s coming together? …the features of Visual Studio SQL Server Data Tools (SSDT) for Analysis Services tabular and Power BI Desktop are coming together. Now it’s a bit of an evolution. We’re seeing features added to each of the tools but they’re definitely converging and so we would typically use Visual Studio the SQL Server Data Tools add-in for Visual Studio to create tabular data models like this but now we have these same capabilities – and even more capabilities – in Power BI Desktop. Power BI Desktop is really kind of becoming the preferred tool.

That Was Then But This Is Now
You’d have to have grown-up in the early 80s to get this song title. This was a song, and the name of an album, from a pop/dance group in the 80s called “ABC”. They did some great songs like Poison Arrow and When Smokey Sings, but things are changing and we’re now at a place where Power BI is moving at a very rapid pace. The development of that product is very quick and the product team will focus on adding features to the product that they own and they manage; and that is Power BI Desktop. In order for this to be a cohesive development experience and for Power BI Desktop to be a tool that is aligned with the Power BI service, Power BI Desktop is really becoming the recommended enterprise BI development tool. Now, that’s not to say that there aren’t features and capabilities of SSDT in Visual Studio that may be preferable in some cases but today Power BI Desktop is the tool that I typically go to first before falling back on Visual Studio so you think about that experience we author in Desktop. We manage all of our development in Desktop; our queries, data sources, security, the data model calculations and the report visuals. All of that is done in Desktop. We deploy those assets up to the cloud, into workspaces in our Power BI tenant.

Peeking behind that cloud out there… that’s not the Sun that you see; that’s actually the Analysis Services tabular engine. That’s right: Power BI in the service, up in Azure actually has an instance of Analysis Services tabular that manages our data models. This is the same analytic database engine that’s available when you install Analysis Services on your own server with the the SQL Server install bits. The enterprise features of SSAS just have to be lit up through the Power BI tools that we’re using

What enterprise features are we talking about well we’re talking about DirectQuery which is a capability that allows you to not import data into your data model but stored metadata about tables so that when users interact with reports queries are generated in the native query language of the data source; whether that’s SQL Server, Teradata, Oracle or whatever – and those queries are actually processed by that data platform and results are sent back to your Power BI reports. There are trade-offs but it is still viable with some important features.

Composite data models give us the ability to use the capabilities of DirectQuery with the capabilities of imported/cached tables as well. Because there are performance trade-offs with DirectQuery and when we’re just generally working with data at scale, very large volumes of data, that’s going to impact performance. This impacts performance drastically with DirectQuery but it can also impact performance with imported/cached tables as well.

Aggregations give us a mechanism for speeding up those queries in our reports by actually storing aggregate values at different levels of grain. I’ll show you what that looks like and then incremental data refresh policies allow us to automatically generate partitions over our tables for a couple of reasons: First) so we don’t have to run huge queries to reload high volumes of data, and Secondly) so that we can break-down the the data stored for those tables so that we don’t have to reload unchanged historical datasets. We can only refresh new data and perhaps more recent data for our large tables.

The XMLA endpoint… as of this recording, this is still a forthcoming feature. We know that the product team are actively working on the XMLA endpoint for Power BI which will give us the same capabilities we have for Analysis Services today, so that third parties can develop tools, we can write PowerShell scripts, we can do automation, and we can generate scripts. We can do all of those things that we do have access to with Analysis Services – against Power BI that we haven’t had access to in the past. This capability will really open up the gate for for serious automation, for DevOps, build management, version control. Those things are going to become much easier when we have this programmatic and metadata layer access to data models.

If you’ve attended a Microsoft event or a conference where the Microsoft product team folks have presented about Power BI recently, you’ve probably seen this demonstration.
How is this possible that in a Power BI report we can surface six trillion rows of data?

Most of us don’t work with datasets this large but this is an excellent example of how these new features can enable such capabilities and we’re going to talk about how that works. Before we talk about enterprise scale, let’s talk about the limitations of different levels of Power BI and Analysis Services.

First of all. it’s important to understand that when working with Power BI data models and Analysis Services tabular data models (which technically are the same thing), that data is always compressed. The compression ratio (the degree to which the size of your data model will be reduced) depends greatly on characteristics of your data. Correctly modeled data will allow you to achieve compression ratios of typically up to about 20x. We generally we expect to see 5x, 10x and 20x compression but I’ve seen tables that compress much smaller: up to let’s say 50x or even 100x but that’s not very usual. This depends on characteristics like the the sparsity of the data, duplicate column values, the data types and granularity of key values. The more sparse the data, the more duplication or empty values there are in column values, and the less granular specific values are; the more that data is going to compress.
Model Size Limitations

Thinking about compressed data sizes, a standard shared capacity Power BI tenant will allow you to upload up to a one gigabyte PBIX file (that’s again, compressed data). We want to reduce the size of these files as much as possible and especially reduce the size of the file that you actually upload. If we can perform a data refresh that will then expand the size of the data, making all necessary data available after it has been refreshed, that’s better than having to refresh your Desktop file before you upload it.

What about if we are dealing with more data than one gigabyte?
In Power BI Premium dedicated capacity, a premium tenant will allow you to upload and store datasets of three to twelve gigabytes. This depends on the level of the premium capacity subscription that you pay for each month. The largest capacity will allow you to upload a file that is up to ten gigabytes in size, then when you refresh that it can expand to up to twelve gigabytes. so that’s that’s actually a lot of data for most solutions if you need to exceed that though, you can use Azure Analysis Services in the cloud (so no hardware to buy and nothing on prem.) It is in Azure which means that Azure Analysis Services is a neighbor to Power BI – possibly in the same data center – so connections between that data are very fast. Depending on the service level, you can store up to 400 gigabytes of compressed data in Azure Analysis Services.

On premises, if you have Analysis Services installed on your own server: if you use SQL Server standard licensing, you are limited to 16 gigabytes per database.

If you use enterprise licensing, there is no stated limit. The product itself can deal with any size dataset but you’re limited just by your hardware. Think about modern server hardware where you know we can have very large shared storage. Of course, memory is a limitation for the tabular engine but modern, well-equipped servers can have 1 to 4 terabytes (or more) of RAM. That’s a lot of memory that can manage a lot of data …so really, virtually there there is no limit to the size of the dataset that you can manage in SQL Server Analysis Services with enterprise licensing.
Let’s talk about how to optimize that experience. Another song reference…

Blurred Lines
Of course, that refers to the the dance song from Robin Thicke and Pharell Williams who were famously sued by the Marvin Gaye family due to copyright infringement.
what do I mean by “blurred lines”? Well, the distinction between Analysis Services and Power BI are becoming blurred because Power BI uses the Analysis Services tabular modeling engine. In the past, we’ve had a subset of those features and now we have a superset of those features but Analysis Services also has features that Power BI doesn’t have. This means the two really are converging, feature-by-feature, as they’re being added to each of those services and tools. There is definitely a shift to the Power BI toolset and the Power BI service from Analysis Services in terms of the preferred tool to use for developing solutions.

I want to cite a reference solution that I recently finished up for a client. I’m not going to show any sensitive data here but I did want to show the row counts of various tables. These table sizes have actually increased since I captured this screen but you can see that we have some relatively large tables here… just shy of five million appointments; just over six million charge capture records (which is an extension of patient encounters), about 20 million test results. The largest table in this model today has about 30 million records. The “person-measures” table in the bottom row had 16 million at the time that I captured this but that’s up to about 30 million now.
In this reference solution, how did I design for scale? There were several high-volume tables that I filtered using parameters. That was to reduce the working dataset size and to reduce the size of the PBIX file. This was, 1) to speed up development and, 2) so that it didn’t take forever every time I deployed a new version of that dataset file.

I’ll point out that in new Power BI projects I separate the dataset and report files. I create two different PBIX files. There are some minor challenges in doing that so actually I start with one PBIX where we manage the queries, the data model, measures (calculations) and the reports. But then when we get ready to transition to a true version management & ALM model, I create a separate copy of the PBIX file where the reports are connected to a separate, published dataset. That copy of the PBIX file only has the reports in it. You can actually copy and paste visuals from one Desktop file to another now. In the end we only have the queries and data modeling calculations in one file, and the reports in another file. That makes it a whole lot easier for users to do their own reporting as well. The dataset PBIX file can be large and I use parameters to reduce the size of that file. Then we can fill it up with data when we do a full data refresh, once it’s been deployed to the service.

How are parameters used to manage the dataset size? Well, we define two parameters. If you look at the documentation for the new incremental data refresh policy feature, you define two parameters called RangeStart and RangeEnd. They have to have these case-sensitive names and they have to be date/time datatypes. Even if you’re working with date-only data, they must be date/time types in order for this feature to work correctly. You then give them default values. Here, I used a range of one-year: January 1 2018 through December 31 2018. Then, I imported the tables from views.

This is key: database professionals tend to try to solve problems in SQL, which is fine because that’s an effective tool for joining tables together, creating aliases, doing groupings and aggregations. You know, all of those things can be done in SQL. However, for this and many other features to work correctly, Power BI prefers – and in some cases only works with – single database objects that it perceives to be tables. That that could be a table, or it could be a view, or it could be a table-valued, user-defined function. As you can see here, once I’ve connected to the data source, I have this list of views (I actually wrote a view for every single table that I wanted to import) and then I just select those views rather than using inline SQL. You need to select database objects from the source and don’t write any inline SQL queries. Again, I created views to consolidate database queries on the server and we actually created a separate database just to contain the views. This was so we didn’t have to to put those views into a vendor database. Next, we create a row filter step. That’s very simply: you go to the column that is a date/time datatype (this is our partition key that we want to filter on to create partitions) and just create a date filter on that column.
The row filter must use this logic on the column: is after or equal to RangeStart, and is before RangeEnd. Again, the column has to be date/time just as the parameters are date/time – even if you’re just dealing with dates, it has to be date/time. You want to perform that filter as early in the query as possible (by “query”, I’m referring to the Power Query query in Power BI Desktop) and the reason that we do that is that we want the filter to get folded into a native query. This only works with data sources that support query folding. There are many data connectors that support query folding. Query folding means that the “M” language query in Power Query gets translated into the native query language of the data source. In this case (since we’re using SQL Server), the native query language is T-SQL. If it were Oracle, it would be PL/SQL and if it were Teradata, it would use the Teradata SQL dialect, etc. Here, you can see the predicates down at the bottom of this native query that I’m viewing. It says WHERE encounter timestamp is greater than or equal to…and less than…the parameter values. Power Query generates the folded native query with this filter, and if you have steps later in your query that break query folding with some complex transformations, things still work as they should. It’s OK because the queries already been folded at this stage of the query and the results are cached in memory for later steps to be applied. Power Query works against that cached result set as a result of this folded filter parameterised query.
Within the data model, we define an incremental refresh policy. That’s done within the data model designer and not in Power Query. That feature will only light-up and enable you to perform the incremental refresh policy feature when you have this filter in place referencing this pair of parameters. You define the period number and period type. In this case, I’m saying five years. If I were to say sixty months, it would generate sixty partitions instead of five partitions. I actually had a table in this project where the SQL Server query plan was was so poor (because so many tables were being joined together and there were some indexing issues in in SQL Server). Those queries were very slow and the one-year partitions actually would timeout. By using months instead of years to partition, the Power BI service generated a series of smaller queries that didn’t timeout and it solved that data refresh problem. You can optionally detect changes using the check box you see down at the bottom of this dialog. If you select that option, it prompts you to select a column in the table with the last update date. This would typical be a timestamp type column that a database designer would use for ETL maintenance. This will detect changes and automatically refresh those changes.

To Cache Or Not To Cache
Even though the spelling is different, this is a reference to the late great Johnny Cash, just for fun.
Imported, in-memory data tables provide high performance. Okay, so this this is an important learning about using the Vertipaq Analysis Services engine: In its most native mode, where you import data into the model that is held in memory, you can expect high performance but the trade-off is that you have to have a place in memory to put all of that data. DirectQuery tables, by contrast, offer scalability so if you have you know massively large data at the data source; it may not be feasible to import all of that data. There’s a lot of goodness that you get if it is possible to import all the data into an in-memory data model. It makes things simple, development is faster, performance is going to be good and your projects are uncomplicated. All good news, but in cases where there is a bonafide reason to use DirectQuery to deal with data at scale, perhaps using unique data sources that deal efficiently with large volume data. There’s a lot of goodness in this option as well but there’s such a trade-off between those two different options.
Now, we have composite data models. These are the best of both worlds with what we call “dual mode” tables and optionally predefined aggregations.
So, what does that look like? In the past, the only option was to use one or the other – either DirectQuery or import. You could not combine tables in both modes into a single data model. Today we have the ability to create composite models where tables can either be DirectQuery or imported. It’s important to note that you can’t take an import table and convert it to a DirectQuery table but you can take a DirectQuery table and convert it to import table. Composite models provide performance for cached in memory tables but also provides some flexibility supporting high volume and near real-time access from data sources. DirectQuery has some performance trade-offs and some calculation restrictions; there are certain DAX functions that will either not be supported and not work at all or will just perform very poorly because of the way that they have to handle a lot of data at aggregate. There are reasons that DirectQuery has been kind of a difficult pill to swallow in the past. Aggregations allow us to mitigate some of these performance concerns. The idea behind aggregations is that we define a virtual table in the data model that is the aggregate of values grouped by a group of keys. As you can see, sitting between the date and geography tables we have this aggregate table that we’re calling Aales Aggs, and if the grain of the Date table is one row per day, and the grain of the Geography table is City, then we’re going to have one row in the Sales Aggregate table per each combination of Date and City with grouped and pre-aggregated values. When a user navigates to a visual or uses a report that is grouped on Date and City, they’re going to get that pre aggregated value rather than having to go back to a fact table and perform the aggregation on-the-fly. If that fact table were DirectQuery, then it’s not going to have to go out and run a group by SQL query against the data source. So, this resolves performance issues by storing the cached aggregate results and the aggregate tables are already rolled up at a certain level of grain. You can have multiple aggregate tables. This all requires some planning and advanced design skills. Typically, you’re going to use aggregations to solve performance problems. Perhaps, with more experience, after we have years of experience with this feature, we’ll be able to look at a model design and say “OK, yeah we ought to put an aggregate table there” but today, we see that there’s a performance issue and say “let’s add some aggregations to to take care of that performance problem.” It’s a brand brand new feature and it does work well. We just don’t have much experience with this yet.
As you can see in this image, we specify the columns for a table that we want to group on. This is much like a lot of user experience interfaces that we already have in Power Query and in query building tools. We just say “I want to group by the Order Date, the Customer and the Product Subcategory and then I want to aggregate the Sales Amount and the Unit Price using sums.”
You can create multiple aggregate tables to support different levels of detail or grain. Since the aggregate values are stored in the model, it does increase the model size but if designed well, the storage costshouldn’t be significant. The purpose is to improve performance and mitigate the performance handicap of DirectQuery tables. You’re not limited just to DirectQuery but that’s a use case that that’s very common.
Something to keep in mind is that these aggregates are stored in the model and of course DirectQuery results are not stored in the model, therefore there is opportunity for the DirectQuery tables and the aggregate tables to to become out of sync so refresh often and just manage expectations around that. Just know that this is a trade-off of using aggregations with DirectQuery.

Show Me The Way
What’s the song reference? I’m gonna show you a picture of the group but I’m not going to tell you who they are yet. what are we talking about? Well, these are our demos so I’m going to demonstrate a couple of things. I’m going to show you query parameterization, which is not limited to premium capacity tenants. Then, I’m going to show you incremental refresh policies which are kind of an extension of query parameterization. Incremental refresh policies are a premium feature.
Show Me The Way was a great song in the 80s from the group Styx. Peter Frampton also had a hit with a song by the same name in the mid 70s. Alright, let’s take a look at our demos. This is a Power BI project that I’ve created with data from a an on-premises SQL Server database… the Contoso DW database, with retail sales information here in Power Query…
(demos)
Thanks for watching. Contact information is on the last slide.

It’s about solving business problems & meeting people… Allen White interview

I’ve been remiss about blogging for a little while and have some great stuff to share, starting with this interview with Allen White at the PASS Summit conference in November. Allen is a 15-year PASS veteran and one of the most consistent SQL Server experts in the industry. He’s a powerhouse of knowledge, experience and wisdom. He’s a long-time SQL Server and Microsoft Data Platform MVP and owner of DataPerf Professionals Consulting, mentor, coach, trainer, co-founder and leader of the Ohio North SQL Server user group. He’s served on the PASS Board of Directors as the Program Director.

Allen shared some pearls of wisdom in this sit-down interview during PASS Summit 2018 in Seattle. He knows a lot about SQL Server tools and his first bit of advice for industry newcomers… “it’s not about tools.”

P: Allen, as an experienced leader in this industry what advice would you give to someone breaking in the industry or really thinking about planning their career around the Microsoft data platform?

A: People just starting out in the industry tend to do is focus on the tools in the technologies that’s the exciting thing. The problem is where the focus really needs to be is on the business problems that need to be solved. The tools are there to solve business problems; not in and of themselves. So, focus on what it is about the business that makes the tools useful to solve those business problems.

P: OK, and… so for a student or someone who has not immersed in facing business users and business leaders yet in their career; where does a person go to get that experience? What do they look for?

A: It’s a function of studying what it is about a particular line of business that…proposes the challenge. So, a transportation industry example… You’ve got trucks and they’ve got to get from point A to point B and you’ve got to measure that.. What tools out there into measures that? Do they have built-in GPS? Are they able to send those GPS signals to some centralized device that can even report on and track the progress of those vehicles etc.? …those kinds of things. Think about the business and the information needs that will help that business succeed.

P: Okay, good. What are you looking forward to most about PASS Summit this week?

A: Meeting the people. People make all the difference. My first Summit was 15 years ago and I’ve been to every one since and it’s always about the people.

Learn more about Allen and the PASS organization here.

Interviews from PASS Summit 2018: Matthew Roche, Matt Masson, Adam Saxton & Patrick LeBlanc

I had a chance to catch up with two of the three Matts from the Power BI product team at PASS Summit 2018.  Speaking with Matthew Roche and Matt Masson about the new Power BI Dataflows announcement, they told me that Power Query is becoming a truly enterprise-class data transformation technology. Matt Roche explained that not only is Dataflows a cloud-based implementation of Power Query with a browser-based query designer but it allows you to enforce business and data quality rules using strongly-typed entities in the Common Data Service.

Power BI Dataflows & Self-service Data Warehousing

Matt Masson told me that the expansion of Power Query into multiple products and services demonstrates how truly powerful the Power Query platform has become. He said that behind the simple-to-use user interface is an extremely flexible data transformation language, and underneath the M language is a great, flexible dataflow engine.  He’s currently working on the Intellisense features of the Power Query M language advanced editor. Matt Masson also said that he is expecting more third-party vendors to create custom data connectors for Dataflows and the Common Data Service.

I also caught up with Adam Saxton and Patrick LeBlanc from the Power BI Customer Advisory Team (CAT).  You may recognize Adam and Patrick from the “Guy In A Cube” video series.  Patrick described Power BI Dataflows as “self-service data preparation for the masses”. Adam added that that Dataflows takes Power Query to the next level by writing data to tables in Azure Data Lake Storage rather than directly to a VertiPaq data model.  This capability enables the Power BI platform to create a truly enterprise-scale data warehouse and Business Intelligence solution.

I asked Adam what skills are most important to someone growing a career in IT-based Business Intelligence.  He said that the most important thing is to understanding proper data modelling in data warehouse design.  Power BI – by itself is a great platform downstream from data sources or an existing data warehouse.  Dataflows and the CDS fills an important gap in self-service BI reporting & data analysis.  These technologies provide a common platform for data preparation, data warehouse design and standardized data entity management for self-service and BI reporting.  Dataflows shows a lot of promise for serious, next generation data warehouse solutions.  Stay tuned for more on this as I explore and prepare demos with Dataflows, Azure Data Lake Gen2 Storage and the Common Data Service. 

Data Model Options for Power BI Solutions

At the heart of every a Business Intelligence reporting  solution is a data model to optimize queries and enable ad hoc report interactions.  Data modelling technology has evolved quite a lot over the past 20 years or so.  You have several modelling technology choices with options depending on the simplicity or formality of your project and factors like data size and security.  In the past, choices were simpler.  Power BI was the choice for smallish, “good enough” projects; and when data quality, high volume and exactness were the priority, Analysis Services (SSAS) was the better choice.  Now, using Power BI for modelling larger data sets is even advantageous compared – with new features like hybrid models, aggregations and incremental data refresh.  Just in the past year or so enhancements have been added – or are in the process of being added – to the Power BI platform, that the list of options and considerations continue to grow.  Here’s a quick (and I mean quick) summarized list of the modelling options for Power BI solutions:

  • Power BI Desktop model published to the Power BI service – limited to 1 GB model size, no partitioning.  Pay per user with Power BI Pro licenses.
  • Power BI Desktop model published to a Premium Capacity Power BI Service – 12 GB size limit, dataflows & numerous “grown-up” features.  Unlimited read-only domain users.
  • Power BI Desktop model published to on-prem Power BI Report Server – limited to 2 GB model size, no partitioning.  Prior version & feature restrictions.
  • SSAS on-prem or AAS using report published to PBIRS – no model size limit.  Prior version & feature restrictions.
  • SSAS tabular or multidimensional on prem connected to Power BI report in service via the data gateway.  No size limits.  License server cores for SSAS.  License Power BI using Pro or Premium.
  • Azure Analysis Services = SSAS tabular.  No size limit, pay as you go and scale as needed.
  • Power BI reports in the service or PBIRS using DirectQuery to connect to data source with no cached model.  Report interaction queries may be slow, with some calculation limits.  *Aggregations & hybrid model design can optimize & reduce query latency, with moderately complex model development overhead (Premium features).

There are numerous advantages and disadvantages with each option so the best choice always clear but we can certainly compare the pros and cons.  I’ve been on a quest to find the best medium to break these choices down into simple terms but it truly is a journey and not a destination.  Along this journey, earlier this year, I presented a session called “The Nine Realms of Power BI” where I enumerated different categories of solution architectures for Power BI solutions; and they are numerous.

One important consideration is Microsoft’s commitment to support a product with new features in the future.  They have made it very clear that the Power BI platform is their primary focus and that they will continue to invest in enterprise-scale capabilities in the cloud service.  Never plan on a crucial feature being added later to a product but give serious consideration to  where a product is headed.

Making a side-by-side comparison of features between products and options is a little like comparing oranges, apples, grapes and bananas.  As a best effort, I started with the feature lists for SQL Server Analysis Services and added the Power BI variations.  Admittedly, this type of comparison doesn’t fit perfectly into this type of pivot format but I think it serves the purpose.  This post will likely evolve a bit with time.  Your feedback and input are welcome.

Some additional resources from trusted colleagues: Matthew Roche from Microsoft recently posted this article.  He refers back to this blog post and mentions an excellent session at the recent PASS Summit from Gabi Münster and Oliver Engels.  On Matt’s post, he links to the session slide deck that Gabi and Oliver shared.  An official whitepaper written by Chris Webb and Melissa Coates covers Power BI enterprise deployment options in 105 pages.  Keep in mind that many new features have been added since this paper was written but the whitepaper will likely be updated in the future.  Matt also references another recent on this topic by James Fancke where he contrasts features and costs.  This is, no doubt, a timely and important topic.

Guy-In-A-Cube’s Adam Saxton and Patrick LeBlanc have posted numerous short videos highlighting specific features of Power BI with Analysis Services and here are a few to get you started.  

Side-by-Side Feature Comparison

In the following table I compare capabilities, starting with SSAS Tabular and SSAS Multidimensional, with Power BI Pro and Premium capacity.  My blog theme doesn’t current support wide tables like this (apologizes for the overlapping text) so you can Open or download the Excel file here.  I hope this is helpful.  I’ll work on updates to optimize this comparison chart as features are added and time allows.

Feature
Enterprise/Developer
StandardAzure Analysis ServicesPower BI StandardPBI Report ServerPower BI PremiumComments
Max model size (compressed)No limit*16 GB No limit*1 GB2 GB10-12 GB**Premium supports 10 GB upload & 12 GB refresh.
Hybrid models
(DirectQuery & cached mode)
NoNoNoNoYesYes 
M/Power Query sources & transformationsYesYesYesYesYesYes**Query extensions in the service using dataflows
Integrated application lifecycle management (ALM) toolsYesYesYesNoNoNo 
Integrated version control toolsYesYesYesNoNoNo 
        
Tabular Models       
Programmability (AMO, ADOMD.Net, OLEDB, XML/A, ASSL, TMSL)YesYesYes*Yes*Yes*Yes*Third-party tool support, new XMLA endpoint for Power BI
HierarchiesYesYesYes**Yes*Yes*Yes**Simple hierarchies,
**AS supports HideMemberIfBlank
KPIsYesYesYesNoNoNo 
PerspectivesYes YesNoNoNo 
TranslationsYesYesYesNoNoNo 
DAX calculations, DAX queries, MDX queriesYesYesYesYesYesYes 
Row-level securityYesYesYesYesYesYes 
Multiple partitionsYes YesNoNoYes**Incremental refresh builds partitions
In-memory storage modeYesYesYesYesYesYes 
DirectQuery storage modeYes YesYesYesYes 
        
Multidimensional Models       
Semi-additive measuresYesNo 1Yes*Yes*Yes*Yes**Using DAX code,
Effort: moderate
HierarchiesYesYesYes**Yes*Yes*Yes**Simple hierarchies,
**AS supports HideMemberIfBlank
KPIsYesYesYesNoNoNo 
PerspectivesYes YesNoNoNo 
ActionsYesYes*Using 3rd party tool*Possible, limited*Possible, limited*Possible, limited 
Account intelligenceYesYesYes*Yes*Yes*Yes**Using DAX code,
Effort: high
Time intelligenceYesYesYesYesYesYes 
Custom rollupsYesYesYes*Yes*Yes*Yes**Using DAX code,
Effort: moderate
Writeback cubeYesYesNoNoNoNo 
Writeback dimensionsYes NoNoNoNo 
Writeback cellsYesYesNoNoNoNo 
DrillthroughYesYesYes*Yes*Yes*Yes**Multiple techniques
Advanced hierarchy types (parent-child and ragged hierarchies)YesYes*YesNoNoNo*Supports HideMemberIfBlank
Advanced dimensions (reference dimensions, many-to-many dimensions)YesYesYesYesYesYes 
Linked measures and dimensionsYesYes 2No*No*No*No**Equivelent functionality
TranslationsYesYesNoNoNoNo 
AggregationsYesYesYesYesYesYes 
Multiple partitionsYesYes, up to 3YesNoNoYes**Incremental refresh builds partitions
Proactive cachingYes *****In-memory model is always cached
Custom assemblies (stored procedures)YesYesNoNoNoNo 
MDX queries and scriptsYesYes*****Supports MDX queries & DAX scripts
DAX queriesYesYesYesYesYesYes 
Role-based security modelYesYesYesYesYesYes 
Dimension and cell-level securityYesYesNo*No*No*No**Equivelent functionality for measures
Scalable string storageYesYesYes*Yes*Yes*Yes**Equivelent functionality
MOLAP, ROLAP, and HOLAP storage modelsYesYesYes*Yes**cached or SSAS directYes**DirectQuery & hybrid models are equivelent or superior
Binary and compressed XML transportYesYesYes*Yes*Yes*Yes**VertiPaq in-memory compression on all data types
Push-mode processingYes YesYes*Yes*Yes**Multiple & different processing techniques supported
Direct writebackYes NoNoNoNo 
Measure expressionsYes YesYesYesYes 

Interviews with Microsoft Data Community Speakers and Leaders

What is the key to advancing your career in the Microsoft data platform?  Here is some advice from some of the most successful people in the industry…

Every year we have some big community events that bring together community leaders from all over.  These are international user group and community leaders who write books and speak at conferences. we had our local Oregon SQL Saturday “SQL Train” which is a chartered train – some coaches that one of our speaker’s chartered to bring all of our speakers and attendees up to the PASS summit after Oregon SQL Saturday, and then the big PASS summit (the big conference up in Seattle). I had a chance to sit down with a number of our speakers, community leaders and attendees and just ask questions about what brought them there, and advice that they would give people in the industry about how to get the most value out of that experience …and this is what they said:

Well-run monthly user group meetings and big annual events local events like SQL Saturday don’t just happen by themselves. It takes a lot of planning, a lot of volunteers and a lot of coordination to make these events successful. Part of that effort are the annual leadership planning meetings that we have during the week of PASS summit. Here are some short clips from those meetings where several hundred local leaders from all over the world got together to share ideas, to exchange notes and to coordinate to be able to make these events successful. Leaders cross-pollinate, exchange ideas and they work together to make this a great community. Why? …because somebody did that for us when we were getting started and we want to give back to the community. So, get involved; join the leadership committees at your local user groups, volunteer at SQL Saturday. Volunteer to do a short speaking engagement. Just get up and talk to some of your peers. Get started by volunteering in the community so that you can be part of the ongoing great community we have around the Microsoft data platform.

How to Assign Pro Licenses to a Power BI Tenant

This is a question that comes up all the time.  Power BI licensing is not complicated but a common challenge is that the person who sets up a new Power BI subscription and tenant within an organization is often not the same person who manages Office 365 or Azure service licensing for the organization.  I’ve consulted on projects for several organizations where folks just didn’t know who to talk to or how to proceed after testing the water with Power BI.  After setting up a new subscription, IT professionals and business data analysts often don’t know how to license Power BI for company use and share reports and datasets with others.

This post will show you how licenses are assigned to users and, more importantly, what to request from your support desk or administrators who may be unfamiliar with Power BI and Office 365 user licensing.  Keep reading for background information about why this is important and necessary.

You can use Power BI in one of three modes:

1) If you’re a one-person organization or don’t need to securely share content online with anyone, you can just use Power BI for free.  Yep, free.  No feature limits.

2) If your organization has a few hundred users or less, you will need to buy a Power BI Pro license for every user to author or view reports.  Unless you publish report content publicly to the web, every user must have a Pro license – period.

3) If you have several hundred users or you have so much data that you need to manage it with dedicated capacity, it may be cost-effective to purchase a Premium capacity tenant for a few thousand dollars a month.  Pro licenses are still required to publish reports but anyone in the organization can view published reports online or an on-premises Power BI Report Server.

Power BI Subscription Basics

Let’s say that I work for XYZ company and my work address is Paul@xyz.com.  Assuming that a Power BI subscription doesn’t yet exist, if I go to PowerBI.com and setup an account using my email address, I have created a Power BI subscription for my company that is a tenant within the Power BI service.  I could be the janitor for a multinational corporation but I am now the administrator of the tenant.

By way of definitions; the Power BI Service is the entire Power BI offering within the Microsoft Azure cloud.  At any time, it could encompass hundreds of virtual machines geolocated in data centers throughout the world.  When you subscribe to the service, you are – in effect – renting some space within the service.  The “space” that you rent and manage for your subscription is called a tenant.  It’s sort of like renting some office space or an apartment in a large building.  You don’t really own it but you are paying for the right to live there.  You can read about these and other administrative concepts here.

After setting up a new Power BI subscription, you really have one of two options:

1) If you have the authority to purchase licensing and manage services on behalf of your organization, proceed to purchase and assign licenses for report developers and users.

2) Make a service request or contact the appropriate administrator within your organization to add and assign licenses.  This might be your help desk, systems or operations admin or Office/cloud services administrator.

The Admin Take-Over

After a “less privileged” user sets up the first Power BI subscription for the organization, no one else can do the same.  This can be a little confusing if some person in Finance sets-up a trial account and then a BI developer tries to do the same thing.  If the organization plans to move-forward with a governed Power BI tenant, they can perform an “Admin Take-Over”.  Contrary to what the name suggests, this is not the same as a government coupe with troopers storming the building in Kevlar vests and automatic weapons.  It simply means that an administrator assumes control of the new tenant and transfers admin rights from the person who established the subscription the appropriate person(s).  Adam Saxton describes who this works in this Guy-In-A-Cube installment.