Next Level Power BI – M Language Basics

This is a series of posts and exercises from a recent full-day workshop called Next Level Power BI. Check back for additional lessons and freely post your questions and feedback in the comments section after this post.

Power Query is a modern data acquisition and transformation technology and it is the “Get Data” component of Power BI and many other Microsoft tools. The query script language of Power Query is known as “M” or the Microsoft Data Mashup language. The menu options and selections in the Power Query Editor interface build this “M” script for you but to use Power Query at an expert level, you must understand fundamental concepts of the language. The post will step you through some essential mechanics of the language to build the necessary fundamental skills.

What you learn here can be applied to the Power Query implementation of other Microsoft products, which currently include:

  • Power BI Desktop Query Editor
  • Power BI Data Flows
  • Excel “Get & Transform” on the Data ribbon
  • SQL Server Analysis Services (SSAS) Tabular projects in SSDT/Visual Studio, or Azure Analysis Services.
  • Azure Data Factory: Power Query task

By the time you read this, there are likely to be more implementations of Power Query.

Basic Syntax

A quick review of the characters that are used to define or encapsulate groups of objects will help to better understand and recognize what you are seeing in the following code examples.

Parentheses, square brackets and Set braces all have different meanings. For references, here they are in the same order:

( ) – Parentheses are used to pass parameters to a function and to control mathematical order of operations.
[ ] – Square brackets encapsulate a set of records.
{ } – Set braces encapsulate a list of values.

Most functions in M are part of an object class (such as Date, Table or Text) and the syntax is something like this:

  • ObjectClass.Function( ),
  • ObjectClass. Function( Parameter ),
  • ObjectClass. Function( Parameter1, Parameter2 )

Examples:

  • Date.AddYear ( <date value> )
  • Table.FromList ( <list object> )
  • Text.End ( <text string>, 2 ) – returns last two characters from right of string

A few older functions that were part of the initial release of Power Query are not prefixed with the object class name. For example the #date( ) function has three parameters: Year number, Month number and Day number.

Any word may be used as a variable. Common examples include step names like Source. Variables are often multiple words containing no spaces, such as ServerName. You are strongly encouraged to rename steps with short but meaningful phrases to make them self-documenting and easy to understand.

Are you with me so far?

The Mighty #shared Function

This little function is literally the key to the entire kingdom and will return every Power Query object defined in your file along with every available M object class and function in the entire language along with complete documentation and code examples. This is the best kept secret in Power Query.

Add a new blank query and type the following into the formula bar, and watch while magic things happen:

=#shared()

The results will be returned as a list of objects and functions. Convert that to a table and you can view the definition of several hundred objects and functions. The M laguange contains nearly 900 functions and more are frequently added to new product versions.

Click the cell next to the link for any function to peek at the definition or click the link to explore related details.

Getting Started

The following examples are hand-written queries entered into the Advanced Editor in Power Query. None of these queries have an actual data source. Values are generated within the code itself but all of these examples can be applied to real queries based on any data source.

Each of the following queries adds a little more to the previous query to explore progressively more advanced objects and code techniques in the M language.

If you create a new blank query in Power Query, it will generate code like this:

let
Source = “”
in
Source

This is the fundamental pattern to get you started.

Values

Let’s modify the script to give the Source variable a text value of “This is a value” (creative, I know).

Use the Advanced Editor to view the generated M script.

Here’s just the script outside of the Advanced Editor:

//——————————————————-
//#1: Object Types: Value
//——————————————————-
let
Source = “This is a value”
in
Source

Use the Done button to close the Advanced Editor to see the result.

Unless they are explicitly data-typed, variable object and data types are automatically assigned, which you will see in the following examples.

The Source variable represents a value type object that is implicitly typed as text. You can see the resulting value in the query designer:

Records

In M, records are encapsulated in square brackets. This script generates two single column records.

//——————————————————-
//#2: Object Types: Record
//——————————————————-
let
Source = [Record1 = “Value 1”, Record2 = “Value 2”]
in
Source

The result is two single column records:

Lists

The List object in M is denoted using Set braces, sometimes called “squiggly brackets”. The following script creates a list of four text-type values:

//——————————————————-
//#3: Object Types: List of Values
//——————————————————-
let
Source =
{ “Value 1”, “Value 2”, “Value 3”, “Value 4” }
in
Source

The list is shown below. Note the column header in the preview grid reminds us that this is a List object. Also note that the designer shows the List Tools ribbon with options to convert the List to a Table, because that’s what we normally do in Power Query to be able to consume List data.

Now, we can start to combine these concepts and techniques. The following query script creates two records, each containing two columns with corresponding values:

//——————————————————-
//#4: Object Types: List of Records
//——————————————————-
let
Source =
{
[Column1 = “Value 1”, Column2 = “Value 2”],
[Column1 = “Value 3”, Column2 = “Value 4”]
}
in
Source

The designer shows two records as links. This is because these are not consumable objects. To use them, you can either convert them to a table or expand one record by clicking the link to view the values.

To peek inside a record without adding a new query step, you can click in a cell but not on the link and the value will be displayed in the preview area at the bottom of the screen.

Tables

If you use the Convert to Table button on the toolbar, that will add the steps you see in this query.

The Table.FromList function is a good example of several similar functions that serve a similar purpose. As you look through all the functions within the different object classes, you will notice several different “From” and “To” functions that convert an object from one type to another type. These are both easy to use and they are easy to identify and to understand their purpose. This one should be obvious: As the FromList function in the Table class, it converts a List type object to a Table. Another example, the ExpandRecordColumn in the Table class expands a column containing delimited values into separate columns.

//——————————————————-
//#5: Object Types: Table
//——————————————————-
let
Source =
Table.FromList(
{
[Column1 = “Value 1”, Column2 = “Value 2”],
[Column1 = “Value 3”, Column2 = “Value 4”]
},
Splitter.SplitByNothing(), null, null, ExtraValues.Error
),
#”Expanded Column1″ = Table.ExpandRecordColumn(Source, “Column1”, {“Column1”, “Column2”}, {“Column1”, “Column2”})
in
#”Expanded Column1″

The result of these steps produces the results you see here. We began with a list of two records, each having comma-separated values. After the conversion steps, we now have a Table object with two rows and two columns with values and headers.

Transforming Each Row

When you use the Transform ribbon or menu options to perform most any transformation on a column, you are in essence saying “I want to perform this action on the specified column in each row of the table.” The key word in this statement is “each” and this is an important differentiation in the following M code that was generated when extracting the last character of the column named Column2.

An example of this generated query is shown in Query #6 below.

//——————————————————-
//#6: Object Types: Transform each row
//——————————————————-
let
Source =
Table.FromList(
{
[Column1 = “Value 1”, Column2 = “Value 2”],
[Column1 = “Value 3”, Column2 = “Value 4”]
},
Splitter.SplitByNothing(), null, null, ExtraValues.Error
),
#”Expanded Column1″ = Table.ExpandRecordColumn(Source, “Column1”, {“Column1”, “Column2”}, {“Column1”, “Column2”}),
#”Extracted Last Characters” = Table.AddColumn(#”Expanded Column1″, “Right Char”, each Text.End([Column2], 1), type text)
in
#”Extracted Last Characters”

Let’s isolate just that step and examine the M code in the last line of the query:

#”Extracted Last Characters” = Table.AddColumn(#”Expanded Column1″, “Right Char”, each Text.End([Column2], 1), type text)

The ‘#”Extracted Last Characters”‘ variable is being assigned the outcome of The AddColumn function. We know that the return object type is a table because of the object class Table always returns a Table type object. The AddColumn function takes three parameters:

  • A table type object (usually the name of the previous step variable)
  • Name of the new column as text
  • An expression that returns a value for the new column

The last parameter argument in this case is the output of the Text.End function. You might guess that this is going to be a text type value. The End function, similar to the RIGHT function in Visual Basic, returns a specified number of characters from the end of a text string. But, since Column2 exists in every record in the table returned by the ‘ #”Expanded Column1″ ‘ variable, how do we specify that we want to do this for each record, one-at-a-time? …go got it. We use “each“.

Note to the Right Char column in the following results. It is the right-most character from Column2:

Transforming Columns In-place

Rather than creating a new column containing a value extracted from a previously created column, you can also modify an existing column. This is done using the TransformColumns function of the Table class. Again, we’re looking at the last step of the following query that was generated by using the Extract menu on the Transform ribbon or menu:

//——————————————————-
//#7: Object Types: Transform each row col in place
//——————————————————-
let
Source =
Table.FromList(
{
[Column1 = “Value 1”, Column2 = “Value 2”],
[Column1 = “Value 3”, Column2 = “Value 4”]
},
Splitter.SplitByNothing(), null, null, ExtraValues.Error
),
#”Expanded Column1″ = Table.ExpandRecordColumn(Source, “Column1”, {“Column1”, “Column2”}, {“Column1”, “Column2”}),
#”Extracted Last Characters” = Table.TransformColumns(#”Expanded Column1″, {{“Column2”, each Text.End(_, 1), type text}})
in
#”Extracted Last Characters”

Here is the last step isolated:

#”Extracted Last Characters” = Table.TransformColumns(#”Expanded Column1″, {{“Column2”, each Text.End(_, 1), type text}})

Instead of the column name being passed to the Text.End function, we see an underscore. This is just a shorthand reference that means “use the column currently being transformed”.

Creating a Table from Records

Now back to completely hand-written code… This one’s easy:
By default, new columns are data-typed as “Any” which is essentially an unclassified data type and this is not typically a best practice. Columns should be strongly typed so that grouping, aggregation and DAX functions work correctly in the data model.

In the following query, we apply the TransformColumnTypes function from the Table class. The function requires a table and at least one column name and data type reference. You can pass any number of columns to this function, each encapsulated with set braces, and with another set of braces around the whole thing.

Here, you can see the results: one table with headers, two columns and three records. This all looks good.

Creating a Table from a Set of Records

Now, we’re going to start putting things together by using object type variables. I think you’ll agree that the code is much easier to read.

And we see similar results as before but with a much cleaner approach and readable code.

Extracting a Record from a Table

As a final assignment, I need to get only one record so I essentially need to unwind the objects that were used to assemble this table and get only one record from it. I want lemons so I need to figure out what the ordinal record number is within the table’s record collection.

Since there are three records, there is automatically an ordinal (index) number assigned to each. Sometimes you just have to guess if you don’t already know, if the first record starts at zero or one. Turns out that all these collections are zero-based so if you want to identify the second record (for Lemons) in the table, it’s going to be ID = 1.

Here’s the syntax:

let
Rec1 = [Product ID = 123, Product Name = “Oranges”],
Rec2 = [Product ID = 124, Product Name = “Lemons”],
Rec3 = [Product ID = 125, Product Name = “Apples”],

MySet = { Rec1, Rec2, Rec3 },

MyTable = Table.FromRecords( MySet ),
MyLemons = MyTable{1}

in
MyLemons

SO what’s all the fuss… what does this actually do?

I’m glad you asked. Let’s take a look.

Isn’t she a beauty?

It’s the best looking lemon I’ve seen all day.

Now you’ve seen the mechanics of how M language functions can be used to perform very simple value manipulations and transformations. More importantly, you have learned about the relationship between important objects like Values, Records, Lists of Values and Lists of Records, oh – and let’s not forget – Tables. That’s the one that actually does stuff in a data model,

I hope this has helped you to broaden your horizons with Power Query and the M language. Please comment with you feedback.

Chat with Donald Farmer at PASS Summit 2019

Enabling the modern data analyst by becoming shopkeepers rather than gatekeepers of information

I appreciated the opportunity to sit down and talk with one of my heros in the industry. Donald Farmer has been a recognized leader in the Business Intelligence and Analytics community for the past two decades and his thought leadership is even more relevant today. He played a key role in shaping the Microsoft BI toolset as a Principal Program Manager from 2001 when the products we use today were in their infancy. He’s also been outside of Microsoft circles which gives him a broad and unique industry perspective. As the Vice President of Innovation and Design at Qlik Technologies, he brought Qlick Sense to market. He remains a friend and advocate for the community. Donald holds multiple patents and has published multiple books. Today he is doing his own thing as the Principal of TreeHive Strategy.

The chat was 9:41 long and is worth watching every second. I have added captions to highlight a few key points which are also listed below. Please enjoy.

Key Points

Route Knowlege is a view of information at a micro level, like turn-by-turn directions. This might be important when you only need to answer a single question without a broad understanding of the entire landscape.
By contrast, Survey Knowlege is a view of information in the context of the bigger picture. Someone with Survey Knowlege knows “how the whole thing hangs together“.

Serving the needs of the modern Business Data Analyst requires a change in attitude. We should become provisioners of information rather than protectors of information. In other words, become shopkeepers rather than gatekeepers.

Actionable data is not always detail data and consuming a summation can not only be easier but more useful than getting lost in the details and the accuracy or precision of a number. Drilling to detail in context is a natural part of the exploration process. He talked about a pattern in the process to provide actionable business information, which includes these stages:

  1. Orientation
  2. Glimsing
  3. Examination
  4. Acquisition
  5. Action

I appreciate Donald taking the time to both speak at PASS Summit and to chat with me for this blog series. Please watch my blog for additional interviews and insights from PASS Summit 2019

Next Level Power BI – SVG Databars

Power BI table and matrix visuals can render SVG object elements in a column or measure to display in-line web content. You can build dynamic content like KPI indicators, databars and sparklines by altering SVG element content using DAX expressions. In this post, I will show you how to create databars that show negative and positive values with different color fills based on category values. Of course, Power BI has built-in databars in the conditional formatting options for numeric columns but you cannot alter the color and other attributes dynamically – at least not yet. For the time being, this technique provides more flexibility than the built-in databar feature.

Read This First

In this post from earlier this week, I introduced a simple technique to produce symbols and graphics using columns or a measure to render SVG images in a table visual. I recommend that you read that post to understand the simple implementation of this technique.

Dynamic Databars

Instead of a graphic symbol, the databar really just consists of a rectangle that has a variable start and end position. I want to change the fill color layer based on whether the value is negative or positive, and based on a category value in my data. You could apply any imaginable business logic to change the color or other attributes. I also want to layer formatted text over the top of the databar rectangle. Check out the example below showing a series of records. My simplified demo dataset causes the same value to repeat across the four Flintstones characters but this Databar measure could be applied to any set of data. Notice that negative values are red and the bar “starts” at the center and is proportionally longer as it extends to the left of center for larger negative values. The same is true for positive values that start in the center of the cell and move to the right.

Before we examine the code to generate the dynamic databar, let’s look at the rendered SVG element for a couple of these cells. If I hover over the cell for ID 1 and “Barney”, the SVG definition is displayed in the tooltip.

One fun thing about writing this blog post and showing you fragments of HTML script is that it is a little challenging to copy and paste the script into this post without it being interpreted and rendered as a graphic. This is one reason I’ve elected to use screen capture images.

Examine the above script and you will see the “<rect” element opening tag followed by attributes that specify the x axis, y axis and height values. These attributes draw the rectangle starting at 79 pixels from the left-most edge of the viewBox (left side of the cell), 80 pixels tall and 181 pixels wide. Keep in mind that the graphic does get rescaled with the table so these are only relative measurements. The fill property of the style attribute fills the rectangle with the color Red.

There is also a “<text” element with similar attributes that place centered (anchored to the middle of the viewBox) text at the mid-point of the viewBox width of 300 pixels. For simplicity, there are a few attributes I’m not mentioning but it’s fairly simple to recognize them and figure out what’s going on.

Hovering over the cell for ID 9 and “Barney” with the value 1495, the x and y axis values cause a DodgerBlue filled rectangle to start at 260 pixels and to be 236 pixels wide.

Show Me the DAX

The DAX code for the DataBar measure is below. The actual expression code is fairly simple but it took a little trail-and-error to work-out the measurements and scaling logic. Rather than building the expression in-line, I’ve invested work up-front to put all the values into variables. First, we get the minimum value for all the selected ID rows (using the MINX iterator function with the ALLSELECTED filtering function) and store that in the MinValue variable. The MaxValue variable is similar with the added condition to handle negative or positive values. This may all seem a bit verbose but its really just math.

There is a disconnected table in my sample data model named Flintstone Characters with character names that I place onto the Columns axis of the matrix visual. In a real-world solution, this could be a field like product categories, geographic regions or anything else. The BarFill variable is set to a color value (which could be either a web color name or Hexadecimal value) based on the selected character name. When grouping a visual on a value, the SELECTEDVALUE function will return a single value. After resolving all the variable values, this measure emits the SVG markup similar to the examples shown previously.

DataBar =
VAR MinValue = MINX(ALLSELECTED(‘Values'[ID]), calculate(SUM(‘Values'[Value])))
VAR MaxValue = MAXX(ALLSELECTED(‘Values'[ID]), calculate(SUM(‘Values'[Value]))) + IF(MinValue<0, ABS(MinValue), 0)
VAR BarValue = SUM(‘Values'[Value])
VAR FormattedValue = FORMAT( BarValue, “#,##0” ) –“#,##0.00” –“#,##0.0 %”
VAR svgWidth = 600
VAR svgHeight = 200
VAR svgFontStyle = “normal 70px sans-serif”
VAR BarHeight = 80
Var BarRange = ABS(MinValue) + MaxValue
VAR BarValueFactor = DIVIDE(svgWidth, BarRange)
VAR BarWidth = INT( ABS(BarValue) * BarValueFactor )
VAR BarZero = INT( DIVIDE(ABS(MinValue), MaxValue) * svgWidth )
VAR BarStart = IF(BarValue < 0, BarZero – BarWidth, BarZero)
VAR BarOpacity = “0.5”
VAR BarFill = IF( BarValue < 0, “Red”,
SWITCH( SELECTEDVALUE(‘Flintstone Characters'[Character]),
“Fred”, “DarkOrange”,
“Wilma”, “Purple”,
“Barney”, “DodgerBlue”,
“Betty”, “DarkGreen”
)
)
VAR svg_start = “data:image/svg+xml;utf8,” VAR svg_end = “”
VAR svg_rect = “”
VAR svg_text = “” & FormattedValue & “”
// x=(half of width width when text is centered)
RETURN
svg_start & svg_rect & svg_text & svg_end

download this code

Like in the Symbols example in the earlier post, you must select the measure and set the Data Category in the Modeling ribbon to Image URL. After that, the measure can be dropped into a matrix visual grouped on the ID and Character fields, to produce the databar visuals I showed in the first example above.

It Takes a Village

A few experts in the community have created SVG sample projects with various techniques. David Eversveld has blogged about generating SVG images in this series. When I started working out the code for the DataBar technique, I reached out to some friends in the MVP community and Darren Gosbell was kind enough to help-out. In the process Darren discovered a bug (or an undesired behavior) in Power BI that ignores the aspect ratio directive in the SVG element and draws a square image even when you don’t want it to. He posted a suggestion the Power BI Ideas forum and I encourage you to go vote to help get this fix prioritized by the Power BI product team. Before this is corrected, the technique is useful but will take-up more vertical space than you might prefer.

Creating in-line Sparklines

On a related topic, Brian Grant has developed a method to produce SVG trend line charts, commonly known as “sparklines”, using a creative string replacement technique. You can find a tutorial for his solution here.

download sample PBIX file

Next Level Power BI – SVG Symbols

With so many different methods to visualize data – just within Power BI – this can be a daunting choice. Let’s see… there are a plethora of standard visuals that can be used in creative ways. A software developer can create a custom visual if they have the time and chops for that kind of thing. You can use a custom visual generator tool like Charticulator or the Chart designer from PowerBI.Tips. And, you can render unique visuals for statistical analysis using R or Python. The technique I’m going to demonstrate in this series doesn’t require a lot of work or special skills so it is relatively simple and easy to use.

In this first post in a series called “Next Level Power BI”, I demonstrate how to create a simple graphical symbol as a Scalable Vector Graphic – just to get started. In a later post, we will expand the technique to create dynamic databars. The same technique can be used to create icons, sparklines and practically any imaginable graphic can be produced with calculated column or measure.

I’ll start with a simple example. Let’s say that you need to display a set of symbols to correspond to a data condition of some kind. For simplicity, let’s just start with the graphics. Here are examples of three symbols displayed in a table visual on a Power BI report page:

Finding the SVG markup for these symbols can be as simple as a web search. You can also create your own using vector graphic editing tools like Adobe Illustrator. The markup for these three symbols is shown here in Power Query. In different Power BI solutions, I have stored and exported the SVG markup in Excel. It’s just text so you can store it anyway you prefer. All three symbols are defined as a series of spatial points in the path element but you can draw shapes, text and a variety of things. There are several good SVG references on the web but I found this collection from Jakob Jenkov to be particularly useful: http://tutorials.jenkov.com/svg/index.html

In this solution, I just used the Enter Data feature and then copied and pasted the SVG markup directly into the query – one row per image.

We’re not quite done because Power BI requires the following header element to be appended to the beginning of each graphic:

data:image/svg+xml;utf8,<?xml version=”1.0″ encoding=”iso-8859-1″>

You can approach this a few different ways but I added a custom column named Definition and then removed the original column using this expression:

After closing and applying the query, to render the SVG image which exists in the Definition column within a report; it must be categorized as an Image URL. In the report designer, select the column from the Field list and then choose the Data Category on the Modeling ribbon.

The column is now usable as an image in a table or matrix visual, which should appear like the first screen image above.

To create a measure to display only a single image, you can create a measure like this one that returns only one value from the column:

Spreadsheet Symbol =
CALCULATE( MIN( ‘Images'[Definition] ),
‘Images'[Image Name] = “Spreadsheet”
)

The measure must also be categorized as an Image URL and then it can be used as an individual symbol or image in a report within a table visual.

That covers the bare bones basics of SVG image columns and measures. There is so much that you can do, expanding on this technique by injecting dynamic logic into the measure code. I’ll cover that in a subsequent post.

Next-Level Power BI – SVG DataBars

So, I wrote a bedtime book now?!

Is the 2010, first edition of my SQL Server Reporting Services Recipe Book so snooze-worthy that it is now being used as a bedtime story? Huh uh!

O’ contraire, This book contains so much of what real life has to offer. So much insight and deep, deep learning opportunities that instruct, educate, train and prepare an industry professional to take their report designs to new heights, to inspire and to elevate the reader by using report design techniques never attempted before. All on a rock-solid foundation of tried-and-true report design best practice techniques that will totally rock your world and change the way you think about reporting! These are the sort of insights contained within the pages of this exciting book about SSRS report recipes.

So, what is this business about Paginated Report Bear wanting to read my world-changing book as a bed time story? Really… well, that’s not going to work!

Oh, Chris (Finlan), my friend… why oh why do you torment me like this?

Matthew, we need to talk about your Dad and his wayward little stuffed friend!

Power Platform World Tour Vancouver Presentations

These are copies of my presentation decks from the sessions today at the Power Platform World Tour event in Vancouver, BC. I appreciate all who attended.

The Charticulator session was originally presented in July at the Power BI Symposium in Portland and here is the recording of the session:

Please feel free to contact me with questions or feedback using comments in this post.

Power BI Project Good and Best Practices

Paul Turley
Power & Data Platform MVP, Principal Consultant, Pragmatic Works

Preface: This post is set set of guidelines that address many aspects of new projects. Although I feel strongly that many of my recommendations are correct and applicable in most uses cases, other experienced practitioners have valuable ideas and I welcome their input. Please post your comments, links to other resources and ideas that may be different than mine. This will be a living document with ongoing revisions with the goal to provide a comprehensive, best practices guide as the platform continues to mature and as experts continue to use it.

Revision: 8/25/2019 – cleaned-up & added checklist
Revision: 9/7/2019 – added Security section & corrected Last Refresh Date/Time

I find there there are so many things to remember when starting a project that a checklist is handy. I’ve been collecting the following as notes for some time. Bare with me as I work on consolidating this article into a concise checklist.

The purpose of this article is to outline a set of guidelines and recommended practices for managing Microsoft Power BI projects. This guide is primarily focused on the work performed by the IT-managed BI Solution Developers for formally-managed BI projects.

Why “Good” and “Best”?

Power BI is a tool and platform that serves two masters: IT developers and business data users. As such, some practices are absolutely best-practice and should be followed with rare exception. Other recommendations are conditional guidelines, with possible exceptions. The flexibility of Power BI makes it possible for industry practitioners to developed their own solutions to apply design patterns in the absence of a specific product feature (like source control, for example).

This guide is largely based on my experience, discussions with other members of the MVP community, clients and fellow consultants. Most of the information in this recommendation guide are based on the consensus of parties with the greatest experience with the Power BI platform. These parties include:

  • Members of the Microsoft MVP Program (Data platform and Business Applications focus) who participate in daily interactions with the Power BI product development and leadership teams at Microsoft.
  • Microsoft certified solution partners who offer consulting, training and value-added services to Microsoft product customers.

Power BI has come a long way in a short time; since the Power BI service and Power BI Desktop were introduced in 2015. At that time, the emphasis was to enable self-service business users to analyze and visualize data in small-scale ad hoc reports. As Power BI grew into the enterprise space, it has been adopted by IT-managed data governance and solution development teams. Power BI is displacing – and being used in concert with – conventional IT-centric Business Intelligence product technologies that have been evolving for decades (such as SQL Server Analysis Services, Reporting Services and Integration Services). With this history and the rapid adoption of the platform, standard design practices will continue to evolve.

Two discreet audiences and roles should be considered for Power BI:

  • Business Data Analyst
  • IT-managed BI Solution Developers

Consider these roles, with respect to both the solution developer(s) and users, as you begin to plan a new project.

Business Data Analyst

Members of this role are part of a rapidly growing segment of the industry. They understand how to use software tools and understand data manipulation and reporting within the confines of area of business focus. Traditionally, data analysts were data consumers who use financial spreadsheets but now they use advanced data preparation tools to perform reporting and deeper data analytics. Business Data Analysts generally fall into two subcategories:

  • Advanced Data Consumers
    These users explore data using spreadsheets, pivot tables and chart-style reporting tools that are served by existing multidimensional cubes and data models. They do not write code and work with IT/BI Developers, Data Modelers or Advanced Data Analysts to collect and model the data they use to create their reports.
  • Data Modelers, Advanced Data Analysts and Data Scientists
    These advanced users perform their own data mashups, produce data models and write limited code to create solutions for their department or business area (using Power Query, DAX, R, Python, etc.) They do no architect and develop governed and certified datasets for enterprise-level business use.

IT-managed BI Solution Developers

Enterprise-scale BI solutions are typically created by a team of specialists who may include:

  • Business Systems Analysts
  • BI Solution Architects
  • Project Managers
  • Data Transformation (ETL) Developers
  • Enterprise Data Modelers
  • Measure code experts (using DAX)
  • Report Developers
  • Operations & Deployment Engineers

Power BI may be used alone or with other products to create BI data transformation queries, data models, reports and dashboards. IT-BI managed scenarios:

  • Complete BI solution design, development and deployment, consumed by end users.
  • Data models created and deployed to support user ad hoc report design.
  • Support Advanced Data Analyst data transformation, data modeling & report design
  • Migrate advanced user created data models to governed/certified datasets, transferring ownership to IT & operations

Project Types – simplified

For simplicity, this guide refers to Power BI projects as either “Formal” or “Informal” using the following definitions:

  • Formal projects are scoped, funded, staffed and executed with the collaboration of a business champion and stakeholders; and IT Business Intelligence developers and data managers. These projects promote business and IT-governed datasets and certified reports.
  • Informal projects are executed by business users and are considered ad hoc in nature. Datasets are generally not IT governed, and reports are typically not certified.
  • Hybrid projects can be anything in-between. They might be a user-authored report using published, certified dataset used for self-service reporting. Informal, self-service datasets can be migrated to governed datasets in collaborative IT/business projects.

Again, this guide is primarily focused on the work performed by the IT-managed BI Solution Developers for formally-managed BI projects. Topics included in this guide:

  • Solution Architecture
  • Managing Power BI Desktop Files
  • Datasets and Reports
  • Version Control & Lifecycle Management
  • Workspace and App Management
  • Data Model and Power Query Design Guidelines
  • Dimensional Design
  • Query Optimization
  • Managing Dataset Size with Parameters
  • Implicit and Explicit Measures
  • If Users Need Excel, Give them Excel
  • Certified & Shared Datasets
  • Enterprise Scale Options

Solution Architecture

All Business Intelligence projects involve the same essential components including:

  • Source Queries
  • Data transformation steps
  • Semantic data model
  • Calculations (typically measures)
  • Data visualizations that reside on:
    • reports
    • report pages
    • dashboards

These components can be built using different products and development tools. Lengthy discussions can be had to debate the nuances and merits of different options; but a short summary of recommendations follows. To summarize these options in simple form, first consider the following options:

  1. Single PBIX File

    Build all queries, transformations, the data model, measures, reports and pages
    with Power BI Desktop in a single PBIX file and deploy to the Power BI cloud
    service; to be shared with users.

Recommendation/Cost/Benefit:
This option is most suitable for Business
Data Analysts, erring on the side of simple design, typically optimal for
small, informal self-service projects where a desktop business user needs to perform their own data mashups and quick, ad hoc analysis. Risks include unknown data integrity and the use of uncertified dataset results. Informal projects can effectively serve as prototypes for formally developed and managed dataset development. Single file projects can be migrated to dataset/report two file projects with some rework required for the reports, depending on the report complexity.

Separate Dataset & Report PBIX

In formal projects, use Power BI Desktop to author a separate PBIX “dataset” file containing the following: queries, transformations, the data model and measures. These are in turn deployed to the Power BI cloud service. Develop one or more separate PBIX “report” files that connects to the published dataset file in the service. All report pages and visuals reside in the “report” file. Some additional measures can be created in the report
file.

Recommendation/Cost/Benefit:
This pattern is recommended for most projects where data governance is important and where new report features are likely to be requested in the future. By separating the dataset from report development; the data model design can be locked-down and managed by a data steward. This approach provides report designers freedom to add visuals and make report enhancements without risking the stability and integrity of the data model that may be serving data to other reports.
Development and management of a two file project is a little more cumbersome because the dataset must be redeployed before changes are available in reports.
Self-service report users who need to explore and visualize data, can use a governed and certified data model. Changes and enhancement should be coordinated between the report designer and the data steward or IT resource managing the data model.

SSAS with Power BI

Taking the “traditional route” in formal projects to build the data model in SSAS, hosted either on-prem or in Azure Analysis Services, and then creating reports and visuals with Power BI was a necessity under enterprise-level feature were added to the Power BI platform in 2017-2018. Although there are still some good reasons to use SSAS rather than Power BI to build the data model (called a “dataset” in Power BI), Power BI can be the more cost-effective choice today. Power BI and SSAS both offer unique capabilities which doesn’t clearly make one option always better than the other.

You can develop the source queries, transformations, data model and measures in an Analysis Services (SSAS) project using Visual Studio (also called SQL Server Data Tools or SSDT) and in turn deploy the SSAS database/semantic model to an on-premises SSAS instance or Azure Analysis Services (AAS) cloud service. Connect new Power BI reports to the deployed SSAS model using a direct connection. SSAS/AAS can handle data models with more than the 1 GB, 3 GB & 12 GB compressed data size limits of Power BI. The SSAS development environment can be more difficult to navigate and the model designer can be unstable. When the VS/SSDT designer crashes, developers can lose work even if they are routinely saving their work.

Recommendation/Cost/Benefit:
Favor using Power BI. It typically takes considerably longer to develop data models in Visual Studio/SSDT than using Power BI Desktop. This project architecture is no longer the primary recommendation for BI solution development. Microsoft is directing most of their development and support resources to Power BI rather than SSAS. Although SSAS still has some advanced capabilities not yet present in Power BI; Power BI can handle larger data volumes in certain cases. Microsoft and Microsoft partners recommend using Power BI is the tool for developing most new BI solutions.

  • DirectQuery vs Import/In-memory

    Power BI desktop using DirectQuery mode, rather than import mode to reference tables in an underlying data mart or data warehouse relational database. No data is stored in the Power BI or SSAS data model; live SQL queries are generated when navigating report visuals.

    Recommendation/Cost/Benefit:
  • Avoid this option, unless needed to satisfy unique requirements. Power BI and SSAS both are favored and are optimized to use: import mode, in-memory tabular data models for all but rare and exceptional scenarios. DirectQuery significantly limits the DAX Calculations supported in the data
  • model. Query speed is typically many times slower than import, in-memory mode. The advantage of DirectQuery mode is that queries are executed directly against source tables in real-time. Consider building composite models with both import and DirectQuery mode to get good performance and drill-down to live transactional details. Composite models are typically complex and more time-consuming to develop, test and deploy.

Managing Power BI Desktop Files

Power BI Desktop files (PBIX and PBIT) should be stored in a centrally managed network-assessable folder, where they can be accessed by all
members of a BI report development team. The storage folder should support automatic backup and recovery in the case of storage loss. Options depend on currently supported services and organization preferences within the following guidelines:

File Size and Folder Synchronization

  • Report and dataset developers must open files from the Windows file system and not from a web browser interface. Files must either reside in or be synchronized with the Windows file system.
  • Files containing imported data typically range in size from 100 to 600 MB. Any shared folder synchronization or disaster recovery system should be designed to effectively handle multiple files of this size.

Options:

  • OneDrive For Business (shared by team, with folder synchronization).
  • SharePoint or SharePoint Online (with folder synchronization).
  • GitHub and/or VSTS with local repository & folder synchronization. If used, Git must be configured for large file storage (LFS) if PBIX files are to be stored in the repository.

File Naming

Give files an intuitive and friendly name, that will result in them not being confused with similar report projects. In default view, only the first 20 or so characters are visible for each report. Use self-evident names that can be used to identify datasets and reports, using the beginning of long names. Workspace and report names have limited space in the Power BI portal navigation bar. Use short and concise names when possible; but try to use names that differentiate like-named items in the first 20 characters. The full name is visible in the hover-over tooltip or by resizing the panel.

image

Datasets and Reports

Formal report projects with certified datasets should be developed with reports and datasets divided into separate files. By default, all Power BI objects, including reports and datasets, are stored in a single file. When the PBIX file is published from Power BI Desktop to a workspace; the PBIX file is separated into separate dataset and report objects. This default behavior is convenient for informal, small-scale, self-service analysis; but not optimal for managing formal solutions with governed datasets, managed by teams of developers or report authors.

Separating the dataset and report into different PBIX files; allows them to be managed and versioned separately. This approach also promotes a division of labor and ownership, particularly for certified datasets that must undergo scrupulous validation.

Like the way SSAS projects are used to develop multidimensional cubes and tabular models; deploying a separate Power BI dataset provides an uncomplicated black box solution for ad hoc report design and data exploration. Power BI can now be used to develop large data models containing 10s of gigabytes of compressed data. Using these development tools take far less time and possibly without additional licensing fees.

Version Control & Lifecycle Management

At the present time, Power BI has no integrated version control mechanism. In addition, Power BI is incompatible with most version control systems that perform branching and differencing operations on code and script files that are common in application development projects. Power BI (PBIX and PBIT) report and dataset files are stored as binary archive files with internal checksum metadata. Change and differencing operations performed by most code and version management systems will corrupt these files and render them inoperable.

Reality: Integrating changes made to multiple PBIX files is difficult to manage and challenging to perform technically. Even with differencing tools and effective version-control, just avoid multi-developer work on Power BI dataset files.

The Power BI ALM Toolkit (based on the older BISM Normalizer tool for SSAS Tabular) will make it easier to compare differences. Until (or if) specialized tooling becomes available, you should avoid replying in this method to manage project work.

Power BI Desktop PBIX files that contain imported data can be large; typically, in the range of 100-600 MB. However, they can be checked into a version control repository for the purpose of backing up and restoring files.

PBIX files can be reduced to small files by saving them to a Power BI Template (PBIT) file. This removes all data and may be an effective to make occasional backups for version archive purposes and disaster recovery. The template can then be used to generate a new PBIX file; much like using a template to generate a new Word, Excel or PowerPoint document.
Saving a PBIX to a template for backup purposes takes minimal effort; but rehydrating a PBIX from a template is more labor-intensive and time-consuming. The reason for this is that a new file must be created and then queries must reprocessed to populate tables with new data.

Recommendations:

File versions

Store the dataset and report PBIX files separately using a version number postfix for the file name in the following the format: Major.Minor.Revision. For example:

Manufacturing Cost Analysis Dataset v1.5.3.PBIX

Keep only the previous file versions that are useful for historical reference and recovery. Report files are small and don’t take up much storage; but, dataset files can be large. If you need to archive older dataset files, resave them as Power BI Template (PBIT) files to reduce the file size. However, a developer must continue to work with the current PBIX file. Recovering a dataset from a template is time-consuming so developers cannot actively develop using only template files.

Remove the version number from file names prior to production deployment.

Dataset Version History

By keeping version history information in an internally-generated table in the data mode; durable revision history notes and version information is self-contained and not dependent on external data sources.

Add a Version History table to the data model by using the “Enter Data” feature in Power Query to create a table containing these columns:

  • Version Number
  • Revision Date/Time
  • Developer Name
  • Revision Notes

Add a custom column named Refresh Date/Time that will be used to capture the date & time every time the queries & model is refreshed. This custom column could be added to any table but it is best to use a table with few rows because the value will be the same in each row.

DateTime.LocalNow()

Add the following measures to the model:

Measure Name Definition
Last Refresh Date/Time = MAX( ‘Version History'[Refresh Date/Time] )
Current Version = MAX( ‘Version History'[Version Number] )

Create a Dataset Information page in the dataset PBIX file and in any report that references the dataset.

Every time a revision is made to the data model, edit the Enter Data
step and add a row to the table with a new version number. Use the following guidelines to increment components of the version number:

Component 
Major releaseIncrement only when a newly scoped solution is released to PROD.
Major releases have a specific set of planning features. A new major release is created only for a new set of planned features or significant enhancements as part of a new scope of work.
Minor Version NumberAs a team, decide when changes are significant enough to warrant an incremental minor version release; typically, when changes are made to QA or PROD releases within a major release. Generally, for minor feature
enhancements: For a dataset, minor version changes may break or affect a report’s design.
For a report, minor version changes add or change functionality aside from fixes.
Fix or Revision NumberIncrement for every bug fix or design change, that will be deployed to the DEV environment or visible to other developers. Generally, for bug
fixes but not for adding features. May be incremented by the developer. Reset
to zero with a new minor version number.

For example; the following version number represents Major Release Version 1Minor Version 5 and Fix/Revision Number 3:

v1.5.3

Datasets and reports should be versioned separately. The major version number for reports and datasets may be synchronized for major workspace app releases.

Be mindful, that the purpose for version numbering is to provide at simple standard to keep track of changes. The process can easily be over-thought and over-engineered.  Keep it simple, discuss changes as a team and make process adjustments to focus on priority tasks.

Release Management, DevOps and Automation

(topic draft – expect updates – please comment)

This is an area where the Power BI platform currently lacks a formal process or mechanism. We know that release management tools are being developers but in the early stages. We may or may not see a release management too from Microsoft that blends with other tools that you might have in use, such as VSTS and AzureDevops. In my experience, automated Power BI release management isn’t really being practiced in the industry yet. That said, there are techniques available. PowerShell script may be used to publish datasets and reports, and to change dataset bindings. It is possible to either publish to a production workspace or to effectively move assets from one workspace to another. This approach is discussed briefly in the Power BI Enterprise Deployment Guide. Other approaches are discussed here:

Power BI release management

PLEASE ADD REFERENCES

https://www.msbiblog.com/2018/09/12/power-bi-release-management/

Workspace and App Management

To share reports to users who do not have a Power BI Pro license (each license has an additional cost) must be in a Premium capacity workspace and the workspace must be published as an app.

Create a separate workspace to share with Free licensed users

Create a separate workspace for formal projects that have a report, collection of reports and/or dashboard that should be distributed to multiple view-only users. A workspace must be converted to an app to be shared with users who do not have Power BI Pro licenses assigned to them. Once the workspace is published as an app; it can be shared with any group of “free” users or all users in the organization. The visible text for workspace names is limited; name workspaces so they be uniquely identified in about 20 characters or less.

For a formal project, create the following workspaces:

DEV Workspace

Only development team members need Contributor access to this workspace. This workspace does not need to have Premium capacity; unless, developers need to unit test incremental refresh or other Premium features.

QA Workspace

All testers must have View access for testing and Contributor access for report authoring. Should be in Premium capacity to test incremental refresh.

PROD Workspace

Omit the “PROD” designation in the name. This workspace will be the
name of the published app that users will see in their Apps, Home and Favorite pages so use a name that is simple and sensible. Must have Premium capacity to share the app with non-Pro licensed users.

image

Use a common workspace for informal reports not shared with Free
licensed users

Multiple self-service and informal report files can share a common workspace without converting the workspace to an app. All users accessing any of those reports, must have a Power BI Pro license assigned to them. Informal or self-service Power BI projects typically don’t require 
separate Development or QA workspaces.

Data Model and Power Query Design Guidelines

The topics in this section are not an exhaustive guide to Power BI query and data model design. Following are common issues and recommendations often encountered in projects.

Dimensional Design

Always design data models for optimal performance, even when it “doesn’t matter” in simple projects. As data volumes grow and as data models are enhanced to address new requirements, increased complexity leads to performance challenges. There are many books on this subject dealing with the intricacies of best practice data model design. In brief:

  • Build star schemas – wherever possible, reshape data into fact a dimension tables with single key, one-to-many relationships from dimensions to fact.
  • Enforce dimension key uniqueness – Just because a key value “should” be unique, there is no guarantee that it will be unless enforced at the data source. Perform grouping and duplicate reduction in the data source views or Power Query queries to guarantee uniqueness. Duplicate record count checks and other mechanisms can be applied to audit source data for integrity but do not allow the data model to violate these rules.
  • Avoid bi-directional filters and unnecessary bridging tables – These data modelling patterns adversely affect performance.
  • Calculated columns – Should be applied in Power Query and not in DAX calculated columns wherever possible. This maintains a consistent design pattern for maintainability.
  • Annotate code – Use in-line comments and annotations in all code including SQL, M and DAX; to explain calculation logic and provide author and revision information.
  • Hide all fields not used directly by users. These include: primary and foreign key columns, numeric columns used to create measures, and columns used to specify the sort order of other fields.
  • Set to Do Not Summarize – Any non-hidden numeric columns that are not intended to roll-up or summarize values should be set to “Do Not Summarize” in the Modeling ribbon in Power BI Desktop. Columns set to summarize are indicated with a Sigma icon.

Support Query Folding with Views

Data source queries in Power BI are authored using the Power Query editor; which generates queries in the Microsoft Data Mashup “M” universal query language. A critical optimization technique employed by Power Query is called “Query Folding”; where the “M” query definition is translated into a native query that can be executed by the source database server. If query folding works, SQL Server executes queries and returns results to the Power BI service through the on-premises data gateway. If query folding can’t be used, all raw data is streamed through the gateway, held in memory in the service and then processed inefficiently, row-by-row.

Query folding works reliably when queries authored in Power Query reference relational database tables or views. Avoid using SQL statements in Power Query and avoid using stored procedures as query data source objects. This can be a difficult transition for database professionals who are accustomed to writing SQL queries but Power Query can optimize its queries only when referencing tables or views.

In rare cases to avoid extremely long-running queries that might time-out during data refresh, it may be necessary to materialize source table results in a table stored in the EDW or data mart database. Using parameters with an incremental refresh policy can also reduce or eliminate query time-out issues.

Managing Dataset Size with Parameters

Use Power Query parameters to filter large tables (typically only fact tables). The incremental refresh feature of Power BI Premium capacity will automatically generate partitions and only process new or partial data to avoid reloading large tables.

Even if you don’t intend to implement incremental refresh policies; data refresh parameters allow the working dataset size to be reduced in the development environment and to deploy a smaller dataset file that can then be refreshed with a full load of data in the service.

Recommendation:
Design large tables with a parameterized range filter that conforms to the requirements supporting incremental refresh policies.

When a dataset PBIX file approaches 400 MB in size, parameters should be used to filter large tables and reduce the working set of records for development.

Dataset approaching 1 GB in size should have fact tables configured with incremental refresh policies and must be in a Premium capacity workspace.

First, create two Date/Time type parameters named RangeStart and RangeEnd. Next create a date range filter in the early steps of each fact table on a Date/Time type column in the table. Incremental refresh policy requirements are detailed in this article: https://docs.microsoft.com/en-us/power-bi/service-premium-incremental-refresh

Measures & Numeric Columns

Format all DAX code for easy readability. not only does it look better but will help avoid mistakes and support debugging and maintenance. As a guide, use DAXFormatter.com to “beautify” measures and follow the same code formatting pattern going forward.

Numeric Measures

All measures should return the correct numeric type values and every measure should be correctly formatted. This seemingly trivial subject is important and an issue in most informal and ill-managed projects.  Requirements should be defined for data types, precision and formatting.  Common items include:

  • Measures are designed to summarize and total over any filter, slicer or grouping condition. Apply the appropriate roll-up or conditional level logic if it is not 
  • Percent and ratio measures should be fractional decimals. For example, “45 percent” is represented as a decimal value .45. When formatted, it appears as 45%, or 45.00% with two decimals for better precision.
  • All whole or large number measures should include a thousand separator.
  • Use the DAX DIVIDE function instead of the forward slash operator to handle blank and zero division errors.

Text Measures

  • Do not conditionally return text values to handle exceptions (like “not applicable”) in a measure that should return a numeric value. 
    Create a separate text-type measure to support titles & labels, if needed.

Implicit and Explicit Measures

For simple and informal projects, Power BI supports implicit measures. Numeric columns in each table are automatically set to summarize data for ad hoc reporting. This behavior is both convenient and confusing for ad hoc report users. In formal projects, avoid using implicit measures for the following reasons. Instead, create a measure for any numeric column values that should be aggregated for reporting, and then hide all redundant numeric columns.

  • Implicit measures don’t work in Excel pivot table and expressions
    Like when working with cubes and connected SSAS models, only explicit measures are available in Excel report.
  • Two choices confuse users
    Since all data models will include explicit measures, written using DAX. It is confusing for users to understand when to use measures and when to use numeric columns. In conventional Excel pivot tables (which connects to Power BIas i f it were a multidimensional cube); measures and calculation objects are contained in a measure groups and all columns (whether numeric or any other data type) are dimension attributes. The introduction of PowerPivot, SSAS tabular and Power BI confuses most users with inconsistencies.
  • Measures are Flexible
    Inevitably, some implicit measure will eventually be replaced with explicit measures when they are required to support some conditional logic or alternate aggregation behavior.  Starting with explicit measures, avoids retraining users and introducing inconsistent behavior.

Promote Self-service Reporting

The true value of the Power BI platform is when business users can use the interactive reporting features of Power BI to get answers to business questions. This can be as simple as starting with an existing report and changing a slicer or filter; or adding a visual to a report page to see a selected set of data. Novice report author/users can easily be taught to drag-and-drop fields into a new report page to group and slice data from an IT & certified dataset. Starting with IT/business certified datasets published to the Power BI service, users with a range of interests and skills can modify or author reports using their web browser without altering governed data or protected reports. 

For user training and support purposes, consider educating users with skills in the following roles:

Novice Report Users

  • Navigate to an existing published app or report
  • Understand how to dashboard tiles, report pages and visuals
  • Use filters and slicers
  • Use visual interactions to select and slice data
  • Use report visual drill-through navigation
  • Use Q&A to ask and answer data questions

Casual Data Analysts

  • Edit a report
  • Change a report visual to a different type of visual
  • Add a report page
  • Add visuals and fields to a report page
  • Add and use slicers to filter and select data
  • Use Analyze in Excel to create a connected pivot table
  • Use Excel to add fields, groups, totals, filters & slicers to a pivot table
  • Set alerts & create a report subscription
  • User bookmarks to create custom filter sets and navigation

Intermediate Report Authors/Analysts

  • Using a published certified dataset, create a new report
  • Add pages, visuals, slicers and filters
  • Add custom measures to a report using DAX
  • Create drill-through actions & navigation
  • Create a custom dashboard by pinning tiles

Use SSRS/Paginated Reports for Operational Reporting

Power BI is not a replacement for paginated, operational reporting. For static, multi-page, printable reports; use SQL Server Reporting Services (SSRS) instead of Power BI. SSRS (now called “Paginated Reports”) is integrated into the Power BI service with Premium capacity licensing, and can be integrated with interactive Power BI reports and Power BI data datasets.

To a limited degree, some operational reports can be reproduced using Power BI reports and SSRS can be used, some a limited degree, to create interactive reports.

Recommendation:
Use the right tool for the job and consider the strengths and limitations of Power BI, SSRS/Paginated Reports and Excel to author and deliver reports for different audiences and use cases. 

If Users Need Excel, Give them Excel

Financial users often misunderstand that they must adapt to visual dashboard reporting when they really need spreadsheet reports with reliable, current data. Power BI was borne from Excel and from SQL Server Analysis Services (traditionally “cubes”). Therefore, it can meet the needs of both self-service users and IT-managed enterprise BI solutions. In many ways, Excel are a perfect pair. If users need spreadsheets, don’t give them reports that look like spreadsheets. Give them connected spreadsheets; then train them to use Excel from a Power BI report and certified dataset.

Analyze in Excel is a gem that is often not promoted by IT-based report authors; typically, because report requirements are usually specified before users lean about this capability. It really is a best-of-both-worlds solution; because it keeps governed data in a published dataset which allows users to access it via a secure live connection from Excel.

When selecting Analyze in Excel from a published report or dataset in the service, Excel opens with a new connection (using an OCD file) directed at the published dataset. The Analyze in Excel ribbon (installed to Excel when downloaded) can also be used to create a new connection from Excel.
By default, a pivot table is generated in a new worksheet; but advanced Excel users can use cube functions to place live Power BI dataset data anywhere in a workbook.

https://sqlserverbi.blog/2016/05/10/power-bi-and-excel-the-dynamic-duo/

Recommendation:
Rather than exporting data from Power BI (and creating off-line copies of data), promote the use of live, connected reporting with Excel. 

Enable Analyze in Excel in selected reports and datasets; then train a limited audience of advanced users to promote its use. Establish best practices for each business use case. Generally information about this feature is available here: https://docs.microsoft.com/en-us/power-bi/service-analyze-in-excel

Implementing Security

There are many good references and tutorials about data and report security that I won’t cover here but will summarize the recommended approaches. Data level security and report/content level security are separate concepts that should be coordinated.

Report/Content-Level Security

There are multiple options for sharing report content with users.  Users cannot view a dashboard or report unless they have been granted access to a workspace or have been invited to share a report.

If you have a Premium capacity Power BI tenant, you can package the content of a workspace [dashboard(s) and report(s)] as an app which can be shared with any users in your organization who do not need individual Power BI Pro licenses to read and view content. These users will not be able to edit or author additional reports.

Users with Pro licenses assigned through the Office 365 Portal can have a workspace or report shared with them, which they can (unless otherwise restricted) edit, save, copy or share with others.

Data-Level Security

Once a user can access a report, they will see the results of all data in the underlying dataset for that report. Users with report level access can read all dataset data unless role-based and/or row-level security has been implemented. Role/RLS security work essentially the same way in SSAS and with data models developed with Power BI Desktop.

Role-based Security

Role-based Security is active when at least one role is defined for the dataset, filter expressions are added to the role and them user or group members are added to the role. Conditional filtering is performed for each table in the data model using a DAX expression.

Row-level Security

RLS is an extension of role-based security. In addition to creating one or more roles with filter expressions and adding role members, filtering may be conditional based on the authenticated user.  The typical approach is to populate and maintain a mapping table of users and key values for each table in the data model that should have restricted access. All related data will be filtered. For example, if a user can only see data for a particular sales territory, they should only see sales totals for the territory.

The user/mapping tables must be maintained either manually or through automation if the user/object permission information is available in an existing data source. Don’t over-engineer this. The user/mapping information can be managed in a spreadsheet and securely stored in SharePoint or OneDrive, or driven by a PowerApp or custom form. *This requirement is often missed in large projects.

Recommendation:
Add users to Office or Azure Active Directory groups and then use groups to share report content. If you are using role-based or row-level security, use groups to manage role membership.

If using Premium, use workspace apps to share and distribute reports with  read-only users and reduce Pro license requirements.

Create a strategy for managing RLS user/entity mapping tables. Either manage manually or automate if the data is available.

Certified & Shared Datasets

To promote reuse and data governance, datasets can be used in reports residing in different workspaces (including a personal workspace). Datasets can be endorsed to designate those that are certified and promoted by data owners and the organization. The Power BI service includes a platform and workflow to manage promoted and certified datasets. When enabled, a dataset author can enable a dataset to be marked for promotion and certification. Designated users can test and optionally certify specific datasets that are known to be reliable and conform to organizational
governance standards.

Dataset promotional and certification markings require the coordinated effort of the solution developer, Power BI tenant admin, data steward or business stakeholders. The actual process is simple; but the certification process should be defined and managed by solution business owners and service administrators.

The following image illustrates the
aforementioned process.

clip_image002

clip_image004

Information about this capability: https://docs.microsoft.com/en-us/power-bi/service-datasets-certify

Enterprise Scale Options

In many ways, Power BI has now surpassed the capabilities of SQL Server Analysis Services. Microsoft are investing in the enterprise capabilities of the Power BI platform, by enhancing Power BI Premium Capacity, adding Paginated Report and features to support massive scale specialized use cases. Consider the present and planned capabilities of the Power BI platform; before, choosing another data modeling tool such as SSAS.

Resources:

https://sqlserverbi.blog/2018/07/27/power-bi-for-grownups/ 

https://sqlserverbi.blog/2018/12/13/data-model-options-for-power-bi-solutions/

Project Preparation Checklist

[_] Categorize the solution by identifying the author & user roles related to the project:

[_] Author role: Business Data Analyst

[_] Author role: Skilled Data Modeler, Analyst, Data Scientist

[_] Author role: IT BI Developer

[_] Users’ role: Report/Dashboard Consumer

[_] Users’ role: Self-service Report Author

[_] Users’ role: Advanced Data Analyst

[_] Develop & Document Support & training plan for users

[_] Identify the Solution Type for the project. This will guide other project management designs:

[_] Design single PBIX file for small geoup, departmental project authored by one developer for a limited group of users

[_] Design & deploy a separate dataset PBIX file – from report file(s) – when the dataset should be branded as a Certified dataset

[_] Design separate dataset and report PBIX fiels for formal projects with more than one dataset & report developer, to coordinate work

[_] Use SSAS/AAS as a data modeling option when those databases exist or where IT operations insist o management development and maintenance through integrated source control (e.g. Visual Studio Team Services & Azure DevOps)

[_] Identify the Project Type & related Soluton Architecture:

[_] Project type: Formal project

[_] Project type: Informal project

[_] Project type: Hybrid project

[_] Architectural approach: Single PBIX

[_] Architectural approach: Separate dataset and report PBIX

[_] Architectural approach: Report PBIX connected to SSAS or AAS

[_] Understand DirectQuery model trade-offs and special use cases. Avoid if possible.

[_] Create storage locations and folder structure for Development file management:

[_] Development file storage

[_] Team member collaboration environment & processes

[_] Folder synchronization

[_] Define File naming standards

[_] Decide on dataset and report names

[_] Define the Version Control & Lifecycle Management

[_] Postfix files with 3-part version number

[_] Remove ver number from published files in QA and PROD

[_] Create Version History table in Power Query

[_] Increment version numbers in data model

[_] Backup PBIT files for archive

[_] Create measures: Last Refresh Date/Time

[_] Create measure: Current Version

[_] Add data model info page to report

[_] Define your Release Management, DevOps & Automation strategy (if any – Might be OK to deploy files manually) (to automate or not to automate)

[_] Decide on Workspace and App Management, workspace & app name, etc.

[_] Create PROD workspace (omit PRD from name), assign dedicated capacity

[_] Create QA workspace (post-fix name with QA), assign dedicated capacity

[_] Create DEV workspace (post-fixo name with DEV), dedicated capacity not required

[_] Assign Pro licenses to all developers, admins and report author users (QA?)

[_] Assign membership and access to workspaces

[_] Create fact date range filter parameters: RangeStart & RangeEnd

[_] Filter large fact tables with range filters, consider incremental refresh policies if slow and/or over 800 MB compressed.

[_] Design source queries (T-SQL?) to reshape source data into conformed dimension & fact tables

[_] Create views in database for each dimension and fact

[_] Enforce key uniqueness to remove all duplicate keys from all dimension tables

[_] Query Date dim/lookup table at source if it exists

[_] If not available, generate Date dim/lookup table in Power Query

[_] Avoid bi-directional relationship wherever possible

[_] Include code annotations & comment blocks in: views, M queries,measures

[_] Hide all key columns and other columns not used directly by users

[_] Use parameters to filter and reduce dataset PBIX size to ~400 MB or less

[_] Create incremental refresh policies for dataset larger than ~500 MB

[_] Use Premium capacity workspaces for datasets approaching 900 MB

[_] Use Premium capacity workspace when you need to share reports with non-Pro licensed user

[_] For non-Pro licensed user, convert QA (and then PRD) workspace to app and distribute the app to users via AD group membership

[_] Create explicit measures for all aggregate-able column values

[_] Hide all measure base numeric columns

[_] Set all non-aggregate-able numeric column default summarization to “Do Not Summarize”

[_] Format add whole numbers with thousand separators

[_] Format all currency & decimal measures to defined standard (perhaps 2 decimal, thousand separator)

[_] Format all ratios and percentage with appropriate decimal positions

[_] Use the DIVIDE function for all division operations to avoid div by zero

[_] Name all measures and user-visible fields with mixed-case, short,

[_] friendly sentenc-like terms

[_] Assign measures to fact table where they logically belong – or to an empty measure group table.

[_] Measures that don’t have an obvious home (fact) table should be assigned a measure group table.

[_] Precede measure group table names with “_” so they sort to the top of the list. (display folders are not ready for use in Power BI Desktop)
As a team decide:

[_] When to increment the revision/fix number

[_] When to increment the minor version number

[_] Who owns the migration to QA process? How is it implemented?

[_] Who own migration from QA to PROD? How is it implemented?

[_] How is user/stakeholder sign-off obtained?

[_] How are data source requirements defined, presented & documented?

[_] Who owns & maintains the requirements log?

[_] How are measure & KPI calculation requirements documented & maintained?

[_] How are the report design, layout & visualization requirements documented & maintained?

[_] What is the organization & project styling & branding standard?

[_] Does the organization have a standard brand image? Where is it documented? How owns that standard & can answer questions?

[_] Should report be designed for mobile consumption (with mobile layouts)?