Mastering Power Query in a Day – Full-day training in Seattle

Following the Power BI World Tour, Seattle event on Oct 30, please join me for a full-day of deep learning.  That’s right… it’s on Oct 31st so put on your Wonder Woman or Captain America costume and get ready to exercise your super powers with Power Query and Power BI!  You will learn to master Power Query extensively from Beginner to Advanced.  The other session taught at the same time by Brian Grant is “Power BI: Enhance Your Data Model with DAX” but ya gotta pick one.  You can learn more about the Power BI World Tour and the Academy by following these events on Twitter and LinkedIn using the links at the bottom of this post, or search these hashtags:
#PowerBIUG | #PowerBI | @pbiusergroup | #PowerBIUGAcademy | #PBIWorldTour

Registration and course description:

Academy – Mastering Power Query in a Day (Seattle, WA)

The foundations of a Business Intelligence solution are data transformations, data wrangling, data cleansing and ETL. A well-crafted Power BI project rests on Power Query and the queries that define the data model, calculations and report visuals. This full-day session will teach you how to lay the foundation for a Power BI solution with simple and advanced Power Query techniques.

Learn from Paul Turley, ten-year Microsoft Data Platform MVP and veteran BI Solution Architect. You will learn best practice design patterns, tricks, shortcuts and proven techniques to improve your skills and add immediate value to your projects. Power Query is everywhere – and growing.

The skills and techniques taught in this workshop apply to Power BI Desktop, the “Get Data” feature in Excel 2016+, SQL Server Analysis Services 2017+ (SSAS), Azure Analysis Services (AAS) and Data Flows in the Power BI Common Data Service (CDS). You will learn through exercises and instructor-led hands-on demos. Bring your laptop with the latest version of Power BI Desktop installed. The rest will be provided. We will cover material from basics through advanced. Each exercise is separate so you can absorb only what you need to learn, based on your prior experience, needs and skill level.

Power Query Basics
Quick tour of the Power Query interface & essentials
Creating and managing queries
Adding and editing steps
Recovery and project management
Essential best practices
Managing data sources
Working with folder paths, web URIs & database connections
Referencing & Duplicating queries
Consolidating queries, building base queries & dependency chains
Loading queries into data model tables
Basic error handling & debugging

Data Sources & Structures
Flat CSV files
Irregular text files (headings & totals)
JSON (simple)
JSON (complex, with nested & ragged hierarchies)
SQL server
Excel (single sheet/table, multiple sheets/tables)
Folders & file collections
Web pages a page tables
Web APIs & web service endpoints

Essential Query Techniques
Managing data types
Applying correct naming conventions
Working with Date & Time values
Splitting & formatting columns
De-duplicating & grouping
Pivot, Unpivot & Transpose
Custom columns & expression basics
Extracting tables from a data sources to supporting essential modeling for Power BI report design:

  • Fact tables
  • Dimension tables
  • Bridge tables
  • Slicer & calculation-driver tables
  • Role-playing dimensions

Advanced Power Query Techniques
Working with M: The Data Mashup language
M function essentials
Prioritized learning (what’s most important)
Using & managing parameters
Using the #shared object for internal documentation, examples & code syntax
Understanding M objects (values, tables, lists & records)
Number, Date, Time & Text manipulation M functions
Create a Date lookup/dimension table using M & Power Query
Create a Time series lookup/dimension table using M & Power Query

  • Why do I need a Date dimension in Power BI?
  • Standard date parts & hierarchies
  • Columns to support time-intelligence calculations
  • Working with fiscal & special-purpose calendars (e.g. 4-4-5, ISO)
  • Working with query functions
  • Parameterized queries, API endpoints & user-defined functions

Putting it Together
Examples:

  • Queries to support data model construction
  • Queries used to support report visuals
  • Deploy a report, configure the on-premises gateway
  • Use query parameters to schedule refresh in a deployed Power BI solution

Follow the events at:

https://www.linkedin.com/groups/13580234

https://twitter.com/pbiusergroup?lang=en

How to Configure the Power BI Gateway to use Dataset Connection Parameters

A service provider or vendor might want to publish multiple copies of a report that should connect to different database servers or databases.  In a true multitenant service solution, we would have a singe database with row-level user mapping tables that filter data by the logged in user.  True multitenant solutions require quite a lot of planning and development work to implement.  In smaller-scale or interim solutions, copies of a report can be deployed to different workspaces and then queries can be parameterized to use different database connections.

In this post, I’ll demonstrate deploying and configuring such a solution where the server name and database name have been parameterized and setup to use the on-premises gateway to connect and refresh data.  I’ll also setup scheduled refresh.  The full video walk-through is below but I’ll do a quick review to set the stage.

This is the Power Query Editor in Power BI Desktop.  I have two parameters that are used to specify the ServerName and DatabaseName for each SQL Server query:

Power BI Gateway with Parameters (Time 0_00_57;24)

Once deployed to a workspace in the service, the gateway must be configured with a data source for every possible server name and database combination.  In my example, I can connect to my local server using the NetBIOS name, IP address or LocalHost.  These all are acceptable methods but a data source must be added to the gateway configuration for each so the that the connection strings match exactly.  Remember that the connection is from the on-prem gateway to the database server so names like LocalHost or an internal IP address will work just fine.  In my example, I’m using the IP address of my local loopback adaptor on the local machine to connect to a local instance of SQL Server over the TCP connection.

Power BI Gateway with Parameters (Time 0_04_31;15)

In the workspace, the dataset is bound to the gateway.  Click the ellipsis and choose Settings.

Power BI Gateway with Parameters (Time 0_02_44;22)

To bind the gateway to the dataset, click to select the radio button next to the gateway.  This flips the switch titled “Use a data gateway”.  Apply the setting and then you can refresh the data or schedule refresh.

Power BI Gateway with Parameters (Time 0_05_47;00)Power BI Gateway with Parameters (Time 0_06_09;27)Finally, the parameters can be updated right here in the dataset settings.

Power BI Gateway with Parameters (Time 0_06_46;21)

 

Video Demonstration

 

Pareto, Burn-down & Accumulating Trend Charts in Power BI

I’m managing an Agile team project using Microsoft Teams – the new project management platform integrated with Office 365.  Teams is a simple and useful project management tool but it’s new and light on features.  Using Power BI, we want to show the hourly task burn-down for each two-week sprint.  In JIRA and some other more mature project management platforms, the burn-down chart is a standard feature in the tool that shows the number of hours or story points remaining, compared to the estimated number for the sprint.  Just as I began working on that, a client asked for some help creating a Pareto chart and it occurred to me that burn-down and Pareto charts are very similar variations of the same type of chart presentation.  These are not so much chart types as they are a set of calculations and techniques for displaying a desired result.

Project Hours Burn-Down Chart

Here’s the Burn-down chart showing days of the sprint on the X-axis and the hours for all resources as columns.  The burn-down line represents the number of estimated hours remaining for the duration of the sprint as a percentage of the total estimated hours for all resources.

imageHere’s another variation of the burn-down chart using stacked columns for each project resource, in my example; developers named Marta, Rob and Vivek.  Again, the burn-down line shows the daily percentage of estimated hours remaining compared to the total hours that have been “burned” for all days up to and including the current day.

image

The columns in the chart are simply the sum of hours reported.  I didn’t even create a measure for this value.  It’s just a summable column in the ‘Project Hours’ table named “Hours”.  Here’s a look at the data, including the measures used in the chart:

image

…and finally, here is the DAX measure code used for the line part of the column/line combination chart:

Project Hours Remaining % =
VAR DayAccumHours =
CALCULATE( SUM( ‘Project Hours'[Hours] ),
FILTER( ALLSELECTED( ‘Day’ ),
‘Day'[Day Number] <= MIN( ‘Day'[Day Number] )
)
)
VAR ProjectHoursRemaining = SUM( ‘Sprint Estimates'[Est Hours] ) – DayAccumHours

RETURN
DIVIDE( ProjectHoursRemaining, SUM( ‘Sprint Estimates'[Est Hours] ) )

Pareto Chart

This style of chart is very similar but the key differences are that the columns are ordered by the contribution value in descending order.   You can see that my sample dataset just uses numbers for the categories (e.g. “8”, “2”, “4”, etc.) but these could just as easily be names of resources, sales people, customers or products.  The columns are in descending order of aggregate value rather a time series of the axis field value.  To order by measure or value (my aggregate field is literally called “Value”), set the Type property for the X-axis to “Categorical”.

SNAGHTMLc49dd9d

Here’s the Pareto chart.  The contribution line shows the percentage of the total contribution that a category item and all of it’s predecessors in ranked descending order have made to that point.  For example, the first four top-ranking categories (8, 2, 4 and 5) account for about 50% of the total.

image

The DAX measure code for the accumulating percentage value used for the line, is as follows:

Accum Value % All =
VAR AccumValue =
CALCULATE( SUM(‘Pareto Series'[Value] ),
FILTER( ALLSELECTED( ‘Pareto Series’ ), ‘Pareto Series'[Value] >= MIN( ‘Pareto Series'[Value] ) )
)
VAR AllSeriesTotal = CALCULATE( SUM( ‘Pareto Series'[Value] ), ALLSELECTED( ‘Pareto Series’ ) )

RETURN
DIVIDE( AccumValue, AllSeriesTotal )

Accumulating Trend Series or Trend Chart

For good measure, I have one more example.  I actually started experimenting with calculation techniques using this chart.  It is essentially the burn-down chart example with the line presenting the opposite of the burn-down, so maybe this is a “burn-up chart” – I don’t know.  This may be useful when you want to visualize a time or other continuous series on the X-axis and the accumulating contribution as a line along with the actual contribution values as columns.

image

…and the measure DAX code used for the trend line:

Accum Series Value % All =
VAR AccumSeriesValue =
CALCULATE( SUM(‘Pareto Series'[Value] ),
FILTER( ALLSELECTED( ‘Pareto Series'[Period] ), ‘Pareto Series'[Period] <= MIN( ‘Pareto Series'[Period] ) )
)
VAR AllSeriesTotal = CALCULATE( SUM(‘Pareto Series'[Value] ), ALLSELECTED(‘Pareto Series’ ) )

RETURN
DIVIDE( AccumSeriesValue, AllSeriesTotal )

I’m hopeful that this will be useful and save you some time and energy.  As always, please leave comments and reach-out if you need some help.

You can download the example Power BI Desktop file here: Simple Pareto & Burn-down Chart Example.zip