Interviews from PASS Summit 2018: Matthew Roche, Matt Masson, Adam Saxton & Patrick LeBlanc

I had a chance to catch up with two of the three Matts from the Power BI product team at PASS Summit 2018.  Speaking with Matthew Roche and Matt Masson about the new Power BI Dataflows announcement, they told me that Power Query is becoming a truly enterprise-class data transformation technology. Matt Roche explained that not only is Dataflows a cloud-based implementation of Power Query with a browser-based query designer but it allows you to enforce business and data quality rules using strongly-typed entities in the Common Data Service.

Power BI Dataflows & Self-service Data Warehousing

Matt Masson told me that the expansion of Power Query into multiple products and services demonstrates how truly powerful the Power Query platform has become. He said that behind the simple-to-use user interface is an extremely flexible data transformation language, and underneath the M language is a great, flexible dataflow engine.  He’s currently working on the Intellisense features of the Power Query M language advanced editor. Matt Masson also said that he is expecting more third-party vendors to create custom data connectors for Dataflows and the Common Data Service.

I also caught up with Adam Saxton and Patrick LeBlanc from the Power BI Customer Advisory Team (CAT).  You may recognize Adam and Patrick from the “Guy In A Cube” video series.  Patrick described Power BI Dataflows as “self-service data preparation for the masses”. Adam added that that Dataflows takes Power Query to the next level by writing data to tables in Azure Data Lake Storage rather than directly to a VertiPaq data model.  This capability enables the Power BI platform to create a truly enterprise-scale data warehouse and Business Intelligence solution.

I asked Adam what skills are most important to someone growing a career in IT-based Business Intelligence.  He said that the most important thing is to understanding proper data modelling in data warehouse design.  Power BI – by itself is a great platform downstream from data sources or an existing data warehouse.  Dataflows and the CDS fills an important gap in self-service BI reporting & data analysis.  These technologies provide a common platform for data preparation, data warehouse design and standardized data entity management for self-service and BI reporting.  Dataflows shows a lot of promise for serious, next generation data warehouse solutions.  Stay tuned for more on this as I explore and prepare demos with Dataflows, Azure Data Lake Gen2 Storage and the Common Data Service. 

Data Model Options for Power BI Solutions

At the heart of every a Business Intelligence reporting  solution is a data model to optimize queries and enable ad hoc report interactions.  Data modelling technology has evolved quite a lot over the past 20 years or so.  You have several modelling technology choices with options depending on the simplicity or formality of your project and factors like data size and security.  In the past, choices were simpler.  Power BI was the choice for smallish, “good enough” projects; and when data quality, high volume and exactness were the priority, Analysis Services (SSAS) was the better choice.  Now, using Power BI for modelling larger data sets is even advantageous compared – with new features like hybrid models, aggregations and incremental data refresh.  Just in the past year or so enhancements have been added – or are in the process of being added – to the Power BI platform, that the list of options and considerations continue to grow.  Here’s a quick (and I mean quick) summarized list of the modelling options for Power BI solutions:

  • Power BI Desktop model published to the Power BI service – limited to 1 GB model size, no partitioning.  Pay per user with Power BI Pro licenses.
  • Power BI Desktop model published to a Premium Capacity Power BI Service – 12 GB size limit, dataflows & numerous “grown-up” features.  Unlimited read-only domain users.
  • Power BI Desktop model published to on-prem Power BI Report Server – limited to 2 GB model size, no partitioning.  Prior version & feature restrictions.
  • SSAS on-prem or AAS using report published to PBIRS – no model size limit.  Prior version & feature restrictions.
  • SSAS tabular or multidimensional on prem connected to Power BI report in service via the data gateway.  No size limits.  License server cores for SSAS.  License Power BI using Pro or Premium.
  • Azure Analysis Services = SSAS tabular.  No size limit, pay as you go and scale as needed.
  • Power BI reports in the service or PBIRS using DirectQuery to connect to data source with no cached model.  Report interaction queries may be slow, with some calculation limits.  *Aggregations & hybrid model design can optimize & reduce query latency, with moderately complex model development overhead (Premium features).

There are numerous advantages and disadvantages with each option so the best choice always clear but we can certainly compare the pros and cons.  I’ve been on a quest to find the best medium to break these choices down into simple terms but it truly is a journey and not a destination.  Along this journey, earlier this year, I presented a session called “The Nine Realms of Power BI” where I enumerated different categories of solution architectures for Power BI solutions; and they are numerous.

One important consideration is Microsoft’s commitment to support a product with new features in the future.  They have made it very clear that the Power BI platform is their primary focus and that they will continue to invest in enterprise-scale capabilities in the cloud service.  Never plan on a crucial feature being added later to a product but give serious consideration to  where a product is headed.

Making a side-by-side comparison of features between products and options is a little like comparing oranges, apples, grapes and bananas.  As a best effort, I started with the feature lists for SQL Server Analysis Services and added the Power BI variations.  Admittedly, this type of comparison doesn’t fit perfectly into this type of pivot format but I think it serves the purpose.  This post will likely evolve a bit with time.  Your feedback and input are welcome.

Some additional resources from trusted colleagues: Matthew Roche from Microsoft recently posted this article.  He refers back to this blog post and mentions an excellent session at the recent PASS Summit from Gabi Münster and Oliver Engels.  On Matt’s post, he links to the session slide deck that Gabi and Oliver shared.  An official whitepaper written by Chris Webb and Melissa Coates covers Power BI enterprise deployment options in 105 pages.  Keep in mind that many new features have been added since this paper was written but the whitepaper will likely be updated in the future.  Matt also references another recent on this topic by James Fancke where he contrasts features and costs.  This is, no doubt, a timely and important topic.

Guy-In-A-Cube’s Adam Saxton and Patrick LeBlanc have posted numerous short videos highlighting specific features of Power BI with Analysis Services and here are a few to get you started.  

Side-by-Side Feature Comparison

In the following table I compare capabilities, starting with SSAS Tabular and SSAS Multidimensional, with Power BI Pro and Premium capacity.  My blog theme doesn’t current support wide tables like this (apologizes for the overlapping text) so you can Open or download the Excel file here.  I hope this is helpful.  I’ll work on updates to optimize this comparison chart as features are added and time allows.

Feature
Enterprise/Developer
StandardAzure Analysis ServicesPower BI StandardPBI Report ServerPower BI PremiumComments
Max model size (compressed)No limit*16 GB No limit*1 GB2 GB10-12 GB**Premium supports 10 GB upload & 12 GB refresh.
Hybrid models
(DirectQuery & cached mode)
NoNoNoNoYesYes 
M/Power Query sources & transformationsYesYesYesYesYesYes**Query extensions in the service using dataflows
Integrated application lifecycle management (ALM) toolsYesYesYesNoNoNo 
Integrated version control toolsYesYesYesNoNoNo 
        
Tabular Models       
Programmability (AMO, ADOMD.Net, OLEDB, XML/A, ASSL, TMSL)YesYesYes*Yes*Yes*Yes*Third-party tool support, new XMLA endpoint for Power BI
HierarchiesYesYesYes**Yes*Yes*Yes**Simple hierarchies,
**AS supports HideMemberIfBlank
KPIsYesYesYesNoNoNo 
PerspectivesYes YesNoNoNo 
TranslationsYesYesYesNoNoNo 
DAX calculations, DAX queries, MDX queriesYesYesYesYesYesYes 
Row-level securityYesYesYesYesYesYes 
Multiple partitionsYes YesNoNoYes**Incremental refresh builds partitions
In-memory storage modeYesYesYesYesYesYes 
DirectQuery storage modeYes YesYesYesYes 
        
Multidimensional Models       
Semi-additive measuresYesNo 1Yes*Yes*Yes*Yes**Using DAX code,
Effort: moderate
HierarchiesYesYesYes**Yes*Yes*Yes**Simple hierarchies,
**AS supports HideMemberIfBlank
KPIsYesYesYesNoNoNo 
PerspectivesYes YesNoNoNo 
ActionsYesYes*Using 3rd party tool*Possible, limited*Possible, limited*Possible, limited 
Account intelligenceYesYesYes*Yes*Yes*Yes**Using DAX code,
Effort: high
Time intelligenceYesYesYesYesYesYes 
Custom rollupsYesYesYes*Yes*Yes*Yes**Using DAX code,
Effort: moderate
Writeback cubeYesYesNoNoNoNo 
Writeback dimensionsYes NoNoNoNo 
Writeback cellsYesYesNoNoNoNo 
DrillthroughYesYesYes*Yes*Yes*Yes**Multiple techniques
Advanced hierarchy types (parent-child and ragged hierarchies)YesYes*YesNoNoNo*Supports HideMemberIfBlank
Advanced dimensions (reference dimensions, many-to-many dimensions)YesYesYesYesYesYes 
Linked measures and dimensionsYesYes 2No*No*No*No**Equivelent functionality
TranslationsYesYesNoNoNoNo 
AggregationsYesYesYesYesYesYes 
Multiple partitionsYesYes, up to 3YesNoNoYes**Incremental refresh builds partitions
Proactive cachingYes *****In-memory model is always cached
Custom assemblies (stored procedures)YesYesNoNoNoNo 
MDX queries and scriptsYesYes*****Supports MDX queries & DAX scripts
DAX queriesYesYesYesYesYesYes 
Role-based security modelYesYesYesYesYesYes 
Dimension and cell-level securityYesYesNo*No*No*No**Equivelent functionality for measures
Scalable string storageYesYesYes*Yes*Yes*Yes**Equivelent functionality
MOLAP, ROLAP, and HOLAP storage modelsYesYesYes*Yes**cached or SSAS directYes**DirectQuery & hybrid models are equivelent or superior
Binary and compressed XML transportYesYesYes*Yes*Yes*Yes**VertiPaq in-memory compression on all data types
Push-mode processingYes YesYes*Yes*Yes**Multiple & different processing techniques supported
Direct writebackYes NoNoNoNo 
Measure expressionsYes YesYesYesYes 

Interviews with Microsoft Data Community Speakers and Leaders

What is the key to advancing your career in the Microsoft data platform?  Here is some advice from some of the most successful people in the industry…

Every year we have some big community events that bring together community leaders from all over.  These are international user group and community leaders who write books and speak at conferences. we had our local Oregon SQL Saturday “SQL Train” which is a chartered train – some coaches that one of our speaker’s chartered to bring all of our speakers and attendees up to the PASS summit after Oregon SQL Saturday, and then the big PASS summit (the big conference up in Seattle). I had a chance to sit down with a number of our speakers, community leaders and attendees and just ask questions about what brought them there, and advice that they would give people in the industry about how to get the most value out of that experience …and this is what they said:

Well-run monthly user group meetings and big annual events local events like SQL Saturday don’t just happen by themselves. It takes a lot of planning, a lot of volunteers and a lot of coordination to make these events successful. Part of that effort are the annual leadership planning meetings that we have during the week of PASS summit. Here are some short clips from those meetings where several hundred local leaders from all over the world got together to share ideas, to exchange notes and to coordinate to be able to make these events successful. Leaders cross-pollinate, exchange ideas and they work together to make this a great community. Why? …because somebody did that for us when we were getting started and we want to give back to the community. So, get involved; join the leadership committees at your local user groups, volunteer at SQL Saturday. Volunteer to do a short speaking engagement. Just get up and talk to some of your peers. Get started by volunteering in the community so that you can be part of the ongoing great community we have around the Microsoft data platform.