Doing Power BI the Right Way: 5. Data Modeling Essentials & Best Practices (2 of 2)

Part of the the series:  Doing Power BI the Right Way (link)

Data modeling 201-301 – continued from part 1

So far, you’ve seen that the essential components of a data model include tables related to each other. Filtering records in one table prorogates the filter to related tables(s), causing records in the related table to be filtered. This dimensional model schema is the foundation of the reporting solution. Once the fact and dimension tables are in-place, you can create more advanced solutions by working outside of this standard relationship pattern.

Using Disconnected Tables

Disconnected tables are useful in a variety of ways, allowing a user to select a value that drives the behavior of a dynamic measure, or by grouping values and performing some kind of dynamic calculation. Let’s focus on one of the disconnected tables shown in this figure, the Selected Measure Period table:

This table was created using the Enter Data feature and includes only one column named Period, which includes the following values:

Current
MTD
YTD
Same Per LY
PY

The exact text for these values isn’t really important because I have written specific DAX code to work with them.

The purpose of this post is to demonstrate effective data modeling and not DAX fundamentals but one cannot exist without the other. For our purposes, my model in this example contains five measures that calculate the Store Sales Qty with different time series variations. The Store Sales Qty measure simply returns the Sum of the Quantity field and the Store Sales Qty MTD calculates the month-to-date value for a given date. I’ve created another measure named Store Sales Qty Selected Period which is shown below. When a value in this table is selected, this measure returns the results of one of the other measures.

The following three images show cases where the Store Sales Qty Selected Period measure is used as the Value property of a matrix visual. Each row represents a date from the Order Date table. The slicer above the matrix allows users to select from one or more Period values from the Selected Measure Period table. By selecting only the value Current, the measure returns the Stores Sales Qty measure value. By selecting YTD the Store Sales Qty YTD measure value is returned.

The third example adds the Period field to the matrix Column group. For all values chosen in the slicer, a Period value is displayed in the column heading and the corresponding measure value is displayed in that column.

Here’s another example used to calculate the daily Orders Pending Delivery. The column chart below shows the daily number of orders where an order has been placed but not yet delivered. The dates on the X axis of the chart are from a disconnected table named Select Effective Date.

The logic for the Online Orders Pending Delivery measure depicted by each chart column is the number of orders where the Order Date is less then or equal to the selected date and the Delivery Date is greater than the selected date, or a delivery dat doesn’t yet exist. Here is the measure definition:

Once again, the disconnected table helps us work-around the normal constraints of related tables. The theme here is to use relationships to build the foundation and use them for primary reporting functionality. Then, use disconnected tables to take the solution to the next level with dynamic calculations and creative problem-solving.

The Answer to Life, the Universe and Everything: 42

Every data model should be designed to fulfill specific reporting requirements. This can be challenging when creating larger and more complex models. This isn’t to say that we can’t design utility models to answer a lot of different business questions but it can be difficult to create the utopian data model than can answer every question. In years past, I’ve seen attempts to create something like a “master cube” that contains nearly every table and field in the data warehouse, in an attempt to let users get insight about anything they could possibly imagine. There is a balance here, which you will see in the following section.

Relationships Don’t Always Work Out

The filter propagation from any one table in the data model to any other table can only use one set of relationships. The problem can be illustrated in the question: How do we determine the geographic location for Online Sales orders? Take a look at the following model diagram: Customers have a geographic location as evidenced by the GeographyKey in the Customer table, which is related to the Geography table using a column with the same name. Easy answer.

What about Store Sales? The Store table has a GeographyKey but there is currently no relationship from Geography to Store. That should be easy to resolve, right? Let’s just create a relationship from Geography to Store by dragging the GeographyKey column…

What the crap, Man! Why can’t we create the relationship? The warning message says “because this would create ambiguity between tables Geography and Online Sales”. Yep, sure enough, if that relationship were active, the data model would have two different paths between Geography and Online Sales and wouldn’t be able to decide whether to follow the filter through the Customer table or the Store table. This is a little bit like Google Maps selecting one of multiple routes between two locations. It might tell you that you can take an alternate route and add 15 minutes to your drive but you can only take one route or the other.

So, how do we deal with this problem? …through the magic of DAX, of course. We don’t need no stinking relationships! Rather than relying on natural filter propagation through a relationship that we can’t implement, we create a measure that implements a pseudo relationship using the DAX TREATAS function. Here, the Store Sales Qty measure is explicitly taking all the selected GeographyKey values in the Geography table and creating an in-memory table using the VALUES function. The TREATAS function is saying “take this table of values and treat it as if it were the GeographyKey in the Store table”, effectively filtering Store records using the selected Geography records.

The result is simply that Store Sales Qty can be grouped and filtered by any field in the Geography table.

I have a unique appreciation for the TREATAS function which was introduced late, after we began to embrace DAX in Power BI, SSAS Tabular and Power Pivot. I was in the Community Zone at the PASS Summit in Seattle some five years ago. Someone asked for help with a similar problem which was a challenging but interesting puzzle. Chris Webb happened to be passing by so I dragged him into the conversation. We spent well over an hour hammering out DAX code on a laptop before he presented a solution consisting of CROSSJOIN, FILTERTABLE and SUMMARIZE functions. I don’t remember all the details but it was complicated. Chris has avoided me ever since (I’m kidding… Chris is a super nice guy, enjoys a challenge and always willing to help).

The underpinnings of many DAX functions are a sea of internal logic and code meant to solve specific business problems. Spending time with experts like Chris, Marco Russo and Jeffery Wang (who architected the DAX language) have helped to to appreciate the immense power yet simplicity that this language offers.

Who Needs Relationships Anyway?

One more example. This one is just for fun and the point is to underscore that although a data model with properly-formed relationships is very important, this should not be the limiting factor in a project design. Take a quick look at the following Power BI report page. What do you see?

There are several report visuals that group the Online Sales Qty measure and time-series variations of this measure by different dimension attributes. You would expect there to be a rich data model with several related tables to support all this functionality, right?

Here is the diagram view of this data model. It contains only one relationship and even that could have been eliminated to further make the point.

How is this possible? Isn’t the whole point of a data model to make this type of data visualization and analysis possible? What’s going on here? The Online Sales Qty measure apply the same DAX code pattern that I showed you before, using the TREATAS function to build virtual relationships on the fly:

How does this measure perform compared with a conventional measure in a “proper” data model? With a few hundred thousand fact records, it performs just fine but in a more complex data model with a larger volume of data, this might not be the best approach. The primary reason that performance could degrade is that the measure code causes queries to use the tabular formula engine rather than the storage engine, which is typically faster. My point with this demonstration is that you can use similar techniques to do just about anything you can imagine.

Conclusion

Start with a well-crafted dimensional data model built using best practice design patterns and rules. Use exceptions when needed to address specific requirements – in the form of flattened tables and master-child relationships. Then you can graduate to using special cases like no relationships at all, to address specific use cases and reporting needs. Start with the basics, adhere to the rules and then make exceptions as needed.

As usual, I had a few more things in-mind to cover when I started this post that will be deferred to later posts, to be covered in greater depth. These include:

  • The current state and future of Power BI data model design
  • DirectQuery and Gen 1 composite models
  • DirectQuery over Power BI datasets & SSAS models (Gen 2 composite models)
  • Using External Tools for advanced data model design
  • When and how to use calculation groups to standardize measure variations

Supporting Resources

Star Schemas
https://www.sqlbi.com/articles/the-importance-of-star-schemas-in-power-bi/https://docs.microsoft.com/en-us/power-bi/guidance/star-schema

Many to Many Relationships
https://docs.microsoft.com/en-us/power-bi/guidance/relationships-many-to-many

Model Optimization
https://docs.microsoft.com/en-us/power-bi/guidance/import-modeling-data-reduction

DirectQuery Guidance
https://docs.microsoft.com/en-us/power-bi/guidance/directquery-model-guidance

Doing Power BI the Right Way: 5. Data Modeling Essentials & Best Practices (1 of 2)

Part of the the series:  Doing Power BI the Right Way (link)

Data Modeling 101: part 1 (more advanced techniques in part 2)

A data model is the foundation of analytic reporting. It provides structure and order over information that might otherwise be chaotic and untrustworthy. Loading data into a properly designed model ensures that it conforms to some essential rules that provide better performance, reliability and accuracy.

Different Schemas for Different Needs

One of the most important lessons I have learned about data modeling over the past 20+ years is that there isn’t one model that fits all business needs. However, there are simple patterns we can follow to model data according to different business needs. Each pattern or schema has a different set of rules. At a very high level, we can think of these schemas in three different categories.

When reporting requirements are simple and unsophisticated, sometimes the easiest way to model a set of data is to transform it into a single, flat table. There are several trade-offs and limitations when using this approach, though. This is typically the goal when bringing information into a spreadsheet so you can simply add-up a column of values, perhaps filtered by other columns. It might seem natural to use the same approach when making the transition from Excel to Power BI. Here is an example of retail order data organized into a flat table:

Look over this sheet before moving to the next figure. If this data were stored in an Excel sheet, how could we use it for reporting? You could filter by the Order Date column and add up the Quantity, Unit Cost and Unit Price. You could also apply filters to the Company Name, Category, Subcategory or Product Name columns. For simple reporting, this could meet our needs.

The master/detail pattern is typically used with transactional data. Master records, like orders or invoice headers, are related to detail records such as the order details or invoice line items. The master/detail pattern can be repeated with multiple tables to create a normalized database. Use this pattern sparingly to support drill-to-detail reports. Including unnecessary detail in an analytic data model uses a lot of memory and storage, and can impact report performance.

When relational database professionals are introduced to Power BI, they will often see DirectQuery as a simple means to leverage their familiar and existing relational database or data warehouse, only to be disappointed by poor performance and other limitations. However, master/detail schemas are where DirectQuery can provide a best of both worlds solution so long as the analytic needs over detail data is light. If we need to provide the ability to drill-down to detail records from an interactive report without needing to perform calculations over the entire detail record set and avoid sucking billions of records into memory, DirectQuery might be the ticket.

Dimensional Modeling 101

The rules of dimensional modeling are a lot like the laws of physics in that the rules of dimensional modeling were really discovered rather than invented. By applying dimensional design patterns, data will naturally behave and reports will perform well. Dimensional rules really help describe the way data behaves when stored in the most optimal form for reporting and analysis.

The Kimball Group are credited as the pioneers of dimensional modeling and although the books and papers that Ralph Kimball and his colleagues authored are 20-30 years old, they are as relevant today as ever. I’ll cover some of the basics but you can find a comprehensive reference guide to dimensional modeling patterns and techniques on the Kimball Group site here: Dimensional Modeling Techniques – Kimball Group. There have a been a few recent publications employing modern examples in books such as Star Schema – The Complete Reference and Agile Data Warehouse Design. Both are excellent books but unless you are undertaking a huge data warehouse project with complicated data problems, understanding the essential concepts is sufficient.

By organizing information into fact and dimension tables, we store as little data as possible while maintaining consistency and data integrity. In data design, we often refer to “entities” and “attributes”. An entity is something we keep track of (like a Customer or Product). Entities have attributes which describe specific properties. In the analytic data world, entities are usually materialized as dimension tables and each attribute is a column or field.

A fact table contains the numeric columns used to summarize and aggregate measure values, along with the key columns that are related to the dimension tables. Dimensions contain the attributes used to group and filter the business facts. Fact records share the same level of grain across all dimensions. For example, if domestic sales orders and international sales orders each have the same level of detail for dimensions like Customers, Products and Order Date, these records could be stored in the same fact table. But, if Sales Targets are applied at the month level rather than the date, they must be stored in a separate fact table.

The essence of a dimensional model is the star schema, simplified here to show one fact table related to a small number of dimensions.

A real application of the star schema design is shown in the following image. The Online Sales fact table contains key columns that are used to relate this table to each of the dimensions. The fact table also contains numeric type columns used to define measures which aggregate and total numeric values such as Net Price, Quantity, Unit Cost, Unit Discount and Unit Price. You’ll notice that the relationship from each of the dimension tables to the fact table is one-to-many and filters records in one direction, depicted by the arrow on the relationship line. For example, the relationship from the Customer table to Online Sales is based on the CustomerKey column in both tables. The relationship filter direction indicates that if the Customer table were filtered on a report, Online Sales would be filtered accordingly. However, if the Online Sales table were filtered in a report, this would not cause the Customer table to be filtered.

According to the Kimball Method, there are four stages to define fact and dimension tables:

  1. Select the business process
  2. Declare the grain
  3. Identify the dimensions
  4. Identify the facts

Seems pretty simple, right? It can be at a macro level until you need to get into the weeds. The trick is to maintain a high-level business perspective as you work through the the details to solve data problems. I find this to be a useful whiteboard exercise when starting a new project. Using only business tools and business language, diagram data entities at a high level using this approach. Next, break the process down using these steps:

  1. Load detailed atomic data into dimensional structures
  2. Structure dimensional models around business processes
  3. Ensure that every fact table has an associated Date dimension table
  4. Ensure that all facts in a single fact table are at the same grain or level of detail
  5. Resolve many-to-many relationships in fact tables
  6. Resolve many-to-one relationships in dimension tables
  7. Store report labels and filter domain values in dimension tables
  8. Make certain that dimension tables use a surrogate key
  9. Create conformed dimensions to integrate data across the enterprise
  10. Continuously balance requirements and realities to deliver a DW/BI solution that’s accepted by
  11. business users and that supports their decision-making

Something to keep in mind is that these rules and steps apply equally to formal, IT-based model design and to self-service BI projects. However, the tools have been modernized over the past decade or so; along with the degree of strict adherence to the rules. Simply put, in the new age of self-service BI, everything is negotiable. But something else to keep in mind is that bending or breaking the rules nearly always comes at a cost. For example, step 8 prescribes creating surrogate keys to identify dimension records. You can use the original business keys throughout the model but run the risk of encountering key collisions or other data quality issues.

In some cases, the lines that separate facts and dimensions can become blurred, which may need to be considered in your model design. For example, I recently developed a BI solution where we counted employee hiring events and a row in the Hires fact table was simply a record of the event when an employee was hired along with information like the hiring date and position. There were no numeric columns to sum-up so we simply needed to count these records. This is often referred to as a “factless fact”. Another exception to the typical star pattern are cases where a fact table contains a value we could use for grouping or filtering, and it might be pointless to extract it into a separate dimension table. This is known as a “degenerate dimension”. These are examples of cases where you could arguably apply the dimensional design rules strictly out of principle or err on the side of simplicity. The right answer for each case should be driven by the business reporting requirements while considering the the reliability and matainability of the overall solution.

Let’s apply the process to the flat table of sales orders. Here it is again with some of the dimensions and facts identified:

In this example, there are three separate date columns that record the date an order was placed, the date it was due and the date that the order was delivered. Customer information includes the Company Name and a unique CustomerKey which could be used to lookup detailed customer information. There are three numeric columns that can be summarized or aggregated. These columns could also be used to perform more complex calculations like ratios, running totals and time-series comparisons.

Dimensional Concepts

Now, let’s review some common practical dimensional modeling applications.

Role Playing Dimensions

Dimensional entities can play more than one role in reporting. In the flat table, the three date columns have completely different purposes but all store the same type of value; a date. But dates can be used to group and aggregate data in date part groups like month, quarter, year or fiscal periods. They can be used to perform time-series calculations like month-to-date or same period in the previous year. To do this, you absolutely must have a date dimension* table containing a contiguous set of date records over a period of time. For our purposes, we might need all dates for the past five years through the end of the current year. Date dimensions are the most common example of role-playing dimension but let’s look at another scenario before looking at at more date examples.

*Power BI Desktop includes features that will automatically generate a date dimension table, and there are DAX functions you can use to generate date tables in code. Although useful in certain cases, I prefer to manage date tables the old fashion way so I have one constant method for importing and managing all tables. By comparison to other data challenges, building date dimension tables is a relatively simple task.

The following image shows a fact table named Flights, with two role-playing Airport dimensions. An airport is, well it’s just an airport… it has a three-letter code and some information about the geographical location. But with respect to any given flight, an airport can serve two functions: to depart from or to arrive to. In this example, it is necessary to materialize the Airport table twice: the Departure Airport and the Arrival Airport.

With two role-playing Airport dimensions, a report user could ask for all the flights from Japan to Australia on a given date.

Back to the retail sales example, let’s say that our Sales fact table has two date keys; OrderDate and ShipDate. A single Date dimension can only have one active relationship. The following image shows the result of the autodetected relationships between Sales and Date, where one of the relationships is inactive and effectively useless for most practical purposes.

This image shows three separate instances of the Date dimension table imported using referenced queries in Power Query. The three tables: Order Date, Due Date and Delivery Date use up a tiny bit more memory than a single instance of the table but give report users more flexibility to answer business questions.

A Table for Every Date Column?

An attendee in a data modeling session recently asked this question. She said: “I have 14 date columns in my fact table used to track a variety of different events. Do I need to create 14 different Date dimensions?” My answer was “certainly not.” This is an extreme example but a very good question. You only need separate date dimension tables when you need to have the flexibility to use time-series functions in DAX or to perform comparisons using date part fields (like Year, Quarter or Month). Operations like calculating the difference between two date type columns or counting rows with future dates can be performed using fields in the fact table. I’ll show you an example in part 2 of this blog series. This is one of those “it depends” situations where you must make a judgement call depending on your business reporting requirements, balancing simplicity and maintainability with complexity and flexibility.

Many-to-Many relationships and Bi-directional Filters

File this under “don’t do this unless you have to.” Many data modeling decisions are a trade-off between performance and functionality; and using iterative design, you will often find a better way to solve problems as you go. There are a few different ways to design for many-to-many relationships. The conventional approach is to use a bridging table that contains all of the combinations of keys that relate two tables together. In the following example, the Customer and Product dimension tables normally would have a single-direction filter from the “one-side” to the “many-side” of the relationship. If reporting requirements call for filtering products based on selected customers who purchased them, we can use the Sales fact table as a bridge table and change the relationship between Product and Sales to use a bi-directional filter.

Depending on the cardinality of the relationship, using a bi-directional filter could introduce a performance penalty. If we only have 100 products and less than a million Sales records, this is probably not a big deal. If we have 10,000 products and 100 million Sales records, this relationship will likely slow things down considerably.

Here is another example: Given the requirement to show accounts and transactions for a selected customer, the following model doesn’t work with out-of-the-box relationships. To understand why, follow the flow of filtered records through the relationships. From Customer to AccountCustomer, the arrow on the relationship line indicates that the filter flows in the right direction. Once the AccountCustomer table has been filtered, the Account table will not be filtered as we try to swim upstream because that relationships doesn’t naturally flow from the many-side to the one-side.

Changing the Account-AccountCustomer relationship filter to bi-directional fixes the problem and meets the reporting requirement.

As I mentioned, the bi-directional filter introduces a potential performance penalty. However, if the cardinality is low and the memory footprint of the data model is low and lean (using efficient key values and removing unnecessary columns), this design may perform just fine and you are good-to-go. I’ll demonstrate more creative techniques in part two of this two-part topic, as we explore more advanced data modeling design techniques.

Continued in Part 2

Composite Models Gen 2 and DirectQuery over Power BI Datasets

My wife bought me a nice telescope for Christmas. I’ve wanted one since I was a kid. We opened gifts early because we’re going to take the RV out in the dessert to watch the great convergence of Jupiter and Saturn next week. But, what am I doing now, after opening my gift? …updating Power BI Desktop to check-out the new Composite Model feature! This new feature of Power BI has been in the works for several years and we have been anxiously waiting for this announcement. It is truly the Utopian dream; the Holy Grail!

It is hard to articulate the impact that this capability can have for businesses. The new composite model gen 2 feature is a monumental breakthrough that will significantly raise the bar for Power BI model development and could change the way we think about designing enterprise BI solutions – but customers will need guidance to use it correctly. If you haven’t been sitting on the edge of your seat waiting for this announcement, read the announcement and understand how significant this is.

Power BI December 2020 Feature Summary | Microsoft Power BI Blog | Microsoft Power BI

There are been many attempts by Microsoft and other vendors to create a data modelling architecture that provides for fast access to cached data, direct access to live data and scaled-out connections to established data models. Remember ROLAP and HOLAP storage in multidimensional cubes? These were great concepts with significant trade-off limitations. No other vendor has anything like this. Way back in the day, Microsoft jumped on the Ralph Kimball bandwagon to promote the idea that a company should have a “one version of the truth” exposed through their data warehouse and cubes or semantic data models. They met customer demand and gave us a BI tool that, in order to bring data together from multiple sources, makes it easy to create a lot of data silos. Arguably, there are design patterns to minimize data duplication but to use governed datasets, self-service report designers are limited to connecting to large, central models that might only be authored and managed by IT. This new feature can restore balance to the force and bring us back to “one version of the truth” again.

To put this into perspective, here’s a quote from Alberto Ferrari’s recent article: “It is not easy to find a name that accurately describes the massive step forward that is this feature in the evolution of Business Intelligence solutions. It would fully deserve a hyperbolic name, but the marketing departments already overused so many terms over the years: “universe” was already used in the past century; you can imagine what happened after that. We like “composite models” because it is a name that concisely describes the purpose of this new feature.”

Alberto has a great video emphasizing the magnitude of this release: New composite models in Power BI: A milestone in Business Intelligence – SQLBI

Those of us who live a breathe BI will continue to have thoughts, opinions and eventually will have war stories and prescriptive guidance to offer about using this new capability most effectively.

Upon initial review, it seems to work exactly as expected. In future posts, I will demonstrate how to design and implement a next-gen composite model and share my experience and thoughts as we move down this new road.

Power BI Expert Resources

Where to go for best practice advice for Power BI: The Data Gods

One of the most common questions I hear from customers and students is “what are your recommended resources for learning and staying current with Power BI?”. I find some people get stuck learning Power BI because they are simply overwhelmed with information and choices. Google search will often raise more questions than answers because there is a lot of noise on the Internet. Expert advice can be found only after dismissing the myriad of posts asking questions, sharing opinions, and sifting through outdated and confusing information. Most information found in discussion forums is outdated because the product changes so often. So, what are the most reliable sources for Power BI expert advice and best practices?

I follow several industry experts and I own many great books. Chris Wagner (KratosBi.com | @KratosBi), Microsoft MVP, has compiled the most comprehensive go-to list of Microsoft data experts called the Data Gods (you can see the complete list here). When I was added to the collective of featured “Data Gods” last week, I was hesitant to brag about. I mean, there’s no modest way to post this picture. The title of “Data God” is a lot to live up to – but it is truly an honor and I want to thank Chris and others for this recognition.

More importantly, this is a concise list of reliable industry experts – which I have paired-down to those who blog and post regularly about their experiences and advice. I know and follow nearly everyone on this list but there are a few folks with whom I have not yet had the pleasure to meet. There are also others featured in the Data Gods collection who make significant contributions in other ways, aside from frequent blogs and articles. I’ve also added a few additional notable community contributors.

Free Best Practice Resources

Following is my selection of members of the Data Gods counsel with links to their free resources. Most of these are blog sites or web sites that include blogs and articles available for free without requiring a sign-in or subscription:

Alberto Ferrari & Marco Russo – SQLBI.com | @marcorus | @FerrariAlberto

For most of us who work day-to-day with Power BI and other Microsoft BI tools, SQLBI.com is the first stop for expert advice. Alberto and Marco have set the gold standard for Microsoft BI best practices and design patterns. The breadth and depth of their expertise is expansive and they are known for going deep into the nuts and bolts that make Power BI work. These guys are are literally an extension of the Microsoft product development teams by guiding Microsoft leadership and evangelizing the platform. Their books on DAX and Tabular model design are considered by the most serious Power BI Developers to be the most comprehensive reference books available.

Patrick Leblanc & Adam Saxton – GuyInACube.com | @GuyInACube | @awsaxton |
@patrickdba

You either know these guys or you’ve never searched the web for Power BI help. Adam started the Guy In A Cube YouTube channel several years ago as a side project while doing BI technical support work for Microsoft. Patrick came from a BI and SQL Server consulting background before joining Microsoft and teaming with Adam. Today they share design techniques and feature highlights in weekly, high-quality short videos and a Saturday morning live streamed Q&A session. Adam and Patrick are members of the Customer Advisory Team (CAT) focused on supporting Power BI enterprise customers. Their weekly Power BI community Roundup video features the most current Power BI blog posts and release notes.

Chris Webb – blog.crossjoin.co.uk | @cwebb_bi

Chris is one of the most prolific, enduring and consistent bloggers in the Microsoft BI industry. His posts and articles go deeper then most, prepared with thought, effort and insight. Before joining the Microsoft Customer Advisory Team (CAT), he ran a successful training and consulting firm, specializing in the Microsoft BI platform. Over the years he was, and continues to be, the go-to expert on SQL Server Analysis Services, MDX, Power Query and DAX; and continues to share his knowledge and advice through his blog.

Matt Allington – Excelerator Blog | @ExceleratorBI

Matt’s main focus is business-centered self-service BI and financial analysis. He is a tried-and-true expert in the Excel and Power BI Desktop space. His training, presentation and writing style is clear, authoritive and to-the-point. He has the best collection of intro and intermediate level books on Power BI and DAX; each is short, concise and easy-to-understand. His books, Supercharge Power BI and Learn to Write DAX are excellent way to get started.

Seth Bauer, Mike Carlo & Steve Campbell – PowerBI.Tips | @PowerBITips | @Seth_C_Bauer | @Mike_R_Carlo | @PowerBISteve

PowerBI.tips is an expansive collection of free resources and utilities for Power BI with focus on visual themes, layout and color selection. This dynamic trio covers the important bases: Seth on visual design, Mike on enterprise solutions and Steve focusing on application development. They have an array of external tools for things like connection management and project documentation.

Melissa Coates – CoatesDataStrategies.com | @SqlChick

Melissa is one of the deep thinkers in the industry, always asking the right questions and challenging the status quo. She’s also one of the most pleasant people I’ve ever met. Her blog posts are thorough and insightful. Melissa has co-authored multiple whitepapers for Microsoft on data governance, enterprise deployment and migration.

Leila Etaati & Reza Rad – Radacad.com | @leila_etaati | @Rad_Reza

Leila and Reza are known as one of the power couples in the industry. With nearly limitless energy, they have traveled from New Zealand over the years to present at more conference events than I can count. Leila is one of the BI industry’s thought leaders in the machine learning and AI space, with education and credentials behind her experience. Reza’s expertise with Power Query and Power BI are impressive. They freely share their knowledge through their blogs, articles, training classes and have published numerous books.

Kasper de Jonge – KasperOnBI.com | @Kjonge

Kasper is a Principal Program Manager on the Power BI product team and based in The Netherlands. A regular conference session headliner, Kasper is very tuned into customers’ real-world enterprise BI reporting problems. He has been blogging for years and has recently launched a YouTube channel.

Alice Drummond – DiscoverEI.com | @AliceDiscoverEI

With an emphasis on environmental data analysis, one of the things I appreciate about this Alice’s active blog series is that with each post, she shares a comprehensive collection of blog sites and training resources to help readers find supportive information and expand their knowledge.

David Eldersveld – DataVeld.com | @DataVeld

If David can’t figure it out, it probably can’t be done. Over the years, he has pushed the limits, discovered and developed some very unique and interesting techniques to integrate things like graphics and geospatial data with Power BI. David is very active in the community and shares different ways to take Power BI beyond the out-of-the box features to do things that many of us had no idea was even possible.

Brian Grant – CSGPro.com | @BrianGrantBI

Brian has a unique intellect and a passion for problem-solving and sharing knowledge, particularly related to Power Query and DAX. I’ve had the pleasure of collaborating with Brian on multiple occasions in conference presentations and training events. He and his colleagues have developed an interesting method to understand the mechanics of the DAX language that he shares in a video series called the Elements of DAX.

Reid Havens – HavensConsulting.net | @HavensBI

In a few short years, Reid has risen to all-star status by tirelessly designing and sharing data visualization techniques. He is a master of spit-and-polish presentation and report design.

Meagan LongoriaDataSavvy.me | @MMarie

Meagan has broad expertise but her focus is designing reports for accessibility and accurate data presentation. Her posts and conference presentations are very insightful. She challenges the status quo and shares specific techniques to provide accessible report designs.

Matthew Roche – BIPolar.com | @SQLAllFather

Matthew is a member of the Microsoft Power BI Customer Advisory Team (CAT). He has considerable experience with several Microsoft development initiatives that have culminated in leading products and Azure services that are used to create enterprise data solutions. He is an industry thought leader who is very active in the online community. He shares his insights, leadership and expertise through his blog and industry conference events, most notably related to modern data governance and Power BI dataflows.

Gil Raviv | DataChant.com | @GilRa

Author of Collect, Combine & Transform Data in Excel using Power Query and Power BI, and excellent reference book. Gil maintains an active blog.

SqlServerBi.com | @Paul_Turley

I add my own blog to this resources list where I share my thoughts about best practice design and development approaches in this series: Doing Power BI The Right Way.

Others

Gilbert “Q” Quevauvilliers – FourMoo.com | @GilbertQue

From has vast experience consulting with IT and business consulting customers, Gilbert shares relevant examples and lessons learned. He is an active blogger and community supporter.

Imke Feldmann – TheBiAccountant.com | @TheBIccountant

Deep, business-focused, creative problem-solving techniques using Power Query/M in Power BI and Excel.

Ruth Pozuelo Martinez – Curbal.com | @CurbalEN

Ruth puts tremendous energy and creativity into her video presentations which she posts frequently. She covers everything from intro to moderately advanced level Power BI design with complete and thorough examples, all presented in a stunning greenscreen format.

Parker Stevens – BiElite.com | @PowerBIElite

Parker’s blog site is an extension of his “BI Elite” YouTube channel, with numerous short tutorials and excellent “how-to” examples.

James Serra | JamesSerra.com

Thorough and consistent, James shares announcements and guidance about the Microsoft data platform and data warehouse architecture.

Books

Beginner
Intermediate/Advanced