Developing Large Power BI Datasets – Part 2 – Partitioning

Developing Large Power BI Datasets – Part 1

Partitioning Large Tables

Table partitioning has long been an important task in semantic model design using SQL Server Analysis Services. Since SSAS Tabular and Power BI models are built on top of the SSAS architecture, the pattern of partitioning remains the same as it has been for twenty years. However, the specific methods for implementation have been fine-tuned and improved. The reasons for partitioning large fact tables mainly include:

Improve refresh speed
Prevent reloading historical records
Capture updated history
Reduce database resource load

You don’t have to have massive tables to benefit from partitioning. Even tables with a few hundred thousand records can benefit from partitioning, to improve data refresh performance and to detect source data changes. There is little maintenance overhead, so the benefits usually outweigh the cost, in terms of effort and management.

Crossing the Bridge to Enterprise Power BI Development

One thing to keep in mind is that once a dataset has been partitioned, you can no longer download it as a PBIX file from the service. There are actually a number of model properties that have this effect. Any changes you make to a deployed model using the XMLA endpoint, will disable downloading the file. Honestly, if you are managing an enterprise-class Power BI solution, you should be keeping your development files in a source code repository, so this shouldn’t be a restriction.

Why Partition?

Partitions are completely hidden from report and data model users. They only see the entire table and don’t see that it is comprised of multiple partitions. For administrators, partitions enable them to process data for only single time period (e.g. year, month, week or day). Each partition can use a different storage mode. For example, the past five years of sales history can be stored using Import mode and current records through the previous day can use DirectQuery so we always see real-time data for that brief period of time.

The following image is the TOM (Tabular Object Model Explorer) in Tabular Editor, showing a set of table partitions automatically generated from an Incremental Refresh policy in the Power BI service:

Here is an example of the XMLA script used to create a quarterly partition on the Fact Online Sales table for Q1 of 2022:

You might notice that this “XMLA” definition is actually in JSON format. The product team kept the old name from the days when we used XML formatted scripts to define objects but moved to the more modern JSON format back in 2016. It’s actually called TMSL, but it’s part of the XMLA standard. Regardless, this is how you create objects using the XMLA webservice endpoint when connected to a Power BI dataset in a Premium capacity workspace.

Back when we had to create and manage semantic model objects using script, semantic model developers would spend weeks generating and modifying scripts like this, and then automate their partition management, creating, dropping and merging partitions together, using tools like SSIS. Those days are behind us for the most part, thanks to the magic of Incremental Refresh policies. The script in the previous example was generated using Incremental Refresh in the Power BI service. An administrator could use this script as a starting point if they needed to manage the process manually, but this is unnecessary in most implementations today.

Implementing Incremental Refresh

This is a feature of the Power BI service that automatically generates and manages partitions based on a simple policy designed in Power BI Desktop. To enable this feature, follow the steps in the previous post to enable parameterization in Power Query. Create RangeStart & RangeEnd parameters and add a date range filter to fact table.

Create Incremental Refresh policy from the ellipsis menu for a fact table in Power BI Desktop. If you have setup the parameters and date range filter correctly, you will see the following dialog:

The settings show here will generate partitions for five years of data. The last two months prior to the refresh date will be stored in monthly partitions and the prior records will be stored in yearly partitions. Additionally, the policy will check all the partition queries to see if any records have been changed since the last refresh. If so, it will process the partition(s) containing those records.

At the 2022 PASS Data Community Summit in Seattle last week, Shabnam Watson covered this topic in great detail in this presentation: A look at automatic partition maintenance by Power BI Service for datasets with Incremental Refresh – Shabnam Watson’s Blog (wpcomstaging.com)

One of the things I learned in her session was that it takes two refresh cycles for partition merging to take place. This would mean that at the end of the month, the last monthly partition wouldn’t be merged into the first yearly partition until the next refresh cycle. The service manages to keep the data consistent throughout the process, but it takes time for the service to clean-up after itself internally. In other words: don’t worry about this unless you are orchestrating unusually complex automation over the partitions outside of using scheduled refresh.

Incremental Refresh works with Pro and Premium licensing, but datasets cannot exceed 1 GB of compressed data unless the workspace is configured with Premium capacity or a Premium Per User license.

Developing Large Power BI Datasets – Part 3 – Detail tables

Power BI Copilot, Data Agent Optimization & Performance

Amid the AI frenzy, there is a lot of conversation about how business users will use agentic chat to answer business questions rather than interactive, dashboard-style reports. Is there truly a shift in the industry, and is agentic analytics going to change the way most business users consume data?

Just how viable is the whole “chat with your data” option, and is it really a replacement for conventional reporting? I recently heard a VP-level leader at a large consulting firm say something to the effect of “we need to stop investing in dashboard-building skills and focus on creating AI-driven data analysis solutions for our consulting customers.” I’m paraphrasing from memory, but that was the sentiment. Are all business leaders across the industry giving up their dashboards, interactive visual reports and scorecards in exchange for AI chat? No. Of course they aren’t — but conversational analysis is a new way to consume business data.

FabCon & SQLCon: What I expect to learn in Atlanta this year

What do I want to learn at FabCon this year? I have high expectations for the FabCon conference during the week of March 16-20, 2026. With over 30,000 organizations actively using Fabric in production, I expect to hear much more about what we’ve learned, how to plan, optimize and monitor Fabric solutions rather than marketing rhetoric about new and upcoming features. There will be announcements and I’m interested in the long-term product vision, but I am anxious to learn more from others who are in the trenches and using Fabric day-to-day.

Visit Me in the Community Lounge

I will be in the community lounge expert’s area from 9 to 11 on Thursday answering questions about Fabric certifications. I have a new O’Reilly book on the PL-300 exam and recently delivered Fabric certification prep sessions for the DP-600 and DP-700 exams.

Power BI Fabric Patterns: Designing Semantic Models for Humans & Robots (AI)

Back in September 2025, about four months ago, I made this earlier post, as we were just gaining early experience…

3 thoughts on “Developing Large Power BI Datasets – Part 2 – Partitioning”

Pingback: Partitioning Data in Power BI – Curated SQL

Great post thanks. Is there any planned future developments which will make incremental refresh more effective for data that changes over a long period of time? Or is there any way to set our policy up using tabular editor to overcome the scenario below?

In our case we have data updating from 2 years ago, for example, meaning that our incremental refresh window needs to extend this far back.

In addition, we have data updating in almost every partition (not many rows, only a handful) meaning that an incremental refresh takes basically as long as a regular refresh, because every partition detects changes, even though it’s a small number of rows that have changed in each partition. For a scenario like this, it would make more sense to use a PK to look against source, check for any updates on the last_modified column and only update the rows that have changed.

Pingback: Developing Large Power BI Datasets – Part 1 – Paul Turley's SQL Server BI Blog

Pingback: Partitioning Data in Power BI – Curated SQL
Oliver MIlls says:

November 24, 2022 at 9:07 am

Great post thanks. Is there any planned future developments which will make incremental refresh more effective for data that changes over a long period of time? Or is there any way to set our policy up using tabular editor to overcome the scenario below?

In our case we have data updating from 2 years ago, for example, meaning that our incremental refresh window needs to extend this far back.

In addition, we have data updating in almost every partition (not many rows, only a handful) meaning that an incremental refresh takes basically as long as a regular refresh, because every partition detects changes, even though it’s a small number of rows that have changed in each partition. For a scenario like this, it would make more sense to use a PK to look against source, check for any updates on the last_modified column and only update the rows that have changed.

Loading...

Pingback: Developing Large Power BI Datasets – Part 1 – Paul Turley's SQL Server BI Blog

Developing Large Power BI Datasets – Part 2 – Partitioning

Partitioning Large Tables

Crossing the Bridge to Enterprise Power BI Development

Why Partition?

Implementing Incremental Refresh

Like this:

Paul Turley

3 thoughts on “Developing Large Power BI Datasets – Part 2 – Partitioning”

Leave a ReplyCancel reply

Partitioning Large Tables

Crossing the Bridge to Enterprise Power BI Development

Why Partition?

Implementing Incremental Refresh

Share this:

Like this:

Paul Turley

Related Posts

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

3 thoughts on “Developing Large Power BI Datasets – Part 2 – Partitioning”

Leave a ReplyCancel reply

Discover more from Paul Turley's SQL Server BI Blog