At any given time, our consulting practice are managing dozens of Power BI projects for customers. These range in scale from small, desktop report development to large-volume, shared data models with hundreds of reports owned by IT and business teams. This provides a unique perspective into the sort of things that cause projects to go well – or to not go well. Aside from using common design patterns like dimensional modeling – team file sharing, version control and deployment management are near the top of the list of successful project characteristics. With increased frequency, customers are asking if we can implement DevOps and CI/CD (Continuous Integration/Continuous Delivery) for their Power BI projects. But, what, exactly does this mean?
If you are a DevOps purist, you might argue that my recommendations for the entry-level solution aren’t really true DevOps & CI/CD. To that I will respond that they are the essence of DevOps and have the same goals at the most fundamental level. And, if you need your projects to evolve to support more advanced build and deployment standards, following these practices will get you started. Most importantly, your projects will be more successful and risk resistant. Your Developers and Project Managers should sleep better at night.
DevOps isn’t difficult to implement for small and medium-scale projects, and simple things like managing version control in a code repository can save hours of lost time. Organizations who are accustomed to managing large application development initiatives might expect to have a fully automated build and deployment process in concert with an Agile delivery process, managed with specialized tools like Jira, GitHub and Azure DevOps.
By most estimates, far more than 80% of all Power BI projects are small and performed by one Data Analyst or Developer. We know that Power BI is also used to develop high-volume datasets, models and business reports in full-scale deployment scenarios where DevOps principles are taken very seriously. So, with a significant minority of large-scale Power BI projects fitting into a category where someone might even think about fundamental concepts like version control or team development; what, exactly does DevOps for Power BI even mean when one size doesn’t fit every project?
Before you go plopping your PBIX files into GitHub and trying to split and merge features into a branch and then build and deploy your Power BI objects (which won’t go well, I assure you – speaking on behalf of all of us who have tried), consider the following important questions:
- Why are BI projects and AppDev projects so different?
- For version control, can’t we just download the latest PBIX file that someone previously deployed to the Power BI service?
- PBIX files are big. How can I get files to fit in the code repository?
- Will Power BI deployment pipelines integrate with automated DevOps?
Rather than expounding on the nebulous spectrum of every possible BI project type, consider that there are two major categories of projects:
If your project scale fits into the left “Small/Mid-scale Class” category, the good news is that your needs are probably light, so you can develop and manage projects by following a few simple guidelines outlined in this article. If your project fits into the “Enterprise Class” category, that’s also good news but you need to understand how BI projects are different than AppDev and database projects, and then use the right tools and practices to enable true CI/CD for Power BI.
Let’s break it down using this Power BI Project DevOps Maturity pyramid…
At the bottom of the pyramid, you can see that every serious Power BI project should have the project files managed in shared storage. This is DevOps 101: Use a source code repository so files can be recovered and shared with other project team members. It doesn’t have to be sophisticated – again, depending on your needs.
If you are using SharePoint, Teams, OneDrive for Business or GitHub; you automatically get versioning that works with very little effort.
The further you move up the maturity pyramid, the further we progress from simple team development to DevOps in its purest form, and CI/CD. I’m covering this in four parts, with the first one (the two items at the bottom of the pyramid) covered in this post and the following 13-minute video:
Later posts and subsequent videos will cover parts 2, 3 and 4, moving up the maturity pyramid.
Let’s start with why Power BI Desktop files don’t naturally work with code management tools.
Back in the day, the very same objects (for the most part) we create today with Power BI Desktop were defined as individual XML files that were all developed as a Visual Studio solution. Keep in mind that 10-20 years ago, BI solutions were all pretty much created by IT developers and not data savvy business users like many are today. All of the object definitions eventually got wrapped up into a big-ole gob of XML & JSON stored in a large, single file called Model.BIM. The reports were stored separately as XML files. Managing the whole thing was “delicate” and complicated, to say the least. To make all of this easier for non-developers, our friends at Microsoft took the modern version of all these little files and zipped them up into a single file with a PBIX extension, which is the default file created by Power BI Desktop. If you were to extract a PBIX file, you would see folders full of JSON, XML and binary files.
“Easy”, says the uber geek from IT, “we’ll just unzip them into files. Let developers have their way, and then zip them all back up when we’re done.” Many have tried and a couple people succeeded for a short while until the file format changed or some dependency didn’t match up. It is still a delicate matter to put Humpty Dumpty back together again over-and-over.
There are a handful of Power BI extension tools specifically designed to reliably distill a PBIX file into parts and to perform differencing and merging on multiple data model versions while coping with the nuances of the complex file structures that standard AppDev differencing tools aren’t equipped to handle. Under some conditions, you can continue to manage development in the streamlined PBIX file structure, compatible with Power BI Desktop. But, under some conditions, once you cross a certain line, you must continue to use more sophisticated development tools and you can’t go back to using Power BI Desktop – not the end of the world in big IT-scale projects but it is an important reality to consider.
Getting back to small and mid-scale projects… the bottom line is that simple team development, versioning and lightweight DevOps all work quite well with Power BI Desktop files. But, if you choose to swim to the deep end of the pool, you might need to use file formats that no longer support Desktop and require additional software tools. We’ll explore using those tools in a later follow-up post after we cover the basics in this one.
The first and greatest challenge to overcome, is some method to keep the PBIX files small on the developer’s desktop. We also need an easy way to control the data volume for testing and post-deployment data refresh. This can easily be done using query parameters, which I cover in this post: Developing Large Power BI Datasets – Part 1 – Paul Turley’s SQL Server BI Blog.
With that bump out of the way, you can move on to managing your development files in shared file storage of some kind. Fortunately, SharePoint, Teams and OneDrive for Business provide an elegant solution. After creating a shared folder online, just synch to your local computer and then make sure you close and save your PBIX files after adding critical features – and at the end of every business day. In the video, I demonstrate using Teams and SharePoint online to synch PBIX files with your computer’s file storage and synch-up with cloud storage after you make changes.
If you need to take shared storage up a notch in an IT setting, GitHub is your friend. Like any other project, create a code repository shared with other team members and clone it to your local drive. Perform a commit and then either “push” or make a “pull request” after making changes that you can’t afford to lose. This creates a new minor version than can be merged into the main code branch.