Are you in the California Bay area? Do you want to learn to do Power BI and Fabric the right…
The enterprise Business Intelligence and IT community have been anxiously awaiting this capability for many years and it has finally arrived – at least in Preview form for testing and prototyping real Power BI CI/CD DevOps solutions.
Microsoft Fabric (#MicrosoftFabric), the new Unified Analytics platform, was announced this morning, literally moments before the Build Conference. Satya made the official announcement in the keynote address. Why is this big news and why is it important? In a few words, this is the most significant expansion to the Power BI and the analytics platform since the product was introduced. Satya said that Fabric is the most significant data platform innovation since SQL Server.
The Lakehouse is the evolution of the earlier cloud data platform in many pieces that came with “some assembly required”. All of the components are modern, mature and capable but complicated and require specialized skills. Imagine that the new version is easier to assemble with instructions that are only a few pages with stick figures, and it comes with an Allen wrench.
We’re seeing consulting customers putting Lakehouse and BI solutions on-line in just a few weeks. Then they iterate to scale-up their modern data warehouse/BI platform as they train their Center of Excellence champions and Data Governance organizations as they progress.
In a recent blog post, Paul discussed that DevOps, and specifically CI/CD, principles are essential for modern software development, and Power BI is no exception. Power BI CI/CD for enterprise class projects can be achieved using Azure DevOps, with steps including source control, automated testing, and build and release pipelines.
Business users often need to know how fresh the data is that they see on a Power BI report. This requirement can be interpreted to mean either “when was the last date & time that the source data was updated?” or “when was the last date & time that the data model was refreshed?”
I’ve used a few different approaches to record the actual date and time, but all generally using the same technique.
There is no secret about this. If you do any legitimate research about Power BI (reading blogs, books or training from reliable sources), you will quickly learn that a lot of basic functionality requires a dimensional model, aka “Star Schema”. This is a hard fact that every expert promotes, and self-taught data analysts either have learned or will learn through experience. So, if everyone agrees on this point, why do so many resist this advice?
Perspective is everything. I didn’t understand why getting to the star schema was so out of reach so often until I was able to see it from another perspective. There are a few common scenarios that draw source data into different directions than an ideal dimensional model.
DevOps isn’t difficult to implement for small and medium-scale projects, and simple things like managing version control in a code repository can save hours of lost time. Organization who are accustomed to managing large application development initiatives might expect to have a fully automated build and deployment process in concert with an Agile delivery process, managed with specialized tools like Jira, GitHub and Azure DevOps.
Power BI is architected to consume data in a dimensional model, with narrow fact tables and related dimensions. Introducing a big, wide table in a tabular model is extremely inefficient. It takes up space and memory resources, impacts performance, and complicates measure coding. Flattening records into a flat table is one of the worst things you can do in Power BI and a common mistake made by novice Power BI users.
Table partitioning has long been an important task in semantic model design using SQL Server Analysis Services. Since SSAS Tabular and Power BI models are built on top of the SSAS architecture, the pattern of partitioning remains the same as it has been for twenty years. However, the specific methods for implementation have been fine-tuned and improved. The reasons for partitioning large fact tables mainly include:
Improve refresh speed,
Prevent reloading historical records,
Capture updated history,
Reduce database resource load
You don’t have to have massive tables to benefit from partitioning. Even tables with a few hundred thousand records can benefit from partitioning, to improve data refresh performance and to detect source data changes. There is little maintenance overhead, so the benefits usually outweigh the cost, in terms of effort and management.
When planning a Power BI solution, how can we plan for scale and growth? Like any technology, Power BI has limits but Power BI can manage a surprising large value of analytic data. We also have tools and companion technologies capable of handling workloads of data that alongside Power BI.
The Right Tool for the Job
You wouldn’t use a screwdriver to hammer a nail and you wouldn’t use a chainsaw to assemble an Ikea cabinet; so why would you use Power BI Desktop to create a detailed shipping manifest? It’s not the right tool for the job. The Power BI platform contains tools that are appropriate to use for different reporting scenarios and using them together will yield far better results than forcing tools to behave differently than they were intended.
Colleagues: if I have sent you to this post, it is because I respect you and want to make sure…
Even for small, informal BI projects, shaping the data into a dimensional model alleviates complexity, speeds up slow calculations and reduces the data model storage size. I conclude this post by reviewing seven data architectures and the data shaping methods with different degrees of scale.
If you are in the Chicago area and haven’t already registered for the Data Insights Summit, please join me. I…
A Gantt chart is an excellent example of where Paginated Report & SSRS were an ideal choice for the purpose. It is a running list of activities with the duration for each displayed as a horizontal bar depicting the beginning and ending day along a horizontal scale. The challenge is that this is not a standard chart type in either Power BI or SSRS/Paginated Reports. Furthermore, project planners may prefer to see activities as rows in the format of a printed page.