About three years ago when the data model development engineers from the Power BI product team told me they were working on the ability for tabular data models to share other published data models, that sounded amazing and almost too good to be true. They said it would be a lot of work to make this a reality. The Utopian dream of distributed datasets was publicly announced on the Wave 2 Roadmap last year and then the feature went to Preview in December with this announcement: Using DirectQuery for datasets and Azure Analysis Services (preview) – Power BI | Microsoft Docs. We’ve seen the new feature demonstrated and explained several times but now I’m exploring this exciting new feature for myself.
For this test, I have two different data models in separate PBIX files. One published dataset contains Internet Sales information from the ContosoDW Azure SQL database and the other contains Store Sales information. These models contain many of the same tables and a few unique tables. Notably, each data model contains a different fact table. Here are the two published data models in the Contoso Sales workspace:
After making sure that I had upgraded to the December 2020 version of Power BI Desktop, I open the Options dialog and enable the preview feature: DirectQuery for Power BI datasets and Analysis Services.
After restarting Power BI Desktop, I use Get Data and choose Power BI datasets. After connecting to the workspace and selecting the ContosoDW Internet Sales published dataset, the local model is now connected using a live connection.
After connecting to the remote model, all the tables are listed in the Fields list. At this point, I’m using a capability that has been available for a long time: one Power BI report file connected to a separate, published dataset. Previously I would not be able to add additional dataset connections or import from other sources into a live connected data model.
The status bar shows a new message with a link to enable changes to the model. On the Model ribbon, a new button is also displayed. Using either of these options will modify the local model metadata to allow multiple live dataset connections.
When I click the link Make changes to this model, a dialog box allows me to confirm that I want to switch the live connection to a DirectQuery connection and prepare the local model to allow additional DirectQuery connections.
What happens if I start with an existing local model?
The first important lesson I learned was this if I were to create a new PBIX report file containing existing tables and then connect to a published dataset, this does not enable the ability to make changes to the local model.
With an imported table in the data model before using Get Data to select and connect to a published dataset, the status bar shows this message:
I found this to be unintuitive: The text “(click to change)” is not a clickable link. At this point there is no way to update the local model with the necessary changes. I had to start with an empty PBIX to add the first live connection. After that, I could use Get Data to import additional tables from others sources, but not the other way around.
Adding More Dataset Connections
Next, I use Get Data to connect to the ContosoDW Store Sales published dataset. After the local model had been converted, there was nothing particularly unique about the experience. I could just connect to an additional dataset and all of the tables will simply show up in the data model diagram view. The tables from each model are displayed with a different colored heading, in this case they are blue and red.
I had several duplicate table names after connecting the second data model. The second set of tables were renamed and postfixed with the number 2. I renamed the tables that I wanted to keep visible. The redundant tables that I don’t need are displayed on the right side of this diagram layout. Tables in a remote model cannot be deleted from the local model. I used the Properties pane to set the unused tables to be hidden.
If I were connecting to a large remote model that had a lot of tables just for the sake of bringing one table into the local model, it would be cumbersome to hide all of the the unneeded tables – so I hope there will be an easier way to manage this in the future.
I opened Tabular Editor from the External Tools ribbon and was able to use it just as if I were designing a normal import model data model, to rename tables and set properties with multiple objects selected.
Adding relationships between tables from each remote model was pretty much the same process as it would have been in a local model. I can use the Manage Relationships dialog and use the drag-and-drop method in the diagram view. The latter method opens the Manage Relationships dialog rather then simply creating the relationship.
Finally, I added a new Combined Sales Qty measure to one of the remote tables in my local model that references an existing measure from each of the remote models. For demonstration, the table visual shown below includes the Order Date field which is related to each of the two fact tables (Online Sales and Store Sales), the Online Sales Qty measure from the Internet Sales remote model, the Store Sales Qty measure from the Store Sales remote model, and the new Combined Sales Qty measure which I added to the local model:
After publishing the new composite data model and report, the Lineage View shows that the Combined Sales dataset has connections and dependencies on the two remote datasets that show up as Analysis Services DirectQuery connections, apparently using the XMLA endpoint in the Power BI service.
Once I worked through the design process a couple of times, the data model behavior is seems to be exactly the same as always, and designing reports is no different at all. I’m just scratching the surface and I’m anxious to dig deeper and compare notes with others in the community. I have no doubt that behaviors, limits and issues will arise with this brand new and very compelling capability but I’m very encouraged so far.
9 thoughts on “A First Look at Gen 2 Composite Models with Live Power BI Datasets”
nice article. Thanks for providing this. Wanted to check, can the dynamic security be implemented on the power BI data set that has been pulled?
Hi Paul, thank you for the article, very informative on the functionality. I have question reg the performance impact of chaining a large AAS model with a small AAS model. From experience, If the relationship is set in PowerBI Dataset, will there be a performance hit and which side would be impacted, the large dataset or the smaller one?
Rahul, this feature is so new that we have relatively little experience to understand the limits. I expect that any design elements that would normally affect performance will be amplified with remote datasets. Because high cardinality typically has a negative performance impact, I would expect tables with higher row count to be slower; but I don’t think that the size of the overall dataset is as much of a factor. The best way to find out is to test it.
Very interesting thanks for your work. Are you able to use a compound model in embedding? I am trying to figure out how to make it works.
Were you able to get an answer to your question? Had I seen this, my answer would have been “I think so but please test it and see”, since it is still in preview and relies on the new Power BI Gen2 architecture, I’m not sure if there are limitations for embedding. Please let me know what you have found.
Very Good Article! This blog is really informative and useful. Thanks for sharing such informative content. Will help me a lot.
None of my users can view visuals that use multiple Power BI models as Direct Query sources. Have you run into this issue?
Interesting capability. I hope that MS makes it a bit smoother to do what you did however.
Not always that smooth, when it’s getting more complex stacked…