About three years ago when the data model development engineers from the Power BI product team told me they were working on the ability for tabular data models to share other published data models, that sounded amazing and almost too good to be true. They said it would be a lot of work to make this a reality. The Utopian dream of distributed datasets was publicly announced on the Wave 2 Roadmap last year and then the feature went to Preview in December with this announcement: Using DirectQuery for datasets and Azure Analysis Services (preview) – Power BI | Microsoft Docs. We’ve seen the new feature demonstrated and explained several times but now I’m exploring this exciting new feature for myself.
For this test, I have two different data models in separate PBIX files. One published dataset contains Internet Sales information from the ContosoDW Azure SQL database and the other contains Store Sales information. These models contain many of the same tables and a few unique tables. Notably, each data model contains a different fact table. Here are the two published data models in the Contoso Sales workspace:
After making sure that I had upgraded to the December 2020 version of Power BI Desktop, I open the Options dialog and enable the preview feature: DirectQuery for Power BI datasets and Analysis Services.
After restarting Power BI Desktop, I use Get Data and choose Power BI datasets. After connecting to the workspace and selecting the ContosoDW Internet Sales published dataset, the local model is now connected using a live connection.
After connecting to the remote model, all the tables are listed in the Fields list. At this point, I’m using a capability that has been available for a long time: one Power BI report file connected to a separate, published dataset. Previously I would not be able to add additional dataset connections or import from other sources into a live connected data model.
The status bar shows a new message with a link to enable changes to the model. On the Model ribbon, a new button is also displayed. Using either of these options will modify the local model metadata to allow multiple live dataset connections.
When I click the link Make changes to this model, a dialog box allows me to confirm that I want to switch the live connection to a DirectQuery connection and prepare the local model to allow additional DirectQuery connections.
What happens if I start with an existing local model?
The first important lesson I learned was this if I were to create a new PBIX report file containing existing tables and then connect to a published dataset, this does not enable the ability to make changes to the local model.
With an imported table in the data model before using Get Data to select and connect to a published dataset, the status bar shows this message:
I found this to be unintuitive: The text “(click to change)” is not a clickable link. At this point there is no way to update the local model with the necessary changes. I had to start with an empty PBIX to add the first live connection. After that, I could use Get Data to import additional tables from others sources, but not the other way around.
Adding More Dataset Connections
Next, I use Get Data to connect to the ContosoDW Store Sales published dataset. After the local model had been converted, there was nothing particularly unique about the experience. I could just connect to an additional dataset and all of the tables will simply show up in the data model diagram view. The tables from each model are displayed with a different colored heading, in this case they are blue and red.
I had several duplicate table names after connecting the second data model. The second set of tables were renamed and postfixed with the number 2. I renamed the tables that I wanted to keep visible. The redundant tables that I don’t need are displayed on the right side of this diagram layout. Tables in a remote model cannot be deleted from the local model. I used the Properties pane to set the unused tables to be hidden.
If I were connecting to a large remote model that had a lot of tables just for the sake of bringing one table into the local model, it would be cumbersome to hide all of the the unneeded tables – so I hope there will be an easier way to manage this in the future.
I opened Tabular Editor from the External Tools ribbon and was able to use it just as if I were designing a normal import model data model, to rename tables and set properties with multiple objects selected.
Adding relationships between tables from each remote model was pretty much the same process as it would have been in a local model. I can use the Manage Relationships dialog and use the drag-and-drop method in the diagram view. The latter method opens the Manage Relationships dialog rather then simply creating the relationship.
Finally, I added a new Combined Sales Qty measure to one of the remote tables in my local model that references an existing measure from each of the remote models. For demonstration, the table visual shown below includes the Order Date field which is related to each of the two fact tables (Online Sales and Store Sales), the Online Sales Qty measure from the Internet Sales remote model, the Store Sales Qty measure from the Store Sales remote model, and the new Combined Sales Qty measure which I added to the local model:
After publishing the new composite data model and report, the Lineage View shows that the Combined Sales dataset has connections and dependencies on the two remote datasets that show up as Analysis Services DirectQuery connections, apparently using the XMLA endpoint in the Power BI service.
Once I worked through the design process a couple of times, the data model behavior is seems to be exactly the same as always, and designing reports is no different at all. I’m just scratching the surface and I’m anxious to dig deeper and compare notes with others in the community. I have no doubt that behaviors, limits and issues will arise with this brand new and very compelling capability but I’m very encouraged so far.