Power BIte: Creating dataflows with linked and computed entities

Important: This post was written and published in 2019, and the content below may no longer represent the current capabilities of Power BI. Please consider this post to be an historical record and not a technical resource. All content on this site is the personal output of the author and not an official resource from Microsoft.

This week’s Power BIte is the second in a series of videos[1] that present different ways to create new Power BI dataflows, and the results of each approach.

When creating a dataflow by defining new entities, the final dataflow will have the following characteristics:

Attribute Value
Data ingress path Ingress via the mashup engine hosted in the Power BI service, using source data that is also managed by the Power BI service, taking advantage of locality of data.
Data location Data stored in the CDM folder defined for the dataflow for computed entities. Data for linked entities remains in source dataflow and is not moved or copied.
Data refresh The dataflow is refreshed based on the schedule and policies defined in the workspace.

Let’s look at the dataflow’s model.json metadata to see some of the details.

2019-11-04-07-00-30-025--Code

At the top of the file we can see the mashup definition, including the query names and load settings on lines 11 through 35 and the Power Query code for all of the entities on line 37. This will look awfully familiar from the last Power BIte post.

Things start to get interesting and different when we look at the entity definitions:

2019-11-04-07-04-23-519--Code

On line 80 we can see that the Product entity is defined as a ReferenceEntity, which is how the CDM metadata format describes what Power BI calles linked entities. Rather than having its attribute metadata defined in the current dataflow’s model.json file, it instead identifies the source entity it references, and the CDM folder in which the source entity is stored, similar to what we saw in the last example. Each modelId value for a linked entity references the id value in the referenceModels section as we’ll see below.

The Customers with Addresses entity, defined starting on line 93, is the calculated entity built in the video demo. This entity is a LocalEntity, meaning that its data is stored in the current CDM folder, and its metadata includes both the location, and its full list of attributes.

The end of the model.json file highlights the rest of the differences between local and linked entities.

2019-11-04-07-16-41-335--Code

At line 184 we can see the partitions for the Customers with Addresses entity, including the URL for the data file backing this entity. Because the other entities are linked entities, their partitions are not defined in the current model.json.

Instead, the CDM folders where their data does reside are identified in the referenceModels section starting at line 193. The id values in this section match the modelId values for the model.json file, above, and the location values are the URLs to the model.json files that define the source CDM folders for the linked entities.

If this information doesn’t make sense yet, please hold on. We’ll have different values for the same attributes for other dataflow creation methods, and then we can compare and contrast them.

I guarantee[2] it will make as much sense as anything on this blog.


[1] New videos every Monday morning!

[2] Or your money back.

4 thoughts on “Power BIte: Creating dataflows with linked and computed entities

  1. Pingback: Power BIte: Creating dataflows by importing model.json – BI Polar

  2. Mark

    Love the site and content on dataflows is THE BEST. Videos are a great addition.

    For the enhanced compute engine to be used, must you create a computed entity from linked entities? Or does any computed entity leverage the enhanced compute engine (i.e. a computed entity that is referencing entities loaded to same dataflow)?

    If the latter, what are the considerations for loading entities in different dataflows vs. all in one dataflow (other than refresh schedule or general organization of entities)?

    Like

  3. Pingback: Dataflows enhanced compute engine – will it fold? – BI Polar

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s