This week’s Power BIte is the second in a series of videos that present different ways to create new Power BI dataflows, and the results of each approach.
When creating a dataflow by defining new entities, the final dataflow will have the following characteristics:
|Data ingress path
||Ingress via the mashup engine hosted in the Power BI service, using source data that is also managed by the Power BI service, taking advantage of locality of data.
||Data stored in the CDM folder defined for the dataflow for computed entities. Data for linked entities remains in source dataflow and is not moved or copied.
||The dataflow is refreshed based on the schedule and policies defined in the workspace.
Let’s look at the dataflow’s model.json metadata to see some of the details.
At the top of the file we can see the mashup definition, including the query names and load settings on lines 11 through 35 and the Power Query code for all of the entities on line 37. This will look awfully familiar from the last Power BIte post.
Things start to get interesting and different when we look at the entity definitions:
On line 80 we can see that the Product entity is defined as a ReferenceEntity, which is how the CDM metadata format describes what Power BI calles linked entities. Rather than having its attribute metadata defined in the current dataflow’s model.json file, it instead identifies the source entity it references, and the CDM folder in which the source entity is stored, similar to what we saw in the last example. Each modelId value for a linked entity references the id value in the referenceModels section as we’ll see below.
The Customers with Addresses entity, defined starting on line 93, is the calculated entity built in the video demo. This entity is a LocalEntity, meaning that its data is stored in the current CDM folder, and its metadata includes both the location, and its full list of attributes.
The end of the model.json file highlights the rest of the differences between local and linked entities.
At line 184 we can see the partitions for the Customers with Addresses entity, including the URL for the data file backing this entity. Because the other entities are linked entities, their partitions are not defined in the current model.json.
Instead, the CDM folders where their data does reside are identified in the referenceModels section starting at line 193. The id values in this section match the modelId values for the model.json file, above, and the location values are the URLs to the model.json files that define the source CDM folders for the linked entities.
If this information doesn’t make sense yet, please hold on. We’ll have different values for the same attributes for other dataflow creation methods, and then we can compare and contrast them.
I guarantee it will make as much sense as anything on this blog.
 New videos every Monday morning!
 Or your money back.