Important: This post was written and published in 2020, and the content below may no longer represent the current capabilities of Power BI. Please consider this post to be more of an historical record and less of a technical resource. All content on this site is the personal output of the author and not an official resource from Microsoft.
tl;dr: If you want to refresh a dataflow without refreshing any downstream dataflows that depend on it, just clear the “Enable Load” setting from any linked entities that reference it. This will remove the dependencies that the Power BI service looks for, without breaking any of the downstream dataflow logic. Win!
At the end of a long day I got an email asking for help[1]. The email included a screen shot from the lineage view of a Power BI workspace, some context about working to troubleshoot a problem, and the question “We want to refresh this dataflow, and not have it refresh the downstream dataflows. Is this possible?”
I almost said no, and then I remembered this post and realized the answer was “yes, sort of.”
Composable ETL with linked and computed entities ensure that when an upstream dataflow is refreshed, any downstream dataflow will automatically refresh as well – all without any explicit configuration required. The dataflows engine in the Power BI service just looks at the dependencies and everything just works.[2]
This is great, until you need it to work differently, and the “no explicit configuration” also means “no real way to turn it off.” So long as there are linked entities referencing entities in the current dataflow, refreshing the current dataflow will cause the dataflows containing those linked entities to refresh as well.
Fortunately, that 2018-era blog post illustrates that clearing the “enable load” setting for a linked entity also clears the metadata that the dataflows engine looks at to build the dependencies graph used for composable ETL.
So I send off a quick reply, “Try this – I haven’t tested it end to end, but it should work,” and gave it a quick test because at this point I was curious.
This was my starting point: three dataflows, all related.
When I refresh the first dataflow in the chain, the next two refresh as expected.
To break that refresh chain I can just edit the second dataflow.
The only change I need to make is this one: clearing that “Enable load” check box.
Once this is done, the lineage view looks a little more like this, and I can refresh the upstream dataflow in isolation.
Once the troubleshooting is done, all I need to do to get back to work is to re-enable load on those temporarily disabled linked entities. Boom!
I doubt this technique will get used all that often, but it looks like it worked today. As I was finishing up this post I got an email confirmation that this solved the problem it needed to solve, and the person who had emailed asking for help is now unblocked.
Life is good.
Look, he’s emailing again, probably offering to buy me a coffee or a nice new sword to say thank you. Oh, he wants to show me other problems he’s run into, not offering to buy me anything at all.
Oh.
[1] Typically I respond to emails asking for help with a perfunctory redirection to another location, with a statement about how my schedule and responsibilities don’t allow me to scale by replying to one-off email support requests. The person sending this particular mail got a pass because they have consistently gone out of their way to help the Power BI community, and because they asked an interesting question with all of the information I needed to respond. It also helped that the email came in at the end of the day when my brain was too burned to start the next tasks on my list. I would definitely not recommend trying to use email me asking for help. Definitely not.
[2] Because of cloud magic, I think. Maybe metadata too, but isn’t metadata a kind of magic as well?
Pingback: Refreshing a Power BI Dataflow without Refreshing Downstream Dataflows – Curated SQL
“Look, he’s emailing again, probably offering to buy me a coffee or a nice new sword to say thank you. Oh, he wants to show me other problems he’s run into, not offering to buy me anything at all.
Oh.”
Well, that is diplomatically rude of him. I’d buy something for you for all the trouble that I’ve had.
LikeLike
There was nothing but love in the whole exchange. I’m sure he’ll offer to buy me a coffee (but probably not a sword) when we’re able to get together in person. I was just trying to end the post on a humorous note. 😉
LikeLiked by 1 person
Hi Matthew, I’m trying to find out if unchecking “enable load” on a Linked Entity will also prevent the Enhanced Compute Engine from being used? I do want to leverage it, I have two Linked Entities in my dataflow, but I don’t want this refresh chaining behavior from the upstream dataflow in the same Workspace (Workspace is in Premium Capacity). Thank you.
LikeLike
I’m also facing the same issue. I have 10 1st-layer dataflows that get the data directly from the source. Then they are appended at the 2nd-layer dataflow. My 1st-layer dataflows are scheduled at different time. Every time when one of them is being refreshed, it will automatically trigger the refresh on 2nd layer, which would take very long time to finish refresh every single 1st layer. I also do want to leverage the ECE without the automatic refresh on the chaining. Looking forward to the solution. Thank you!
LikeLike
I’d recommend also sharing this request at ideas.powerbi.com.
LikeLike