Power BIte: Creating dataflows with Power Query Online

This week’s Power BIte is the first in a series of videos[1] that present different ways to create new Power BI dataflows, and the results of each approach.

When creating a dataflow by defining new entities in Power Query Online, the final dataflow will have the following characteristics:

Attribute Value
Data ingress path Ingress via the mashup engine hosted in the Power BI service
Data location Data stored in the CDM folder defined for the newly created dataflow
Data refresh The dataflow is refreshed based on the schedule and policies defined in the workspace

Let’s look at the dataflow’s model.json metadata to see some of the details.

2019-10-27-10-13-08-592--Code

At the top of the file we can see the mashup definition, including the query names and load settings on lines 11 through 19 and the Power Query code for all of the entities on line 22.

2019-10-27-10-23-37-698--Code.png

At the bottom of the file we can see information about the refresh and storage.[2] Line 26 identifies the entity as a LocalEntity, which means that the entity’s data is physically stored in the current CDM folder.

Line 30 shows that the entity is fully refreshed rather than incrementally refreshed, and line 31 shows the file name where the entity data is stored. Lines 97 through 99 identify the single partition where the data for the current version of the entity is stored, including the full URI for the data file. If this entity used incremental refresh, there would be multiple partitions to match the incremental refresh policy.

If this information doesn’t all make sense just yet, please hold on. We’ll have different values for the same attributes for other dataflow creation methods, and then we can compare and contrast them.

I guarantee[3] it will make as much sense as anything on this blog.


[1] New videos every Monday morning!

[2] The same information is also included starting on line 103 for the Promotions entity,  but is not pictured here.

[3] Or your money back.

It all comes down to culture

I talk about data culture a lot, and in my presentations I often emphasize how the most important success factor when adopting a tool like Power BI[1] is the culture of the organization, not the tool itself.

I talk about this a lot, but I think Caitie McCaffrey may have just had the final word.[2]

2019-10-24-12-47-22-722--msedge

I don’t think that Caitie was talking about the enterprise adoption of self-service business intelligence, but she could have been.

In my day job I get to talk to leaders from large companies around the world, and to see how they’re adopting and using Power BI, Azure. Before today I didn’t think of Moby Dick – I thought of Leo Tolstoy’s classic Anna Karenina, which starts with this classic line:

All happy families are alike; each unhappy family is unhappy in its own way.

Although the details vary, large companies that have successfully adopted managed self-service BI at scale have cultures with important aspects in common:

  • Leaders empower business users to work with data
  • Leaders trust business users to use data to make better decisions
  • IT supports business users with platforms and tools and with curated data sources
  • Business users work with the tools from IT and the guidance from leaders, and work within the guardrails and guidelines given to them for this use
  • Business and IT collaborate to deliver responsive solutions and mature/stable solutions, with clearly defined responsibilities between them

Companies that are successful with managed self-service BI do these things. Companies that are not successful do not. The details vary, but the pattern holds up again and again.

How do these roles and responsibilities relate to culture?

In many ways a culture is defined by the behaviors it rewards, the behaviors it allows, and the behaviors it punishes. A culture isn’t what you say – it’s what you do.

In the context of BI, having a culture with shared goals that enable business and IT to work together with the support from the company leaders is the key. If you have this culture, you can be successful with any tool. Some tools may be more helpful than others, and the culture will enable the selection of better tools over time, but the tool is not the most important factor. The culture – not the tool – inevitably determines success.

This is not to say that BI tools should not improve to be a bigger part of the solution. But to paraphrase Caitie… maybe you should let that white whale swim past.

 


[1] But definitely not only Power BI.

[2] He says unironically, before writing many more words.

Using and reusing Power BI dataflows

I use this diagram a lot[1]:

excel white

This diagram neatly summarizes a canonical use case for Power BI dataflows, with source data being ingested and processed as part of an end-to-end BI application. It showcases the Lego-like composition that’s possible with dataflows. But it also has drawbacks – its simplicity omits common scenarios for using and reusing dataflows.

So, let’s look at what’s shown – and at what’s not shown – in my favorite diagram. Let’s look at some of the ways these dataflows and their entities can be used.

  1. Use the final entities as-is: This is the scenario implied by the diagram. The entities in the “Final Business View” dataflow represent a star schema, and are loaded as-is into a dataset.
  2. Use the final entities with modification: The entities in the “Final Business View” dataflow are loaded into a dataset, but with additional transformation or filtering applied in the dataset’s queries.
  3. Use the final entities with mashup: The entities in the “Final Business View” dataflow are loaded into a dataset, but with additional data from other sources added via the dataset’s queries.
  4. Use upstream entities: The entities in other dataflows are loaded into a dataset, likely with transformations and filtering applied, and with data from other sources added via the dataset’s queries.

Please understand that this list is not exhaustive. There are likely dozens of variations on these themes that I have not called out explicitly. Use this list as a starting point and see where dataflows will take you. I’ll keep the diagram simple, but you can build solutions as complex as you need them to be.


[1] This is my diagram. There are many like it, but this one is mine.

 

Power BIte: Sharing and reuse with dataflows and datasets

Last week I kicked off the new BI Polar YouTube channel with a video and blog post comparing and contrasting Power BI dataflows and datasets. In the days that followed, I continued to hear questions that led me to believe I hadn’t done a great job answering one vital question:

When would I use dataflows, and when would I use shared datasets?

Here’s the short answer:

And here’s the long answer: Lego Bricks and the Spectrum of Data Enrichment and Reuse.

The video focuses on – and demonstrates – sharing and reuse with both dataflows and datasets. It’s short and to the point[1] and focuses on this one question.

The blog post takes a more conceptual view, using Lego bricks as a metaphor for dataflows and datasets and the types of reuse they enable.

If you’ve watched the videos and read the posts and you still have questions, please let me know.


[1] As short and to the point as anything you’re likely to get from me, anyway. Brevity has never been my forte.

Fiore’s Virtues of Business Intelligence

In the late 1300s and early 1400s, Fiore de’i Liberi was a knight, a diplomat, and a fencing master. He also wrote one of the most comprehensive treatises on medieval combat, his Flower of Battle, of which four copies survive in museums and private collections today. Fiore started – or was a significant evolutionary step in – one of the most important and long-lasting traditions in armed and unarmed combat.

In addition to detailed instruction on fighting with dagger, longsword, spear, and other weapons, Fiore’s manuscript included a preface with information about the virtues that any fencer[1] would need to be successful in combat.

MS_Ludwig_XV_13_32r.jpg

In the image above, Fiore pictures the seven blows of the sword, and his four virtues, each represented by a different animal[2][3]:

This Master with these swords signifies the seven blows of the sword. And the four animals signify four virtues, that is prudence, celerity, fortitude, and audacity. And whoever wants to be good in this art should have part in these virtues.

Fiore then goes on to describe each virtue in turn:

Prudence
No creature sees better than me, the Lynx.
And I always set things in order with compass and measure.

Celerity
I, the tiger, am so swift to run and to wheel
That even the bolt from the sky cannot overtake me.

Audacity
None carries a more ardent heart than me, the lion,
But to everyone I make an invitation to battle.

Fortitude
I am the elephant and I carry a castle as cargo,
And I do not kneel nor lose my footing.[4]

Step back and read this again: “And whoever wants to be good in this art should have part in these virtues.”

That’s right – Fiore was documenting best practices, 600+ years ago. And although I suspect that Fiore wasn’t thinking about business intelligence projects at the time, I do believe that these virtues are just as relevant to the slicing and dicing[5] we’re still doing today. Let me explain.

Prudence – “…I always set things in order with compass and measure“: A successful BI practitioner knows what needs to be done before a project can begin, and when additional work is required before they can get started. Initiating a project requires careful setup and planning, and moving before the prerequisites for success are in place can be disastrous.[6]

Celerity – “I… am so swift to run and to wheel that even the bolt from the sky cannot overtake me:  Business requirements change day to day and hour to hour. To succeed, a BI practitioner must be prepared to move quickly and decisively, engaging without delay when an opportunity presents itself – and also be prepared to change direction as the needs of the project change.

Audacity – “…to everyone I make an invitation to battle:  Any project declined presents an opening for another practitioner, another team, another tool, and this is likely to reduce opportunities over time. Saying yes to difficult projects – and succeeding in their execution – is necessary to ensure that future projects don’t pass you by.

Fortitude – “And I do not kneel nor lose my footing: When Fiore speaks of fortitude, he does not speak of the strength that comes from big muscles. He speaks of the strength that comes from structure, and balance. His “elephant with a castle on its back” is a perfect metaphor for a BI solution delivered quickly and confidently because of the solid and stable platform on which it is built. Success doesn’t come from the extra effort put in when delivering a solution – it comes from the care and planning that went into the overall data estate.

You may look at these virtues and see contradiction – how can you have prudence and audacity and celerity? The answer for BI is the same answer that it is for the sword: practice, training, and preparation. In both situations, whether you’re battling with an armed foe or battling with a difficult client, you need to apply the right virtues at the right times, and to understand both the big picture and the day to day steps that produce larger successes. In both situations you’re also facing complex and dynamic challenges where you need to quickly take advantage of opportunities as they arise, and create opportunities when they don’t appear on their own[7]. Fortunately, as BI practitioners we can rely on the strengths of our teams – it’s not always a solo battle.

You may also look at these virtues and see Matthew stretching to make the most tenuous of analogies work, just because he loves swords as much as he loves BI. While this may be true, I do honestly believe that these virtues do apply here. Over the past 20-25 years I have seen many projects succeed because these virtues were embodied by the people and teams involved, and I’ve seen many projects fail where these virtues were absent. This isn’t the only way to look at success factors… but at the moment it’s my favorite.

In closing, I’d like to mention that this post marks one year since I started this blog. In the past year I’ve published almost 90 posts, and have had roughly 50,000 visitors and 100,000 page views. Here’s hoping that by applying Fiore’s virtues I’ll be able to make the next year even more productive and more successful than the year that has passed.

Thanks to all of you who read what I write, and who provide feedback here and on Twitter – I couldn’t do it without you.


[1] Fencer in this context meaning someone who fights with swords or other edged weapons, not the Olympic-style sport of fencing that a modern reader might picture when reading the word.

[2] As translated by Michael Chidester and Colin Hatcher.

[3] Although it may not be obvious to the modern reader, the animal at the bottom is an elephant with a tower or castle on its back. I suspect that Fiore never actually saw an elephant.

[4] In case these terms don’t immediately have meaning, prudence == wisdom, celerity == speed, audacity == daring, and fortitude == strength.

[5] See what I did there?

[6] I assume that Fiore’s use of the term “measure” here is pure coincidence.

[7] If you’ve worked on a high-stakes, high-visibility BI project where requirements changed during implementation, or where not all stakeholders were fully committed to the project goals, this will probably feel very familiar.

Power BI dataflows best practices

I do a lot that’s not related to dataflows. In fact, dataflows take up a surprisingly small part of my day, if your insight into my calendar came solely from this blog.

Despite this, I like to believe that I’m keeping my finger on the pulse of this feature, and when I learned today that the dataflows team had published best practice guidance almost a month ago, I was shocked and surprised.

Here are those best practices: https://docs.microsoft.com/en-us/power-bi/service-dataflows-best-practices

Image by rawpixel from Pixabay
This is apparently where Matthew thinks documentation comes from

In my defense, the blog post where this guidance was announced was the overall September update for dataflows, and it was the last link at the bottom of the post… but I still should have noticed.

These practices were produced by the dataflows team, and are based on questions and support tickets from customers around the world. Definitely check them out, and see how you can incorporate them into your Power BI solutions!