Communicating the voice of the customer

My last post focused primarily on a problem that can arise when there’s a central team that sits between a team responsible for delivering solutions, and teams that have problems to be solved. The advice in that article is valuable[1], but it’s also very general. It also introduces a problem without providing any advice to solve the problem, beyond the not-very-helpful “keep an eye open for this problem behavior playing out.”

In this article I’m going to share some more specific advice to help overcome this challenge if you’re part of that central team. In this article I’m going to share my advice for communicating the voice of the customer.

In my experience, there are four key success factors for effectively communicating the voice of the customer when you’re not really the customer yourself.

Success Factor 1: Understand both the problem domain and the solution domain. Since this is what most of the last set of articles have covered already I don’t plan to re-cover this familiar ground, but it’s still important to mention. If you’re going to act as a bridge between two worlds, you need a meaningfully deep understanding of both.

Success Factor 2: Don’t have an agenda. More specifically, don’t have an agenda other than enabling the success of the the solution team and the customers you enable. In order to speak with the voice of the customer, the people you’re communicating with need to understand that you’re really speaking for the customer, and not for yourself. This doesn’t mean you can’t have an opinion – it means that you need to share the customer’s truth even when it might not support that opinion.

Success Factor 3: Tell a story. If you only share numbers and statistics, you’re leaving the interpretation of those numbers up to your audience, and that audience doesn’t have the context that you do. You’re from the team that understands the needs of the customer – and you’re the one that doesn’t have an agenda. These two facts make put you in an ideal position to tell a story that captures the customer’s scenarios, goals, priorities, and challenges and to ensure that it is received in a way that inspires your audience to action.

Success Factor 4: Include numbers and statistics. If you only tell a story, your audience will often assume that you’re telling your own story to forward your own agenda. Having data available to back up the story, including specific data to back up each key story point, helps overcome any skepticism and ensure that your story can be received, believed, and acted upon. The amount of data you need will depend on factors including how much trust you’ve already earned with your audience, and how well the actions you’re hoping to inspire align with their current goals.[2] 

Somewhere between telling a story and including data[3]  lies including the literal voice of the customer. When you’re meeting with customers, take thorough notes. When possible, record your customer conversations. Then, when preparing to tell your story, have verbatim customer quotes prepared to reinforce they key points of your story. This lets you say “here’s a customer problem I think you need to prioritize solving” and “here’s how these customers have independently described their experiences in their own words.” It’s easy for someone to say that you don’t really understand the customer’s problem, because you’re not really the customer. But it’s hard for someone say that the customer doesn’t understand their own problem.[4] Bringing the voice of the customer to a story is like having another five or six aces up your sleeve, but it’s not technically cheating.

These four behaviors have proven indispensable in my work on the Power BI CAT team, and I see them as success factors in many of the teams I’ve worked with that also operate between the problem domain and the solution domain. If you’re part of a center of excellence or another team that follows this pattern, look for opportunities to incorporate these behaviors into your work. They’ve served me very well, and I suspect they’ll do the same for you.


[1] In any event, I think it’s valuable, but my opinion here may not be coming from a position of neutral impartiality. You can reach your own conclusion.

[2] In my experience, earning and retaining the trust of your audience is a much more important factor than the story or the data. Right now the only advice I can think of here is “act consistently with integrity and transparency” but maybe I can find a blog post on this topic at some point.

[3]  I could call this “Success Factor 5” or even “Success Factor 3.5” but then I would need to go back up and edit the post introduction and I’m far too lazy for that sort of work.

[4] If you do find the solution team you work with saying that the customer’s problems aren’t real, this is a massive red flag that needs to be confronted. In my experience, if this attitude persists it’s time to start escalating or looking for a new team.

Measuring success and satisfaction

Back in 2020 I wrote a post titled “Successfully measuring / measuring success” that talked about a problem without meaningfully describing it, because at the time I didn’t believe that the problem itself was particularly relevant. I was more interested in the pattern, which at the time I thought was “measuring the wrong thing.”

In recent weeks I’ve found myself sharing this post more frequently, and have come to believe that the underlying pattern is actually something else. Let’s look at my beautiful  2010-era chart one more time.

Before Power BI we had PowerPoint, but data is still Power

In this chart, both series represent NSAT[1] scores – basically they’re saying “this is how happy our customers are with a thing.” This sounds pretty simple, but the tricky part is in the details. Who are “our customers” and what is the “thing” they’re happy or unhappy with?

In the context of the chart above, the yellow series was “customer satisfaction with what they got.” The blue series was “customer satisfaction with what we shipped.” Implied in these descriptions and the difference in the two data series is that something happened between the time we shipped something and the time the customers got something, and that the thing that happened was key to the customers’ satisfaction.

Without going into too many details[3], we were shipping a product that was used by partners, and those partners used our product to deliver an experience to their customers. In the FY06 timeframe we started to change the product that we shipped, and the partners used the updated product to deliver an experience that met their customers’ needs. Now they just had to do a little more work to keep their customers happy. As the product changes continued, the partner load increased. They had to do more and more to fix what we gave them and to keep their customers happy. You can see the story play out in the blue series in the chart above.

We were looking at the yellow series, and falsely conflating “customer satisfaction” with “customer satisfaction in what we have shipped to partners.” We didn’t truly appreciate the importance of the party in the middle and their impact, and it ended up causing no end of problems. We were measuring customer satisfaction, but failing to appreciate what it was that customers were satisfied with – and how little that satisfaction related to what we had created.

And this is the pattern I’ve been seeing more often lately.

In my recent post on the role of the Power BI CAT team and how that role has parallels with Power BI centers of excellence, I described a success pattern where a central team serves as a value-add bridge between people with problems and people with solutions.

This pattern is one that I’ve seen provide significant value in multiple contexts… but it also introduces risk of measuring the wrong things, and overlooking real problems. This is a side-effect of the value that the central team provides. Customers are interacting with the work of the central team in addition to[4] the work of the solution team, and it may be difficult for them to understand what part of their experience is dependent on what factors.

In this situation there is a risk of the solution team overlooking or failing to appreciate and prioritize the problems their customers are experiencing. This is  a problem that the “curate” function in the diagram above is designed to mitigate, but the risk is real, and the mitigation takes ongoing effort.

When a member of a solution team works with customers directly, it’s hard to overlook their challenges and pain. When that solution team member hears about customer challenges from an interim party, the immediacy and impact can be lost. This effect is human nature, and the ongoing effort of the central team to curate customer feedback is vital to counteract it.[5]

As mentioned at the top of the article, I’ve seen this pattern more often recently, where a solution team is failing to recognize problems or opportunities because they’re looking at the wrong thing. They’re focused on what their end customer is saying, instead of looking at the bigger picture and the downstream value chain that includes them and their customers, but isn’t limited to these two parties. It’s easy to mistake “customer being happy” for “customer being happy with what we produce” if you’re not keeping an eye on the big picture.

It’s easy to mistake “customer being happy” for “customer being happy with what we produce” if you’re not keeping an eye on the big picture.

If you’re in a situation like this, whether you’re a member of a solution team, a member of a central team, or a member of a customer/consumer team, you’ll do well to keep an eye open for this problem behavior playing out.


[1] You can read this article if you’re curious about NSAT as a metric and what the numbers mean and were too lazy[2] to read the 2020 blog post I linked to above.

[2] I’m judging you, but not too harshly.

[3] I’m being deliberately vague here, trying to find a balance between establishing enough context to make a point and not sharing any decade-plus-old confidential information.

[4] Or in some circumstances, instead of.

[5] Please keep in mind that in most circumstances the central team is introduced when direct ongoing engagement between the solution team and customers can’t effectively scale. If you’re wondering why you’d want a central team in the first place, it may be because your current scenario doesn’t need it. If this is the case, please keep reading so you’ll be better prepared when your scenario gets larger or more complex and you need to start thinking about different solutions.

Problem Domain, Solution Domain

I’ve been thinking about problem domains and solution domains a lot lately. I’ve posted on this topic before, but the more I think about it, the more I think I should explore it more. Starting now.

Image of interlocking puzzle piece hearts from https://pixabay.com/illustrations/puzzle-heart-love-two-hearts-1721619/
When the right problems meet the right solutions, it can be magical

Let’s begin by defining our terms.[1]

A problem domain is a subject area where people work.

If you’re a business intelligence or data professional, the problem domains of interest are often a business function like finance, supply chain or HR. The problem domain experts[2] are typically people who are work in one of these fields, and who might come to you looking for solutions, or for help building solutions.

A solution domain is also a subject area where people work.

If you’re a business intelligence or data professional, the solution domains of interest are often some combination of data visualization, data modeling, data transformation, and so on. They may also be DAX, Power Query, Power BI, or another specific set of tools and technologies.  The solution domain experts[3] are typically people who build data and BI applications and systems to solve problems in other problem domains.

On the other hand, if you’re a member of the Power BI product team[4], the solution domain you’re working in is commercial software development – and the problem domain of interest is building, deploying, monitoring, and/or administering Power BI solutions. Everything is relative, and whether a given subject area is a problem domain or a solution domain is a function of the context in which it is being evaluated.

Let’s pause to let that sink in for a minute. None of the information above is particularly new, and it may not seem profound at first glance, but these two terms are some of the most foundational concepts of building a data culture.

A successful and mature data culture is the product of the right people doing the right things with the right data as part of the right processes.[5] This means that a successful and mature data culture involves solution domain experts and problem domain experts having healthy partnerships and mutual respect… which is also a foundational concept that sounds simple until you look at it more closely.

If you think about the traditional relationship between business and IT, “partnership” probably isn’t the first word that leaps to mind. All too often this relationship is characterized by conflict and a lack of mutual respect that is in part a function of misaligned priorities. Like many entrenched conflicts it is also partly a function of history and  the mistrust produced by historical wrongs – actual or perceived.

Most interesting things – interesting conversations, interesting projects, interesting jobs and careers – exist at the intersection of the problem domain and the solution domain. Interesting things happen at the edges where one thing ends and another thing begins. This is where complexity is found, because multiple domains are involved and making sense requires experts in each domain.

Unfortunately this is also where things tend to go wrong. Too often things fall into the cracks between the problem domain and the solution domain. Experts in one domain don’t value the other part of the picture, or they don’t see it as their job, or they assume that someone else will figure that part out… or maybe it’s none of these things, and they just lack the vocabulary and the context to close that gap.

Please take a moment to think honestly and critically about the problem domains and solution domains in which you operate, and your relationships with the folks in other domains with whom you interact. This is all I have for today, and although I don’t know exactly where I’m going down this path, I know I’m not done – and I know a little introspection can go a long way.


[1] These are my personal definitions that I’m making up as I write this post. You’ll find other definitions if you’re willing to go looking, and although those definitions will align broadly with these, they will have different emphasis and context and nuance because they’re not mine.

[2] The subject matter experts in a given problem domain.

[3] The subject matter experts in a given solution domain.

[4] Or another team building commercial tools for data and analytics.

[5] And the state in which these things are the norm and the exception.

Problems, not solutions

Imagine walking into a restaurant.

No, not that one. Imagine walking a nicer restaurant than the one you thought of at first. A lot nicer.

restaurant-2697945_640
Even nicer than this.

Imagine walking into a 3-star Michelin-rated best-in-the-world restaurant, the kind of place where you plan international travel around reservations, the kind of place where the chef’s name is whispered in a kind of hushed awe by other chefs around the world.

Now imagine being seated and then insisting that the chef cook a specific dish in a specific way, because that’s what you’re used to eating, because you know what you like and what you want.

knife-1088529_640
I’ll just leave this here for no particular reason.

In this situation, one of three things is likely to happen:

  1. The chef will give you what you ask for, and your dining experience will be diminished because your request was granted.
  2. The chef will ask you to leave.
  3. The chef will instruct someone else to ask you to leave.[1]

Let’s step back from the culinary context of this imaginary scenario, and put it into the context of software development and BI.

Imagine a user emailing a developer or software team[2] and insisting that they need a feature developed that works in some specific way. “Just make it do this!” or maybe “It should be exactly like <legacy software feature> but <implemented in new software>!!”

I can’t really imagine the restaurant scene playing out – who would spend all that money on a meal just to get what they could get anywhere? But I don’t need to imagine the software scene playing out, because I’ve seen it day after day, month after month for decades, despite the fact that even trivial software customization can be more expensive than a world-class meal. I’ve also been on both sides of the conversation – and I probably will be again.

When you have a problem, you are the expert on the problem. You know it inside and out, because it’s your problem. You’ve probably tried to solve it – maybe you’ve tried multiple solutions before you asked for help. And while you were trying those ineffective solution approaches, you probably thought of what a “great” solution might look like.

So when you ask for help, you ask for the solution you thought of.

This is bad. Really bad.

“Give me this solution” or “give me this feature” is the worst thing to ask for. Because while you may be the expert on your problem, you’re not an expert on the solution. If you were, you wouldn’t be asking for help in the first place.

And to make matters worse, most of the people on the receiving end aren’t the IT equivalents of 3-star Michelin-rated chefs. They’re line cooks, and they give you what you asked for because they don’t know any better. And because the customer is always right, right?

Yeah, nah.

As a software professional, it’s your job to solve your customers’ problems, and to do so within constraints your customers probably know nothing about, and within an often-complex context your customers do not understand[3]. If you simply deliver what the customer asks for, you’ve missed the point, and missed an opportunity to truly solve the fundamental problem that needs to be solved.

If you’re a BI professional, every project and every feature request brings with it an opportunity. It’s the opportunity to ask questions.

Why do you need this?

When will you use it?

What are you doing today without the thing you’re asking for?

When will this be useful?

Who else will use it?[4]

As a software or BI professional, you’re the expert on the solution, just as your customer is the expert on the problem. You know where logic can be implemented, and the pros and cons of each option. You know where the right data will come from, and how it will need to be transformed. You know what’s a quick fix and what will require a lot of work – and might introduce undesirable side-effects or regressions in other parts of the solution.

With this expertise, you’re in the perfect position to ask the right questions to help you understand the problem that needs to be solved. You’re in the perfect position to take the answers to your questions and to turn them into what your customer really needs… which is often very different from what they’re asking for.

You don’t need to ask these questions every time. You may not even need to ask questions of your customers most of the time[5]. But if you’re asking these questions of yourself each time you’re beginning new work – and asking questions of your customers as necessary – the solutions you deliver will be better for it.

And when you find yourself on the requesting side (for example, when you find yourself typing into ideas.powerbi.com) you’re in the perfect position to provide information about the problem you need solved – not just the solution you think you need. Why not give it a try?

This is a complex topic. I started writing this post almost 100 years ago, way back in February 2020[6]. I have a lot more that I want to say, but instead of waiting another hundred years I’ll wrap up now and save more thoughts for another post or two.

If you’ve made it this far and you’re interested in more actual best practices, please read Lean Customer Development by Cindy Alvarez. This book is very accessible, and although it is targeted more at startups and commercial software teams it contains guidance and practices that can be invaluable for anyone who needs to deliver solutions to someone else’s problems.


 

[1] This seems like the most likely outcome to me.

[2] This could be a commercial software team or “the report guy” in your IT department. Imagine what works for you.

[3] If you’re interested in a fun and accessible look at how the Power BI team decides what features to build, check out this 2019 presentation from Power BI PM Will Thompson. It’s only indirectly related to this post, but it’s a candid look at some of the “often-complex context” in which Power BI is developed.

[4] Please don’t focus too much on these specific questions. They might be a good starting point, but they’re just what leaped to mind as I was typing, not a well-researched list of best practice questions or anything of the sort.

[5] If you’re a BI developer maintaining a Power BI application for your organization, you may have already realized that asking a ton of questions all the time may not be appreciated by the people paying your salary, so please use your own best judgment here.

[6] This probably explains why I so casually mentioned the idea of walking into a restaurant. I literally can’t remember the last time I was in a restaurant. Do restaurants actually exist? Did they ever?

Old videos, timeless advice

It’s 2021, which means my tech career turns 25[1] this year.

It me

Back in 1995 an awesome book was published: Dynamics of Software Development by Jim McCarthy. Jim was a director on the Visual C++ team at Microsoft back when that was a big deal[2]. The book is one of the first and best books I read as a manager of software development teams, and it has stood the test of time – I still have it on the bookshelf in my office[3].

Of course, I wasn’t managing a software development team in 1995. I discovered Jim’s wisdom in 1996 or 1997 when I was preparing to teach courses on the Microsoft Solutions Framework[4]. When I opened the box that the trainer kit came in, there was a CD-ROM with videos[5] of Jim McCarthy presenting to an internal Microsoft audience on his “rules of thumb” for reliably shipping great software. When I first watched them, it was eye-opening… almost life-changing. Although I did have a mentor at the time, I did not have a mentor like Jim.

The second time I watched the videos, it was with my whole team. Jim’s rules from these videos and his book became part of the team culture[6], and I’ve referred back to them many times over the decades.

Now you can too – the videos are available on YouTube:

Some of the rules may feel a bit dated if you’re currently shipping commercial software or cloud services[7] but even the dated ones are based on fundamental and timeless truths of humans and teams of humans.

I’m going to take some time this week to re-watch these videos, and to start off the new year with this voice from years gone by. If you build software and work with humans, maybe you should too.


[1] Damn, that is weird to type, since I’m sure I can’t even be 25 years old yet.

[2] It may also be a big deal today, but to me at least it doesn’t feel quite as much.

[3] Not that I’ve seen my office or my bookshelf in forever, so this assertion should be taken with a grain of 2020.

[4] Fun fact: if you remember MSF, you’re probably old too.

[5] Because in those days “video” and “internet” weren’t things that went together. This may or may not have been because the bits had to walk uphill both ways in the snow.

[6] To the extent the team could be said to have a shared culture. We were all very young, and very inexperienced, and were figuring things out as we went along.

[7] Back in 1995 CI/CD wasn’t part of the industry vocabulary, and I can count on zero hands the clients I worked with before joining Microsoft who had anything resembling a daily build.

Session resources: Patterns for adopting dataflows in Power BI

This morning I presented a new webinar for the Istanbul Power BI user group, covering one of my favorite subjects: common patterns for successfully using and adopting dataflows in Power BI.

This session represents an intersection of my data culture series in that it presents lessons learned from successful enterprise customers, and my dataflows series in that… in that it’s about dataflows. I probably didn’t need to point out that part.

The session slides can be downloaded here: 2020-09-23 – Power BI Istanbul – Patterns for adopting dataflows in Power BI

The session recording is available for on-demand viewing. The presentation is around 50 minutes, with about 30 minutes of dataflows-centric Q&A at the end. Please check it out, and share it with your friends!

 

Data Culture: The Importance of Community

The last two videos  in our series on building a data culture covered different aspects of  how business and IT stakeholders can partner and collaborate to achieve the goals of the data culture. One video focused on the roles and responsibilities of each group, and one focused on the fact that you can’t treat all data as equal. Each of these videos builds on the series introduction, where we presented core concepts about cultures in general, and data culture in particular.

Today’s video takes a closer look at where much of that business/IT collaboration takes place – in a community.

Having a common community space – virtual, physical, or both – where your data culture can thrive is an important factor in determining success. In my work with global enterprise Power BI customers, when I hear about increasing usage and business value, I invariably hear about a vibrant, active community. When I hear about a central BI team or a business group that is struggling, and I ask about a community, I usually hear that this is something they want to do, but never seem to get around to prioritizing.

Community is important.[1]

woman-1594711

A successful data culture lets IT do what IT does well, and enables business to focus on solving their problems themselves… but sometimes folks on both sides of this partnership need help. Where do they find it, and who provides that help?

This is where the community comes in. A successful community brings together people with questions and people with the answer to these questions. A successful community recognizes and motivates people who share their knowledge, and encourages people to increase their own knowledge and to share it as well.

Unfortunately, many organizations overlook this vital aspect of the data culture. It’s not really something IT traditionally owns, and it’s not really something business can run on their own, and sometimes it falls through the cracks[2] because it’s not part of how organizations think about solving problems.

If you’re part of your organization’s journey to build and grow a data culture and you’re not making the progress you want, look more closely at how you’re running your community. If you look online you’ll find lots of resources that can give you inspiration and ideas, anything from community-building ideas for educators[3] to tips for creating a corporate community of practice.


[1] Really important. Really really.

[2] This is a pattern you will likely notice in other complex problem spaces as well: the most interesting challenges come not within a problem domain, but at the overlap or intersection of related problem domains. If you haven’t noticed it already, I suspect you’ll start to notice it now. That’s the value (or curse) of reading the footnotes.

[3] You may be surprised at how many of these tips are applicable to the workplace as well. Or you may not be surprised, since some workplaces feel a lot like middle school sometimes…

Power BI dataflows PowerShell scripts on GitHub

Last week I shared a post highlighting a common pattern for making API data available through dataflows in Power BI, and included a few diagrams to show how a customer was implementing this pattern.

In the post I mentioned that I was simplifying things a bunch to only focus on the core pattern. One of the things I didn’t mention is that the diagrams I shared were just one piece of the puzzle. Another part was the need to define dataflows in one workspace, and then use those as a template for creating other dataflows in bulk.

This is simple enough to do via the Power BI portal for an individual dataflow, but if you need to do it for every dataflow in a workspace, you might need a little more power – PowerShell, to be specific.

The rest of this post is not from me – it’s from the dataflows engineering team. it describes a set of PowerShell scripts they’ve published on GitHub, and which address this specific problem. The rest is pretty self-explanatory, so I’ll just mention that these are unsupported scripts presented as-is, and I’ll let the rest speak for itself.

nautilus-1029360_1920

Microsoft Power BI dataflows samples

The document below describes the various PowerShell scripts available for Power BI dataflows. These rely on the Power BI public REST APIs and the Power BI PowerShell modules.

Power BI Dataflow PowerShell scripts

Below is a table of the various Power BI PowerShell modules found in this repository.

Description Module Name Download
Export all dataflows from a workspace ExportWorkspace.ps1 GitHub Location
Imports all dataflows into a workspace ImportWorkspace.ps1 GitHub Location
Imports a single dataflow ImportModel.ps1 GitHub Location

For more information on Powershell support for Power BI, please visit powerbi-powershell on GitHub

Supported environments and PowerShell versions

  • Windows PowerShell v3.0 and up with .NET 4.7.1 or above.
  • PowerShell Core (v6) and up on any OS platform supported by PowerShell Core.

Installation

  1. The scripts depend on the MicrosoftPowerBIMgmt module which can be installed as follows:
Install-Module -Name MicrosoftPowerBIMgmt

If you have an earlier version, you can update to the latest version by running:

Update-Module -Name MicrosoftPowerBIMgmt
  1. Download all the scripts from the GitHub Location into a local folder.
  2. Unblock the script by right click on the files and select “Unblock” after you download. Otherwise you might get a warning when you run the script.

Uninstall

If you want to uninstall all the Power BI PowerShell cmdlets, run the following in an elevated PowerShell session:

Get-Module MicrosoftPowerBIMgmt* -ListAvailable | Uninstall-Module -Force

Usage

The APIs below supports two optional parameters:

  • -Environment: A flag to indicate specific Power BI environments to log in to (Public, Germany, USGov, China, USGovHigh, USGovMil). Default is Public
  • -V: A flag to indicate whether to produce verbose output. Default is false.

Export workspace

Exports all the dataflow model.json from a Power BI workspace into a folder:

.\ExportWorkspace.ps1 -Workspace "Workspace1" -Location C:\dataflows

Import workspace

Imports all the dataflow model.json from a folder into a Power BI workspace. This script also fixes the reference models to point to the right dataflow in the current workspace:

.\ImportWorkspace.ps1 -Workspace "Workspace1" -Location C:\dataflows -Overwrite

Import dataflow

Imports a dataflow model.json into a Power BI workspace:

.\ImportModel.ps1 -Workspace "Workspace1" -File C:\MyModel.json -Overwrite

Data Culture: Picking Your Battles

Not all data is created equal.

One size does not fit all.

In addition to collaboration and partnership between business and IT, successful data cultures have something  else in common: they recognize the need for both discipline and flexibility, and have clear, consistent criteria and responsibilities that let all stakeholders know what controls apply to what data and applications.

2020-08-01-19-55-59-794--POWERPNT

Today’s video looks at this key fact, and emphasizes this important point: you need to pick your battles[1].

If you try to lock everything down and manage all data and applications rigorously, business users who need more agility will not be able to do their jobs – or more likely they will simply work around your controls. This approach puts you back into the bad old days before there were robust and flexible self-service BI tools – you don’t want this.

If you try to let every user do whatever they want with any data, you’ll quickly find yourself in the “wild west” days – you don’t want that either.

Instead, work with your executive sponsor and key stakeholders from business and IT to understand what requires discipline and control, and what supports flexibility and agility.

One approach will never work for all data – don’t try to make it fit.


[1] The original title of this post and video was “discipline and flexibility” but when the phrase “pick your battles” came out unscripted[2] as I was recording the video, I realized that no other title would be so on-brand for me. And here we are.

[2] In case you were wondering, it’s all unscripted. Every time I edit and watch a recording, I’m surprised. True story.

Dataflows and API data sources

These days more and more data lives behind an API, rather than in a more traditional data source like databases or files. These APIs are often designed and optimized for small, chatty[1] interactions that don’t lend themselves well to use as a source for business intelligence.

beehive-337695_1280
I’d explain the choice of image for APIs, but it doesn’t take a genus to figure it out

These APIs are often slower[3] than a database, which can increase load/refresh times. Sometimes the load time is so great that a refresh may not fit within the minimum window based on an application’s functional requirements.

These APIs may also be throttled. SaaS application vendors often have a billing model that doesn’t directly support frequent bulk operations, so to avoid customer behaviors that affect their COGS and their bottom line, their APIs may be limited to a certain number of calls[4] for a given period.

The bottom line is that when you’re using APIs as a data source in Power BI, you need to take the APIs’ limitations into consideration, and often dataflows can help deliver a solution that accommodates those limitations while delivering the functionality your application needs.

I’ve covered some of the fundamental concepts of this approach in past blog posts , specifically this one from 2018 on Power BI Dataflows and Slow Data Sources, and this one from 2019 on Creating “data workspaces” for dataflows and shared datasets.  I hadn’t planned on having a dedicated new blog post, but after having this pattern come up in 4 or 5 different contexts in the past week or so, I thought maybe a post was warranted.

In late July I met with a customer to discuss their Power BI dataflows architecture. They showed me a “before” picture that looked something like this:

2020-08-01-13-40-14-887--POWERPNT

One of their core data sources was the API for a commercial software application. Almost every Power BI application used some data from this API, because the application supports almost every part of their business. This introduced a bunch of familiar challenges:

  • Training requirements and a steeper-then-necessary learning curve for SSBI authors due to the complex and unintuitive API design
  • Long refresh times for large data extracts
  • Complex, redundant queries in many applications
  • Technical debt and maintenance due to the duplicate logic

Then they showed me an “after” picture that looked something like this:

2020-08-01-13-40-32-810--POWERPNT

They had implemented a set of dataflows in a dedicated workspace. These dataflows have entities that pull data from the source APIs, and make them available for consumption by IT and business Power BI authors. All data transformation logic is implemented exactly once, and each author can easily connect to trusted tabular data without needing to worry about technical details like connection strings, API parameters, authentication, or paging. The dataflows are organized by the functional areas represented in the data, mapping roughly to the APIs in the source system as viewed through the lens of their audience.

The diagram above simplifies things a bit. For their actual implementation they used linked and computed entities to stage, prepare, and present the dataflows, but for the general pattern the diagram above is probably the right level of abstraction. Each IT-developed or business-developed application uses the subset of the dataflows that it needs, and the owners of the dataflows keep them up to date and current.

Life is good[5].


[1]There’s a lot of information available online covering chatty and chunky patterns in API design, so if you’re not sure what I’m talking about here, you might want to poke around and take a peek[2] at what’s out there.

[2] Please let me know if you found the joke in footnote[1].

[3] Possibly by multiple orders of magnitude.

[4] Or a given data volume, etc. There are probably as many variations in licensing as there are SaaS vendors with APIs.

[5] If you’re wondering about the beehive image and the obtuse joke it represents, the genus for honey bees is “apis.” I get all of my stock photography from the wonderful web site Pixabay. I will typically search on a random term related to my blog post to find a thumbnail image, and when the context is completely different like it was today I will pick one that I like. For this post I searched on “API” and got a page full of bees, which sounds like something Bob Ross would do…