Dataflows with benefits

Power BI datamarts are like dataflows with benefits.

In case you missed the announcements this week from Microsoft’s Build conference, datamarts are a new major new capability coming to Power BI that are now available in public preview. There are also preview docs available, but the best datamarts content I’ve seen so far is this fantastic video from Adam and Patrick at Guy in a Cube[1].

I’m going to assume that since you’re reading this blog, you didn’t miss the announcement. I’m also going to assume that some small part of you is asking “what the heck are datamarts, anyway?”

For me, datamarts are like dataflows with benefits[2].

It should come as no surprise to any regular reader of this blog that I’m a big fan of dataflows in Power BI. Dataflows let Power BI users build reusable data tables  in a workspace using Power Query Online, and share them with other users for reuse in other workspaces. What’s not to love?[3]

If you’ve spent a lot of time working with dataflows, you can probably think of a few things you wished dataflows did differently, or better. These are some of the most common requests I’ve heard in customer conversations over the years:

  • “I wish I could define row-level security (RLS) on my dataflows so I could share them securely with more users.”
  • “I wish my users could connect to dataflows using SQL, because analysts in my org all know SQL.”
  • “I wish <operation that would benefit from a compute engine> performed better in dataflows.”[4]

You probably see where I’m going here. Datamarts in Power BI deliver solutions to these problems. Datamarts in Power BI build on the strength of dataflows, while enabling common scenarios where dataflows did not offer an obvious or simple solution.

Almost like datamarts were dataflows with benefits.

Datamarts, like dataflows, provide a Power Query Online experience to perform data preparation. Datamarts, like dataflows, allow users to create reusable data tables in a Power BI workspace.

But datamarts, unlike dataflows, store their data in a managed SQL database. Dataflows use CDM folders for their storage, which means CSV files with some extra metadata. Although this file-based approach provides some benefits for reuse in integration scenarios, it can also be a challenge for simply connecting to the data in tools other than Power BI.

With datamarts, the data produced by executing your Power Query queries[5] is loaded into tables in an Azure SQL database that’s managed by the Power BI service. Having data in a full-featured database, as opposed to folders full of text files, makes a lot of difference.

  • Datamarts support row-level security. Using a simple in-browser user experience a datamart author can define RLS rules that restrict what users can see what data when connecting to the datamart.
  • Anyone with the right permissions can query[6] the datamart’s underlying SQL database using any SQL query tool. This means that authorized users can perform exploratory data analysis in Excel, SQL Server Management Studio, Azure Data Studio, or whatever tool they’re most comfortable using. It’s a database.
  • Merges and joins and other operations common to building a star schema perform much better, because these are things that SQL has been awesome at for decades.[7]

Is it just me, or is this sounding a lot like dataflows with benefits?

But wait, you might say, what about measures and that automatically created dataset thingie and all the other stuff they showed at Build, and which I don’t really understand yet? What about deciding when I should use a datamart over a dataflow? What about the expanded web authoring experience, and querying the datamart directly from within the browser??

Yeah, I’m not going to cover any of that in this post. The post is already too long, and I didn’t really have time to write this one as it is.[8] But I think it’s those things that make the product team scowl when I describe datamarts as “dataflows with benefits” because they’re really a lot more. But if you think about dataflows with benefits, you’re probably off to a good start, and heading in the right direction.

I’m going to end on this note: all of my personal Power BI projects going forward will be built using datamarts. Datamarts do everything I need dataflows to do, and for me they do it better. I’ll still always love dataflows, and there will likely still be places where dataflows make sense, but for me…


 

[1] And I’m not only saying that because Roche’s Maxim makes an awkward surprise appearance.

[2] In case the “with benefits” descriptor isn’t meaningful to you, I’m attempting to make a play on the phrase “friends with benefits.” You can check out the mildly NSFW Urban Dictionary definition if you really feel like you need to. Honestly, I wouldn’t recommend it, but you do you.

[3] I literally got goosebumps typing that “reusable data tables” sentence. Even after all these years, having the promise of self-service data preparation and reuse realized in Power BI still feels a little bit like magic.

[4] Yes, this typically leads into a conversation about the dataflows “enhanced compute engine” but since using that engine requires following specific design patterns, this isn’t always as straightforward a conversation as you might want it to be.

[5] Power Queries? I always struggle with what to use for the plural noun for the queries that you build in Power Query. Maybe I should ask Alex.

[6] I use the term “query” here to mean SELECT statements. Only read operations are permitted, so if you try to UPDATE or whatever, you’ll get an error. Use Power Query to transform the data as you load it, like you would with a traditional data mart or data warehouse.

[7] I don’t have enough hands-on with the datamarts preview at this point to say much more than “faster” but in my non-scientific testing queries that would take “I guess I’ll make another pot of coffee” in dataflows take “oh it’s done already” in datamarts.

[8] If you had any idea how mean my calendar is this year, you’d weep. I’m weeping right now.

Coming to the PASS Data Community Summit in November: The Hitchhiker’s Guide to Adopting Power BI in Your Organization

At the 2022 PASS Community Data Summit this November, I’m thrilled to be co-presenting a full-day pre-conference session with the one and only Melissa Coates [blog | Twitter | LinkedIn]. We’ll be presenting our all-day session live and in-person in Seattle on Tuesday, November 15, 2022.

What’s the Session?

The Hitchhiker’s Guide to Adopting Power BI in Your Organization

What’s the Session About?

The Power BI Adoption Roadmap is a collection of best practices and suggestions for getting more value from your data and your investment in Power BI. The Power BI Adoption Roadmap is freely available to everyone — but not everyone is really ready to start their journey without a guide. Melissa and I will be your guides…while you’re hitchhiking…on the road…to reach the right destination…using the roadmap. (You get it now, right?!?)

We’ll do an end-to-end tour of the Power BI Adoption Roadmap. During the session we’ll certainly talk about all of the key areas (like data culture, executive sponsorship, content ownership and management, content delivery scope, center of excellence, mentoring and user enablement, community of practice, user support, data governance, and system oversight).

Smart Power BI architecture decisions are important – but there’s so much more to a successful Power BI implementation than just the tools and technology. It’s the non-technical barriers, related to people and processes, that are often the most challenging. Self-service BI also presents constant challenges related to balancing control and oversight with freedom and flexibility. Implementing Power BI is a journey, and it takes time. Our goal is to give you plenty of ideas for how you can get more value from your data by using Power BI in the best ways.

We promise this won’t be a boring day merely regurgitating what you can read online. We’ll share lessons learned from customers, what works, what to watch out for, and why. There will be ample opportunity for Q&A, so you can get your questions answered and hear what challenges that other organizations are facing. This will be a highly informative and enjoyable day for you to attend either in-person or virtually.

Who is the Target Audience?

To get the most from this pre-conference session: You need to be familiar with the Power BI Adoption Roadmap and the Power BI Implementation Planning guidance. You should have professional experience working with Power BI (or other modern self-service BI tools), preferably at a scope larger than a specific team. Although deep technical knowledge about Power BI itself isn’t required, but the more you know about Power BI and its use, the more you’ll walk away with from this session.

We hope to see you there! More details and to register: link to the PASS Data Community web site.

Who wrote this blog post?

It was Melissa.

She wrote it and emailed it to me and I shamelessly[1] stole it, which may be why there haven’t been any footnotes[2]. I even stole the banner image[3].


[1] With her permission, of course.
[2] Until these ones.
[3] Yes, Jeff. Stealing from Melissa is a Principal-level behavior.

On building expertise

How do you learn a new skill? How do you go from beginner to intermediate to expert… or at least from knowing very little to knowing enough that other people start to think of you as an expert?

This post describes the learning approach that has worked for me when gaining multiple skills, from cooking to sword fighting to communication. This approach may work for you or it may not… but I suspect that you’ll find something useful even if your learning style is different.

Purely decorative stock photo

When I’m building expertise[1] in a new area, there are typically five key activities that together help me make the learning progress I need. They don’t come in any particular order, and they all tend to be ongoing and overlapping activities, but in my experience they’re all present.

Practicing: Building expertise in an area or skill requires actively performing that skill. You can’t become great at something without doing that thing. Practice whenever you can, and look for opportunities to practice things that are outside your comfort zone. If your practice is always successful, this may be a sign that you’re not pushing your self enough, and your progress may be slower than it needs to be.

Studying: It’s rare to learn something completely new. Even if you’re trailblazing a brand new area of expertise, you probably get there by learning about related better-known areas, from existing experts. Read whatever you can, watch whatever you can, listen to whatever you can. Find content that matches the way you prefer to learn, and spend as much time as you can consuming that content. Make it part of your daily routine.

Observing: Find existing experts and watch them work. Where reading books or watching videos exposes you to the static aspect of expertise, being “expert adjacent” exposes you to the dynamic aspect. Mindfully observing how an expert approaches a problem, how they structure their work area, how they structure their day, will give you insights into how you can improve in these aspects.

Networking: Find ways to engage with a community of like-minded people who share your interest in your chosen area of expertise. Not only will these activities provide ongoing opportunities to learn from peers, the questions and problems that other community members share can serve as motivation to explore topics you may not have otherwise thought of independently.

Teaching: Teaching a skill to others forces you to think about that skill in ways that would probably not be needed if you were learning on your own. Teaching forces you to look at problems and concepts in ways that expose your biases and blind spots, and to ask (and answer) questions that would never have occurred to you on your own. Teaching a skill is in many ways the best way to deeply learn that skill.

Please note that these activities aren’t sequential, and no one activity is dependent on the others. In my experience, all five activities are ongoing and overlapping, and each one complements and enables the others.

What does this look like in practice?

I grew up in a family where cooking wasn’t really a thing[2], so I started learning to cook as an adult. Despite this, my cooking and baking have become something of a gold standard for friends and acquaintances. My learning experience looked something like this:

Studying: I bought and read dozens of cookbooks. If a topic or book looked interesting, I bought it and read it. I subscribed to many cooking magazines and read them when they arrived each month and watched pretty much every cooking show I could fit into my schedule.[3]

Practicing: I cooked and baked almost every day. I tried new ingredients and recipes and techniques that I discovered through my study, and figured out what worked, and what I needed to do to make it work for me. I organized fancy dinners for friends as a forcing function, and to keep myself from getting too lazy.

Observing: When I dined in restaurants[4], I would try to sit at the chef’s counter or somewhere I could get a good view of the kitchen, and would mindfully watch how the chef prepared and served each dish.

Networking: I made foodie friends. I talked about food and cooking with them, and occasionally cooked with them as well. Sometimes we’d go to cooking classes together. Sometimes we’d borrow equipment or cookbooks from each other. Eventually we’d invite each other over for dinner. Those were the days…

Teaching: I found opportunities to share what I’d learned with others. When someone would exclaim “Oh my goodness – how did you make this?!?”[5] I would do my best to tell them and show them. Sometimes this was a conversation, sometimes it was a 1:1 tutoring session, sometimes it was a small group class. Each time I learned something about the topic I was teaching because of the questions people asked.

Cooking is just one example, but I have similar experiences for every topic where people have asked me questions like “how do you know all this?” or “how did you get so good at this?” For every area where I have developed a reasonable degree of expertise, it’s because I have done some combination of these things, often over many years. I have every reason to believe that this approach will work for you as well.

Ok… that’s the blog post, but I’m not quite done. Back in January when I started writing this one, I started with the three “content seeds” you see below.

Data Chef

FFS

That’s right. Past Matthew left me the phrase “Data Chef,” the link to the PowerBI.tips video he’d just finished watching, and the phrase “FFS.” Three months later I have no idea how these relate to the topic of building expertise. If you can figure it out, please let me know.


[1] You may have noticed that I’m deliberately using the phrase “building expertise” instead of a different phrase like “learning” or “acquiring a new” skill. I’ve chosen this phrase because the approach described here isn’t what I do when I’m learning something casually – it’s what I’ve done when building a wide and deep level of knowledge and capability around a specific problem or solution domain.

[2] I love my mother dearly, but when I was a child she would cook broccoli by boiling it until it was grey, and you could eat it with a spoon. I never realized that food was more than fuel until I was in my 20s and met my future wife.

[3] Yes, I’m old, in case talking about magazines didn’t give it away. I assume the same pattern today would involve subscribing to blogs or other content sources. Also, can you remember back when you needed to watch TV shows when they were on, not when you wanted to watch them?

[4] Back in the day I used to travel for work a lot, so I ate in restaurants a lot. Sometimes I was on the road two or three weeks per month, which translated into a lot of restaurant dinners.

[5] If you’ve never met me in person and have never eaten the food I make and share, you might be surprised by how often this happens. If you know me personally, you probably won’t be surprised at all.

Recipe: Chicken Liver Mousse

It’s been a few years since I shared a recipe, but this one has kept coming up in conversation lately and it feels like the right time to share. This recipe is from Laurie Riedman of Elemental@Gasworks[1], and words can’t express how awesome it is.

Ingredients

  • 2 pounds chicken liver, soaked in milk
  • ½ pound unsalted butter
  • 4 shallots, sliced
  • 2 cloves garlic, sliced
  • 2 Granny Smith apples, peeled and diced
  • 8 sheets gelatin, soaked in water
  • 1/3 cup Grand Mariner
  • Salt and pepper to taste

Technique

  1. Melt butter in sauté pan
  2. Add shallots, garlic and apple – cook until soft, but do not brown
  3. Turn heat to medium high and add drained chicken livers
  4. Sauté until just cooked – livers should still be pink inside
  5. Add Grand Mariner and reduce by half
  6. Stir in gelatin until dissolved
  7. Cool slightly and process until very smooth, adding salt and pepper to taste
  8. Put in terrine mold, cover and weigh
  9. Chill overnight

Serving suggestion

Serve with toasted baguette and pickles

The mousse will keep for months in the freezer. I made a big batch in 2020, vacuum sealed 5 or 6 generous portions, and have been thawing one every few months when I feel the need for something rich, savory, and delicious.


[1] Elemental@Gasworks was my favorite restaurant for years. Before I moved to the Seattle area I would dine at Elemental at least once per week when I was visiting. I learned a lot from eating Laurie’s food, and from watching her cook in the tiny, tiny kitchen. Elemental closed in 2012, but it comes up in conversation almost every day among those of us fortunate enough to have experienced it.

Thank you for sticking around – 200th post!

I write this blog mainly for myself.

I write about topics that are of interest to me, mainly because they’re interesting to me but also because there doesn’t seem to be anyone in the Power BI community covering them. There are dozens of blogs and YouTube channels[1] covering topics like data visualization, data modeling, DAX, and Power Query – but I haven’t found any other source that covers the less-tangible side of being a data professional that is so compelling to me.

I write on my own chaotic schedule. Some months I might post every other day, or I might go weeks or months without posting anything[2]. These days I try to post once per week, but that’s definitely a stretch goal rather than something to which I will commit. Sometimes my creativity flows freely, but sometimes writing comes from the same budget of emotion and energy that I need for work and life… and blogging typically ends up near the bottom of my priority list.

And yet, here we are, 40 months and 200 blog posts later. In the past few weeks I’ve seen dozens of people reference Roche’s Maxim of Data Transformation, which started out as a tongue-in-cheek self-deprecating joke and then took on a life of its own. Earlier this week I spent time with another team at Microsoft that has organized been collectively reading and discussing my recent series on problems and solutions, and looking for ways to change how their team works to deliver greater impact. More and more often I talk with customers who mention how they’re using some information or advice from this blog… and it’s still weird every single time.

In these dark days it’s easy to feel isolated and alone. It’s easy to dismiss and downplay online interactions and social media as superficial or unimportant. It’s easy to feel like no one notices, like no one cares, and like nothing I’m doing really makes a difference[3].

So for this 200th post, I wanted to take the time to say thank you to everyone who has taken the time to let me know that I’m not alone, and that someone is listening. It makes a big difference to me, even if I don’t always know how to show it.

Let’s keep doing this. Together.


[1] I recently discovered that two of my co-workers have their own little YouTube channel. Who knew?

[2] Please don’t even get me started on how long it’s been since I posted a new video.

[3] This is a reminder that Talking About Mental Health is Important. I have good days and bad days. Although I make a real effort to downplay the bad and to amplify the good, there’s always a voice inside my head questioning and criticizing everything I do. It’s important to talk about mental health because this is a voice that many people have, and not everyone knows that it’s not just them. Not everyone knows that the voice is lying.

Risk Management Basics

When I began my software career back in the 90s, one of the software-adjacent skills I discovered I needed was risk management. When you’re building anything non-trivial, it’s likely that something will go wrong. How do you find the right balance between analysis paralysis and blindly charging ahead? How do you know what deserves your attention today, and what can be safely put on the back burner and monitored as needed?

This is where risk management comes in.[1]

In this context, a risk is something that could possibly go wrong that would impact your work if it did go wrong. Risk management is the process of identifying risks, and deciding what do do about them.

One simple and lightweight approach for risk management[2] involves looking at two factors: risk likelihood, and risk impact.

Risk likelihood is just what it sounds like: how likely is the risk to occur. Once you’re aware that a risk exists, you can measure or estimate how likely that risk is to be realized. In many situations an educated guess is good enough. You don’t need to have a perfectly accurate number – you just need a number that no key stakeholders disagree with too much.[3] Rather than assigning a percentage value I prefer to use a simple 1-10 scale. This helps make it clear that it’s just an approximation, and can help prevent unproductive discussions about whether a given risk is 25% likely or 26% likely.

Risk impact is also what it sounds like: how bad would it be if the risk did occur? I also like to use a simple 1-10 scale for measuring risk impact, which is more obviously subjective than the risk likelihood. So long as everyone who needs to agree agrees that the impact a given risk is 3 or 4 or whatever, that’s what matters.

Once you have identified risks and assigned impact and likelihood values to each one, multiply them together to get a risk score from 1 to 100. Sort your list by this score and you have a prioritized starting point for risk mitigation.

Risk mitigation generally falls into one or more of these buckets:[4]

  1. Risk prevention – you take proactive steps to reduce the likelihood of the risk occurring.
  2. Risk preparation – you take proactive steps to plan for how you’ll respond to reduce the impact if the risk does occur.

For risks with high risk scores, you’ll probably want to do both – you’ll take steps to make the risk more likely to occur, and you’ll take steps to be ready in case it still does.

Here are a few examples of risks that might be identified when performing risk management for a BI project, along with examples of how each might be mitigated:

  • Risk: A database server might be unavailable due to hardware failure, thus interrupting business operations
    • Possible prevention: Purchase and configure server server hardware with redundant storage and other subsystems
    • Possible preparation: Define and test a business continuity and disaster recovery[5] plan for recovering the database server
  • Risk: You might not get permissions to access a data source in a timely manner
    • Possible prevention: Prioritize getting access to all required data sources before committing to the project
    • Possible preparation: Identify an executive sponsor and escalation path for situations where additional data source access is required
  • Risk: A key team member might leave the team or the company
    • Possible prevention: Work to keep the team member happy and engaged
    • Possible preparation: Cross-train other members of your team to minimize the impact if that key member moves on
  • Risk: Your data center might lose power for a week
    • Possible prevention: Locate the data center in a facility with redundant power and a reliable power grid
    • Possible preparation: Purchase and install generators and fuel reserves
  • Risk: Your data center location might be destroyed by a giant meteor
    • Possible prevention: Um… nothing leaps to mind for this one
    • Possible preparation: Again um, but maybe using a geo-distributed database like Azure Cosmos DB to ensure that the destruction of one data center doesn’t result in downtime or data loss?[6]

You get the idea. I’m not going to assign likelihood or impact values to these hypothetical risks, but you can see how some are more likely than others, and some have a higher potential impact.

Now let’s get back to a question posed at the top of the post: how do you find the right balance between analysis paralysis and blindly charging ahead?

Even in simple contexts, it’s not possible to eliminate risk. Insisting that a mitigation strategy needs to eliminate a risk and not only reduce it is ineffective and counterproductive. It’s not useful or rational to refuse to get in a car because of the statistical risk of getting injured or killed in a collision – instead we wear seat belts and drive safely to find a balance.

And this is kind of what inspired this post:

The “perfect or nothing” mindset isn’t effective or appropriate for the real world. Choosing to do nothing because there isn’t an available perfect solution that eliminates a risk is simply willful ignorance.

Most real-world problems don’t have perfect solutions because the real world is complex. Instead of looking for perfect solutions we look for solutions that provide the right tradeoff between cost[7] and benefit. We implement those pragmatic solutions and we keep our eyes open, both for changes to the risks we face and to the possibility of new mitigations we might consider.

Whether or not risk management is a formal part of your development processes, thinking about risks and how you will mitigate them will help you ensure you’re not taken by surprise as often when things go wrong… as they inevitably do…


[1] Yes, I’m linking to a Wikipedia article for a technical topic. It’s surprisingly useful for an introduction, and any search engine you choose can help you find lots of articles that are likely to be more targeted and useful if you have a specific scenario in mind.

[2] This is the only approach to risk management that will be shared in this article. If you want something more involved or specialized, you’ll need to look elsewhere… perhaps starting with the Wikipedia article shared earlier, and following the links that sound interesting.

[3] If you are in a situation where “good enough” isn’t good enough, you’ll probably want to read more than just this introductory blog post. Are you starting to see a trend in these footnotes?

[4] That Wikipedia article takes a slightly different approach (direct link to section) but there’s a lot of overlap as well. What I describe above as “risk prevention” aligns most with their “risk reduction” and my “risk preparation” aligns most with their “risk retention” even though they’re not exact matches.

[5] The other BCDR.

[6] I had originally included the “giant meteor strike” risk as an example of things you couldn’t effectively mitigate, but then I remembered how easy Cosmos DB makes it to implement global data distribution. This made me realize how the other technical risks are also largely mitigated by using a managed cloud service… and this in turn made me realize how long ago I learned about mitigating risks for data projects. Anyway, at that point I wasn’t going back to pick different examples…

[7] However you want to measure that cost – money, time, effort, or some other metric.

Communicating the voice of the customer

My last post focused primarily on a problem that can arise when there’s a central team that sits between a team responsible for delivering solutions, and teams that have problems to be solved. The advice in that article is valuable[1], but it’s also very general. It also introduces a problem without providing any advice to solve the problem, beyond the not-very-helpful “keep an eye open for this problem behavior playing out.”

In this article I’m going to share some more specific advice to help overcome this challenge if you’re part of that central team. In this article I’m going to share my advice for communicating the voice of the customer.

In my experience, there are four key success factors for effectively communicating the voice of the customer when you’re not really the customer yourself.

Success Factor 1: Understand both the problem domain and the solution domain. Since this is what most of the last set of articles have covered already I don’t plan to re-cover this familiar ground, but it’s still important to mention. If you’re going to act as a bridge between two worlds, you need a meaningfully deep understanding of both.

Success Factor 2: Don’t have an agenda. More specifically, don’t have an agenda other than enabling the success of the the solution team and the customers you enable. In order to speak with the voice of the customer, the people you’re communicating with need to understand that you’re really speaking for the customer, and not for yourself. This doesn’t mean you can’t have an opinion – it means that you need to share the customer’s truth even when it might not support that opinion.

Success Factor 3: Tell a story. If you only share numbers and statistics, you’re leaving the interpretation of those numbers up to your audience, and that audience doesn’t have the context that you do. You’re from the team that understands the needs of the customer – and you’re the one that doesn’t have an agenda. These two facts make put you in an ideal position to tell a story that captures the customer’s scenarios, goals, priorities, and challenges and to ensure that it is received in a way that inspires your audience to action.

Success Factor 4: Include numbers and statistics. If you only tell a story, your audience will often assume that you’re telling your own story to forward your own agenda. Having data available to back up the story, including specific data to back up each key story point, helps overcome any skepticism and ensure that your story can be received, believed, and acted upon. The amount of data you need will depend on factors including how much trust you’ve already earned with your audience, and how well the actions you’re hoping to inspire align with their current goals.[2] 

Somewhere between telling a story and including data[3]  lies including the literal voice of the customer. When you’re meeting with customers, take thorough notes. When possible, record your customer conversations. Then, when preparing to tell your story, have verbatim customer quotes prepared to reinforce they key points of your story. This lets you say “here’s a customer problem I think you need to prioritize solving” and “here’s how these customers have independently described their experiences in their own words.” It’s easy for someone to say that you don’t really understand the customer’s problem, because you’re not really the customer. But it’s hard for someone say that the customer doesn’t understand their own problem.[4] Bringing the voice of the customer to a story is like having another five or six aces up your sleeve, but it’s not technically cheating.

These four behaviors have proven indispensable in my work on the Power BI CAT team, and I see them as success factors in many of the teams I’ve worked with that also operate between the problem domain and the solution domain. If you’re part of a center of excellence or another team that follows this pattern, look for opportunities to incorporate these behaviors into your work. They’ve served me very well, and I suspect they’ll do the same for you.


[1] In any event, I think it’s valuable, but my opinion here may not be coming from a position of neutral impartiality. You can reach your own conclusion.

[2] In my experience, earning and retaining the trust of your audience is a much more important factor than the story or the data. Right now the only advice I can think of here is “act consistently with integrity and transparency” but maybe I can find a blog post on this topic at some point.

[3]  I could call this “Success Factor 5” or even “Success Factor 3.5” but then I would need to go back up and edit the post introduction and I’m far too lazy for that sort of work.

[4] If you do find the solution team you work with saying that the customer’s problems aren’t real, this is a massive red flag that needs to be confronted. In my experience, if this attitude persists it’s time to start escalating or looking for a new team.

Measuring success and satisfaction

Back in 2020 I wrote a post titled “Successfully measuring / measuring success” that talked about a problem without meaningfully describing it, because at the time I didn’t believe that the problem itself was particularly relevant. I was more interested in the pattern, which at the time I thought was “measuring the wrong thing.”

In recent weeks I’ve found myself sharing this post more frequently, and have come to believe that the underlying pattern is actually something else. Let’s look at my beautiful  2010-era chart one more time.

Before Power BI we had PowerPoint, but data is still Power

In this chart, both series represent NSAT[1] scores – basically they’re saying “this is how happy our customers are with a thing.” This sounds pretty simple, but the tricky part is in the details. Who are “our customers” and what is the “thing” they’re happy or unhappy with?

In the context of the chart above, the yellow series was “customer satisfaction with what they got.” The blue series was “customer satisfaction with what we shipped.” Implied in these descriptions and the difference in the two data series is that something happened between the time we shipped something and the time the customers got something, and that the thing that happened was key to the customers’ satisfaction.

Without going into too many details[3], we were shipping a product that was used by partners, and those partners used our product to deliver an experience to their customers. In the FY06 timeframe we started to change the product that we shipped, and the partners used the updated product to deliver an experience that met their customers’ needs. Now they just had to do a little more work to keep their customers happy. As the product changes continued, the partner load increased. They had to do more and more to fix what we gave them and to keep their customers happy. You can see the story play out in the blue series in the chart above.

We were looking at the yellow series, and falsely conflating “customer satisfaction” with “customer satisfaction in what we have shipped to partners.” We didn’t truly appreciate the importance of the party in the middle and their impact, and it ended up causing no end of problems. We were measuring customer satisfaction, but failing to appreciate what it was that customers were satisfied with – and how little that satisfaction related to what we had created.

And this is the pattern I’ve been seeing more often lately.

In my recent post on the role of the Power BI CAT team and how that role has parallels with Power BI centers of excellence, I described a success pattern where a central team serves as a value-add bridge between people with problems and people with solutions.

This pattern is one that I’ve seen provide significant value in multiple contexts… but it also introduces risk of measuring the wrong things, and overlooking real problems. This is a side-effect of the value that the central team provides. Customers are interacting with the work of the central team in addition to[4] the work of the solution team, and it may be difficult for them to understand what part of their experience is dependent on what factors.

In this situation there is a risk of the solution team overlooking or failing to appreciate and prioritize the problems their customers are experiencing. This is  a problem that the “curate” function in the diagram above is designed to mitigate, but the risk is real, and the mitigation takes ongoing effort.

When a member of a solution team works with customers directly, it’s hard to overlook their challenges and pain. When that solution team member hears about customer challenges from an interim party, the immediacy and impact can be lost. This effect is human nature, and the ongoing effort of the central team to curate customer feedback is vital to counteract it.[5]

As mentioned at the top of the article, I’ve seen this pattern more often recently, where a solution team is failing to recognize problems or opportunities because they’re looking at the wrong thing. They’re focused on what their end customer is saying, instead of looking at the bigger picture and the downstream value chain that includes them and their customers, but isn’t limited to these two parties. It’s easy to mistake “customer being happy” for “customer being happy with what we produce” if you’re not keeping an eye on the big picture.

It’s easy to mistake “customer being happy” for “customer being happy with what we produce” if you’re not keeping an eye on the big picture.

If you’re in a situation like this, whether you’re a member of a solution team, a member of a central team, or a member of a customer/consumer team, you’ll do well to keep an eye open for this problem behavior playing out.


[1] You can read this article if you’re curious about NSAT as a metric and what the numbers mean and were too lazy[2] to read the 2020 blog post I linked to above.

[2] I’m judging you, but not too harshly.

[3] I’m being deliberately vague here, trying to find a balance between establishing enough context to make a point and not sharing any decade-plus-old confidential information.

[4] Or in some circumstances, instead of.

[5] Please keep in mind that in most circumstances the central team is introduced when direct ongoing engagement between the solution team and customers can’t effectively scale. If you’re wondering why you’d want a central team in the first place, it may be because your current scenario doesn’t need it. If this is the case, please keep reading so you’ll be better prepared when your scenario gets larger or more complex and you need to start thinking about different solutions.

Join me in London for SQLBits – March 8 through March 12

In less than two months I’ll be making my first work trip in over two years, flying to London to present, meet with customers, and learn exciting things at the 2022 SQLBits conference. If you can, you should register today and join me.

Here’s what I expect my week to look like:

  • Wednesday March 9, 09:00 to 17:00: I’ll be back on stage for The Day After Dashboard in a Day pre-conference learning day, co-presenting with Patrick LeBlanc, Alex Powers, and Lars Andersen.
  • Thursday March 10, 14:10 to 15:00: I’ll be joining SQLBits organizer and MVP Simon Sabin for the Maximising everyone’s super powers panel discussion on mental health.
  • Thursday March 10, 18:15 to 19:05: Prathy K from the London Power BI User Group has organized an evening “ask me anything” open Q&A session with a bunch of folks from the Power BI CAT team, which sounds like a perfect way to end the day. You can register for this evening meetup here.
  • Friday March 11, 13:40 to 14:00: I finally get to present on Roche’s Maxim of Data Transformation for a live, in-person audience, and I get 20 minutes to do it!
  • Friday March 11, 14:10 to 15:00: The BI Power Hour returns after a two-year pandemic hiatus, guaranteeing laughs and excitement[1] in a demo- and joke-filled exploration of how not to use Microsoft data technologies in the workplace. I’ll be joined by an international star-studded cast from the Power BI CAT team and the Microsoft data community, and I expect this session to be the can’t miss event of the decade.[2]
  • Saturday March 12, 08:00 to 08:50: I kick off the final day of the conference with Unleashing your personal superpower, an honest and sometimes-painful look at how to succeed in tech, despite your brain’s best efforts to stop you. I’m very excited to have this important session scheduled on the free-to-the-public day of the conference.

When I’m not on stage, I’m hoping to spend as much time as possible at the Microsoft product booth and the community zone. Conferences like SQLBits are an opportunity to have interesting conversations that can be an awkward fit for virtual channels, and I plan to get the most from my week as possible.

Update February 10: I’m planning to be in the community zone on Thursday afternoon immediately following the mental health panel discussion so we can keep that conversation going. I’m also planning to be back on Friday morning at 10:10 to talk about non-traditional career paths. If either of these conversations sounds interesting to you, you should plan on joining me.

Update February 12: My Saturday session has been moved from the end of the day to the beginning of the day. With this change, I can now have more time to hang out in the community zone on Saturday to continue the discussion.

If you’re in the London area – or if you can get there – and if attending an in-person event matches your personal risk profile, I hope you’ll register today and come say hi during the event. I’ll be the purple-haired guy with the mask on.

If you can’t come to London, I hope you’ll still register and attend. Most sessions are hybrid with virtual attendees welcome and included.


[1] Not guaranteeing that you will laugh or be excited. I’m just thinking about me here.

[2] Thinking back on the decade in question, this isn’t as high a bar as it might otherwise seem.

Reporting on your Spotify listening history

I’ve listened to music on Spotify for over 5,000 hours since I subscribed to their service. I listen the most in the early morning when I’m at the gym and before my morning meetings begin. I’ve listened to more Amon Amarth than I’ve listened to my next three top bands combined, but the album I’ve spent the most time listening to is “The Black Parade” by My Chemical Romance.[1] The song I’ve listened to the most is “O Father O Satan O Sun!” by Behemoth – Since March 2017 I’ve listened to it 620 times, for a total time of 69 hours, 15 minutes.

How do I know this? I know because for the past few years I’ve been reporting on my Spotify listening data.

Spotify doesn’t have a viable API for this type of analysis, but you can request a copy of your listening history and other account data by emailing privacy@spotify.com[2]. It takes about a month to get your data, so if you want to report on your own listening history, you should probably email them today.

If you’re interested in using my report as-is or as a starting point for your own analysis, I’ve shared a PBIT file here:

Here’s what you’ll probably need to do:

  1. Request a copy of your Spotify listening history by emailing privacy@spotify.com.
  2. Wait approximately 30 days for Spotify to prepare your data.
  3. Download the zip file prepared by Spotify.
  4. Extract the zip file to a folder on your PC.
  5. Locate the folder that contains JSON files. This report uses only the “endsong*.json” files that contain your listening history. The other files probably contain other interesting data – who knows?
  6. Open the PBIT file in Power BI Desktop.
  7. When prompted, enter the path to the folder containing your Spotify JSON data.
  8. Click “Load”.
  9. Cross your fingers, because I’ve never created a PBIT template before. Even though I’ve tested it, I don’t really know if this will work as hoped for you and I’m probably not going to provide any timely support if it doesn’t work.

Music is a huge part of my life, and I’ve had dozens of music-related pet projects over the past 30 years. When I was in college, my “learning a new programming language” routine involved re-implementing my record collection manager in whatever new language that semester’s classes called for. Being able to play with this familiar and very-meaningful-to-me data in Power BI has been a fun way to spend a few hours her and there over the past few years.

Sadly, the reason I’m using this report today is to figure out what albums I need to add to my music collection when I delete my Spotify account. I’ve discovered so much awesome new music because of Spotify[3], and I am now using this report to prioritize what CDs to purchase. This isn’t why I originally made this report, but this is where we are today.

As you’re probably aware, Spotify has chosen to spread disinformation and hateful propaganda and to prioritize their profits over public health. This is their right to do so, since they’re apparently not breaking any laws, but I won’t give them any more of my money to knowingly cause harm and poison the public discourse.

If you want to call me out for being a snowflake or SJW or whatever, please do so on Twitter, not here. I’ll delete any blog comments on this theme, and no one will get to see how witty you are – wouldn’t that be a shame?

Whether or not you’re planning to delete your own Spotify account, feel free to download this PBIT and check out your own listening history. I found insights that surprised me – maybe you will too. This report is something of a work in progress, so you might also find interesting ways to complete and improve it that I didn’t think of.

As for my post-Spotify music streaming world, I’ll probably end up getting a family subscription to Amazon Music. The price is the same, and it looks like all of the top albums I’ve been listening to on Spotify are there.

I’ll also definitely be spending more time with the CD/MP3 collection I spent decades building before I discovered Spotify. I own thousands of albums by hundreds of artists, and curating this collection was an activity that gave me joy for many years. Now I’m feeling motivated and inspired to rediscover that collection, and the personal satisfaction that comes from mindfully expanding it.

Specifically, I’m adding my personal music collection to Plex Media Server, and streaming it to all of my devices using their excellent PlexAmp app. I’ve been moving in this direction since it became obvious Spotify was going to keep funding and promoting hatred and ignorance, and it’s already reminded me of the wide world of music that’s not included in the big streaming catalogs[4].

Plex and PlexAmp make it incredibly easy to manage and stream your personal media, and they have both a REST API and a SQLite database for tracking and reporting on library contents and listening activity – and it doesn’t take a month to get your data. Maybe there’s a silver lining here after all…


[1] If you’re wondering where Manowar is on this list, they’re #8. Please consider that Spotify only has a limited selection of Manowar’s music, and that when I listen to my personal music collection it isn’t included in this data.

[2] For more information, see the Privacy page at https://www.spotify.com/us/account/privacy or whatever variation on this URL works for your part of the world.

[3] Four of my top 10 artists are bands Spotify recommended to me, and which I probably would not have heard of were it not for their “release radar” and similar playlists.

[4] For example, my personal collection includes every album, EP, single, and DVD Manowar has ever released. I may not literally have an order of magnitude more Manowar than Spotify does, but it’s pretty close.