Problems, not solutions

Imagine walking into a restaurant.

No, not that one. Imagine walking a nicer restaurant than the one you thought of at first. A lot nicer.

restaurant-2697945_640
Even nicer than this.

Imagine walking into a 3-star Michelin-rated best-in-the-world restaurant, the kind of place where you plan international travel around reservations, the kind of place where the chef’s name is whispered in a kind of hushed awe by other chefs around the world.

Now imagine being seated and then insisting that the chef cook a specific dish in a specific way, because that’s what you’re used to eating, because you know what you like and what you want.

knife-1088529_640
I’ll just leave this here for no particular reason.

In this situation, one of three things is likely to happen:

  1. The chef will give you what you ask for, and your dining experience will be diminished because your request was granted.
  2. The chef will ask you to leave.
  3. The chef will instruct someone else to ask you to leave.[1]

Let’s step back from the culinary context of this imaginary scenario, and put it into the context of software development and BI.

Imagine a user emailing a developer or software team[2] and insisting that they need a feature developed that works in some specific way. “Just make it do this!” or maybe “It should be exactly like <legacy software feature> but <implemented in new software>!!”

I can’t really imagine the restaurant scene playing out – who would spend all that money on a meal just to get what they could get anywhere? But I don’t need to imagine the software scene playing out, because I’ve seen it day after day, month after month for decades, despite the fact that even trivial software customization can be more expensive than a world-class meal. I’ve also been on both sides of the conversation – and I probably will be again.

When you have a problem, you are the expert on the problem. You know it inside and out, because it’s your problem. You’ve probably tried to solve it – maybe you’ve tried multiple solutions before you asked for help. And while you were trying those ineffective solution approaches, you probably thought of what a “great” solution might look like.

So when you ask for help, you ask for the solution you thought of.

This is bad. Really bad.

“Give me this solution” or “give me this feature” is the worst thing to ask for. Because while you may be the expert on your problem, you’re not an expert on the solution. If you were, you wouldn’t be asking for help in the first place.

And to make matters worse, most of the people on the receiving end aren’t the IT equivalents of 3-star Michelin-rated chefs. They’re line cooks, and they give you what you asked for because they don’t know any better. And because the customer is always right, right?

Yeah, nah.

As a software professional, it’s your job to solve your customers’ problems, and to do so within constraints your customers probably know nothing about, and within an often-complex context your customers do not understand[3]. If you simply deliver what the customer asks for, you’ve missed the point, and missed an opportunity to truly solve the fundamental problem that needs to be solved.

If you’re a BI professional, every project and every feature request brings with it an opportunity. It’s the opportunity to ask questions.

Why do you need this?

When will you use it?

What are you doing today without the thing you’re asking for?

When will this be useful?

Who else will use it?[4]

As a software or BI professional, you’re the expert on the solution, just as your customer is the expert on the problem. You know where logic can be implemented, and the pros and cons of each option. You know where the right data will come from, and how it will need to be transformed. You know what’s a quick fix and what will require a lot of work – and might introduce undesirable side-effects or regressions in other parts of the solution.

With this expertise, you’re in the perfect position to ask the right questions to help you understand the problem that needs to be solved. You’re in the perfect position to take the answers to your questions and to turn them into what your customer really needs… which is often very different from what they’re asking for.

You don’t need to ask these questions every time. You may not even need to ask questions of your customers most of the time[5]. But if you’re asking these questions of yourself each time you’re beginning new work – and asking questions of your customers as necessary – the solutions you deliver will be better for it.

And when you find yourself on the requesting side (for example, when you find yourself typing into ideas.powerbi.com) you’re in the perfect position to provide information about the problem you need solved – not just the solution you think you need. Why not give it a try?

This is a complex topic. I started writing this post almost 100 years ago, way back in February 2020[6]. I have a lot more that I want to say, but instead of waiting another hundred years I’ll wrap up now and save more thoughts for another post or two.

If you’ve made it this far and you’re interested in more actual best practices, please read Lean Customer Development by Cindy Alvarez. This book is very accessible, and although it is targeted more at startups and commercial software teams it contains guidance and practices that can be invaluable for anyone who needs to deliver solutions to someone else’s problems.


 

[1] This seems like the most likely outcome to me.

[2] This could be a commercial software team or “the report guy” in your IT department. Imagine what works for you.

[3] If you’re interested in a fun and accessible look at how the Power BI team decides what features to build, check out this 2019 presentation from Power BI PM Will Thompson. It’s only indirectly related to this post, but it’s a candid look at some of the “often-complex context” in which Power BI is developed.

[4] Please don’t focus too much on these specific questions. They might be a good starting point, but they’re just what leaped to mind as I was typing, not a well-researched list of best practice questions or anything of the sort.

[5] If you’re a BI developer maintaining a Power BI application for your organization, you may have already realized that asking a ton of questions all the time may not be appreciated by the people paying your salary, so please use your own best judgment here.

[6] This probably explains why I so casually mentioned the idea of walking into a restaurant. I literally can’t remember the last time I was in a restaurant. Do restaurants actually exist? Did they ever?

Data Culture: Strategy is more important than tactics

Even though he lived 2,000 years ago, you’ve probably heard of the Chinese military strategist and general Sun Tzu. He’s known for a lot of things, but these days he’s best known for his work The Art of War[1], which captures military wisdom that is still studied and applied today

Even though Sun Tzu didn’t write about building a data culture[2], there’s still a lot we can learn from his writings. Perhaps the most relevant advice is this:

Building a data culture is hard. Keeping it going, and thriving, as the world and the organization change around you is harder. Perhaps the single most important thing[3] you can do to ensure long-term success is to define the strategic goals for your efforts.

Rather than doing all the other important and valuable tactical things, pause and think about why you’re doing them, and where you want to be once they’re done. This strategic reflection will prove invaluable, as it will help you prioritize, scope, and tune those tactical efforts.

Having a shared strategic vision makes everything else easier. At every step of the journey, any contributor can evaluate their actions against that strategic vision. When conflicts arise – as they inevitably will – your pre-defined strategic north star can help resolve them and to keep your efforts on track.


[1] Or possibly for the Sabaton album of the same name, which has a catchier bass line. And since Sabaton is a metal band led by history geeks, they also have this video that was released just a few weeks ago that looks at some of the history behind the album and song.

[2] Any more than Fiore wrote about business intelligence.

[3] I say “perhaps” because having an engaged executive sponsor is the other side of the strategy coin. Your executive sponsor will play a major role in defining your strategy, and in getting all necessary stakeholders on board with the strategy. Although I didn’t plan it this way, I’m quite pleased with the parallelism of having executive sponsorship be the first non-introductory video in the series, and having this one be the last non-summary video. It feels neat, and right, and satisfying.

Session resources: Patterns for adopting dataflows in Power BI

This morning I presented a new webinar for the Istanbul Power BI user group, covering one of my favorite subjects: common patterns for successfully using and adopting dataflows in Power BI.

This session represents an intersection of my data culture series in that it presents lessons learned from successful enterprise customers, and my dataflows series in that… in that it’s about dataflows. I probably didn’t need to point out that part.

The session slides can be downloaded here: 2020-09-23 – Power BI Istanbul – Patterns for adopting dataflows in Power BI

The session recording is available for on-demand viewing. The presentation is around 50 minutes, with about 30 minutes of dataflows-centric Q&A at the end. Please check it out, and share it with your friends!

 

Data Culture: Now you’re thinking with portal

In an ideal world, everyone knows where to find the resources and tools they need to be successful.

We don’t live in that world.

I’m not even sure we can see that world from here. But if we could see it, we’d be seeing it through a portal[1].

One of the most common themes from my conversations with enterprise Power BI customers is that organizations that are successfully building and growing their data cultures have implemented portals where they share the resources, tools, and information that their users need. These mature companies also treat their portal as a true priority – the portal is a key part of their strategy, not an afterthought.

This is why:

In every organization of non-trivial size there are obstacles that keep people from finding and using the resources, information, and data they need.

Much of the time people don’t know what they need, nor do they know what’s available. They don’t know what questions to ask[2], much less know where to go to get the answers. This isn’t their fault – it’s a natural consequence of working in a complex environment that changes over time on many different dimensions.

As I try to do in these accompanying-the-video blog posts I will let the video speak for itself, but there are a few key points I want to emphasize here as well.

  1. You need a place where people can go for all of the resources created and curated by your center of excellence
  2. You need to engage with your community of practice to ensure that you’re providing the resources they need, and not just the resources you think they need
  3. You need to keep directing users to the portal, again and again and again, until it becomes habit and they start to refer their peers

The last point is worth emphasizing and explaining. If community members don’t use the portal, it won’t do what you need it to do, and you won’t get the return you need on your investments.

Users will continue to use traditional “known good” channels to get information – such as sending you an email or IM – if you let them. You need to not let them.


[1] See what I did there?

[2] Even though they will often argue vehemently against this fact.

Migrating to Power BI

One aspect of building a data culture is selecting the right tools for the job. If you want more people working with more data, giving the tools they need to do that work is an obvious[1] requirement. But how many tools do you need, and which tools are the right tools?

Migrating to the cloud

It should be equally obvious that the answer is “it depends.” This is the answer to practically every interesting question. The right tools for an organization depend on the data sources it uses, the people who work with that data, the history that has gotten the organization to the current decision point, and the goals the organization needs to achieve or enable with the tools it selects.

With that said, it’s increasingly common[2] to see large organizations actively working to reduce the number of BI tools they support[3]. The reasons for this move to standardization are often the same:

  • Reduce licensing costs
  • Reduce support costs
  • Reduce training costs
  • Reduce friction involved in driving the behaviors needed to build and grow a data culture

Other than reducing the licensing costs[4], most of these motivations revolve around simplification. Having fewer tools means learning and using fewer tools. It means everyone learning and using fewer tools, which often results in less time and money spent to get more value from the use of those tools.

One of the challenges in eliminating a BI tool is ensuring that the purpose that tool fulfilled is now effectively fulfilled by the tool that replaces it. This is where migration comes in.

The Power BI team at Microsoft has published a focused set of guidance articles focused specifically on migrating from other BI tools to Power BI.

This documentation was written by the inestimable Melissa Coates of Coates Data Strategies, with input and technical review by the Power BI customer advisory team. If you’re preparing to retire another BI tool and move its workload to Power BI – or if you’re wondering where to start – I can’t recommend it highly enough.


[1] If this isn’t obvious to a given organization or individual, I’m reasonably confident that they’re not actively trying to build a data culture, and not reading this blog.

[2] I’m not a market analyst but I do get to talk to BI, data, and analytics leaders at large companies around the world, and I suspect that my sample size is large and diverse enough to be meaningful.

[3] I’m using the word “support” here – and not “use” – deliberately. It’s also quite common to see companies remove internal IT support from deprecated BI tools, but also let individual business units continue to use them – but also to pay for the tools and support out of their own budgets. This is typically a way to allow reluctant “laggard” internal customer groups to align with the strategic direction, but to do it on their own schedules.

[4] I’m pretty consistent in saying I don’t know anything about licensing, but even I understand that paying for two things costs more than paying for one of those things.

Data Culture: Picking Your Battles

Not all data is created equal.

One size does not fit all.

In addition to collaboration and partnership between business and IT, successful data cultures have something  else in common: they recognize the need for both discipline and flexibility, and have clear, consistent criteria and responsibilities that let all stakeholders know what controls apply to what data and applications.

2020-08-01-19-55-59-794--POWERPNT

Today’s video looks at this key fact, and emphasizes this important point: you need to pick your battles[1].

If you try to lock everything down and manage all data and applications rigorously, business users who need more agility will not be able to do their jobs – or more likely they will simply work around your controls. This approach puts you back into the bad old days before there were robust and flexible self-service BI tools – you don’t want this.

If you try to let every user do whatever they want with any data, you’ll quickly find yourself in the “wild west” days – you don’t want that either.

Instead, work with your executive sponsor and key stakeholders from business and IT to understand what requires discipline and control, and what supports flexibility and agility.

One approach will never work for all data – don’t try to make it fit.


[1] The original title of this post and video was “discipline and flexibility” but when the phrase “pick your battles” came out unscripted[2] as I was recording the video, I realized that no other title would be so on-brand for me. And here we are.

[2] In case you were wondering, it’s all unscripted. Every time I edit and watch a recording, I’m surprised. True story.

You can’t avoid problems you can’t see

The last post was about the dangers inherent in measuring the wrong thing – choosing a metric that doesn’t truly represent the business outcome[1] you think it does. This post is about different problems – the problems that come up when you don’t truly know the ins and outs of the the data itself… but you think you do.

This is another “inspired by Twitter” post – it is specifically inspired by this tweet (and corresponding blog post) from Caitlin Hudon[2]. It’s worth reading her blog post before continuing with this one – you go do that now, and I’ll wait.

Caitlin’s ghost story reminded me of a scary story of my own, back from the days before I specialized in data and BI. Back in the days when I was a werewolf hunter. True story.

Around 15 years ago I was a consultant, working on a project with a company that made point-of-sale hardware and software for the food service industry. I was helping them build a hosted solution for above-store reporting, so customers who had 20 Burger Hut or 100 McTaco restaurants[3] could get insights and analytics from all of them, all in one place. This sounds pretty simple in 2020, but in 2005 it was an exciting first-to-market offering, and a lot of the underlying platform technologies that we can take for granted today simply didn’t exist. In the end, we built a data movement service that took files produced by the in-store back-of-house system and uploaded them over a shared dial-up connection[4] from each restaurant to the data center where they could get processed and warehoused.

The analytics system supported a range of different POS systems, each of which produced files in different formats. This was a fun technical challenge for the team, but it was a challenge we expected. What we didn’t expect was the undocumented failure behavior of one of these systems. Without going into too much detail, this POS system would occasionally produce output files that were incomplete, but which did not indicate failure or violate any documented success criteria.

To make a long story short[5], because we learned about the complexities of this system very late in the game, we had some very unhappy customers and some very long nights. During a retrospective we engaged with of the project sponsors for the analytics solution because he had – years earlier – worked with the development group that built this POS system. (For the purposes of this story I will call the system “Steve” because I need a proper noun for his quote.)

The project sponsor reviewed all we’d done from a reliability perspective – all the validation, all the error handling, all the logging. He looked at this, then he looked at the project team and he said:

You guys planned for wolves. ‘Steve’ is werewolves.

Even after all these years, I still remember the deadpan delivery for this line. And it was so true.

We’d gone in thinking we were prepared for all of the usual problems – and we were. But we weren’t prepared for the horrifying reality of the data problems that were lying in wait. We weren’t prepared for werewolves.

Digging through my email from those days, I found a document I’d sent to this project sponsor, planning for some follow-up efforts, and was reminded that for the rest of the projects I did for this client, “werewolves” became part of the team vocabulary.

2019-10-30-12-37-25-597--WINWORD

What’s the moral of this story? Back in 2008 I thought the moral was to test early and often. Although this is still true, I now believe that what Past Matthew really needed was a data catalog or data dictionary with information that clearly said DANGER: WEREWOLVES in big red letters.

This line from Caitlin’s blog post could not be more wise, or more true:

The best defense I’ve found against relying on an oral history is creating a written one.

The thing that ended up saving us back in 2005 was knowing someone who knew something – we happened to have a project stakeholder who had insider knowledge about a key data source and its undocumented behavior. What could have better? Some actual <<expletive>> documentation.

Even in 2020, and even in mature enterprise organizations, having a reliable data catalog or data dictionary that is available to the people who could get value from it is still the exception, not the rule. Business-critical data sources and processes rely on tribal knowledge, time after time and team after team.

I won’t try to supplement or repeat the best practices in Caitlin’s post – they’re all important and they’re all good and I could not agree more with her guidance. (If you skipped reading her post earlier, this is the perfect time for you to go read it.) I will, however, supplement her wisdom with one of my favorite posts from the Informatica blog, from back in 2017.

I’m sharing this second link because some people will read Caitlin’s story and dismiss it because she talks about using Google Sheets. Some people will say “that’s not an enterprise data catalog.” Don’t be those people.

Regardless of the tools you’re using, and regardless of the scope of the data you’re documenting, some things remain universally true:

  • Tribal knowledge can’t be relied upon at any meaningful scale or across any meaningful timeline
  • Not all data is created equal – catalog and document the important things first, and don’t try to boil the ocean
  • The catalog needs to be known by and accessible to the people who need to use the data it described
  • Someone needs to own the catalog and keep it current – if its content is outdated or inaccurate, people won’t trust it, and if they don’t trust it they won’t use it
  • Sooner or later you’ll run into werewolves of your own, and unless you’re prepared in advance the werewolves will eat you

When I started to share this story I figured I would find a place to fit in a “unless you’re careful, your data will turn into a house when the moon is full” joke without forcing it too much, but sadly this was not the case. Still – who doesn’t love a good data werehouse joke?[6]

Maybe next time…


[1] Or whatever it is you’re tracking. You do you.

[2] Apparently I started this post last Halloween. Have I mentioned that the past months have been busy?

[3] Or Pizza Bell… you get the idea.

[4] Each restaurant typically had a single “data” phone line that used the same modem for processing credit card transactions. I swear I’m not making this up.

[5] Or at least short-ish. Brevity is not my forte.

[6] Or this data werehouse joke, for that matter?

Viral adoption: Self-service BI and COVID-19

I live 2.6 miles (4.2 km) from the epicenter of the coronavirus outbreak in Washington state. You know, the nursing home that’s been in the news, where over 10 people have died, and dozens more are infected.[1]

As you can imagine, this has started me thinking about self-service BI.

2020-03-10-17-44-57-439--msedge
Where can I find information I can trust?[2]
When the news started to come out covering the US outbreak, there was something I immediately noticed: authoritative information was very difficult to find. Here’s a quote from that last link.

This escalation “raises our level of concern about the immediate threat of COVID-19 for certain communities,” Dr. Nancy Messonnier, director of the CDC’s National Center for Immunization and Respiratory Diseases, said in the briefing. Still, the risk to the general public not in these areas is considered to be low, she said.

That’s great, but what about the general public in these areas?

What about me and my family?

When most of what I saw on Twitter was people making jokes about Jira tickets[3], I was trying to figure out what was going on, and what I needed to do. What actions should I take to stay safe? What actions were unnecessary or unhelpful?

Before I could answer these questions, I needed to find sources of information. This was surprisingly difficult.

Specifically, I needed to find sources of information that I could trust. There was already a surge in misinformation, some of it presumably well-intentioned, and some from deliberately malicious actors. I needed to explore, validate, confirm, cross-check, act, and repeat. And I was doing this while everyone around me seemed to be treating the emerging pandemic as a joke or a curiosity.

I did this work and made my decisions because I was a highly-motivated stakeholder, while others in otherwise similar positions were farther away from the problem, and were naturally less motivated at the time.[4]

And this is what got me thinking about self-service BI.

In many organizations, self-service BI tools like Power BI will spread virally. A highly-motivated business user will find a tool, find some data, explore, iterate, refine, and repeat. They will work with untrusted – and sometimes untrustworthy – data sources to find the information they need to use, and to make the decisions they need to make. And they do it before people in similar positions are motivated enough to act.

But before long, scraping together whatever data is available isn’t enough anymore. As the number of users relying on the insights being produced increases – even if the insights are being produced by a self-service BI solution – the need for trusted data increases as well.

Where an individual might successfully use disparate unmanaged sources successfully, a population needs a trusted source of truth.

At some point a central authority needs to step up, to make available the data that can serve as that single source of truth. This is easier said than done[5], but it must be done. And this isn’t even the hard part.

The hard part is getting everyone to stop using the unofficial and untrusted sources that they’ve been using to make decisions, and to use the trusted source instead. This is difficult because these users are invested in their current sources, and believe that they are good enough. They may not be ideal, but they work, right? They got me this far, so why should I have to stop using them just because someone says so?

This brings me back to those malicious actors mentioned earlier. Why would someone deliberately share false information about public health issues when lies could potentially cost people their lives? They would do it when the lies would help forward an agenda they value more than they value other people’s lives.

In most business situations, lives aren’t at stake, but people still have their own agendas. I’ve often seen situations where the lack of a single source of truth allows stakeholders to present their own numbers, skewed to make their efforts look more successful than they actually are. Some people don’t want to have to rebuild their reports – but some people want to use falsified numbers so they can get a promotion, or a bonus, or a raise.

Regardless of the reason for using untrusted sources, their use is damaging and should be reduced and eliminated. This is true of business data and analytics, and it is true of the current global health crisis. In both arenas, let’s all be part of the solution, not part of the problem.

Let us be a part of the cure, never part of the plague – we’ll only be remembered for what we create.[6]


[1] Before you ask, yes, my family and I are healthy and well. I’ve been working from home for over a week now, which is a nice silver lining; I have a small but comfortable home office, and can avoid the obnoxious Seattle-area commute.

[2] This article is the best single source I know of. It’s not authoritative source for the subject, but it is aggregating and citing authoritative sources and presenting their information in a form closer to the solution domain than to the problem domain.

[3] This is why I’ve been practicing social media distancing.

[4] This is the where the “personal pandemic parable” part of the blog post ends. From here on it’s all about SSBI. If you’re actually curious, I erred on the side of caution and started working from home and avoiding crowds before it was recommended or mandated. I still don’t know if all of the actions I’ve taken were necessary, but I’m glad I took them and I hope you all stay safe as well.

[5] As anyone who has ever implemented a single source of truth for any non-trivial data domain can attest.

[6] You can enjoy the lyrics even if Kreator’s awesome music isn’t to your taste.