Foster Kittens and Managed Self-Service BI

My family is a foster family for cats and kittens from the cat shelter where we’ve adopted some of our own cats. Usually a litter will stay with us for a month or two, but it depends on the kittens themselves, and on external factors.

While the kittens are with us, it’s our responsibility to help them grow, both physically and socially. The experts at the shelter are always available if we need help, but for the most part we have the knowledge and tools we need to be successful. In many cases we’re their first real exposure to humans, and we can prepare them to be loving and playful members of a family. Just not our family.

Once the kittens are ready to be adopted, we take them to the shelter, where they will be carefully matched with their forever family. This last part is important – it’s hard enough to let go, and knowing that each of them will find a good home is what makes it possible.

It’s really the best of both worlds – kind of like managed self-service BI with Power BI.[1]

 

Not unlike fostering kittens, managed self-service BI can be the best of both worlds. As an analyst working in Power BI, you can often pick up projects when the scope is still small and manageable, and when you can have fun playing around with the data and seeing what it’s likely to become.

I’m emphasizing the “managed” in managed self-service BI, because it’s best to not be completely on your own. Having someone backing you up, someone with the expertise and resources to get you through challenging spots with a helping hand, is just as important with BI as it is with kittens. An author on his own may make avoidable mistakes with long-term consequences, but a center of excellence or community of practice can provide training up front, and assistance along the way so the finished self-service solution is ready to grow up – and growing up is an important goal.

Just as my family includes our adult cats, that analyst working in Power BI has a day job. If we kept each litter of kittens we foster, things would soon become messy and unmanageable. If an analyst retained ownership of every Power BI solution he developed, he would struggle to stay on top of his core priorities.

Being able to hand off a self-service solution to a central BI team is what gives this story a happy ending. The BI team can give the analyst’s work the long-term home it deserves, and the analyst can get back to his job… while also keeping an eye open for the next self-service BI challenge to come along and steal his heart.

Of all the kittens I have loved, I miss Tiny the most.
Your head. I will bite it now.[2]
If you’re interested in learning more about the shelter where we volunteer, please visit the Meow Cat Rescue web site. Please also consider donating while you’re there – the global pandemic is making it harder for their awesome staff and volunteers to do what they do, and kitten season is upon us. If you appreciate the BI Polar blog and its companion YouTube channel, there’s no better way to say thank you than to donate to Meow. Even a small donation will help.


[1] I hope you saw that one coming

[2] This footage of Tiny attacking my head didn’t fit into the Power Kittens video, but I shared it on Twitter because it was just too cute to not share.

Being a Program Manager at Microsoft

My awesome friend Christopher also works at Microsoft, but his career and mine have taken different paths since we joined. Here we are together back in 2008, before either one of us worked at Microsoft. He’s the one not wearing a Manowar t-shirt.

gateslunch1
No, no, the other one.

Out backgrounds are remarkably similar. We each spent years before joining Microsoft as Microsoft Certified Trainers, with an emphasis on software development rather than systems administration. We each became involved in the MCT community, and in the broader technical community, and to one extent or another this helped us find our first positions at Microsoft.

These days Christopher is working with student developers – I’ve seen him tweeting a lot recently about Django on Twitch, which probably means that he’s still teaching people about software development… or may be watching Quentin Tarantino movies while drinking too much coffee. Honestly, either one is possible.

Anyway… Christopher reached out to me last month and asked if I’d be interested in talking to college students about to graduate about what it’s like being a program manager at Microsoft. I said yes[1], then I said this:

I’ll let the video speak for itself. I tell my story for the first 15 minutes or so, and around the 14:50 mark I talk with Will Thompson and Tessa Hurr from the Power BI team and ask them to share their experiences as well. But there are a few things that I want to add that didn’t really fit well into the video.

First of all, every team has a need for many types of program managers. I’ve blogged about diversity enough that you probably know how I feel about the value of diverse teams, and this applies to PM teams at Microsoft as well. Since this video is targeted at college students and recent college graduates, I’d like to focus briefly on the career diversity dimension.

What does “career diversity dimension” mean? Looking at most PM teams I see program managers falling into three broad groups:

  1. New to career – program managers who are starting their careers after college, or after switching from a non-IT discipline.[2]
  2. Industry hires – program managers who are new to Microsoft, but who have an established career in a related field.[3] This is often someone who has been a consultant, developer or administrator who works hands-on with Microsoft or competitive tools and technologies.
  3. Veterans – program managers who have been at Microsoft long enough to succeed a few times, fail a few times, and understand what PM success can look like on multiple teams.[4]

Reading this list it may be easy to think that there is a progression of value implied, but this is not the case. A successful team will find ways to get from each group the things that only they can contribute. A PM in one group will be able to see things and do things that a PM in other groups will not, and an experienced team leader will be able to direct each PM to the problems that they are best suited to solve, and can add the most value.

The second thing I wanted to add to the video is that each team is different. I said this in the video, but I want to elaborate here. Each team will have its own culture, and some teams will be a better fit than others for a given PM. I’ve worked on multiple teams where I didn’t think I was contributing effectively, or where I felt that my contributions weren’t valued[5]. At one point this culture mismatch almost led me to leave Microsoft, but with the support of my manager at the time I instead found another team in another org where I could thrive.

And this leads me to the the final point I wanted to add: Microsoft is huge. I’ve been a PM at Microsoft since October 2008, but I’ve had 4 or 5 major career changes since then, with very different responsibilities after each change, requiring very different contributions from me.

When I was interviewing for my first position in 2008, the hiring manager asked me why I wanted to work for Microsoft. I already had a successful career as a data warehousing and ETL consultant, and becoming a Microsoft employee would include a reduction in income, at least in the short term. Why give up what I’d built?

I hadn’t expected this question, and my answer was authentic and unscripted, and I’ve thought a lot about it over the past 11+ years. I told him that I wanted to join Microsoft because Microsoft had bigger and more challenging problems to solve than I would ever see as a consultant, and would never run out of new problems for me to help solve. If I joined Microsoft, I would never be bored.

I wanted to join Microsoft because Microsoft had bigger and more challenging problems to solve than I would ever see as a consultant, and would never run out of new problems for me to help solve. If I joined Microsoft, I would never be bored.

And I was right.

If you’re a PM at Microsoft, please share your thoughts and experiences.

If you’re thinking about becoming a PM at Microsoft, please share your questions.

If you’ve joined Microsoft as a PM after watching this video and reading this blog, please send me an email, because I would love to say hi.


[1] It’s a good thing he asked when he did. I haven’t had a haircut since late February, and I don’t know when I’ll let anyone point a camera at my head again…

[2] This was kind of me when I first applied for a position at Microsoft in the 90s, although I already had a few years’ experience. I was not hired.

[3] This was me in 2008 when I was hired.

[4] This is me in 2020. I tend to talk about the successes more, but it’s the failures I think about the most, and where I learned the most along the way. Success is awesome, but it’s a lousy teacher.

[5] Yes, this sucked as much as you could imagine.

Data culture: Executive sponsorship

Continuing on our series on data culture, we’re examining the importance of having an executive sponsor. This is one of the least exciting success factors for implementing Power BI and getting more insights from more data to deliver more value to the business… but it’s also one of the most important factors.

Let’s check it out:

Ok, what did we just watch?

This video (and the series it’s part of) includes patterns for success I’ve observed as part of my role on the Power BI CAT team[1]. and will complement the guidance being shared in the Power BI Adoption Framework.

The presence of an executive sponsor is one of the most significant factors for a successful data culture. An executive sponsor is:

  • Someone in a position of authority who shares the goals having important business decisions driven by accurate and timely data
  • A leader who can help remove barriers and make connections necessary to build enterprise data solutions
  • A source of budgetary[2] and organizational support for data initiatives
skyscraper-3184798_1920
Because executives fly on planes, right?

Without an executive sponsor, the organizational scope of the data culture is often limited by the visibility that departmental BI successes can achieve. The data culture will grow gradually and may eventually attract executive attention… or may not.

Without an executive sponsor, the lifetime of a data culture is often limited by the individuals involved. When key users move to new roles or take on new challenges and priorities, the solutions they’ve developed can struggle to find new owners.

Without an executive sponsor, all of the efforts you take to build and sustain a data culture in your organization will be harder, and will be more likely to fail.

Who is your executive sponsor?

Update: A Twitter conversation about this video sparked a follow-up post. You can check it out here: Tough Love in the data culture.


[1] This is your periodic reminder that this is my personal blog, and all posts and opinions are mine and mine alone, and do not reflect the opinions of my employer or my teenage children.

[2] This aspect of sponsorship is a bigger deal than we’re going to cover in this post and video – organizations fund what’s important to them, and they don’t fund what’s not.

Building a data culture

tl;dr – to kick off 2020 we’re starting a new BI Polar video series focusing on building a data culture, and the first video introduces the series. You should watch it and share it.

Succeeding with a tool like Power BI is easy – self-service BI tools let more users do more things with data more easily, and can help reduce the reporting burden on IT teams.

Succeeding at scale with a tool like Power BI is not easy. It’s very difficult, not because of the technology, but because of the context in which the technology is used. Organizations adopt self-service BI tools because their existing approaches to working with data are no longer successful – and because the cost and pain[1] of change has become outweighed by the cost and pain of maintaining course.

Tool adoption may be top-down, encouraged or mandated by senior management as a broad organization-wide effort. Adoption may be bottom-up, growing organically and virally in the teams and departments least well served by the existing tools and processes in place.

Both of these approaches[2] can be successful, and both of these approaches can fail. The most important success factor is a data culture in which the proper use of self-service BI tools can deliver the greatest value for the organization.

The most important success factor is a data culture

books-1655783_640
There must be a data culture on the other side of this door.

Without an organizational culture that values, encourages, recognizes, and rewards users and teams for their use of data, no tool and no amount of effort and skill is enough to achieve the full potential of the tools – or of the data.

In this new video series we’ll be covering practices that will help build a data culture. More specifically, we’ll introduce common practices that are exhibited by large organizations that have mature and successful data cultures. Each culture is unique, but there are enough commonalities to identify patterns and anti-patterns.

The content in this series will be informed by my work with enterprise Power BI customers as part of my day job[3], and will complement nicely[4] the content and guidance in the Power BI Adoption Framework.

Back in November when the 100th BI Polar blog post was published, I asked what everyone wanted to read about in the next 100 posts. There were lots of different ideas and suggestions, but the most common theme was around guidance like this. Hopefully you’ll enjoy the result – and hopefully you’ll let me know either way.


[1] I strongly believe that pain is a fundamental precursor to significant change. If there is no pain, there is no motivation to change. Only when the pain of not changing exceeds the perceived pain of going through the change will most people and organizations consider giving up the status quo. There are occasional exceptions, but in my experience these are very rare.

[2] Including any number of variations – these approaches are common points on a wide spectrum, but should not be interpreted as the only ways to adopt Power BI or other self-service BI tools.

[3] By day I’m a masked crime-fighter. Or a member of the Power BI customer advisory team. Or both. It varies from day to day.

[4] Hopefully this will be true. I’m at least as interested in seeing where this ends up as you are.

Power BIte: Power Platform dataflows

INTERIOR: pan over cubicles of happy, productive office workers

CLOSE-UP: office worker at desk

NARRATOR: Is that Susie I see, giving Power Platform dataflows a try?

SUSIE: That’s right! With dataflows I can have all of the data I need, right where I need it!!

NARRATOR: Dataflows. They’re not just for Power BI anymore.

OK, you may not remember that orange juice ad campaign from the late 1970s and early 80s[1], but I’ve had it stuck in my head since I started working on this post and video. I couldn’t figure out how to work it into the video itself, so here it is in written form.

Anyway, with that awkward moment behind us, you probably want to watch the video. Here is it:

As the video discusses, Power Apps now have a dataflows capability that is a natural complement to Power BI dataflows. Power Platform dataflows have been generally available since November 2019, and have been in preview since summer.

Power Platform dataflows use Power Query Online – and the same set of connectors, gateways, and transformation capabilities as Power BI dataflows. But there are a few key differences that are worth emphasizing.

Power Platform dataflows can load data into the Common Data Service, either into the standard Common Data Model entities used by Dynamics 365 apps, or into custom entities used by custom Power Apps. This is important – this makes dataflows more like a traditional ETL tool like SSIS data flows in that at the end of the dataflow creation process you can map the columns in your queries to the columns in these existing tables[2].

Power Platform dataflows can load data into ADLSg2 for analytical scenarios, but Power Apps doesn’t have the same concept of “built-in storage” that Power BI does. That means if you want to use Power Platform dataflows to create CDM folders, you must configure your Power Apps environment to use an ADLSg2 resource in your Azure subscription.

The “link to data lake” feature in Power Apps feels to me like a better integration experience than what’s currently available in Power BI. In Power Apps you define the link at the environment level, not the tenant level – this provides more flexibility, and enables non-tenant admins[3] to configure and use the data lake integration.

2019-12-23-11-07-30-619--ApplicationFrameHost

The first time you create a Power Platform dataflow and select the “analytical entities” option, you’ll be prompted – and required – to link the Power Apps environment to an Azure Data Lake Storage resource. You’ll need to have an Azure subscription to use, but the process itself if pretty straightforward.

2019-12-23-11-19-10-160--ApplicationFrameHost.png

I can’t wait to hear what you think of this new capability. Please let me know in the comments or via Twitter.

See you in the new year!


[1] I just realized that this was 40 years ago. Were you even born yet?

[2] CDS entities aren’t tables by the strictest definition, but it’s close enough for our purposes today.

[3] I honestly don’t know enough about Power Apps security to go into too much depth on this point, but I am not a Power Apps admin and I was able to create a trial environment and link it to my own ADLSg2 resource in my own Azure subscription without affecting other users.

Video: A most delicious analogy

Every time I cook or bake something, I think about how the tasks and patterns present in making food have strong and significant parallels with building BI[1] solutions. At some point in the future I’m likely to write a “data mis en place” blog post, but for today I decided to take a more visual approach, starting with one of my favorite holiday recipes[2].

Check it out:

(Please forgive my clickbaitey title and thumbnail image. I was struggling to think of a meaningful title and image, and decided to have a little fun with this one.)

I won’t repeat all of the information from the video here, but I will share a view of what’s involved in making this self-service BI treat.

2019-12-17-12-52-36-894--VISIO

When visualized like this, the parallels between data development and reuse are probably a bit more obvious. Please take a look at the video, and see what others jump out at you.

And please let me know what you think. Seriously.


[1] And other types of software, but mainly BI these days.

[2] I published this recipe almost exactly a year ago. The timing isn’t intentional, but it’s interesting to me to see this pattern emerging as well…

Power BIte: Dataflows enhanced compute engine

The enhanced compute engine in Power BI dataflows has been in preview since June. It’s not really new, and I’ve posted about it before. But I still keep hearing questions about it, so I thought it might make sense to record a video[1].

This video.


I won’t go into too much more depth here – just watch the video, and if you want more details check out one of these existing posts:

Now to get back on schedule with that next video…


[1] Also, I’m behind on my video schedule – this was a motivating factor as well. November was an unexpectedly busy month[2], and between work, life, and not really having the video editing skills I need to keep to a schedule… Yeah.

[2] And I expected it to be very, very busy.

Power BIte: Turning datasets into dataflows

At this point I’ve said “Power BI dataflows enable reuse” enough times that I feel like a broken record[1]. What does this phrase actually mean, and how can you take advantage of dataflows to enable reuse in your Power BI applications?

This Power BIte video is a bit longer than its predecessors, and part of this is because it covers both the problem and the solution.

The problem is that self-service BI applications often start out as one-off efforts, but don’t stay that way. At least in theory, if the problem solved by the application was widespread and well understood, there would be an existing solution already developed and maintained by IT, and business users wouldn’t need to develop their own solutions.

Successful applications have a tendency to grow. For self-service BI, this could mean that more and more functionality gets added to the application, or it could mean that someone copies the relevant portions of the application and uses them as the starting point for a new, different-but-related, application.

Once this happens, there is a natural and gradual process of drift[2] that occurs, as each branch of the application tree grows in its own direction. A predictable consequence of this drift in Power BI applications is that query definitions that start off as common will gradually become out of sync, meaning that “the same data” in two datasets will actually contain different values.

Moving queries that need to be shared across multiple applications from multiple datasets into a single dataflow is a simple and effective solution to this problem. There’s no dedicated tooling for this solution in Power BI today, but the steps are still simple and straightforward.

P.S. This is the first Power BIte video recorded in my home office. After struggling unsuccessfully to get decent audio quality in my office at work, I’m trying out a new environment and some new tools. I know there’s still work to be done, but hopefully this is a step in the right direction. As always, I’d love to know what you think…


 

[1] For my younger readers, this phrase is a reference to when Spotify used to be called “records” and the most common service outage symptom was a repeat of the audio buffer until the user performed a hard reset of the client application. True story.

[2] Is there a better term for this? I feel like there should be an existing body of knowledge that I could reference, but my searching did not yield any promising results. The fact that “Logical Drift” is the name of a band probably isn’t helping.

Power BIte: Creating dataflows by attaching external CDM folders

This week’s Power BIte is the fourth and final entry in a series of videos[1] that present different ways to create new Power BI dataflows, and the results of each approach.

When creating a dataflow by attaching an external CDM folder, the dataflow will have the following characteristics:

Attribute Value
Data ingress path Ingress via Azure Data Factory, Databricks, or whatever Azure service or app has created the CDM folder.
Data location Data stored in ADLSg2 in the CDM folder created by the data ingress process.
Data refresh The data is refreshed based on the execution schedule and properties of the data ingress process, not by any setting in Power BI.

The key to this scenario is the CDM folder storage format. CDM folders provide a simple and open way to persist data in a data lake. Because CDM folders are implemented using CSV data files and JSON metadata, any application can read from and write to CDM folders. This includes multiple Azure services that have libraries for reading and writing CDM folders and 3rd party data tools like Informatica that have implemented their own CDM folder connectors.

CDM folders enable scenarios like this one, which is implemented in a sample and tutorial published on GitHub by the Azure data team:

  • Create a Power BI dataflow by ingesting order data from the Wide World Importers sample database and save it as a CDM folder
  • Use an Azure Databricks notebook that prepares and cleanses the data in the CDM folder, and then writes the updated data to a new CDM folder in ADLS Gen2
  • Attach the CDM folder created by Databricks as an external dataflow in Power BI[2]
  • Use Azure Machine Learning to train and publish a model using data from the CDM folder
  • Use an Azure Data Factory pipeline to load data from the CDM folder into staging tables in Azure SQL Data Warehouse and then invoke stored procedures that transform the data into a dimensional model
  • Use Azure Data Factory to orchestrate the overall process and monitor execution

That’s it for this mini-series!

If all this information still doesn’t make sense yet, now is the time to ask questions.


[1] New videos every Monday morning!

[2] I added this bullet to the list because it fits in with the rest of the post – the other bullets are copied from the sample description.

Power BIte: Creating dataflows by importing model.json

This week’s Power BIte is the third in a series of videos[1] that present different ways to create new Power BI dataflows, and the results of each approach.

When creating a dataflow by importing a model.json file previously exported from Power BI, the dataflow will have the following characteristics:

Attribute Value
Data ingress path Ingress via the mashup engine hosted in the Power BI service.
Data location Data stored in the CDM folder defined for the newly created dataflow
Data refresh The dataflow is refreshed based on the schedule and policies defined in the workspace.

Let’s look at the dataflow’s model.json metadata to see some of the details.

2019-11-06-14-50-53-981--Code

At the top of the file we can see the dataflow name on line 2…

…and that’s pretty much all that’s important here. The rest of the model.json file will exactly match what was exported from the Power BI portal, and will look a lot like this or this. Boom.

For a little more detail (and more pictures, in case you don’t want to watch a four minute video) check out this post from last month, when this capability was introduced.

If this information doesn’t make sense yet, please hold on. We still have one more incoming Power BIte in this series, and then we’ll have the big picture view.

I guarantee[3] it will make as much sense as anything on this blog.


[1] New videos every Monday morning![2]

[2] Did you notice that I just copied the previous post and made some small edits to it? That seemed very appropriate given the topic…

[3] Or your money back.