The Power BI Adoption Framework – it’s Power BI AF

You may have seen things that make you say “that’s Power BI AF” but none of them have come close to this. It’s literally the Power BI AF[1].

That’s right – this week Microsoft published the Power BI Adoption Framework on GitHub and YouTube. If you’re impatient, here’s the first video – you can jump right in. It serves as an introduction to the framework, its content, and its goals.

Without attempting to summarize the entire framework, this content provides a set of guidance, practices, and resources to help organizations build a data culture, establish a Power BI center of excellence, and manage Power BI at any scale.

Even though I blog a lot about Power BI dataflows, most of my job involves working with enterprise Power BI customers – global organizations with thousands of users across the business who are building, deploying, and consuming BI solutions built using Power BI.

Each of these large customers takes their own approach to adopting Power BI, at least when it comes to the details. But with very few exceptions[2], each successful customer will align with the patterns and practices presented in the Power BI Adoption Framework – and when I work with a customer that is struggling with their global Power BI rollout, their challenges are often rooted in a failure to adopt these practices.

There’s no single “right way” to be successful with Power BI, so don’t expect a silver bullet. Instead, the Power BI Adoption Framework presents a set of roles, responsibilities, and behaviors that have been developed after working with customers in real-world Power BI deployments.

If you look on GitHub today, you’ll find a set of PowerPoint decks broken down into five topics, plus a few templates.

2019-12-12-12-09-19-166--msedge

These slide decks are still a little rough. They were originally built for use by partners who could customize and deliver them as training content for their customers[3], rather than for direct use by the general public, and as of today they’re still a work in progress. But if you can get past the rough edges, there’s definitely gold to be found. This is the same content I used when I put together my “Is self-service business intelligence a two-edged sword?” presentation earlier this year, and for the most part I just tweaked the slide template and added a bunch of sword pictures.

And if the slides aren’t quite ready for you today, you can head over to the official Power BI YouTube channel where this growing playlist contains bite-size training content to supplement the slides. As of today there are two videos published – expect much more to come in the days and weeks ahead.

The real heroes of this story[4] are Manu Kanwarpal and Paul Henwood.  They’re both cloud solution architects working for Microsoft in the UK. They’ve put the Power BI AF together, delivered its content to partners around the world, and are now working to make it available to everyone.

What do you think?

To me, this is one of the biggest announcements of the year, but I really want to hear from you after you’ve checked out the Power BI AF. What questions are still unanswered? What does the AF not do today that you want or need it to do tomorrow?

Please let me know in the comments below – this is just a starting point, and there’s a lot that we can do with it from here…


[1] If you had any idea how long I’ve been waiting to make this joke…

[2] I can’t think of a single exception at the moment, but I’m sure there must be one or two. Maybe.

[3] Partners can still do this, of course.

[4] Other than you, of course. You’re always a hero too – never stop doing what you do.

Power BIte: Dataflows enhanced compute engine

The enhanced compute engine in Power BI dataflows has been in preview since June. It’s not really new, and I’ve posted about it before. But I still keep hearing questions about it, so I thought it might make sense to record a video[1].

This video.


I won’t go into too much more depth here – just watch the video, and if you want more details check out one of these existing posts:

Now to get back on schedule with that next video…


[1] Also, I’m behind on my video schedule – this was a motivating factor as well. November was an unexpectedly busy month[2], and between work, life, and not really having the video editing skills I need to keep to a schedule… Yeah.

[2] And I expected it to be very, very busy.

Power BIte: Turning datasets into dataflows

At this point I’ve said “Power BI dataflows enable reuse” enough times that I feel like a broken record[1]. What does this phrase actually mean, and how can you take advantage of dataflows to enable reuse in your Power BI applications?

This Power BIte video is a bit longer than its predecessors, and part of this is because it covers both the problem and the solution.

The problem is that self-service BI applications often start out as one-off efforts, but don’t stay that way. At least in theory, if the problem solved by the application was widespread and well understood, there would be an existing solution already developed and maintained by IT, and business users wouldn’t need to develop their own solutions.

Successful applications have a tendency to grow. For self-service BI, this could mean that more and more functionality gets added to the application, or it could mean that someone copies the relevant portions of the application and uses them as the starting point for a new, different-but-related, application.

Once this happens, there is a natural and gradual process of drift[2] that occurs, as each branch of the application tree grows in its own direction. A predictable consequence of this drift in Power BI applications is that query definitions that start off as common will gradually become out of sync, meaning that “the same data” in two datasets will actually contain different values.

Moving queries that need to be shared across multiple applications from multiple datasets into a single dataflow is a simple and effective solution to this problem. There’s no dedicated tooling for this solution in Power BI today, but the steps are still simple and straightforward.

P.S. This is the first Power BIte video recorded in my home office. After struggling unsuccessfully to get decent audio quality in my office at work, I’m trying out a new environment and some new tools. I know there’s still work to be done, but hopefully this is a step in the right direction. As always, I’d love to know what you think…


 

[1] For my younger readers, this phrase is a reference to when Spotify used to be called “records” and the most common service outage symptom was a repeat of the audio buffer until the user performed a hard reset of the client application. True story.

[2] Is there a better term for this? I feel like there should be an existing body of knowledge that I could reference, but my searching did not yield any promising results. The fact that “Logical Drift” is the name of a band probably isn’t helping.

Power BI Premium Dedicated Capacity Load Assessment Tool on GitHub

Last month[1] at the Microsoft Business Applications Summit (MBAS), Power BI program managers David Magar and Swati Gupta showed off a new load assessment tool for Power BI Premium capacity workloads.

premium tool

This new tool was included as part of the BRK2046  session on Power BI Premium at MBAS. The whole session is valuable, but the tool itself comes in around the the 32 minute mark. There’s a demo at the 37 minute mark. The tool is available today on github.

This tool will help Power BI Premium customers better plan for how their specific workloads (reports, dashboards, datasets, dataflows, and patterns of access) will perform on a given Premium capacity.

The tool is built on top of a PBIE load generation tool my teammate Sergei Gundorov has built to help ISVs better handle load on their PBIE capacities. The tool grabs a user’s token and uses it to render reports again and again, cycling through preset filter values and incrementing a “render counter”. The tool stops rendering when the authentication token expires, so the result is an empirical benchmark: “report X can run against capacity Y, Z times in 1 hour”.

The tool that’s available publicly used Sergei’s work as the starting point and uses PowerShell to turn it into a simple “menu-based” UX that anybody can run. The tool enables users to:

  • choose multiple reports to run at once
  • choose the credentials used for each report
  • define filter values to cycle through between renders for each report
  • define how many users (browser windows) should request the report at once

Once all definitions are set the tool launches multiple browser windows, each targeting different reports and the users can see the load test happening on screen.

The tool was an effective way for David and Swati to generate “interesting” load scenarios for their MBAS workshop. They used it demonstrate how phenomena related to overloaded capacities (such as query wait time build up and frequent evictions) are visible using the Power BI Premium metrics app. If you haven’t already watched the session recording, be sure to check it out.

The dedicated capacity load assessment tool is published on GitHub for anyone to use. There’s a post introducing it on the Power BI blog.

The folks at Artis Consulting have already taken the tool that Sergei developed and which was shown at MBAS and have released a “Realistic” load test tool, also on GitHub. This tool build on the original one, and makes it easier to simulate users interacting with reports in a more realistic manner, such as selecting filters and bookmarks.

If you’re interested in understanding how your Power BI application will scale under load on your dedicated capacity[2], check out these tools and consider how to incorporate them into your development and deployment processes.


[1] Yes, MBAS took place in June, and this is getting posted in October. I originally wrote this post in early July, and I put it on hold to wait for the official blog post to be done so I could include a link. It’s been languishing in my drafts ever since. Life comes at you fast…

[2] It’s worth emphasizing that this tool and this post apply only to dedicated Power BI capacity. If you are using shared capacity, you should not use this tool.

Power BIte: Creating dataflows by attaching external CDM folders

This week’s Power BIte is the fourth and final entry in a series of videos[1] that present different ways to create new Power BI dataflows, and the results of each approach.

When creating a dataflow by attaching an external CDM folder, the dataflow will have the following characteristics:

Attribute Value
Data ingress path Ingress via Azure Data Factory, Databricks, or whatever Azure service or app has created the CDM folder.
Data location Data stored in ADLSg2 in the CDM folder created by the data ingress process.
Data refresh The data is refreshed based on the execution schedule and properties of the data ingress process, not by any setting in Power BI.

The key to this scenario is the CDM folder storage format. CDM folders provide a simple and open way to persist data in a data lake. Because CDM folders are implemented using CSV data files and JSON metadata, any application can read from and write to CDM folders. This includes multiple Azure services that have libraries for reading and writing CDM folders and 3rd party data tools like Informatica that have implemented their own CDM folder connectors.

CDM folders enable scenarios like this one, which is implemented in a sample and tutorial published on GitHub by the Azure data team:

  • Create a Power BI dataflow by ingesting order data from the Wide World Importers sample database and save it as a CDM folder
  • Use an Azure Databricks notebook that prepares and cleanses the data in the CDM folder, and then writes the updated data to a new CDM folder in ADLS Gen2
  • Attach the CDM folder created by Databricks as an external dataflow in Power BI[2]
  • Use Azure Machine Learning to train and publish a model using data from the CDM folder
  • Use an Azure Data Factory pipeline to load data from the CDM folder into staging tables in Azure SQL Data Warehouse and then invoke stored procedures that transform the data into a dimensional model
  • Use Azure Data Factory to orchestrate the overall process and monitor execution

That’s it for this mini-series!

If all this information still doesn’t make sense yet, now is the time to ask questions.


[1] New videos every Monday morning!

[2] I added this bullet to the list because it fits in with the rest of the post – the other bullets are copied from the sample description.

Where has the time gone?

My last post was apparently my 100th post since BI Polar kicked off last October.

2019-11-13-19-30-18-503--msedge

That’s an average of right around two posts per week, although my actual writing output has been much less even and predictable than this number might suggest.

2019-11-13-19-33-01-422--msedge

After 100 posts and a little over one year, where should BI Polar go?

What are the topics you would like to see emphasized in the next year and the next hundred posts?

For the past month or so I’ve been sticking to a three-posts-per-week schedule, but I don’t know how sustainable that is. I’m thinking about switching to a Monday-Wednesday schedule, with one post each Monday to accompany the week’s new YouTube video, plus one additional post each week. This feels like a much more reasonable long-term plan.

So… what topics or themes are you interested in? What would you like to see more of, based on what you’ve seen over the last year? I can’t promise I’ll do what you want, but I can promise that I’ll read every comment, and I expect I’ll be inspired by whatever ideas you have…

Quick Tip: Working with dataflow-created CDM folders in ADLSg2

If you’re using your own organizational Azure Data Lake Storage Gen2 account for Power BI dataflows, you can use the CDM folders that Power BI creates as a data source for other efforts, including data science with tools like Azure Machine Learning and Azure Databricks.

Image by Arek Socha from Pixabay
A world of possibilities appears before you…

This capability has been in preview since early this year, so it’s not really new, but there are enough pieces involved that it may not be obvious how to begin – and I continue to see enough questions about this topic that another blog post seemed warranted.

The key point is that because dataflows are writing data to ADLSg2 in CDM folder format, Azure Machine Learning and Azure Databricks can both read the data using the metadata in the model.json file.

This json file serves as the “endpoint” for the data in the CDM folder; it’s a single resource that you can connect to, and not have to worry about the complexities in the various subfolders and files that the CDM folder contains.

This tutorial is probably the best place to start if you want to know more[1]. It includes directions and sample code for creating and consuming CDM folders from a variety of different Azure services – and Power BI dataflows. If you’re one of the people who has recently asked about this, please go through this tutorial as your next step!


[1] It’s the best resource I’m aware of  – if you find a better one, please let me know!