Have you looked at the Power BI roadmap lately?

In case you missed it, Microsoft has published the “2020 release wave 1” release plan for the Power Platform, including Power BI.

You can find the goodness here: Power Platform: 2020 release wave 1 plan.

globe-trotter-1828079_640
I have the map, and the road… where are dataflows on this thing?

Even though you won’t see the term “roadmap” anywhere in the release plan[1] docs, this is how I think of them – because they’re the best, most current, and most complete public view of what Microsoft is planning for Power BI and the rest of the Power Platform.

Check it out today, and also check back in as the release plan is updated periodically[2] as the teams have more clarity and detail to share.


[1] Yes, these were called “release notes” not too long ago. No, I don’t know why picking a name and sticking with it is so hard. Yes, I will do my best to call these “roadmap” even though this isn’t their official name. Hashtag power rebel.

[2] I think the docs team publishes updates every week, but not every article gets modified in each update. I’m also not 100% sure about the weekly publishing schedule, which is why I buried this in a footnote that no one will actually read.

Power BI and ACLs in ADLSg2

In addition to using Azure Data Lake Storage Gen2 as the location for Power BI dataflows data, Power BI can also use ADLSg2 as a data source. As organizations choose ADLSg2 as the storage location for more and more data, this capability is key to enabling analysts and self-service BI users to get value from the data in the lake.

boje-2914324_640
Oh buoy, that is one big data lake!

But how do you do this in as secure a manner as possible, so that the right users have the minimum necessary permissions on the right data?

The short answer is that you let the data source handle secure access to the data it manages. ADLSg2 has a robust security model, which supports both Azure role-based access control (RBAC) and POSIX-like access control lists (ACLs)[1].

The longer answer is that this robust security model may make it more difficult to know how to set up permissions in the data lake to meet your analytics and security requirements.

Earlier this week I received a question from a customer on how to get Power BI to work with data in ADLSg2 that is  secured using ACLs. I didn’t know the answer, but I knew who would know, and I looped in Ben Sack from the dataflows team. Ben answered the customer’s questions and unblocked their efforts, and he said that I could turn them into a blog post. Thank you, Ben![2]

Here’s what you should know:

1 – If you’re using ACLs, you must at least specify a filesystem name in the URL to load in the connector (or if you access ADLS Gen2 via API or any other client).

i.e. Path in Power BI Connector must at least be: https://storageaccountname.dfs.core.windows.net/FileSystemName/

2 – For every file you want to read its contents, all parent folders and filesystem must have the “x” ACL. And the file must have a “r” ACL.

i.e. if you want to access the file: https://StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1/File1.csv

3 – For files you want to list, all parent folders and filesystem must have the “x” ACL. The immediate parent folder must also have a “r” ACL.

i.e. if you want to view and access the files in this subfolder: https://StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1/

4 – Default ACLs are great way to have ACLs propagate to child items. But they have to be set before creating subfolders and files, otherwise you need to explicitly set ACLs on each item.[3]

5 – If permission management is going to be dynamic, use groups as much as possible rather than assigning permissions to individual users[4]. First, ACL the groups to folders/files and then manage access via membership in the group.

6 – If you have an error accessing a path that is deep in the filesystem, work your way from the filesystem level downwards, fixing ACL settings in each step.

i.e. if you are having trouble accessing https:/StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1/SubFolder2/File(s)

First try: https://StorageAccountName.dfs.core.windows.net/FileSystemName

Then: https://StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1

And so on.

Update: James Baker, a Program Manager on the Azure Storage team has published on GitHub a PowerShell script to recursively set ACLs. Thanks to Simon for commenting on this post to make me aware of it, Josh from the Azure support team for pointing me to the GitHub repo, and of course to James for writing the actual script!


[1] This description is copied directly from the ADLSg2 documentation, which you should also read before acting on the information in this post.

[2] Disclaimer: This post is basically me using my blog as a way to get Ben’s assistance online so more people can get insights from it. If the information is helpful, all credit goes to Ben. If anything doesn’t work, it’s my fault. Ok, it may also be your fault, but it’s probably mine.

[3] This one is very important to know before you begin, even though it may be #3 on the list.

[4] This is a best practice pretty much everywhere, not just here.

Tough love in the data culture

When I shared on Twitter the most recent installment in the BI Polar “data culture” series, BI developer Chuy Varela replied to agree with some key points from the video on executive sponsorship… and to share some of his frustrations as well[1].

2020-01-13-08-46-03-439--msedge

Chuy’s comments made me think of advice that I received a few years back from my manager at the time. She told me:

Letting others fail is a Principal-level behavior.

Before I tie this tough love into the context of a data culture and executive sponsorship, I’d like to share the context in which the advice was given.

At the time I had just finished a few 80-hour work weeks, including working 12+ hour days through a long holiday weekend. Much of that time was spent performing tasks that weren’t my responsibility – another member of the extended team was dropping the ball, and with a big customer-facing milestone coming up I wanted to make everything as close to perfect as possible.

Once the milestone had been met, my manager pulled me aside and let me know how unhappy she was that I had been making myself ill to deliver on someone else’s commitments[4]. She went on to elaborate that because of my well-intentioned extra effort, there were two negative consequences:

  1. The extended team member would not learn from the experience, so future teams were more likely to have the same challenges.
  2. The executive sponsors for the current effort would believe that the current level of funding and support was enough to produce the high quality and timely deliverables that actually required extra and unsustainable work.

She was right, of course, I’ve added two phrases to my professional vocabulary because of her advice.

The first phrase[5] gets used when someone is asking me to do work:

That sounds very important, and I hope you find the right folks to get it done. It sounds too important for me to take on, because I don’t have the bandwidth to do it right, and it needs to be done right.

The second phrase gets used when I need to stop working on something:

After this transition I will no longer be performing this work. If the work is still important to you, we’ll need to identify and train someone to replace me. If the work is no longer important, no action is required and the work will no longer be done.

Now let’s think about data culture and executive sponsorship in a bottom-up organization. How does a sponsor – or potential sponsor – react when a BI developer or analyst is delivering amazing insights when it’s not really their job?

If the sponsor is bought in to the value of a data culture, their reaction is like to include making it their job, and ensuring that the analyst gets the support and resources to make this happen. This can take many different forms[6], but should always include an explicit recognition of the work and the value it delivers.

And what if the analyst who has enabled more data-driven decisions is ready to move on to new challenges? What happens to the solutions they’ve built and the processes that rely on their work? What then?

Again, if the sponsor is committed to a data culture, then their reaction will involve making sure that it becomes someone’s responsibility to move the analyst’s work forward. They will assign the necessary resources – data resources, human resources, financial resources – to ensure that the data-driven insights continue.

If you’re working on helping to build a data culture from the bottom up, there are a few opportunities to apply this advice to that end. Please understand that there is risk involved in some of these approaches, so be certain to take the full context into consideration before taking action.

  1. Work with your immediate manager(s) to amplify awareness of your efforts. If your managers believe in the work that you’re doing, they should be willing to help increase visibility.
  2. Ensure that everyone understands that your work needs support to be sustained. You’ve done something because it was the right thing to do, but for you to keep doing it, but you can’t keep doing it with all of your other responsibilities.
  3. When it comes time to change roles, ask who will be taking over the solutions you’ve built in your current role, and how you can transition responsibility to them.

Each of these potential actions is the first step down a more complicated path, and the organization’s response[7] will tell you a lot about the current state of the data culture.

For the first potential course of action, managerial support in communicating the value and impact of current efforts can demonstrate the art of the possible[8] to a potential executive sponsor who may not be fully engaged. If you can do this for your current team or department, imagine what a few more empowered folks could do for the business unit! You see where I’m going…

If this first course of action is the carrot, the other two might be the stick. Not everyone will respond the way we might like to our efforts. If you’re building these solutions today without any additional funding or support, why would I give you more resources? Right? In these cases, the withdrawal or removal of existing capabilities may be what’s necessary to communicate the value of work that’s already been completed.

Wow, this post ended up being much longer than planned. I’ll stop now. Please let me know what you think!


[1] I don’t actually believe in New Years resolutions, but I am now retroactively adding “start a blog post with a sentence that contains six[2] or more hyperlinks to my goals for 2020[3].

[2] Yes, I had to go back to count them.

[3] Nested footnotes achievement unlocked!

[4] If your reaction at this point is “what sort of manager would yell at you for doing extra work?” the answer is “an awesome manager.” This may have been the best advice I ever received.

[5] I’ve probably never said either phrase exactly like this, but the sentiment is there.

[6] Including promotion, transfer to a new role, and/or adding capacity to the analyst’s team so the analyst can focus more on the new BI work.

[7] Remember that a culture is what people do, not what people say – have I mentioned that before?

[8] The art of the possible is likely to warrant its own post and video before too long, so I won’t go into too much depth here.

Data culture: Executive sponsorship

Continuing on our series on data culture, we’re examining the importance of having an executive sponsor. This is one of the least exciting success factors for implementing Power BI and getting more insights from more data to deliver more value to the business… but it’s also one of the most important factors.

Let’s check it out:

Ok, what did we just watch?

This video (and the series it’s part of) includes patterns for success I’ve observed as part of my role on the Power BI CAT team[1]. and will complement the guidance being shared in the Power BI Adoption Framework.

The presence of an executive sponsor is one of the most significant factors for a successful data culture. An executive sponsor is:

  • Someone in a position of authority who shares the goals having important business decisions driven by accurate and timely data
  • A leader who can help remove barriers and make connections necessary to build enterprise data solutions
  • A source of budgetary[2] and organizational support for data initiatives
skyscraper-3184798_1920
Because executives fly on planes, right?

Without an executive sponsor, the organizational scope of the data culture is often limited by the visibility that departmental BI successes can achieve. The data culture will grow gradually and may eventually attract executive attention… or may not.

Without an executive sponsor, the lifetime of a data culture is often limited by the individuals involved. When key users move to new roles or take on new challenges and priorities, the solutions they’ve developed can struggle to find new owners.

Without an executive sponsor, all of the efforts you take to build and sustain a data culture in your organization will be harder, and will be more likely to fail.

Who is your executive sponsor?

Update: A Twitter conversation about this video sparked a follow-up post. You can check it out here: Tough Love in the data culture.


[1] This is your periodic reminder that this is my personal blog, and all posts and opinions are mine and mine alone, and do not reflect the opinions of my employer or my teenage children.

[2] This aspect of sponsorship is a bigger deal than we’re going to cover in this post and video – organizations fund what’s important to them, and they don’t fund what’s not.

Building a data culture

tl;dr – to kick off 2020 we’re starting a new BI Polar video series focusing on building a data culture, and the first video introduces the series. You should watch it and share it.

Succeeding with a tool like Power BI is easy – self-service BI tools let more users do more things with data more easily, and can help reduce the reporting burden on IT teams.

Succeeding at scale with a tool like Power BI is not easy. It’s very difficult, not because of the technology, but because of the context in which the technology is used. Organizations adopt self-service BI tools because their existing approaches to working with data are no longer successful – and because the cost and pain[1] of change has become outweighed by the cost and pain of maintaining course.

Tool adoption may be top-down, encouraged or mandated by senior management as a broad organization-wide effort. Adoption may be bottom-up, growing organically and virally in the teams and departments least well served by the existing tools and processes in place.

Both of these approaches[2] can be successful, and both of these approaches can fail. The most important success factor is a data culture in which the proper use of self-service BI tools can deliver the greatest value for the organization.

The most important success factor is a data culture

books-1655783_640
There must be a data culture on the other side of this door.

Without an organizational culture that values, encourages, recognizes, and rewards users and teams for their use of data, no tool and no amount of effort and skill is enough to achieve the full potential of the tools – or of the data.

In this new video series we’ll be covering practices that will help build a data culture. More specifically, we’ll introduce common practices that are exhibited by large organizations that have mature and successful data cultures. Each culture is unique, but there are enough commonalities to identify patterns and anti-patterns.

The content in this series will be informed by my work with enterprise Power BI customers as part of my day job[3], and will complement nicely[4] the content and guidance in the Power BI Adoption Framework.

Back in November when the 100th BI Polar blog post was published, I asked what everyone wanted to read about in the next 100 posts. There were lots of different ideas and suggestions, but the most common theme was around guidance like this. Hopefully you’ll enjoy the result – and hopefully you’ll let me know either way.


[1] I strongly believe that pain is a fundamental precursor to significant change. If there is no pain, there is no motivation to change. Only when the pain of not changing exceeds the perceived pain of going through the change will most people and organizations consider giving up the status quo. There are occasional exceptions, but in my experience these are very rare.

[2] Including any number of variations – these approaches are common points on a wide spectrum, but should not be interpreted as the only ways to adopt Power BI or other self-service BI tools.

[3] By day I’m a masked crime-fighter. Or a member of the Power BI customer advisory team. Or both. It varies from day to day.

[4] Hopefully this will be true. I’m at least as interested in seeing where this ends up as you are.

Power BIte: Power Platform dataflows

INTERIOR: pan over cubicles of happy, productive office workers

CLOSE-UP: office worker at desk

NARRATOR: Is that Susie I see, giving Power Platform dataflows a try?

SUSIE: That’s right! With dataflows I can have all of the data I need, right where I need it!!

NARRATOR: Dataflows. They’re not just for Power BI anymore.

OK, you may not remember that orange juice ad campaign from the late 1970s and early 80s[1], but I’ve had it stuck in my head since I started working on this post and video. I couldn’t figure out how to work it into the video itself, so here it is in written form.

Anyway, with that awkward moment behind us, you probably want to watch the video. Here is it:

As the video discusses, Power Apps now have a dataflows capability that is a natural complement to Power BI dataflows. Power Platform dataflows have been generally available since November 2019, and have been in preview since summer.

Power Platform dataflows use Power Query Online – and the same set of connectors, gateways, and transformation capabilities as Power BI dataflows. But there are a few key differences that are worth emphasizing.

Power Platform dataflows can load data into the Common Data Service, either into the standard Common Data Model entities used by Dynamics 365 apps, or into custom entities used by custom Power Apps. This is important – this makes dataflows more like a traditional ETL tool like SSIS data flows in that at the end of the dataflow creation process you can map the columns in your queries to the columns in these existing tables[2].

Power Platform dataflows can load data into ADLSg2 for analytical scenarios, but Power Apps doesn’t have the same concept of “built-in storage” that Power BI does. That means if you want to use Power Platform dataflows to create CDM folders, you must configure your Power Apps environment to use an ADLSg2 resource in your Azure subscription.

The “link to data lake” feature in Power Apps feels to me like a better integration experience than what’s currently available in Power BI. In Power Apps you define the link at the environment level, not the tenant level – this provides more flexibility, and enables non-tenant admins[3] to configure and use the data lake integration.

2019-12-23-11-07-30-619--ApplicationFrameHost

The first time you create a Power Platform dataflow and select the “analytical entities” option, you’ll be prompted – and required – to link the Power Apps environment to an Azure Data Lake Storage resource. You’ll need to have an Azure subscription to use, but the process itself if pretty straightforward.

2019-12-23-11-19-10-160--ApplicationFrameHost.png

I can’t wait to hear what you think of this new capability. Please let me know in the comments or via Twitter.

See you in the new year!


[1] I just realized that this was 40 years ago. Were you even born yet?

[2] CDS entities aren’t tables by the strictest definition, but it’s close enough for our purposes today.

[3] I honestly don’t know enough about Power Apps security to go into too much depth on this point, but I am not a Power Apps admin and I was able to create a trial environment and link it to my own ADLSg2 resource in my own Azure subscription without affecting other users.

New resource: Generating CDM folders from Azure Databricks

Most of my blog posts that discuss the integration of Azure data services and Power BI dataflows via Common Data Model folders[1][2][3] include links to a tutorial and sample originally published in late 2018 by the Azure team. This has long been the best resource to explain in depth how CDM folders fit in with the bigger picture of Azure data.

Now there’s something better.

Microsoft Solutions Architect Ted Malone has used the Azure sample as a starting point for a GitHub project of his own, and has extended this sample project to start making it suitable for more scenarios.

2019-12-20-15-39-41-744--msedge

The thing that has me the most excited (beyond having Ted contributing to a GitHub repo, and having code that works with large datasets) is the plan to integrate with Apache Atlas for lineage and metadata. That’s the good stuff right there.

If you’re following my blog for more than just Power BI and recipes, this is a resources you need in your toolkit. Check it out, and be sure to let Ted know if it solves your problems.


[1] Power BIte: Creating dataflows by attaching external CDM folders

[2] Quick Tip: Working with dataflow-created CDM folders in ADLSg2

[3] Dataflows, CDM folders and the Common Data Model