In addition to using Azure Data Lake Storage Gen2 as the location for Power BI dataflows data, Power BI can also use ADLSg2 as a data source. As organizations choose ADLSg2 as the storage location for more and more data, this capability is key to enabling analysts and self-service BI users to get value from the data in the lake.

But how do you do this in as secure a manner as possible, so that the right users have the minimum necessary permissions on the right data?
The short answer is that you let the data source handle secure access to the data it manages. ADLSg2 has a robust security model, which supports both Azure role-based access control (RBAC) and POSIX-like access control lists (ACLs)[1].
The longer answer is that this robust security model may make it more difficult to know how to set up permissions in the data lake to meet your analytics and security requirements.
Earlier this week I received a question from a customer on how to get Power BI to work with data in ADLSg2 that is secured using ACLs. I didn’t know the answer, but I knew who would know, and I looped in Ben Sack from the dataflows team. Ben answered the customer’s questions and unblocked their efforts, and he said that I could turn them into a blog post. Thank you, Ben![2]
Here’s what you should know:
1 – If you’re using ACLs, you must at least specify a filesystem name in the URL to load in the connector (or if you access ADLS Gen2 via API or any other client).
i.e. Path in Power BI Connector must at least be: https://storageaccountname.dfs.core.windows.net/FileSystemName/
2 – For every file you want to read its contents, all parent folders and filesystem must have the “x” ACL. And the file must have a “r” ACL.
i.e. if you want to access the file: https://StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1/File1.csv
- FileSystemName: “x”
- Subfolder1: “x”
- File1.csv “r”
- Path in Power BI Connector: https://StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1/File1.csv
3 – For files you want to list, all parent folders and filesystem must have the “x” ACL. The immediate parent folder must also have a “r” ACL.
i.e. if you want to view and access the files in this subfolder: https://StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1/
- FileSystemName: “x”
- Subfolder1: “x,,r”
- Each file in the subfolder should have “r”
- Path in Power BI Connector: https://storageaccountname.dfs.core.windows.net/FileSystemName/SubFolder1
4 – Default ACLs are great way to have ACLs propagate to child items. But they have to be set before creating subfolders and files, otherwise you need to explicitly set ACLs on each item.[3]
5 – If permission management is going to be dynamic, use groups as much as possible rather than assigning permissions to individual users[4]. First, ACL the groups to folders/files and then manage access via membership in the group.
6 – If you have an error accessing a path that is deep in the filesystem, work your way from the filesystem level downwards, fixing ACL settings in each step.
i.e. if you are having trouble accessing https:/StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1/SubFolder2/File(s)
First try: https://StorageAccountName.dfs.core.windows.net/FileSystemName
Then: https://StorageAccountName.dfs.core.windows.net/FileSystemName/SubFolder1
And so on.
Update: James Baker, a Program Manager on the Azure Storage team has published on GitHub a PowerShell script to recursively set ACLs. Thanks to Simon for commenting on this post to make me aware of it, Josh from the Azure support team for pointing me to the GitHub repo, and of course to James for writing the actual script!
[1] This description is copied directly from the ADLSg2 documentation, which you should also read before acting on the information in this post.
[2] Disclaimer: This post is basically me using my blog as a way to get Ben’s assistance online so more people can get insights from it. If the information is helpful, all credit goes to Ben. If anything doesn’t work, it’s my fault. Ok, it may also be your fault, but it’s probably mine.
[3] This one is very important to know before you begin, even though it may be #3 on the list.
[4] This is a best practice pretty much everywhere, not just here.
There’s also a PowerShell script available from MSFT to apply ACL’s retroactively – it’s super useful. Unless, of course, you want to spend days doing it manually in Storage Explorer. To each their own.
LikeLiked by 1 person
Any chance you could share a link to that script so I can add it (and credit to you, of course) to the post? My search-fu is weak today…
LikeLike
I shot you the link via LinkedIn.
LikeLike
Pingback: Using ACLs to Secure Azure Data Lake Data – Curated SQL