Microsoft Fabric and OneLake: Data governance and enterprise adoption

The data internet this week is awash with news and information about Microsoft Fabric. My Introducing Microsoft Fabric post on Tuesday got just under ten thousand views in the first 24 hours, which I believe is a record for this blog.

Even more exciting than the numbers are the comments. Bike4thewin replied with this excellent comment and request:

I would love to hear your thought on how to adopt this on Enterprise level and what could be the best practices to govern the content that goes into OneLake. In real life, I’m not sure you want everyone in the organisation to be able to do all of this without compromising Data Governance and Data Quality.

There’s a lot to unpack here, so please understand that this post isn’t a comprehensive answer to all of these topics – it’s just my thoughts as requested.

In the context of enterprise adoption, all of the guidance in the Power BI adoption roadmap and my video series on building a data culture applies to Fabric and OneLake. This guidance has always been general best practices presented through the lens of Power BI, and most of it is equally applicable to the adoption of other self-service data tools. Start there, knowing that although some of the details will be different, this guidance is about the big picture more than it is about the details.

In the context of governance, let’s look at the Power BI adoption roadmap again, this time focusing on the governance article. To paraphrase this article[1], the goal of successful governance is not to prevent people from working with data. The goal should be to make it as easy as possible for people to work with data while aligning that work with the goals and culture of the organization.

Since I don’t know anything about the goals or culture that inform Bike4thewin’s question, I can’t respond to them directly.. but reading between the lines I think I see an “old school” perspective on data governance rearing its head. I think that part of this question is really “how do I keep specific users from working with specific data, beyond using security controls on the data sources?”

The short answer is you probably shouldn’t, even if you could. Although saying “no” used to work sometimes, no matter what your technology stack is, saying “yes, and” is almost always the better approach. This post on data governance and self-service BI[2] provides the longer answer.

As you’re changing the focus of your governance efforts to be more about enabling the proper use of data, Fabric and OneLake can help.

Data in OneLake can be audited and monitored using the same tools and techniques you use today for other items in your Power BI tenant. This is a key capability of Fabric as a SaaS data platform – the data in Fabric can be more reliably understood than data in general, because of the SaaS foundation.

The more you think about the “OneDrive for data” tagline for OneLake, the more it makes sense. Before OneDrive[3], people would store their documents anywhere and everywhere. Important files would be stored on users’ hard drives, or on any number of file servers that proliferated wildly. Discovering a given document was typically a combination of tribal knowledge and luck, and there were no reliable mechanisms to manage or govern the silos and the  sprawl. Today, organizations that have adopted OneDrive have largely eliminated this problem – documents get saved in OneDrive, where they can be centrally managed, governed, and secured.

To make things even more exciting, the user experience is greatly improved. People can choose to save their documents in other locations, but by default every Office application saves to OneDrive by default, and documents in OneDrive can be easily discovered, accessed, and shared by the people who need to work with them, and easily monitored and governed by the organization. People still create and use the documents they need, and there are still consistent security controls in place, but the use of a central managed SaaS service makes things better.

Using OneLake has the potential to deliver the same type of benefits for data that OneDrive delivers for documents. I believe that when we’re thinking about what users do with OneLake we shouldn’t be asking “what additional risk is involved by letting users do the things they’re already doing, but in a new tool?” Instead, we should ask “how we enable users to do the things they’re already doing using a platform that provides greater visibility to administrators?”

In addition to providing administrator capabilities for auditing and monitoring, OneLake also includes capabilities to data professionals who need to discover and understand data. The Power BI data hub[4] has been renamed the OneLake data hub in Fabric, and allows users to discover data in the lake for which they already have permissions, or which the owners have marked as discoverable.

The combination of OneLake and the OneLake data hub provide compelling benefits for data governance: it’s easier for users to discover and use trusted data without creating duplicates, and it’s easier for administrators to understand who is doing what with what data.

I’ll close with two quick additional points:

  1. Right before we announced Fabric, the Power BI team announced the preview of new admin monitoring capabilities for tenant administrators. I haven’t had the chance to explore these new capabilities, but they’re designed to make the administrative oversight easier than ever.
  2. I haven’t mentioned data quality, even though it’s part of the comment to which this post is responding. Data quality is a big and complicated topic, and I don’t think I can do it justice in a timely manner… so I’m going to take a pass on this one for now.

Thanks so much for the awesome comments and questions!

[1] And any number of posts (1 | 2 | 3 | 4 | 5 | 6 | 7 |  …) on this site as well.

[2] The linked post is from exactly two years ago, as I write this new post. What are the odds?

[3] In this context I’m thinking specifically about OneDrive for Business, not the consumer OneDrive service.

[4] The data hub was originally released in preview in late 2020, and has been improving since then. It’s one of the hidden gems in Power BI, and is a powerful tool for data discovery… but I guess since I haven’t blogged about it before now, I guess I can’t complain too loudly when people don’t know it exists.

2 thoughts on “Microsoft Fabric and OneLake: Data governance and enterprise adoption

  1. Pingback: Where does Power BI end and Microsoft Fabric begin? – BI Polar

  2. Pingback: Why your governance team should be excited about Microsoft Fabric – BI Polar

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s