Dataflows with benefits

Power BI datamarts are like dataflows with benefits.

In case you missed the announcements this week from Microsoft’s Build conference, datamarts are a new major new capability coming to Power BI that are now available in public preview. There are also preview docs available, but the best datamarts content I’ve seen so far is this fantastic video from Adam and Patrick at Guy in a Cube[1].

I’m going to assume that since you’re reading this blog, you didn’t miss the announcement. I’m also going to assume that some small part of you is asking “what the heck are datamarts, anyway?”

For me, datamarts are like dataflows with benefits[2].

It should come as no surprise to any regular reader of this blog that I’m a big fan of dataflows in Power BI. Dataflows let Power BI users build reusable data tables  in a workspace using Power Query Online, and share them with other users for reuse in other workspaces. What’s not to love?[3]

If you’ve spent a lot of time working with dataflows, you can probably think of a few things you wished dataflows did differently, or better. These are some of the most common requests I’ve heard in customer conversations over the years:

  • “I wish I could define row-level security (RLS) on my dataflows so I could share them securely with more users.”
  • “I wish my users could connect to dataflows using SQL, because analysts in my org all know SQL.”
  • “I wish <operation that would benefit from a compute engine> performed better in dataflows.”[4]

You probably see where I’m going here. Datamarts in Power BI deliver solutions to these problems. Datamarts in Power BI build on the strength of dataflows, while enabling common scenarios where dataflows did not offer an obvious or simple solution.

Almost like datamarts were dataflows with benefits.

Datamarts, like dataflows, provide a Power Query Online experience to perform data preparation. Datamarts, like dataflows, allow users to create reusable data tables in a Power BI workspace.

But datamarts, unlike dataflows, store their data in a managed SQL database. Dataflows use CDM folders for their storage, which means CSV files with some extra metadata. Although this file-based approach provides some benefits for reuse in integration scenarios, it can also be a challenge for simply connecting to the data in tools other than Power BI.

With datamarts, the data produced by executing your Power Query queries[5] is loaded into tables in an Azure SQL database that’s managed by the Power BI service. Having data in a full-featured database, as opposed to folders full of text files, makes a lot of difference.

  • Datamarts support row-level security. Using a simple in-browser user experience a datamart author can define RLS rules that restrict what users can see what data when connecting to the datamart.
  • Anyone with the right permissions can query[6] the datamart’s underlying SQL database using any SQL query tool. This means that authorized users can perform exploratory data analysis in Excel, SQL Server Management Studio, Azure Data Studio, or whatever tool they’re most comfortable using. It’s a database.
  • Merges and joins and other operations common to building a star schema perform much better, because these are things that SQL has been awesome at for decades.[7]

Is it just me, or is this sounding a lot like dataflows with benefits?

But wait, you might say, what about measures and that automatically created dataset thingie and all the other stuff they showed at Build, and which I don’t really understand yet? What about deciding when I should use a datamart over a dataflow? What about the expanded web authoring experience, and querying the datamart directly from within the browser??

Yeah, I’m not going to cover any of that in this post. The post is already too long, and I didn’t really have time to write this one as it is.[8] But I think it’s those things that make the product team scowl when I describe datamarts as “dataflows with benefits” because they’re really a lot more. But if you think about dataflows with benefits, you’re probably off to a good start, and heading in the right direction.

I’m going to end on this note: all of my personal Power BI projects going forward will be built using datamarts. Datamarts do everything I need dataflows to do, and for me they do it better. I’ll still always love dataflows, and there will likely still be places where dataflows make sense, but for me…


 

[1] And I’m not only saying that because Roche’s Maxim makes an awkward surprise appearance.

[2] In case the “with benefits” descriptor isn’t meaningful to you, I’m attempting to make a play on the phrase “friends with benefits.” You can check out the mildly NSFW Urban Dictionary definition if you really feel like you need to. Honestly, I wouldn’t recommend it, but you do you.

[3] I literally got goosebumps typing that “reusable data tables” sentence. Even after all these years, having the promise of self-service data preparation and reuse realized in Power BI still feels a little bit like magic.

[4] Yes, this typically leads into a conversation about the dataflows “enhanced compute engine” but since using that engine requires following specific design patterns, this isn’t always as straightforward a conversation as you might want it to be.

[5] Power Queries? I always struggle with what to use for the plural noun for the queries that you build in Power Query. Maybe I should ask Alex.

[6] I use the term “query” here to mean SELECT statements. Only read operations are permitted, so if you try to UPDATE or whatever, you’ll get an error. Use Power Query to transform the data as you load it, like you would with a traditional data mart or data warehouse.

[7] I don’t have enough hands-on with the datamarts preview at this point to say much more than “faster” but in my non-scientific testing queries that would take “I guess I’ll make another pot of coffee” in dataflows take “oh it’s done already” in datamarts.

[8] If you had any idea how mean my calendar is this year, you’d weep. I’m weeping right now.

Coming to the PASS Data Community Summit in November: The Hitchhiker’s Guide to Adopting Power BI in Your Organization

At the 2022 PASS Community Data Summit this November, I’m thrilled to be co-presenting a full-day pre-conference session with the one and only Melissa Coates [blog | Twitter | LinkedIn]. We’ll be presenting our all-day session live and in-person in Seattle on Tuesday, November 15, 2022.

What’s the Session?

The Hitchhiker’s Guide to Adopting Power BI in Your Organization

What’s the Session About?

The Power BI Adoption Roadmap is a collection of best practices and suggestions for getting more value from your data and your investment in Power BI. The Power BI Adoption Roadmap is freely available to everyone — but not everyone is really ready to start their journey without a guide. Melissa and I will be your guides…while you’re hitchhiking…on the road…to reach the right destination…using the roadmap. (You get it now, right?!?)

We’ll do an end-to-end tour of the Power BI Adoption Roadmap. During the session we’ll certainly talk about all of the key areas (like data culture, executive sponsorship, content ownership and management, content delivery scope, center of excellence, mentoring and user enablement, community of practice, user support, data governance, and system oversight).

Smart Power BI architecture decisions are important – but there’s so much more to a successful Power BI implementation than just the tools and technology. It’s the non-technical barriers, related to people and processes, that are often the most challenging. Self-service BI also presents constant challenges related to balancing control and oversight with freedom and flexibility. Implementing Power BI is a journey, and it takes time. Our goal is to give you plenty of ideas for how you can get more value from your data by using Power BI in the best ways.

We promise this won’t be a boring day merely regurgitating what you can read online. We’ll share lessons learned from customers, what works, what to watch out for, and why. There will be ample opportunity for Q&A, so you can get your questions answered and hear what challenges that other organizations are facing. This will be a highly informative and enjoyable day for you to attend either in-person or virtually.

Who is the Target Audience?

To get the most from this pre-conference session: You need to be familiar with the Power BI Adoption Roadmap and the Power BI Implementation Planning guidance. You should have professional experience working with Power BI (or other modern self-service BI tools), preferably at a scope larger than a specific team. Although deep technical knowledge about Power BI itself isn’t required, but the more you know about Power BI and its use, the more you’ll walk away with from this session.

We hope to see you there! More details and to register: link to the PASS Data Community web site.

Who wrote this blog post?

It was Melissa.

She wrote it and emailed it to me and I shamelessly[1] stole it, which may be why there haven’t been any footnotes[2]. I even stole the banner image[3].


[1] With her permission, of course.
[2] Until these ones.
[3] Yes, Jeff. Stealing from Melissa is a Principal-level behavior.