Dataflows with benefits

Power BI datamarts are like dataflows with benefits.

In case you missed the announcements this week from Microsoft’s Build conference, datamarts are a new major new capability coming to Power BI that are now available in public preview. There are also preview docs available, but the best datamarts content I’ve seen so far is this fantastic video from Adam and Patrick at Guy in a Cube[1].

I’m going to assume that since you’re reading this blog, you didn’t miss the announcement. I’m also going to assume that some small part of you is asking “what the heck are datamarts, anyway?”

For me, datamarts are like dataflows with benefits[2].

It should come as no surprise to any regular reader of this blog that I’m a big fan of dataflows in Power BI. Dataflows let Power BI users build reusable data tables  in a workspace using Power Query Online, and share them with other users for reuse in other workspaces. What’s not to love?[3]

If you’ve spent a lot of time working with dataflows, you can probably think of a few things you wished dataflows did differently, or better. These are some of the most common requests I’ve heard in customer conversations over the years:

  • “I wish I could define row-level security (RLS) on my dataflows so I could share them securely with more users.”
  • “I wish my users could connect to dataflows using SQL, because analysts in my org all know SQL.”
  • “I wish <operation that would benefit from a compute engine> performed better in dataflows.”[4]

You probably see where I’m going here. Datamarts in Power BI deliver solutions to these problems. Datamarts in Power BI build on the strength of dataflows, while enabling common scenarios where dataflows did not offer an obvious or simple solution.

Almost like datamarts were dataflows with benefits.

Datamarts, like dataflows, provide a Power Query Online experience to perform data preparation. Datamarts, like dataflows, allow users to create reusable data tables in a Power BI workspace.

But datamarts, unlike dataflows, store their data in a managed SQL database. Dataflows use CDM folders for their storage, which means CSV files with some extra metadata. Although this file-based approach provides some benefits for reuse in integration scenarios, it can also be a challenge for simply connecting to the data in tools other than Power BI.

With datamarts, the data produced by executing your Power Query queries[5] is loaded into tables in an Azure SQL database that’s managed by the Power BI service. Having data in a full-featured database, as opposed to folders full of text files, makes a lot of difference.

  • Datamarts support row-level security. Using a simple in-browser user experience a datamart author can define RLS rules that restrict what users can see what data when connecting to the datamart.
  • Anyone with the right permissions can query[6] the datamart’s underlying SQL database using any SQL query tool. This means that authorized users can perform exploratory data analysis in Excel, SQL Server Management Studio, Azure Data Studio, or whatever tool they’re most comfortable using. It’s a database.
  • Merges and joins and other operations common to building a star schema perform much better, because these are things that SQL has been awesome at for decades.[7]

Is it just me, or is this sounding a lot like dataflows with benefits?

But wait, you might say, what about measures and that automatically created dataset thingie and all the other stuff they showed at Build, and which I don’t really understand yet? What about deciding when I should use a datamart over a dataflow? What about the expanded web authoring experience, and querying the datamart directly from within the browser??

Yeah, I’m not going to cover any of that in this post. The post is already too long, and I didn’t really have time to write this one as it is.[8] But I think it’s those things that make the product team scowl when I describe datamarts as “dataflows with benefits” because they’re really a lot more. But if you think about dataflows with benefits, you’re probably off to a good start, and heading in the right direction.

I’m going to end on this note: all of my personal Power BI projects going forward will be built using datamarts. Datamarts do everything I need dataflows to do, and for me they do it better. I’ll still always love dataflows, and there will likely still be places where dataflows make sense, but for me…


 

[1] And I’m not only saying that because Roche’s Maxim makes an awkward surprise appearance.

[2] In case the “with benefits” descriptor isn’t meaningful to you, I’m attempting to make a play on the phrase “friends with benefits.” You can check out the mildly NSFW Urban Dictionary definition if you really feel like you need to. Honestly, I wouldn’t recommend it, but you do you.

[3] I literally got goosebumps typing that “reusable data tables” sentence. Even after all these years, having the promise of self-service data preparation and reuse realized in Power BI still feels a little bit like magic.

[4] Yes, this typically leads into a conversation about the dataflows “enhanced compute engine” but since using that engine requires following specific design patterns, this isn’t always as straightforward a conversation as you might want it to be.

[5] Power Queries? I always struggle with what to use for the plural noun for the queries that you build in Power Query. Maybe I should ask Alex.

[6] I use the term “query” here to mean SELECT statements. Only read operations are permitted, so if you try to UPDATE or whatever, you’ll get an error. Use Power Query to transform the data as you load it, like you would with a traditional data mart or data warehouse.

[7] I don’t have enough hands-on with the datamarts preview at this point to say much more than “faster” but in my non-scientific testing queries that would take “I guess I’ll make another pot of coffee” in dataflows take “oh it’s done already” in datamarts.

[8] If you had any idea how mean my calendar is this year, you’d weep. I’m weeping right now.

15 thoughts on “Dataflows with benefits

  1. Bryan Campbell

    Looks like a data mart is a sql server, 100gb limit, using primary and foreign keys to relate tables (which if we added column store technology, would be analysis services; the measures are added on top and likely converted to T-sql?, the etl is power query.

    All w no sql code on end user, unless you want to write T-sql; appears (but can’t do it yet) you’ll be able to save (and I bet publish) views on top of the data mart…

    Like

  2. bryanca888

    long term, i see this bridging the skill sets of excel power users (who got good at power query because of power bi, but didnt make the next step to SQL skillset due to access or aptitude…)
    microsoft hit a homerun with power bi/power query 8 years ago, and the ball has still not started coming down… oh yeah, the activity logs are “sorta” working and the tenant scanner API knows nothing of this ‘datamart’… 😉

    Like

  3. Pingback: Dataflows with benefits — BI Polar | ERP and BI

  4. Mukesh

    I have followed all your blogs and read multiple times related to dataflow and implemented in my organization and I too love dataflows, but now will be waiting to for your blogs to uncover what Datamart offers. I have a load data stage when I load all the data( mostly excel ) on incremental basis then then use another dataflow to transform it as per business requirement.

    Now my question is- shall I used output created from first stage of dataflow in Datamart? If yes then here when I tried connecting using Power BI as well as Power Platform connectors but here linked entities are not there. Is it something we need to wait for this features?

    If no then going forward shall we just use Datamart for all – load, transform, model and visualize?

    Like

    1. Richard

      So, I gave it a try yesterday. I felt like Datamart is a wrap up box including dataflow design model and dataset inside it since it included staging phase – syphon data from datasource, modeling as dataflow design, and release in the other end as dataset to build report on. I noticed that it did not allow originating a dataflow from datamart, which is one important benefit less than dataflow – the reusability.

      Like

  5. Andrew

    I just come here to read the footnotes.

    But seriously thanks for taking time out of your mean calendar to write this (and the rest of the blog). As someone who’s trying to decide what our PowerBI “landscape” could look like, it’s always refreshing to read an informal yet informative blog with a little humour and sarcasm thrown in to keep us all sane. It’s like having a chat with a mate over a few beers, which I would love to do one day! But I think you’ve got far more interesting things to do plus I don’t know if I could afford all the beers I clearly owe you for the help you’ve already given me!

    Like

  6. Artur

    The biggest question to all of the question. Currenly we cannot use Dataflow in Power Query of Excel (which after all this moths of exsting Dataflows is still surprising me). Is will be different with Datamarts?

    Like

    1. Dataflows are now supported in Excel for folks in the Office Insiders program (learn more here: https://insider.office.com/ ) and should be available for everyone in the near-ish future. If you don’t have them already, you shouldn’t have too long to wait.

      As for datamarts, you can connect directly to the SQL database using the SQL Server connector in Excel and can connect to the dataset as well.

      Like

  7. Pingback: Power BI dataflows vs datamarts: What's the difference??? - Guy in a Cube

  8. Pingback: Power BI Datamarts | James Serra's Blog

Leave a Reply to Andrew Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s