Dataflows in Power BI: Overview Part 2 – Hands-on

Part 1 of this series introduced Power BI dataflows from a largely conceptual perspective. This post is going to be much more hands-on. Let’s look at what’s involved with creating a dataflow in Power BI.

To get started, you’ll need an app workspace. Dataflows are created in Power BI workspaces just like datasets, reports, and dashboards. But because dataflows are designed for sharing and reuse, you can’t create them in your personal “My Workspace” which is designed for… not-sharing-and-reuse.

I’m going to start with this workspace:

2018-10-19_17-34-24

You’ll probably notice that this is a new “v2” workspace. You’ll probably also notice that this workspace is assigned to Power BI Premium capacity. Neither of these factors is required to get started with Power BI dataflows, but for some specific features[1] they will be important.

In my workspace I will select the “create” button, and will select “dataflow” from the menu.

Oh. Curses.

2018-10-19_17-39-28

I swear I didn’t plan this, but let’s use it as a learning experience.[2]

The use of dataflows needs to be enabled by a Power BI capacity administrator for a given Premium capacity node. Dataflows use compute and storage resources. Requiring a capacity administrator to explicitly enable dataflows gives them control over how the capacity they manage is used. This is great for the capacity administrator… but annoying for the blogger.

Anyway, as I was saying, I’m using this new workspace, which is on Power BI shared capacity, since I’m not doing anything in this post that relies on Premium. Yeah.

new workspace

So… In my new workspace I will select the “create” button, and will select “dataflow” from the menu. For real this time.

new dataflow

I’m now prompted to add new entities, which is good, because this is exactly what I want to do. If you think about a dataflow as being a self-service data warehouse or data lake, an entity is like a table within it.

Creating a new entity opens Power Query, where I can select a data source and provide my connection details. The details are a little different from Power Query in Power BI Desktop or Excel, but the experience is the same. I have an interactive preview of the data, I can transform at the column or table level, I can view and edit the transformation steps I’ve applied, and I can view the underlying “M” Power Query script in the advanced editor. User who are familiar with Power Query in Power BI Desktop should find this to be a very familiar experience.

editor

When I’m done – when the query has been given a meaningful name and all transformations have been applied, I can create more entities[3] or I can save the dataflow, giving it a name and an optional description.

save

When I save the dataflow, I am given the option to refresh now, or to set up a refresh schedule. I can do both of these things later from the dataflows list in the workspace page, but it’s convenient to be prompted when saving the dataflow. Refreshing the dataflow will execute the query for each entity in the dataflow, and will populate the underlying Azure Data Lake Storage with the results of the queries.

refresh now

Once I’ve created and refreshed the dataflow, it’s time to use it. Dataflows are designed to be used as data sources for analysis[4] so it makes sense to connect to the dataflow from Power BI Desktop, and to use it to start building my data model.

Power BI Desktop now includes a “Power BI dataflows (Beta)” option in the Get Data menu. When I select it, I can see all of the workspaces in my Power BI tenant that I have permission to access, and which contain dataflows. Inside the workspace I can see each of the dataflows it contains, and inside the dataflow I can see each entity. And just like I can with other data sources, I can select, preview, and load into my data model – or edit in Power BI desktop to further refine and prepare the data. Dataflows are built and managed in Power BI, but they act like any other data source that an analyst or business user might use.

get data

When I’m done building my PBIX file which can contain data from multiple dataflows in multiple workspaces, I can publish it to the Power BI service into any workspace where I have the necessary permissions. A dataflow in one workspace can serve as a data source for datasets and reports in any number of workspaces in the tenant.

That’s the basic end-to-end flow for creating and using a dataflow in Power BI. As you can imagine, there’s more to it, but that will wait for another post…


[1] None of which will be used or mentioned in this post.

[2] If you’re wondering how this Premium workspace contains dataflows but I can’t create dataflows in this Premium capacity, ask me in person some time. It’s a long story.

[3] If a dataflow is like a database, and an entity is like a table, who wants a database with just one table in it?

[4] Remember the last post?

7 thoughts on “Dataflows in Power BI: Overview Part 2 – Hands-on

  1. Pingback: Power BI Dataflows | MS Excel | Power Pivot | DAX | SSIS |SQL

  2. Pingback: Dataflows in Power BI – BI Polar

  3. Dhawal

    Hello Matthew,

    I love reading your blogs, 1 quick question is Dataflow in public preview?
    If yes i am not seeing that as an option, do i need to do anything for that?

    Cheers,
    Dhawal

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s