Are Power BI Dataflows a Master Data Management Tool?

Important: This post was written and published in 2018, and the content below no longer represents the current capabilities of Power BI. Please consider this post to be an historical record and not a technical resource. All content on this site is the personal output of the author and not an official resource from Microsoft.

Are Power BI dataflows a master data management tool?

This guy really wants to know.

MDM
Image from https://www.pexels.com/photo/close-up-photography-of-a-man-holding-ppen-1076801/

Spoiler alert: No. They are not.

When Microsoft first announced dataflows[1] were coming to Power BI earlier this year, I started hearing a surprising question[2]:

Are dataflows for Master Data Management in the cloud?

The first few times I heard the question, it felt like an anomaly, a non sequitur. The answer[3] seemed so obvious to me that I wasn’t sure how respond.[4]

But after I’d heard this more frequently, I started asking questions in return, trying to understand what was motivating the question. A common theme emerged: people seemed to be confusing the Common Data Service for Apps used by PowerApps, Microsoft Flow, and Dynamics 365, with dataflows – which were initially called the Common Data Service for Analytics.

The Common Data Service for Apps (CDS) is a cloud-based data service that provides secure data storage and management capabilities for business data entities. Perhaps most specifically for the context of this article, CDS provides a common storage location, which “enables you to build apps using PowerApps and the Common Data Service for Apps directly against your core business data already used within Dynamics 365 without the need for integration.”[5] CDS provides a common location for storing data that can be used by multiple applications and processes, and also defines and applies business logic and rules that are applied to any application or user manipulating data stored in CDS entities.[6]

And that is starting to sound more like master data management.

When I think about Master Data Management (MDM) systems, I think of systems that:

  • Serve as a central repository for critical organizational data, to provide a single source of truth for transactional and analytical purposes.
  • Provide mechanisms to define and enforce data validation rules to ensure that the master data is consistent, complete, and compliant with the needs of the business.
  • Provide capabilities for matching and de-duplication, as well as cleansing and standardization for the master data they contain.
  • Include interfaces and tools to integrate in with related systems in multiple ways, to help ensure that the master data is used (and used appropriately) throughout the enterprise.
  • (yawn)
    And all the other things they do, I guess.[7]

Power BI dataflows do not do these things.

While CDS has many of these characteristics, dataflows fit in here primarily in the context of integration. Dataflows can consume data from CDS and other data sources to make them available for analysis, but their design does not provide any capabilities for the curation of source data, or for transaction processing in general.

Hopefully it is now obvious that Power BI dataflows are not an MDM tool. Dataflows do provide complementary capabilities for self-service data preparation and reuse, and this can include data that comes from MDM systems. But are dataflows themselves for MDM? No, they are not.


[1] At the time, they weren’t called dataflows. Originally they were called the Common Data Service for Analytics, which may well have been part of the problem.

[2] There were many variations on how the question was phrased – this is perhaps the simplest and most common version.

[3] “No.”

[4] Other than by saying “no.”

[5] Taken directly from the documentation.

[6] Please understand that the Common Data Service for Apps is much more than just this. I’m keeping the scope deliberately narrow because this post isn’t actually about CDS.

[7] MDM is a pretty complex topic, and it’s not my intent to go into too much depth. If you’re really interested, you probably want to seek out a more focused source of information. MDM Geek may be a good place to start.

8 thoughts on “Are Power BI Dataflows a Master Data Management Tool?

  1. SUHAIL ALI

    You are absolutely correct that dataflows is not a MDM solution but I’ve been thinking about this as well for last few days. I would say all the components to build a MDM solution using PowerApps as front-end, CDS as model and combination of dataflows and Microsoft Flow for integration can build a pretty good MDM solution. Microsoft Flow can integrate with D&B and with Melissa data for 3rd party Enirchment and Dynamics connectors can be used to build a real-time connection to MDM model in CDS.

    Microsoft could at this point with not too much effort build an improved MDM offering by integrating MDS with CDS. Creating flow components for fuzzy matching and data quality rules. Those components could easily be ported to PowerQuery and within Azure Data Factory. Use of standard Power BI dashboards on top of CDS can be used to monitor say the customer golden record and the AI offering can be used to score customer data quality. It could easily compete with MDM solutions out there that start at $300K

    Like

    1. The Power Platform and Azure definitely provide a large slice of the capabilities that an MDM solution would require. I’m personally skeptical about the amount of effort that would be required to deliver the missing functionality to close the gap with other MDM offerings, but that might just be me.

      Like

      1. SUHAIL ALI

        What MDM feature(s) do you think require the most effort to build out a competent offering? I am eager to pursue this on my own for “MDM-lite” projects where some combination of cost, custom requirements and lack of integration with the greater Power/Azure platform is a deciding factor. I have built a simple MDM solution if you could call it that on the Power stack with good success but I want to go into this eyes wide open.

        I also passionately feel a good Power/Azure based MDM solution will make implementing an enterprise data warehouse solution based on dataflows a closer reality since most of the heavy ETL work of building dimensions will be off-loaded to the MDM solution. I’m over-simplifying a complicated, multi-faceted architectural topic but hopefully you’ll understand my motivation.

        Like

      2. I don’t know if I’m qualified to comment on this. If you truly want to go in with “eyes wide open” you’ll invest the time to complete some competitive market research, and to understand what are the strengths and weaknesses of existing market offerings in the context of the space you’re considering. It’s been 5+ years since I worked on an MDM product, and have not kept up with the details since then.

        Like

  2. SUHAIL ALI

    You are right Matthew. My enthusiasm overtook my rational skepticism. You didn’t discourage me and that’s all I was looking for. I’ll report back if you at all interested in where I end up.

    Like

    1. I would *love* to hear where you go with this!

      I know that a lot of organizations still use Excel workbooks or SharePoint lists to curate their “master” data. With the right processes in place this is better than nothing, but I would hesitate to call either one an MDM tool. I agree that between CDM, PowerApps, Flow, and Power BI, the Power Platform has the building blocks you would need to build a “light” MDM solution, but since I haven’t looked into this in any depth, I’m not sure where you would begin…

      Like

  3. Pingback: Dataflows in Power BI – BI Polar

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s