These days more and more data lives behind an API, rather than in a more traditional data source like databases or files. These APIs are often designed and optimized for small, chatty interactions that don’t lend themselves well to use as a source for business intelligence.
These APIs are often slower than a database, which can increase load/refresh times. Sometimes the load time is so great that a refresh may not fit within the minimum window based on an application’s functional requirements.
These APIs may also be throttled. SaaS application vendors often have a billing model that doesn’t directly support frequent bulk operations, so to avoid customer behaviors that affect their COGS and their bottom line, their APIs may be limited to a certain number of calls for a given period.
The bottom line is that when you’re using APIs as a data source in Power BI, you need to take the APIs’ limitations into consideration, and often dataflows can help deliver a solution that accommodates those limitations while delivering the functionality your application needs.
I’ve covered some of the fundamental concepts of this approach in past blog posts , specifically this one from 2018 on Power BI Dataflows and Slow Data Sources, and this one from 2019 on Creating “data workspaces” for dataflows and shared datasets. I hadn’t planned on having a dedicated new blog post, but after having this pattern come up in 4 or 5 different contexts in the past week or so, I thought maybe a post was warranted.
In late July I met with a customer to discuss their Power BI dataflows architecture. They showed me a “before” picture that looked something like this:
One of their core data sources was the API for a commercial software application. Almost every Power BI application used some data from this API, because the application supports almost every part of their business. This introduced a bunch of familiar challenges:
- Training requirements and a steeper-then-necessary learning curve for SSBI authors due to the complex and unintuitive API design
- Long refresh times for large data extracts
- Complex, redundant queries in many applications
- Technical debt and maintenance due to the duplicate logic
Then they showed me an “after” picture that looked something like this:
They had implemented a set of dataflows in a dedicated workspace. These dataflows have entities that pull data from the source APIs, and make them available for consumption by IT and business Power BI authors. All data transformation logic is implemented exactly once, and each author can easily connect to trusted tabular data without needing to worry about technical details like connection strings, API parameters, authentication, or paging. The dataflows are organized by the functional areas represented in the data, mapping roughly to the APIs in the source system as viewed through the lens of their audience.
The diagram above simplifies things a bit. For their actual implementation they used linked and computed entities to stage, prepare, and present the dataflows, but for the general pattern the diagram above is probably the right level of abstraction. Each IT-developed or business-developed application uses the subset of the dataflows that it needs, and the owners of the dataflows keep them up to date and current.
Life is good.
There’s a lot of information available online covering chatty and chunky patterns in API design, so if you’re not sure what I’m talking about here, you might want to poke around and take a peek at what’s out there.
 Please let me know if you found the joke in footnote.
 Possibly by multiple orders of magnitude.
 Or a given data volume, etc. There are probably as many variations in licensing as there are SaaS vendors with APIs.
 If you’re wondering about the beehive image and the obtuse joke it represents, the genus for honey bees is “apis.” I get all of my stock photography from the wonderful web site Pixabay. I will typically search on a random term related to my blog post to find a thumbnail image, and when the context is completely different like it was today I will pick one that I like. For this post I searched on “API” and got a page full of bees, which sounds like something Bob Ross would do…