Dataflows in Power BI: Overview Part 5 – Data Refresh

Dataflows are all about enabling data reuse through self-service data preparation, and before you can reuse data, you need to have data. To get data into your dataflow entities you need to refresh your dataflow. Similar to the options available for Power BI datasets, there are multiple options for refreshing the data in your dataflows.

The simplest way to refresh your dataflow is to simply click the “refresh” icon in the dataflows list in your workspace. This will trigger a manual refresh. Each of the queries for the entities in the dataflow will execute, and the results of those queries will be stored in the underlying CDM Folders in Azure Data Lake Storage.

01 - Manual refresh

You can also configure scheduled refresh for your dataflows. In the “…” menu in the dataflows list, select “Settings” and then set up one or more times for the dataflow’s queries to run. This will ensure that the data in your dataflow entities remains as current as you need it to.

02 - Settings

03 - Scheduled Refresh.png

This settings page is also where you configure the gateway and credentials used to refresh the dataflow.

If you’re using Power BI Premium capacity, you can also enable incremental refresh for dataflow entities that contain a DateTime column. To configure incremental refresh for an entity, click on the right-most icon in the entity list for a dataflow.

04 - Incremental

After turning on incremental refresh for the entity, specify the DateTime column on which the incremental refresh logic is applied.

05 - Incremental Settings

The logic for dataflows is the same as it is for datasets, so instead of going into detail here, I’ll just point you to the incremental refresh documentation.[1]

Regardless of how you refresh your dataflow – manual or scheduled, full or incremental – you can view the refresh history in the dataflows list, by selecting “Refresh history” from the menu.

06 - refresh history menu.png

This will show you a list of times the dataflow was refreshed, whether the refresh was manual or scheduled, and whether the refresh succeeded or failed.

07 - Refresh history detials

For more detail about any refresh in the list you can click the “download arrow” next to the refresh. This will download a CSV file containing per-entity information for the refresh. If your refresh fails, this is probably the best place to look, and if you reach out for support with a failed refresh, the information in this file will  be valuable to share with support personnel.

That’s it for refresh. The next post in this series will introduce linked entities, which will add a little more complexity to the refresh story, but that’s a topic for another post…


[1] It’s late. I’m lazy.

Power BI Dataflows and World Bank Climate Change Data

It is increasingly common to include external data in business intelligence and data science. If you’re working to understand retail sales patterns, you may want to include weather data, traffic data, event data, crime data, and other externally-sourced data sets to supplement your internal corporate data.

The World Bank publishes a number of freely available data sets, including this one, which provides per-country-per year metrics related to global climate change. The data set is available in three different formats: CSV, XML, and XLS. Each version is accessible via HTTP.

Using Power Query in Power BI Desktop makes it easy to include this data in a PBIX file and dataset, but doesn’t provide easy opportunities for reuse.  And because significant transformation is required before the World Bank data can be used, each dataset that incorporates this data needs to duplicate this transformation logic, which is far from ideal.

Power BI dataflows present an obvious and elegant solution to this problem. Using dataflows, a user can use Power Query to define entities that are stored in and managed by the Power BI service. The queries and the transformation logic they contain can be defined once, and the data produced by the queries can be used by authorized users in as many workspaces and datasets as necessary.

But the nature of the World Bank API introduces a few challenges.

  • The Excel data is in the binary XLS format, not XLSX, which means that it requires a gateway to refresh. This isn’t a show-stopper, but I always prefer to avoid using an on-premises gateway to refresh online data. It just feels wrong.
  • The XML and CSV data isn’t actually exposed as XML and CSV files, They’re exposed as Zip files that contain XML and CSV files. And Power Query doesn’t support Zip files as a data source.

Or does it?

BI specialist Ken Russel, with a little help from MVP Chris Webb[1], figured out how to make this work. Ken published a Power Query “M” script that will decompress a Zip file, essentially exposing its content to the rest of the script as if it were a folder containing files.

Life is good.

With this part of the problem solved, the rest is pretty straightforward, especially considering that I already had a Power BI Desktop file that uses the World Bank Excel API, and that solves most of the rest of the problem.

With a little text editing aided by the Power Query extension for Visual Studio Code, I ended up with two queries: one for per-country metadata, and one for the climate change data itself[2]. I can now get into the fun and easy part: dataflows.

I’ll start by opening my workspace, and creating a new dataflow, and adding an entity.

01 - Create new dataflow

02 - New entity

Since I already have the query written, I’ll choose “Blank Query” from the available data sources, and I’ll just paste in the query text.

03 - Blank Query

04 - Query 1

At this point, all I need to do is give the query a name that will be meaningful and understandable to the users who will have access to the dataflow. As you can see in the screen shot below, even though the query was written largely by hand, all of the steps are maintained and can be interacted with in the Power Query editor.

05 - Edit Query

Once I finish with the first entity, I can follow the same basic steps with the second entity, and can save the dataflow.

06 - Save dataflow

Since the dataflow can’t be used until its entities contain data, I’ll refresh it immediately. The dataflow and its last refresh time will now be shown in the dataflows list in my workspace.[3]

07 - Refreshing

From this point on, no user in my organization who wants to include World Bank climate change data in her reports will need to connect to the World Bank API and transform it until it is fit for purpose. All users will need to do is connect to Power BI dataflows from Power BI Desktop, where the data is always available, always current, and always consistent, managed by the Power BI service. Life is good.

08 - Get data

Obviously, this pattern doesn’t apply only to World Bank data. It applies to any external data source where multiple users need to use the data in different solutions. It can also apply to internal/organizational data sources that need to be transformed to prepare them for analytics.


[1] See Ken’s awesome blog post here for the full details.

[2] You can download the queries from OneDrive: World Bank Metadata.pq, and World Bank Climate Change Data.pq.

[3] I could also set up a refresh schedule to ensure that the data in the entities is always current, but I’m saving this for a future blog post.

Dataflows in Power BI: Overview Part 4 – CDM Folders

One key aspect of Power BI dataflows is that they use Azure Data Lake Storage gen2 for their data storage. As mentioned in part 1, the technology is not exposed to Power BI users. If you’re working in Power BI, a dataflow is just a collection of entities in a workspace, with data that can be reused. But if you’re trying to understand dataflows, it’s worth looking under the hood at some of the details.

Power BI stores dataflow data in a format known as CDM Folders. The “CDM” part stands for Common Data Model[1] and the “Folder” part… is because they’re folders, with files in them.

Each CDM folder is a simple and self-describing structure. The folder contains one or more[2] CSV files for each entity, plus a JSON metadata file. Having the data in a simple and standard format like CSV means that it is easy for any application or service to read the data.[3] Having a JSON metadata file in the folder to describe its contents means that any consumer can read the JSON to easily understand the contents and their structure.

2018-10-20_19-14-54

The JSON metadata file contains:

  • The names and locations of all files in the folder.
  • Entity metadata, including names, descriptions, attributes, data types, last modified dates, and so on.
  • Lineage information for the entities – specifically, the Power Query “M” query that defines the entity
  • Information about how each entity conforms (or does not conform) to Common Data Model standard entity schemas.

If you’re interested in seeing this for yourself, the JSON metadata for a dataflow can be exported from the Power BI portal. Just select “export JSON” from the menu in the dataflows list.

export json

You don’t need to know any of this to use dataflows in Power BI. But if you’re interested in getting the most from dataflows in your end-to-end data architecture, there’s no time like the present to see how things work.


[1] The Common Data Model is a bigger topic than we’ll cover here, but if you’re interested in an introduction, you can check out this excellent session from Microsoft Ignite 2018.

[2] For simple scenarios, each entity will be backed by a single text file. For more complex scenarios involving partitioned data or incremental refresh, there will be one file per partition.

[3] Please note that in the default configuration, Power BI manages the underlying storage, and it is not available to other applications or services, so this won’t do you all that much good to start off. Power BI will, however, provide integration with Azure that will make this important and valuable.

Sword Fighting as Software Metaphor: Demos

I participated last night in The Sword Experience. This is a delightfully fun event organized as part of Microsoft’s annual “Giving” campaign, and I was very happy to donate to support a great cause and spend 3+ hours pretending to sword fight with actor Adrian Paul[1] and a bunch of like-minded Microsoft employees.

5dm43258.jpg

I can’t wait to do it again, but there were a few things that bothered me, especially when Mr. Paul “corrected” my footwork during some of the warmup exercises. I’ve spent much of the last four years studying various historical martial arts[2] and practicing them as a full-contact combat sport, and footwork, balance, and structure are the foundation of all of that. Damn it, man, I know how to do this the right way, and you’re trying to make me do it wrong!!

Sigh.

Of course I didn’t say this, and of course it would have completely missed the point if I had. The event was about stage combat and fight choreography, not about actual sword fighting. Even though the two things may look the same from a distance, they have fundamentally different goals.

And this got me thinking about software demos, and how they relate to building production software. These things look similar from a distance as well, but they also have fundamentally different goals.

In an actual sword fight, you want to make small, fast movements that can’t be predicted, and which make contact before their threat is recognized. You want the fight to be over immediately and decisively. In a stage fight, you want to make large, easily visible movements that are clearly expressive of threat, but which are not actually presenting one. You want the fight to last for a long time, and to be interesting to observe.

When building production software, you want to make a solution that is secure, that performs well, and that is easy to maintain and extend. The structure of your solution, and the processes used to deploy and support it, reflect these goals. Typically you do not optimize production software around its ease of understanding, and instead invest in training new team members over time.

When building a software demo, you want to make a solution that is easy to understand, and that communicates and reinforces the concepts and information that are the foundation of the demo. The structure of the demo is simplified to eliminate any details that do not directly support the demo’s goals, even at the expense of fundamental characteristics that would be required in any production system that uses the concepts and technologies in the demo.

A demo tells a story that reflects the reality of a production system, but deliberately glosses over the complicated and messy bits – just like a choreographed fight reflects the reality of an actual fight, minus all the violence and consequences.

You can learn from stage combat, and you can learn from demos. Just don’t mistake them for the real thing.[3] [4]

And now it’s time for me to go cut something with a real sword, just to make sure the things I practiced last night don’t stick around…

IMG_20181020_153015.jpg

Isn’t that a sight just overflowing with promise? Like an empty Visual Studio solution, where anything is possible…


[1] Yes, the guy from the Highlander TV series. That guy.

[2] For a great introduction to this topic, check out this short video. This video is how I discovered that HEMA was a thing, and I never looked back.

[3] I was hoping to incorporate the YouTube trope “will this martial art work on the street??” into this post somewhere, but I didn’t find a place where it fit. Maybe next time.

[4] In both situations, don’t be “that guy.” Don’t be the guy who complains about how a demo isn’t “real world” because it isn’t production ready. And don’t be the guy who complains that the stage combat moves you’re being shown aren’t martially sound. Seriously.

Dataflows in Power BI: Overview Part 3 – Premium

I hadn’t planned on writing so soon about Power BI dataflows on Power BI Premium dedicated capacity but yesterday’s unexpected message has forced my hand.

2018-10-19_17-39-28

Yes, this unexpected message.

Before we look at dataflows on Premium capacity, let’s look at Premium capacity in general. Power BI Premium capacity allows an organization to use dedicated compute and storage capacity that is available only for their assigned workloads. Unlike cloud services that use shared capacity (which is the case for Power BI when Premium capacity is not being used), workloads on Premium dedicated capacity don’t suffer from the “noisy neighbor” problem[1] and give organizations more control and predictability, as well as offering some additional features that are not available in shared capacity.

Since I’m not an expert on Power BI Premium[2], and since it’s a pretty big topic on its own here are some useful links with more information if this simple overview leaves you wanting more details:

  • Power BI Premium marketing/ landing page: This is the home page for Power BI premium, and it has lots of links and is pretty.
  • Power BI Premium technical documentation: This is the “What is Microsoft Power BI Premium?” topic in the Power BI docs, and it includes technical details, a video from a guy in a cube[3], and links to lots of other related docs.
  • Power BI whitepapers: This is the page where Microsoft publishes all of its long-form whitepapers for Power BI, and there are multiple papers dedicated to Premium.

Now let’s talk about dataflows and how they related to Premium.

Every Premium capacity node has a set of storage and compute resources available. These resources are used to run the reports, dashboards, and queries in workspaces that are assigned to the capacity node. Power BI capacity administrators have control over which users can use the capacity nodes that they manage. And one way that they can exercise control is to permit or deny dataflows to be created in workspaces on the capacity.

Dataflows are disabled by default on Power BI Premium.

This might sound strange at first – why would Microsoft build an awesome new thing, and have it turned off?

The reason is rooted in why organizations use Power BI Premium capacity in the first place: to have more control and predictability over their critical workloads. Using dataflows introduces new processing and storage load, which could potentially impact existing loads. Requiring a capacity administrator to explicitly enable dataflows ensures that the people responsible for critical BI applications are not surprised.

We’ve already seen the results of this setting. How about the admin experience itself?

Well I’m not a capacity administrator for any Power BI Premium capacity nodes. Honestly, if I were I may have never even written this post – I wold probably have just fixed my short-term problem and moved on. Since I’m not a capacity administrator I can’t take screen shots of the admin experience, but thankfully someone who is a capacity administrator[4] took screen shots and shared them with me.

Admin 1

On the admin page, the capacity administrator  selects “workloads” and then explicitly enable the “dataflows” workload.

Admin 2

Once the dataflows workload has been enabled, the capacity administrator can also specify the maximum percentage of the node’s memory that can be used for dataflows. This provides an additional level of control, and can help ensure that no matter what dataflows are included in workspaces assigned to the capacity node, sufficient resources remain available to the “traditional” workloads in a Power BI application.

And, of course, remember that dataflows are not a Premium-only feature. If you’re a Power BI Pro user and aren’t using Premium capacity, you don’t need to worry about any of this.

If you’re interested in a comparison of what’s available in dataflows in Power BI Pro vs. Power BI Premium, check out the online documentation.

If you don’t feel like reading the docs, here’s a quick cheat sheet. Only these specific dataflows features are limited to Power BI Premium:

  1. Linked and computed entities
  2. Incremental refresh of entities

All other dataflows features, including the ability to use your own organizational data lake storage and the ability to attach external CDM folders as dataflows, are available without Premium.


[1] In case you haven’t heard this term before, it is indeed a real thing, and even has a big mention on Wikipedia.

[2] Or anything else, for that matter, but we don’t talk about that.

[3] I hate to give this away, but I know this guy and he doesn’t actually have a cube. Don’t tell anyone, because he’d probably be embarrassed.

[4] Kudos and thanks to Anton from the dataflows PM team.

Dataflows in Power BI: Overview Part 2 – Hands-on

Part 1 of this series introduced Power BI dataflows from a largely conceptual perspective. This post is going to be much more hands-on. Let’s look at what’s involved with creating a dataflow in Power BI.

To get started, you’ll need an app workspace. Dataflows are created in Power BI workspaces just like datasets, reports, and dashboards. But because dataflows are designed for sharing and reuse, you can’t create them in your personal “My Workspace” which is designed for… not-sharing-and-reuse.

I’m going to start with this workspace:

2018-10-19_17-34-24

You’ll probably notice that this is a new “v2” workspace. You’ll probably also notice that this workspace is assigned to Power BI Premium capacity. Neither of these factors is required to get started with Power BI dataflows, but for some specific features[1] they will be important.

In my workspace I will select the “create” button, and will select “dataflow” from the menu.

Oh. Curses.

2018-10-19_17-39-28

I swear I didn’t plan this, but let’s use it as a learning experience.[2]

The use of dataflows needs to be enabled by a Power BI capacity administrator for a given Premium capacity node. Dataflows use compute and storage resources. Requiring a capacity administrator to explicitly enable dataflows gives them control over how the capacity they manage is used. This is great for the capacity administrator… but annoying for the blogger.

Anyway, as I was saying, I’m using this new workspace, which is on Power BI shared capacity, since I’m not doing anything in this post that relies on Premium. Yeah.

new workspace

So… In my new workspace I will select the “create” button, and will select “dataflow” from the menu. For real this time.

new dataflow

I’m now prompted to add new entities, which is good, because this is exactly what I want to do. If you think about a dataflow as being a self-service data warehouse or data lake, an entity is like a table within it.

Creating a new entity opens Power Query, where I can select a data source and provide my connection details. The details are a little different from Power Query in Power BI Desktop or Excel, but the experience is the same. I have an interactive preview of the data, I can transform at the column or table level, I can view and edit the transformation steps I’ve applied, and I can view the underlying “M” Power Query script in the advanced editor. User who are familiar with Power Query in Power BI Desktop should find this to be a very familiar experience.

editor

When I’m done – when the query has been given a meaningful name and all transformations have been applied, I can create more entities[3] or I can save the dataflow, giving it a name and an optional description.

save

When I save the dataflow, I am given the option to refresh now, or to set up a refresh schedule. I can do both of these things later from the dataflows list in the workspace page, but it’s convenient to be prompted when saving the dataflow. Refreshing the dataflow will execute the query for each entity in the dataflow, and will populate the underlying Azure Data Lake Storage with the results of the queries.

refresh now

Once I’ve created and refreshed the dataflow, it’s time to use it. Dataflows are designed to be used as data sources for analysis[4] so it makes sense to connect to the dataflow from Power BI Desktop, and to use it to start building my data model.

Power BI Desktop now includes a “Power BI dataflows (Beta)” option in the Get Data menu. When I select it, I can see all of the workspaces in my Power BI tenant that I have permission to access, and which contain dataflows. Inside the workspace I can see each of the dataflows it contains, and inside the dataflow I can see each entity. And just like I can with other data sources, I can select, preview, and load into my data model – or edit in Power BI desktop to further refine and prepare the data. Dataflows are built and managed in Power BI, but they act like any other data source that an analyst or business user might use.

get data

When I’m done building my PBIX file which can contain data from multiple dataflows in multiple workspaces, I can publish it to the Power BI service into any workspace where I have the necessary permissions. A dataflow in one workspace can serve as a data source for datasets and reports in any number of workspaces in the tenant.

That’s the basic end-to-end flow for creating and using a dataflow in Power BI. As you can imagine, there’s more to it, but that will wait for another post…


[1] None of which will be used or mentioned in this post.

[2] If you’re wondering how this Premium workspace contains dataflows but I can’t create dataflows in this Premium capacity, ask me in person some time. It’s a long story.

[3] If a dataflow is like a database, and an entity is like a table, who wants a database with just one table in it?

[4] Remember the last post?

Dataflows in Power BI: Overview Part 1 – Concepts

If you watched the Microsoft Ignite session recording that I called out in my last “resources” post, the pictures below will look familiar[1]. If you haven’t watched it, you may want to do so now, because it will provide a more structured[2] introduction to the topic than this overview, which promises to be quite rambling.

This is (probably) what your modern BI application looks like from a distance. You have a bunch of data sources that are producing data from which you want to gain insights. You have an ETL system that pushes that data into a data warehouse[3]. You have a data warehouse that serves as a location to stage, cleanse and standardize the data, and transform it into a shape more suited for analysis. You have an analytics model that caches the data from the data warehouse and enables high-speed ad hoc queries. And you have reports and dashboards that use this analytics model to deliver interactive visualizations and business insights to users. Life is good.

Until it isn’t.

Life is good until the demands of the business outpace the ability of the IT team to keep up. If it takes too weeks or months for a central team to deliver what business users need, the opportunity has already passed, and business users wait for no process. Business finds a way, and that’s how shadow data and shadow IT are born.

And of course this is one of the reasons we have self-service business intelligence tools. SSBI tools are designed to make it easy[4] for analysts and other non-developers to build BI solutions without the need for direct IT involvement. This can help allow IT to focus on major strategic initiatives that deliver big value, while allowing business to solve their own day-to-day problems. Life is good.

Until it isn’t.

Life is good until the BI solutions that the analysts have built become too complex, too numerous, or too widely used to operate without some oversight, support, or management from IT.  Life is good until the lack of a data warehouse in a self-service BI solution re-introduces some or all of the problems that a data warehouse solves[5].

You see, most self-service BI tools don’t include a data warehouse or data lake. Although they will integrate with data warehouses or data lakes that are built by data professionals, they don’t let analysts and business users perform data warehousing tasks in tools that are designed for them.

This is where dataflows come in.

I should emphasize that dataflows are much more than just a self-service data warehouse / data lake in Power BI. There’s a reason that this post is only part one of a series. But this is a good place to start.

Dataflows are a capability in Power BI that allow users to build reusable data sets called entities, and build them using a familiar Power Query user experience.

Unlike Power BI datasets, which are implemented as Analysis Services Tabular models, dataflows are implemented as folders and files in Azure Data Lake Store gen2.

Like Power BI datasets, the underlying technology is opaque to the user. An analyst doesn’t need to know or care that the data model she designs in Power BI desktop is actually an Analysis Services model. Similarly, she doesn’t need to know that the dataflow she designs in Power BI is actually a bunch of files in a folder. The Power BI service manages the details, and delivers them in a friendly user experience to make it easier to build better BI applications.

If you look at the image at the top of this post, you’ll see the “self-service” box skips over the data warehouse tier. Dataflows fill this gap, and allow the same users that are familiar with datasets, reports, and queries in Power BI to also create the reusable building blocks of data that can then serve as a data source for their datasets.

Like I said up above, dataflows are much more than this, but even this is a pretty darned big deal. Enabling self-service data preparation inside of Power BI, backed by Azure Data Lake Store, and integrating into the Power BI tools with a goal to enable simple and manageable data reuse is incredibly exciting[6]. And this is just the beginning…

 


[1] Because I’ve taken them directly from the presentation deck.

[2] I assume it’s more structured. I haven’t actually watched it myself. Please let me know if it’s any good.

[3] And/or data lake. For the purposes of this discussion, they both play the same role, even though the technology and functional characteristics are different.

[4] For a given value of “easy.”

[5] The data warehouse pattern solves many common and significant problems that are present in analytics applications. I won’t go into detail here, because there’s a ton of available information already out there, including this Wikipedia page.

[6] To me. And now, to you.

Dataflows in Power BI: Resources

As you may be aware, the Power BI team is working on new self-service data preparation capabilities, and after a few early name changes[1] has settled on the name “dataflows” for this new feature. Given my decades-long love affair with data preparation and ETL, dataflows have been a natural fit for me, and I’ve been kicking the tires for the past few months while dataflows have been in private preview. Now that we have finally[2] announced the public preview of dataflows, I’m planning on kicking off the new blog with a bunch of dataflow posts, and keeping them coming.

This is the first one. Start here.

Rather than start off with my personal take[3] I’d like to share a set of existing resources available online. Start here.

Dataflows documentation: This is the official documentation for dataflows in Power BI, and will continue to be revised and expanded.

Read this: https://docs.microsoft.com/en-us/power-bi/service-dataflows-overview

Dataflows whitepaper: This is a 20-page technical whitepaper from Microsoft Technical Fellow Amir Netz, and is my personal favorite resource on dataflows. If you want to know how the team building dataflows thinks about the feature – and where they’re thinking about taking it in the future – you will love it too.

Read this: https://go.microsoft.com/fwlink/?linkid=2011419&clcid=0x409

Microsoft Business Applications Release Notes: Microsoft publishes forward-looking release notes for its Business Applications portfolio, including Dynamics 365 and Power BI. This is the best place to go for information about upcoming capabilities in Power BI in general, and this is true of dataflows as well.

Read this: https://docs.microsoft.com/en-us/business-applications-release-notes/October18/intelligence-platform/power-bi-service/self-service-data-prep

Microsoft Ignite: At the Microsoft Ignite conference last month, I co-presented an introductory Power BI dataflows session with Miguel Llopis. It’s only 45 minutes in length, and if you’re brand new to dataflows, this is a great place to start.

Watch this: https://myignite.techcommunity.microsoft.com/sessions/65437

Update: The Ignite 2018 sessions are no longer current. Look here for Ignite 2019 sessions instead: https://myignite.techcommunity.microsoft.com/sessions

Microsoft Business Applications Summit: Back in July, Adi and Yaron from the dataflows team presented what I believe was the first public conference session on dataflows. If you’re interested in hearing about dataflows directly from the folks who are building it, this is for you.

Watch this: https://www.microsoft.com/en-us/businessapplicationssummit/video/BAS2018-2117

Update: The MBAS 2018 sessions are no longer current. Look here for MBAS 2020 sessions instead: https://community.powerbi.com/t5/MBAS-Gallery/bd-p/MBAS_Gallery

More videos: To help support the dataflows public preview announcement, my teammate Adam Saxton and I each recorded short overview videos.

 

 

Wow. One blog post down. Let’s see if I can keep this up.


[1] Don’t even get me started.

[2] It felt like forever, didn’t it?

[3] There will be enough of that soon enough.

[4] Adam’s is around 12 minutes long. Mine is around twice that. Once the recording was complete, the dataflows team let me know they wanted something around 10 minutes. I think they may have asked the wrong person…

Welcome to BI Polar

Before I joined Microsoft, I was a pretty active blogger. I blogged mainly about SQL Server Integration Services (SSIS) and other data-related topics, but would also often cover non-data-related topics of personal interest. The blog title “BI Polar” fit for multiple reasons.

When I joined Microsoft, my blogging efforts dried up quickly. Not only did I have an exciting new job that consumed most of my free time, I just wasn’t comfortable sharing my personal perspective on Microsoft technologies, as a Microsoft employee, because everything seemed so new and big and complicated…

Anyway… ten years[1] later, I’m back.

This is my personal blog. I’ll probably post about Power BI, Azure, and general data-related topics. I’ll probably post about cooking and baking. I’ll probably post about mental health and diversity. I’ll almost certainly post about heavy metal and swords. The tone will likely be irreverent, because the voice will be mine – as will all the opinions.

If you want to provide feedback, please send it to me via Twitter.

Let’s do this.


[1] Give or take a month or two. Who’s counting?