Avoid Repetitive Steps in MS Fabric Dataflow Gen2 with Custom Functions
Parth Lad
8 min read
Table of contents
While working with Dataflow Gen2 or Power BI, If you keep finding yourself needing to apply the same transformations to different queries or values, making a custom function in Power Query can save you loads of time. It's like using SQL functions to make repetitive tasks easier. A Power Query custom function uses native M functions, takes parameters as arguments, and returns an output.
If you're unfamiliar with Power Query or would like a refresher, be sure to check out these blog post to learn more!
How to Create a Function in Power Query
Creating a custom function in Power Query is straightforward. Here's the basic syntax:
FunctionName = (Parameter1 as Type1, Parameter2 as Type2, ...) as ReturnType =>
let
// Function logic here
Result = SomeExpression
in
Result
There are two main ways to create custom functions. You can either start from scratch with a Blank Query or turn an existing query (table) transformation into a function. Check out these screenshots for reference:
Custom Functions in The Real World Data Transformations
Imagine you're working with a large flat table filled with sales data spanning various products and customers. To improve data management and analysis, you decide to split this table into Fact and Dimension tables.
Let's say you want to turn this flat table into more organized tables where each product and customer has a unique ID. You'll need to merge both dimension tables back into the fact table using the Name columns, which is similar to doing a left join in SQL to get primary keys from dimension tables as foreign keys in fact table. This approach works well, but if we want to do the same steps for multiple fact tables, we will need to repeat these steps again and again.
This is where creating custom functions becomes useful. We can use the second approach by converting an existing query into a function by right-clicking on the table. I have already transformed the flat table to create DIMCustomer and DIMProduct tables with primary key columns, as shown below:
I've added the M code for these three tables with all transformation steps at the end of this blog for you to check out.
Steps to Create Custom Functions
Let's go through the steps to make these custom functions. We'll create two functions called GetProductKey
and GetCustomerKey
. These will match the Name columns in the fact table with those in the dimension tables.
Before we make the functions, we need to do a few transformation steps:
Reference the DIMProduct Table:
Start by creating a reference of the DIMProduct table..
Filter the ProductName Column:
We want to call this function on the ProductName column in the FCTSales table, so filter the table by the ProductName column(select any single value).
Transform to Obtain a Scalar Value:
After filtering, select the ProductKey column and transform it to get the minimum or maximum value. This step is crucial to obtain a scalar value, which can be returned from a function.
Rename and Edit the Query:
Rename the query to
GetProductKey
and open the Advanced Editor to see the M Code for all the transformations we performed.Update the Query to Use Parameters:
Modify the query to take ProductName as a text parameter and use the parameter instead of the static filter "Apple" to make it dynamic. Here's how the final function code looks:
//Add the following line to pass ProductName as parameter. (ProductName as text) as number => let Source = DIMProduct, //In the following step, we have replaced "Apple" with ProductName parameter. //#"Filtered rows" = Table.SelectRows(Source, each ([ProductName] = "Apple")), #"Filtered rows" = Table.SelectRows(Source, each ([ProductName] = ProductName)), #"Calculated maximum" = List.Max(#"Filtered rows"[ProductKey]) in #"Calculated maximum"
Here's a screenshot showing the
GetProductKey
function:
You can follow the same steps to create the
GetCustomerKey
function. Simply reference the DIMCustomer table, filter by the CustomerName column, and perform similar transformations to obtain the CustomerKey as a scalar value.
Using Custom Functions in Your Fact Table
Now that we've created our custom functions, let's use them in the FCTSales table. Here are the steps:
Invoke Custom Function:
Go to the Add Column menu at the top and click on "Invoke custom function." A window will pop up where you can map column values as parameters.
Map Column Values:
Select the "Use Values in a Column" option and from the dropdown menu, choose the "ProductName" column and hit "OK."
Rename and Update Data Type:
A new column will appear. Rename this column to "ProductKey" by replacing "Invoked custom function" in both places. Then, change the data type to
Int64.Type
.Repeat for CustomerKey:
Follow the same steps for the
GetCustomerKey
function to add the "CustomerKey" column.Clean Up and Reorganize:
Remove the "ProductName" and "CustomerName" columns from the fact table and Reorganize the remaining columns as needed to finalize your table.
Summary
To wrap things up, custom functions in Power Query make it way easier to avoid repeating the same data tasks in tools like Dataflow Gen2 and Power BI. This guide shows you how to create and use custom functions to organize sales data into Fact and Dimension tables. By turning your existing queries into useful functions, you can manage large datasets effortlessly and automate complex tasks. We've gone through the steps and provided M code examples for creating GetProductKey
and GetCustomerKey
functions, which demonstrate how to filter and rearrange data columns quickly.
References
Using custom functions in Power Query - Power Query | Microsoft Learn
Understanding Power Query M functions - PowerQuery M | Microsoft Learn
M Code for Reference
As I mentioned, I've added the M code for all tables below. The first line has sample data since I used the "Enter Data" option in Power Query.
M Code for DIMProduct Table
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyMDLWNTAEIiDHsaAgJxVIe+Vn5Cm45IOYhgYgQs/UQClWJ1rJCFkHiOOUmAeEIC2JeakKwbmZJRlAjikQG+iZm4L1GCPrAXH8ixLz0tGsMYJYYwSxxgRZC4jjC9SRD3JgTmZyqoJXfl5qMZBnAVKmZwDUEwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SaleID = _t, Date = _t, ProductName = _t, CustomerName = _t, Quantity = _t, Price = _t]),
#"Changed column type" = Table.TransformColumnTypes(Source, {{"SaleID", Int64.Type}, {"Date", type date}, {"ProductName", type text}, {"CustomerName", type text}, {"Quantity", Int64.Type}, {"Price", type number}}),
#"Removed other columns" = Table.SelectColumns(#"Changed column type", {"ProductName"}),
#"Removed duplicates" = Table.Distinct(#"Removed other columns", {"ProductName"}),
#"Added index" = Table.AddIndexColumn(#"Removed duplicates", "ProductKey", 1, 1, Int64.Type),
#"Reordered columns" = Table.ReorderColumns(#"Added index", {"ProductKey", "ProductName"})
in
#"Reordered columns"
M Code for DIMCustomer Table
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyMDLWNTAEIiDHsaAgJxVIe+Vn5Cm45IOYhgYgQs/UQClWJ1rJCFkHiOOUmAeEIC2JeakKwbmZJRlAjikQG+iZm4L1GCPrAXH8ixLz0tGsMYJYYwSxxgRZC4jjC9SRD3JgTmZyqoJXfl5qMZBnAVKmZwDUEwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SaleID = _t, Date = _t, ProductName = _t, CustomerName = _t, Quantity = _t, Price = _t]),
#"Changed column type" = Table.TransformColumnTypes(Source, {{"SaleID", Int64.Type}, {"Date", type date}, {"ProductName", type text}, {"CustomerName", type text}, {"Quantity", Int64.Type}, {"Price", type number}}),
#"Removed other columns" = Table.SelectColumns(#"Changed column type", {"CustomerName"}),
#"Removed duplicates" = Table.Distinct(#"Removed other columns", {"CustomerName"}),
#"Added index" = Table.AddIndexColumn(#"Removed duplicates", "CustomerKey", 1, 1, Int64.Type),
#"Reordered columns" = Table.ReorderColumns(#"Added index", {"CustomerKey", "CustomerName"})
in
#"Reordered columns"
M Code for FCTSales Table
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyMDLWNTAEIiDHsaAgJxVIe+Vn5Cm45IOYhgYgQs/UQClWJ1rJCFkHiOOUmAeEIC2JeakKwbmZJRlAjikQG+iZm4L1GCPrAXH8ixLz0tGsMYJYYwSxxgRZC4jjC9SRD3JgTmZyqoJXfl5qMZBnAVKmZwDUEwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SaleID = _t, Date = _t, ProductName = _t, CustomerName = _t, Quantity = _t, Price = _t]),
#"Changed column type" = Table.TransformColumnTypes(Source, {{"SaleID", Int64.Type}, {"Date", type date}, {"ProductName", type text}, {"CustomerName", type text}, {"Quantity", Int64.Type}, {"Price", type number}}),
#"Invoked custom function" = Table.TransformColumnTypes(Table.AddColumn(#"Changed column type", "ProductKey", each GetProductKey([ProductName])), {{"ProductKey", Int64.Type}}),
Custom = Table.TransformColumnTypes(Table.AddColumn(#"Invoked custom function", "CustomerKey", each GetCustomerKey([CustomerName])), {{"CustomerKey", Int64.Type}})
in
Custom
Written by
Parth Lad
I'm a data analyst who loves finding insights from numbers and visualizing them. I write about Data Analytics, Data Engineering, Power BI, and DAX on Medium & Hashnode.
Follow me for more!✌️😉