Avoid Repetitive Steps in MS Fabric Dataflow Gen2 with Custom Functions

PL

Parth Lad

8 min read

Microsoft
microsoftfabric
PowerBI
PowerQuery
Azure
data
Databases
ETL
Cover Image for Avoid Repetitive Steps in MS Fabric Dataflow Gen2 with Custom Functions

While working with Dataflow Gen2 or Power BI, If you keep finding yourself needing to apply the same transformations to different queries or values, making a custom function in Power Query can save you loads of time. It's like using SQL functions to make repetitive tasks easier. A Power Query custom function uses native M functions, takes parameters as arguments, and returns an output.

If you're unfamiliar with Power Query or would like a refresher, be sure to check out these blog post to learn more!

How to Create a Function in Power Query

Creating a custom function in Power Query is straightforward. Here's the basic syntax:

FunctionName = (Parameter1 as Type1, Parameter2 as Type2, ...) as ReturnType =>
let
    // Function logic here
    Result = SomeExpression
in
    Result

There are two main ways to create custom functions. You can either start from scratch with a Blank Query or turn an existing query (table) transformation into a function. Check out these screenshots for reference:

Screenshot of the Power Query user interface in Power BI. The left pane shows a dropdown under "Get data" with options like Excel workbook, Dataflows, SQL Server database, Text/CSV, Web page, and Blank query (highlighted). The right pane shows a context menu for an item under which “Create function...” is highlighted.

Custom Functions in The Real World Data Transformations

Imagine you're working with a large flat table filled with sales data spanning various products and customers. To improve data management and analysis, you decide to split this table into Fact and Dimension tables.

Let's say you want to turn this flat table into more organized tables where each product and customer has a unique ID. You'll need to merge both dimension tables back into the fact table using the Name columns, which is similar to doing a left join in SQL to get primary keys from dimension tables as foreign keys in fact table. This approach works well, but if we want to do the same steps for multiple fact tables, we will need to repeat these steps again and again.

This is where creating custom functions becomes useful. We can use the second approach by converting an existing query into a function by right-clicking on the table. I have already transformed the flat table to create DIMCustomer and DIMProduct tables with primary key columns, as shown below:

The image shows two query tables in Power Query. The first table, "DIMProduct," lists product keys and product names: 1 for Apple, 2 for Banana, 3 for Orange, and 4 for Mango. The second table, "DIMCustomer," lists customer keys and customer names: 1 for John Doe, 2 for Jane Smith, and 3 for Alice Jones. Both tables are part of a Queries list that also includes "FCTSales".

I've added the M code for these three tables with all transformation steps at the end of this blog for you to check out.

Steps to Create Custom Functions

Let's go through the steps to make these custom functions. We'll create two functions called GetProductKey and GetCustomerKey. These will match the Name columns in the fact table with those in the dimension tables.

Before we make the functions, we need to do a few transformation steps:

  1. Reference the DIMProduct Table:

    Start by creating a reference of the DIMProduct table..

    Screenshot of a data table in Power Query. The table has columns for ProductKey and ProductName, listing items such as Apple, Banana, Orange, and Mango. A context menu is open, offering options like Copy, Paste, Delete, Rename, Enable staging, Duplicate, and Reference.

  2. Filter the ProductName Column:

    We want to call this function on the ProductName column in the FCTSales table, so filter the table by the ProductName column(select any single value).

    Power Query editor displaying a table with columns labeled "ProductKey" and "ProductName" and rows filled with data for Apple, Banana, Orange, and Mango. The "ProductName" column shows a filter dropdown with only "Apple" selected. The "Queries" pane on the left lists four queries including "FCTSales" and "DIMCustomer."

  3. Transform to Obtain a Scalar Value:

    After filtering, select the ProductKey column and transform it to get the minimum or maximum value. This step is crucial to obtain a scalar value, which can be returned from a function.

    A screenshot of the “Transform” tab in Power Query Editor, part of Microsoft Fabric or Microsoft Power BI. The active pane shows the “Statistics” dropdown menu with options: Sum, Minimum, Maximum (selected), Median, Average, Standard deviation, Count values, and Count distinct values. The "Queries" pane on the left shows a list of queries including "FCTSales", "DIMProduct", "DIMCustomer", and "DIMProduct (2)".

  4. Rename and Edit the Query:

    Rename the query to GetProductKey and open the Advanced Editor to see the M Code for all the transformations we performed.

    Power Query editor screenshot showing a custom formula to get the maximum ProductKey value from filtered rows using the List.Max function. The "Query settings" panel on the right lists the applied steps, including "Source," "Filtered rows," and "Calculated maximum." The name of the query is "GetProductKey."

    A screenshot of the "Power Query" window in Microsoft Power BI with the "Advanced editor" highlighted in yellow. The editor shows code that filters a table to select rows where the "ProductName" is "Apple" and calculates the maximum value of "ProductKey" from these filtered rows.

  5. Update the Query to Use Parameters:

    Modify the query to take ProductName as a text parameter and use the parameter instead of the static filter "Apple" to make it dynamic. Here's how the final function code looks:

     //Add the following line to pass ProductName as parameter.
     (ProductName as text) as number => 
     let
       Source = DIMProduct,
       //In the following step, we have replaced "Apple" with ProductName parameter.
       //#"Filtered rows" = Table.SelectRows(Source, each ([ProductName] = "Apple")),
       #"Filtered rows" = Table.SelectRows(Source, each ([ProductName] = ProductName)),
       #"Calculated maximum" = List.Max(#"Filtered rows"[ProductKey])
     in
       #"Calculated maximum"
    

    Here's a screenshot showing the GetProductKey function:

    Screenshot of a query interface with four queries listed: FCTSales, DIMProduct, DIMCustomer, and GetProductKey. The GetProductKey query is selected and it displays a parameter input for "ProductName" with buttons labeled "Invoke" and "Clear".

You can follow the same steps to create the GetCustomerKey function. Simply reference the DIMCustomer table, filter by the CustomerName column, and perform similar transformations to obtain the CustomerKey as a scalar value.

Using Custom Functions in Your Fact Table

Now that we've created our custom functions, let's use them in the FCTSales table. Here are the steps:

  1. Invoke Custom Function:

    Go to the Add Column menu at the top and click on "Invoke custom function." A window will pop up where you can map column values as parameters.

    A screenshot of Power Query in Microsoft Fabric. The "Invoke custom function" dialog box is open, with fields for "Function name" and "ProductName." The drop-down menu under "ProductName" displays options such as "Enter a value," "Use values in a column," "Select a parameter," and "Select a query." The background shows a data table with columns for "SaleID," "Date," and "ProductName," containing items like Apple, Banana, Orange, and Mango. The Queries pane lists FCTSales, DIMProduct, DIMCustomer, and GetProductKey. The Query settings pane shows the name "FCTSales" and applied steps including "Source" and "Changed column type."

  2. Map Column Values:

    Select the "Use Values in a Column" option and from the dropdown menu, choose the "ProductName" column and hit "OK."

  3. Rename and Update Data Type:

    A new column will appear. Rename this column to "ProductKey" by replacing "Invoked custom function" in both places. Then, change the data type to Int64.Type.

  4. Repeat for CustomerKey:

    Follow the same steps for the GetCustomerKey function to add the "CustomerKey" column.

  5. Clean Up and Reorganize:

    Remove the "ProductName" and "CustomerName" columns from the fact table and Reorganize the remaining columns as needed to finalize your table.

Summary

To wrap things up, custom functions in Power Query make it way easier to avoid repeating the same data tasks in tools like Dataflow Gen2 and Power BI. This guide shows you how to create and use custom functions to organize sales data into Fact and Dimension tables. By turning your existing queries into useful functions, you can manage large datasets effortlessly and automate complex tasks. We've gone through the steps and provided M code examples for creating GetProductKey and GetCustomerKey functions, which demonstrate how to filter and rearrange data columns quickly.

References

M Code for Reference

As I mentioned, I've added the M code for all tables below. The first line has sample data since I used the "Enter Data" option in Power Query.

M Code for DIMProduct Table

let
  Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyMDLWNTAEIiDHsaAgJxVIe+Vn5Cm45IOYhgYgQs/UQClWJ1rJCFkHiOOUmAeEIC2JeakKwbmZJRlAjikQG+iZm4L1GCPrAXH8ixLz0tGsMYJYYwSxxgRZC4jjC9SRD3JgTmZyqoJXfl5qMZBnAVKmZwDUEwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SaleID = _t, Date = _t, ProductName = _t, CustomerName = _t, Quantity = _t, Price = _t]),
  #"Changed column type" = Table.TransformColumnTypes(Source, {{"SaleID", Int64.Type}, {"Date", type date}, {"ProductName", type text}, {"CustomerName", type text}, {"Quantity", Int64.Type}, {"Price", type number}}),
  #"Removed other columns" = Table.SelectColumns(#"Changed column type", {"ProductName"}),
  #"Removed duplicates" = Table.Distinct(#"Removed other columns", {"ProductName"}),
  #"Added index" = Table.AddIndexColumn(#"Removed duplicates", "ProductKey", 1, 1, Int64.Type),
  #"Reordered columns" = Table.ReorderColumns(#"Added index", {"ProductKey", "ProductName"})
in
  #"Reordered columns"

M Code for DIMCustomer Table

let
  Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyMDLWNTAEIiDHsaAgJxVIe+Vn5Cm45IOYhgYgQs/UQClWJ1rJCFkHiOOUmAeEIC2JeakKwbmZJRlAjikQG+iZm4L1GCPrAXH8ixLz0tGsMYJYYwSxxgRZC4jjC9SRD3JgTmZyqoJXfl5qMZBnAVKmZwDUEwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SaleID = _t, Date = _t, ProductName = _t, CustomerName = _t, Quantity = _t, Price = _t]),
  #"Changed column type" = Table.TransformColumnTypes(Source, {{"SaleID", Int64.Type}, {"Date", type date}, {"ProductName", type text}, {"CustomerName", type text}, {"Quantity", Int64.Type}, {"Price", type number}}),
  #"Removed other columns" = Table.SelectColumns(#"Changed column type", {"CustomerName"}),
  #"Removed duplicates" = Table.Distinct(#"Removed other columns", {"CustomerName"}),
  #"Added index" = Table.AddIndexColumn(#"Removed duplicates", "CustomerKey", 1, 1, Int64.Type),
  #"Reordered columns" = Table.ReorderColumns(#"Added index", {"CustomerKey", "CustomerName"})
in
  #"Reordered columns"

M Code for FCTSales Table

let
  Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyMDLWNTAEIiDHsaAgJxVIe+Vn5Cm45IOYhgYgQs/UQClWJ1rJCFkHiOOUmAeEIC2JeakKwbmZJRlAjikQG+iZm4L1GCPrAXH8ixLz0tGsMYJYYwSxxgRZC4jjC9SRD3JgTmZyqoJXfl5qMZBnAVKmZwDUEwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SaleID = _t, Date = _t, ProductName = _t, CustomerName = _t, Quantity = _t, Price = _t]),
  #"Changed column type" = Table.TransformColumnTypes(Source, {{"SaleID", Int64.Type}, {"Date", type date}, {"ProductName", type text}, {"CustomerName", type text}, {"Quantity", Int64.Type}, {"Price", type number}}),
  #"Invoked custom function" = Table.TransformColumnTypes(Table.AddColumn(#"Changed column type", "ProductKey", each GetProductKey([ProductName])), {{"ProductKey", Int64.Type}}),
  Custom = Table.TransformColumnTypes(Table.AddColumn(#"Invoked custom function", "CustomerKey", each GetCustomerKey([CustomerName])), {{"CustomerKey", Int64.Type}})
in
  Custom

Written by

PL

Parth Lad

I'm a data analyst who loves finding insights from numbers and visualizing them. I write about Data Analytics, Data Engineering, Power BI, and DAX on Medium & Hashnode.

Follow me for more!✌️😉