Power Automate: Group By Multiple Columns on a Collection

When recently working on a flow, I had  a scenario where we need to group an existing collection using multiple columns. Typically this would be a very straight forward to accomplish using code.  I thought the same can be achieved in flow easily.

However, after checking the functions reference documentation there isn't a function available that could accomplish this task easily at the time of writing this post. As a result, I had to perform a couple of additional steps to accomplish this.

I will describe the approach I took to achieve this, this is just one approach to solve the problem, there might be other efficient ways to do this.  

Let's take a look at the collection that we will be using and the collection that will be generated.

On the left hand side we will be using the Collection that we will Group By using the data elements DepartmentId and Title. The final result is shown on the right side after we do the Group By.

Here is how the process works:

  1. Initialize two array variables
    1. An array to store the values of each unique id that is used to group the records. If you just have single column, you can just add the unique column element into the array. If you have multiple columns, generate a unique id from those columns. e.g. Join the columns using a delimiter
    2. An array to store the final collection that is grouped by the columns.
  2. Loop through all the items of the source collection and append each unique id into the array that stores the unique group ids. Remember that if we use multiple columns we generate the unique id by joining the columns into one string and adding it to the array. This makes it easier to compare.
  3. Initialize a running object variable that stores the current grouped object during looping. This will be used for generating the grouped object that will be added to our final array variable that we defined in step 1.2
  4. Initialize a running array variable that will store the list of records that we are grouping. This array will be added as a property on the current grouped object in step 3. In our example this would be the Employees array on the right side of the final outcome picture associated to a group.
  5. Loop through each unique group id and than add a nested loop on the initial array. For each item in the initial array that matches the unique group id, add it to the current group array object defined in step 4.
  6. Add the current group array object to the current group object before the loop ends and continues to the next item on the group array.

If the above approach sounds confusing don't worry, it will make sense when we walk through the steps within the flow.

First, let's assume we have an input array that resembles the above structure, in this case I have define an array variable 'Employees' for the sake of simplicity.

Step 1

Now initialize the array that stores the unique group id.

Initialize the array to store the final grouped array

Step 2

Now, let's loop the through the initial Employees array and add the unique group id into the GroupIds array. Here we are using the concat function to join the elements into a delimited string.

contains(variables('GroupIds'),concat(item()['DepartmentId'],'|',item()['Title']))

In this example the GroupId would be 1|Developer and 2|manager

Step 3 & 4

Now, let's initialize the CurrentGroup running object and the CurrentGroupEmployees running array

Step 5 & 6

Now let's loop through each unique GroupId that we want to group the employees on.

After, we group the Employees, we than set our CurrentGroup object with the  group information and the employees.

This completes our flow. As you can see, that's a lot of steps and variables.

This approach has couple of downsides

  • When there are a lot of records, this approach would take time to group the data and would be slow to complete. If you are using this data for a canvas app, you could leverage the formulas provided by canvas app to Group and Count records instead of building it on the flow.
  • Multiple variables and conditions have been used to accomplish a task that should have been completed through functions.

Until Microsoft provides more functions for the collections, I think we would have to use workarounds like this. If you can think of a better approach I would be interested in knowing about it.