Tag Archives

2 Articles

Combining Excel files using Power Query for Excel

For the information in this post I am heavily indebted to Michiel Rozema who originally figured this out.

This post is an extension to my previous post on combining text (CSV) files using Power Query for Excel. In this post we will go through the same steps again, but this time for Excel sheets. Combining Excel sheets is a bit harder than text files however.

Being able to do this is very handy if you have multiple Excel sheets reporting on different periods, regions or products and you want to combine the data from those sheets into one table to use in your reporting.

First we need to make sure that ‘Advanced Query Editing’ is enabled. To do this, open Excel, go to the Power Query tab and choose Options. Then make sure the checkmark for ‘Advanced Query Editing’ is enabled. In order to successfully combine multiple Excel sheets we will need to modify the query’s script.

Just like with combining text files using Power Query, get started by clicking ‘From File’ and then ‘From Folder’.

Specify the folder where your Excel files are located and choose ‘OK’.

As before Power Query returns the list of Excel files in the folder:

Lets click on the first cell of the first row to get the sheets of the Excel workbook:

If you now click on ‘Table’ in ‘Data’ column of the Sheet1 row you will get the contents of the sheet itself:

Let’s first fix the headers. Right-click on the little table icon in the top left and choose ‘Use First Row As Headers’.

Now, in order to load the rest of the Excel files we need to modify the code a bit. To do this click on the little script icon just above the table:

 

 

The query editor opens and shows the M-script you have generated in the previous steps. Here is mine:

If we take a closer look at this M-script we see the following structure:

What happens is that we define A and then in steps apply functions on A. In each step we take as input the output of the previous step. In the end we return the result of the last function.

The M-script above is quite static in the sense that it applies only to the Excel sheet we selected (in this case ‘Sales Data 1.xlsx’). What we need to do is find a way to provide a parameter to this code and execute the code for every Excel sheet in the directory.

To make this work we will need to define a function based on the M-script we currently have. To do that we need to change the M-script to:

What we do here is defining a function with two parameters (the path to the files and a filename). Make sure you match the casing of ‘in’ and ‘let’ correctly! Do not use any capitals. Change your M-script accordingly and click ‘Done’.

Your Query Editor should now look like this:

Now, let’s make sure we understand what this query does by changing the name from ‘Query1’ to something like ‘GetExcel’ by double clicking on ‘Query1’ in this screen and entering the new name. We will use this name to invoke the function.

Click ‘Done’. We will invoke this function in a just a little while.

On the Power Query tab click ‘From File’ and choose ‘From Folder’ again. Re-enter the folder where your Excel files are stored and click ‘OK’. Now right-click a colum header and choose ‘Insert Column’ and choose ‘Custom…’. A formula editing screen opens:

Here we will need to invoke the function we defined earlier by entering the following:

Click ‘OK’ and rename the column by right-clicking the column name and choosing ‘Rename’. I renamed the column to ‘FileContents’:

Now click the icon to select the columns to expand from your Excel sheet and click ‘OK’.

Now we have the contents of the files combined with information about the files in one table. To get rid of all columns except for the contents select the three contents columns, right-click and select ‘Remove Other Columns’:

Optionally you can rename the columns. When done click ‘Done’ to get the data in Excel.

The best part is that if you add a sheet to the directory and click ‘Refresh’ in the Query tab of the ribbon, the data will get added to the result set:

So, if you have set this up and new data needs to be added as long as the structure of your Excel sheets does not change you can just click ‘Refresh’ and it works! This is the amazing power of Power Query.

 

 

 

 

Combining text (csv) files using Power Query for Excel

For the information in this post I am heavily indebted to Michiel Rozema who originally figured this out.

In this post we will have a look at how text (CSV) files in a folder can be combined using Power Query for Excel. In a later post will we extend this and load Excel files. Having this capability is very handy if you have files reporting on the same info (say sales figures) from different regions or multiple periods. You can keep the files as is but load them into Excel / Power Pivot as one table.

To start, load Excel and make sure Power Query is installed. Then on the Power Query tab click From File and then choose From Folder.

Next, specify the path where the text files are stored and choose Ok. What you will get is a list of files in the folder:

In this sample I have two csv files in the folder. The columns contain information about the file, such as date modified and file name. Some columns are special however and the column we are interested in is the first column (Content). Clicking on ‘Binary’ in that column in any row gets you to the content of the file you clicked. That is nice, but what we really want is to load all content of the two files. To do that we need to click the little icon next to the Content column header that looks a bit like two downward pointing arrows: . Click it and the contents of the files is combined together and shown in a new column (Column1):

The information from all files in the folder is now present, but all in one column. We need to split it by right-clicking the column, and choosing ‘Split Column’ and ‘By Delimiter’. I chose ‘Semicolumn’ as delimiter and clicked ‘OK’.

Now we still have the column headers from the CSV files in our set. In order to use the first row as column headers click the little table icon in the top left () and choose ‘Use First Row As Headers’.

Now we need to filter out the column headers from the second CSV file. I did this by filtering the first column just like you would in Excel. The result I ended up with is this:

Click ‘Done’ and your data will load into Excel, ready to be used!

%d bloggers like this: