Generic Consolidator

Default Keys: Ctrl + Shift + Alt + C

Intro:

I admit, consolidating data from lots of Excel files is not something I do every day. However! It happens frequently enough that it’s given me ample time to think long and hard about the poor life decisions that had brought to those, mind numbing, consolidation moments… I am joking of course — at least mostly. Really what went through my head, while I was mechanically plugging away at my consolidations, were wishes for some tool that would automate the process for me. I firmly believe that analysts’ time should not be focused on consolidating data, but on the analysis of the data. To that end, I went ahead and built the tool for myself, and am sharing it with you all here as well. It’s the Generic Consolidator Nutility that is part of my Nutilities add-in pack. If you work in Finance, it’s almost guaranteed that you or your team is sending out templates to get forecast or accrual information. With a little time spent up front setting up your templates to be predictable and uniform, this Nutility and article can help you move away from consolidating data, to actually using it.

Throughout the rest of this article, I will go over some examples and instructions on how to use the Generic Consolidator Nutility to automate the consolidation of Excel files. I, myself, have recently used it to consolidate over 200 Excel files. While the example I will demo for you below will be smaller, for ease of presentation, the process is exactly the same as I used for 200+ files, and can easily be scaled upwards of 1000 files if needed.

This Nutility requires you to have the Nutilities Excel add-in installed in order to use it. If you don’t know what Nutilities is, or where to get it, please start at my web-page here: http://iwishexcel.com . The add-in is completely free to use.

High-Level How It Works

Before I go into any examples or detailed instructions, let me first give you a high-level understanding of how the Generic Consolidator Nutility works.

Upon running the Nutility it will create a new workbook with what is called the Control Sheet — see the picture below. The way the Control Sheet operates is very straightforward. Each row will hold the information about a range in an Excel file that you want copied to a consolidated sheet. When you have all the ranges you want consolidated filled in on the rows, you click the “Click to Consolidate” button. The Nutility will then create a new sheet called Consolidated Data, and go row-by-row copying the ranges you have specified to the Consolidated Data sheet. The Control Sheet has 4 main areas to understand, and I have labeled them in the order that they are used.

Area 1 holds the information on where to find each Excel file, as well as which sheet in that file we want data from. When the Control Sheet is created, it allows you to choose a folder where your excel files are kept, and then automatically populates this area for you — you are free to make manual adjustments as necessary to this area.

Area 2 allows you to specify the range of cells you want consolidated. This must be filled in manually.

Area 3 is the button to click once areas 1 & 2 are populated.

Area 4 is used by the Nutility to keep track of how the row was processed. If it has errors consolidating any of the rows, it will note them here, so the user will want to glance at this area once complete to make sure everything was processed OK.

Once the Nutility has finished running, you will find all your consolidated data on the newly created, Consolidated Data sheet — see the picture below. There are 2 main areas to this sheet. Area 1 contains the information about where each row of data came from, the path, file, sheet, and range info. I’ve found this very useful to have because sometimes the folder structure or file-name tells you a lot about the data, such as the VP or Business Unit it belongs to. Area 2 contains the data you wanted consolidated. When the data is copied over, cell values, formats, and comments are kept intact, everything else is removed (formulas for example).

Instructions and Examples

Let’s now take a look at some examples with detailed instructions. In this first example, let’s say that I have all the Excel files I want to consolidate data from sitting in a single folder called “Generic Consolidator Example 1 – Single Folder”.

In that folder I have 3 files I want consolidated.

And the data in each of those files is pictured as below. You can see that each file has a sheet called “Template”, and the data I want starts in cell B5 for each. You can also see that they each have a different number of rows. So for File1 the range I want is B5:D5, while in File2 it’s B5:D6, and in File3 it’s B5:D7. I point this out because the Generic Consolidator Nutility does have a feature to handle this situation, which I will discuss later in this demo.

Now let’s go through the step-by-step running of the Nutility for this example scenario. To start, open Excel and run the Nutility (Ctrl + Shift + Alt + C).

First thing, you will see a popup as pictured below. This popup lets you know that you’ve run the Generic Consolidator, and that you are about to create a Control Sheet for it. It also asks you the question on how you want Area 1 of the Control Sheet
(shown above earlier in the article) pre-populated.

In this example, if you click yes, it would include not only all 3 files in my selected folder, but any Excel files it finds in any folders that exist within the select folder, or any of their subfolders…etc. To say that another way, in this example, it would also check the circled folder as well as the current folder in the picture below.

For this demo though, I have selected No. So the Nutility will ignore the subfolders and only take the files in the currently selected folder — in this case Files 1 – 3.

After making my selection on the first popup, I am met with a second popup. This popup wants to know if I want the Nutility to pre-populate Area 1 of the Control Sheet (shown above earlier in the article) with all the sheet names from each Excel file it finds. It’s important to note here that the way it finds the sheet names is by opening each file and looking at the sheets, this could take a bit of time if you are doing 50+ files. If you know beforehand the name of the sheet(s) you are looking to consolidate, it is recommended that you select No here, and fill in the sheet name manually on the Control Sheet. In the case of this example that I am demonstrating, I know that I am looking for the Template sheet in every file, so I would typically select No. However, for clarity, I will show what happens for either selection, a little bit further on in the article.

Having made my selection on the second popup, I am now asked to select the folder where my files are located using a standard folder browser (pictured below). Select the folder you want and hit OK.

At this point the Control Sheet will be generated and pre-populated based on your selections. The next picture below shows what is created when I selected No on the second popup. You can see it listed each file only once, and left the Worksheet Name column blank. In this case I would fill in “Template” on each row for the Worksheet Name.

If I had selected Yes for the second popup, the pre-poluated Control Sheet would look like the next picture below. You can see here that each file is listed multiple times, once for each sheet in the file. In this case, since I am only concerned with getting data from the Template sheet, I used an auto-filter and deleted all the rows that were not the Template sheet.

Either way gets me to the same place. Not having sheet pre-propulated is fastest if I already know the name of the sheet I am looking for, and it’s the same in every workbook. A quick thing to note, your Control Sheet can be saved off and reused in the future. So if you have static Excel files that people periodically open and change, saving the Control Sheet is a good idea.

Now having specified which File/Sheet for each row on my Control Sheet, all that’s left is for me to specify what range I want copied from each of those rows. For the example scenario I’ve setup, we would fill in the ranges as pictured below. If in your situation you needed to copy a couple ranges from the same sheet, you could do that by having the same sheet on 2 different rows, and filling in a different range on each. With this setup, I am telling the Nutility to literally go in and copy B5:D5 from the Template sheet in File1, B5:D6 from the Template sheet in File2… etc. This is all well and good, but I will not always know ahead of time how far out or down the data extends on the Template sheet.

In situations where you do not know how far out or down your data goes, you can leave those cells blank (as pictured below) on your Control Sheet, and the Nutility will try to figure it out for you. You can have the Nutility guess either one, or both, the Last Column and Last Row. In this example, I’ve left both blank, but from the way my templates are setup, I could have filled in the Last Column as “D” because it’s always the same — no guessing necessary.

The way the Nutility guesses these values is simple. If guessing the Last Row, it will start at the bottom of the sheet, then search all rows going up, until it finds a row with a value in it — this will be the row that it guesses. For the Last Column, it does the same thing, but with columns this time. So if you setup your template to not have any other data, other than what you want consolidated, to be at the bottom-right of your template, you can just have the Generic Consolidator guess the Last Column/Row for you. For this demo, because of how my templates were setup, explicitly specifying the Last Column/Row, or leaving them blank, will result in the exact same output on the Consolidated Data sheet.

Now that I have all the Files/Sheets/Ranges to consolidate listed on my Control Sheet, I’m ready to run the consolidation by clicking the “Click to Consolidate” button on the Control Sheet. At this point it will create a new Consolidated Data sheet in the workbook, then go row-by-row on the Control Sheet, copying the specified ranges onto the Consolidated Data sheet. In order to protect your data, the consolidation will not run unless the only workbook open at the time is the one housing the Control Sheet. So when you press the button, you may get notified to close and save any other workbooks, and then try pressing the consolidation button again.

Once the consolidation process is complete, you will get a message stating how long it took, and whether it completed successfully, or with errors. You should then see the Consolidated Data sheet, as pictured below. Here you can see that it captured the 6 rows of data, while at the same time keeping the info about where the data came from — including the range that was copied.

As I said before, sometimes the folder-structure/file-name will have important info about your data that you might need for your analysis, such as the template’s owner, or what part of the business the template refers to. Having the range that was copied is especially useful for cases where you let the Nutility guess the Last Column/Row, or for situations where you consolidated more than one range from a single sheet and want to be able to distinguish them later.

When you received the consolidation completion message, if it stated that it completed with errors, it is important that you go back and look at your Control Sheet. The message is letting you know that while some data was consolidated, it is not complete. As pictured below, the Process Status column lets you know which rows were processed successfully. In this case all rows were successful. However, if any were not, you would find a different message on the offending row — an error stating that the row’s file does not exist, for example.

Things To Be Aware Of

At this point, the demo is complete, the following are just some things that are worth knowing about the Nutility.

If it encounters a file that is password-protected in order to open, it will pause and ask for the password.

When it is getting sheet names, or opening the files to consolidate data, it has to open the Excel files the same as you would. What this means is that if the files are located on a shared drive, and you are at home on VPN, and it takes you 45 seconds just to open each file from home, the Nutility will not be able to open them any faster. It is HIGHLY RECOMMENDED that you copy over to your computer the entire folder structure you are consolidating, and perform the consolidation process locally on your computer — then deleting the folder structure off your computer once complete. This is the safest, fastest way to do the consolidations. When I recently did 200+ files, I copied the 8GB of folders from the shared drive at work before going home to consolidate. While it took 20 minutes to copy the folders, it saved me a lot of time and hassle in the long-run.

The Control Sheet sheet should just be thought of as a set of instructions for the automated consolidation process. When the control sheet is being created it does try to pre-populate itself as much as possible. However, in the end it does not matter how it gets populated, as long as it gets populated with the correct info. So if you have any method for populating it faster, do that. For example. If I have 100 files that I want to copy just 2 sheets from each, and I know the sheet names ahead of time, I will have it create the Control Sheet without sheet names. Then I’d just copy the block of 100 files it has listed down below, and manually paste the first sheet name with the first block, and then the second sheet name with the second block. It would be much slower to have the Nutility fill in the sheet names for all the files because then it would have to open the 100 files first.

The order of the rows on the Control Sheet does not matter. While I have said that the consolidation process goes row-by-row processing the Control Sheet, that is not technically true. In order to optimize the processing speed, it will open a workbook only once, and then process all rows relating to that workbook while it is open. If those rows are not next to each other on the Control Sheet, then it will go out of order.