Apr 04, 2019
Organization is a beautiful thing. It allows us to observe trends that are initially intangible in order to make decisions that shape our day-to-day processes in business. We see the troughs and crests in the waves of AOR client requests, interpret the workload of an over-stressed team, complete a post-mortem to determine how our time was used, and so much more.
But before we can witness these revelations, we must organize. Raw, improperly organized data can cripple as much as good data can empower. So how do we organize? In the wide world of web applications, Google Sheets comes through as a very powerful tool in organizing that monstrous spreadsheet you just exported from your project management software.
As an example, I'd like to walk you through the setup of a frequently requested hours report that we generate here at Mission.
Basic Document Setup
You've exported your data that you'd like to organize, so what's next? As a general rule, there are several sheets in my workbook that I like to set up in preparation for organizing any data. These sheets can be grouped into two types: referencing and working. Below is a list of the sheets we'll be using in our example.
1. datasource_ref: It is a good idea to keep an unmolested copy of the raw data that has been exported. You are preemptively saving yourself from accidentally corrupting the data with an errant keystroke, or deleting something you believe is not needed...only to find out an hour later that it would have been useful.
2. vlookup_month_ref: If I intend to use the VLOOKUP function in my workbook, then I will create a sheet with the required data beforehand (or pull it from previously created workbooks). In the instance of our example, you will be using this function to generate secondary keys.
3. vlookup_team_ref: This is similar to the above, except the key type here is the team an employee is a part of, as opposed to a month of the year.
4. organized: There is usually only one organized sheet in my workbook. If you need to alter any data within the datasource_ref, this is where you would do that. In our example worksheet, this is where you translate the duration value of hours worked to decimal, and add the secondary keys of team and month, so that the original data can be organized by these newly associated values. For the purpose of this exercise, I've created several organized sheets so we can work through the steps!
The working sheets can be the end game, and for our purposes, they are. This is the organized data that we have been searching for. These are the numbers we send up the chain to show what someone has been working on and for how long. The only reason you may want to push beyond working sheets is to polish the data and make it more visually appealing with pretty colors.
Organizing Your Data
Your raw data has been placed in your datasource_ref sheet and you’ve created your blank organized reference sheet...so what is next? First, you will want to identify what columns of data you’d like to include from your datasource_ref. It’s my general rule of thumb to export every piece of data for the datasource_ref, and then select choice columns for the purpose at hand.
In this instance, we will be needing every column and every line. Since I know that I will want to insert separate columns for the secondary keys, I want to pull each column individually. The best way I currently know to execute this is to use our first function, ARRAYFORMULA.
Enables the display of values returned from an array formula into multiple rows and/or columns and the use of non-array functions with arrays.
In layman’s terms, it allows you to execute a function that is normally used on a cell-by-cell basis across multiple cells. However, initially we will use it to transpose the data from datasource_ref sheet to our organized sheet.
You can view the example in organized_1, A1. I want to pull all values from column A (Employee) in datasource_ref. Therefore, I use [=arrayformula(datasource_ref!A:A)]. Since I want all of the data from datasource_ref, I duplicate this function across the next three columns to also pull over columns B:D, Project, Date, and Hours.
Since I want everything transposed from datasource_ref, you may ask why I didn’t:
1. Use [=arrayformula(datasource_ref!A:D)], which would pull everything at once with one formula
2. Use the FILTER or QUERY functions.
Well, to the first point, I know I want to add secondary key columns between the original columns from datasource_ref. If I used the formula in A and then tried to insert columns between, it would break! To the second point, we would encounter the same issue. Also, those functions are designed to sift through data and pull specific entries based on specific content within the search. Additionally, they are a bit more complicated than what we need, and simpler is always better...when it works.
Before we start adding secondary keys, let’s take a look at our data. It looks like our hours format is duration. Wouldn’t it be so much easier to work with it in decimal form? Let’s alter our ARRAYFORMULA in D1 to take care of that. Move to organized_2 to see the result!
First off, you will notice that the ARRAYFORMULA in D1 is now restricted to only pull datasource_ref!D1. We cannot multiply a word, so that would end in an error, which we don’t want. So we add another ARRAYFORMULA to D2 [=arrayformula(datasource_ref!D2:D*24)]. This is where we see ARRAYFORMULA truly come into its own. We are telling it to pull data from the datasource_ref and multiply each value by 24 (24 just happens to be the number you multiply a duration by to translate it to decimal.) Then, it is simple as formatting that column to display as a number rather than a decimal, and that’s it.
Next, let’s add our first secondary key. While I have the employee listed in A1, I want to be able to organize data based on department. In order to do this, we will be using our next function, VLOOKUP, in combination with ARRAYFORMULA.
Or vertical lookup, searches down the first column of a range for a key and returns the value of a specified cell in the row found.
Enter our first vlookup_ref sheet, vlookup_team_ref. VLOOKUP allows you to automatically assign a value or content to a cell based on the contents of an adjacent cell; however, you first need to provide those relationships to the function in the form of the table. If you take a look at vlookup_team_ref, you can see that I have assigned a department to each team member.
Going back to organized_2, I know that I want to have the Department column first, so let’s insert a column to the left of Employee (currently A). Then, we’ll use the combination of ARRAYFORMULA and VLOOKUP in the formula [=ARRAYFORMULA(VLOOKUP(B2:B,vlookup_team_ref!A1:B9,2,false))] in A2. This formula is telling VLOOKUP to search all cells in B2:B, hence the need for ARRAYFORMULA. At the same time, it is searching A1:B9 in vlookup_team_ref for a matching value. The "2" in the formula lets VLOOKUP know that the value it should return is in the second column of A1:B9, and false tells it that it is not in alpha order. See it in action in the organized_3 sheet.
If we only used VLOOKUP in cell A2, it would only search vlookup_team_ref!A1:B9 for the value in B2, and that formula would have to be copied into every cell. ARRAYFORMULA lets us only stretch that function down the entirety of A, with only one formula and no mass copying/pasting. Also, this is our first instance of nesting functions within each other! Next, we’ll take a look at nesting three functions.
Alright, we’re down to our final secondary key. While it is nice to know the date that time was logged for, it may be easier to group it by month. For this, we’ll use the MONTH function, in combination with VLOOKUP and ARRAYFORMULA.
Returns the month of the year a specific date falls in, in numeric format.
We don’t want a numeric value, which is why we’ll be combining it with VLOOKUP, but for now, let’s get the basics set up. We’ll start by adding a new column to the left of E, and give it a header of Month. Then, we’ll enter the start of our formula in E2 with [=ARRAYFORMULA(MONTH(D2:D))]. We have a pretty solid understanding at this point of ARRAYFORMULA; it’s now applying MONTH to every cell in D2:D and returning that value in E2:E.
So how do we translate that numeric value to the full name of the month? VLOOKUP sounds good! Enter our second vlookup_ref, vlookup_month_ref. Take a look. Here, we’re associating the numeric value with the name of the month. How do we apply it to the formula we’ve already created? That would look something like this [=ARRAYFORMULA(VLOOKUP(MONTH(D2:D),vlookup_month_ref!A1:B12,2,False))]. In this formula, we’re telling VLOOKUP to look at the numeric value generated by MONTH and search against the table in vlookup_month_ref, all at once. Voila! You can see the final result on the organized_complete sheet.
And now, we get to our working sheets. All of your data is compiled and ready to be organized. How? Through the wonders that are Pivot Tables.
Are used to narrow down a large data set or see relationships between data points.
They are a fast, reliable way for you to view your data in an organized fashion. The best part is that you can reorganize in an almost infinite amount of combinations. It all depends on setting up your rows, columns, values, and filters properly.
To create a Pivot Table, highlight your data set, and select Pivot Table from the Tools menu in sheets. It will automatically create a new sheet, and the Pivot Table editor will appear on the right-hand side. You can view an example of two Pivot Tables that I’ve created based on our organized data set.
The first example lets us see how many hours each employee has spent working on a particular project, organized by the employee. Take a look at how I have set up the data in the editor.
The second example utilizes those secondary keys that we added in. It shows us how many hours each department worked on a project on a month-by-month basis, organized by project.
Congratulations! Your ability to organize large sheets of data is now the envy of the office (or you’ll just be that weird spreadsheets wizard.) Either way, you can feel confident in your knowledge of these not-so-simple functions to continue growing your skillset. Remember, there is also a large support community for Google Sheets. Chances are, you aren’t trying to do something that someone hasn’t already tried. Before you know it, you’ll be writing out these formulas without having to reference the hints or forums.