This post describes how you can use the INDEX() and MATCH() functions to look up data in a matrix.
Background to VLOOKUP()
Many users are familar with the VLOOKUP() (vertical lookup) function that simply looks up a value in the leading column of a table and returns the value in another specified column of the same table. The required parameters are:
VLOOKUP(Lookup_value,Table_array,Col_index_num,Range_lookup) where Range_lookup is optional.
With the data in the table below…
|4||Orange||Orange (what else!)|
will search for “Banana” (1st parameter) in the leftmost column of the table ($A$2:$B$4) (2nd parameter) and return the value that is in the 2nd (3rd-parameter) column.
It should be noted that the 1st parameter (the value of which, here, is “Banana“) can also be a cell reference (e.g. $D$1) rather than statically entered into the result cell formula.
Problems with VLOOKUP()
I’ve never like this function for several reasons:
- Any changes to the structure of the table (such as the insertion of a new column) may result in the incorrect field value for that row entry being returned.
- You need to determine what column number the information you want return resides in. (This is obviously fine for small tables, but you shouldn’t have to waste time determining this in the first place!)
- The table needs to be (ascendingly?) sorted by the first column to work correctly.
- It takes longer than it ought to replicate the query to return the value for a different column.
So what are our alternatives? I prefer to use the INDEX() and MATCH() functions. For my mind this approach is not only more flexible, but it’s also more intuitive.
The premise is much the same, however, instead of matching only the leftmost column, you can also match a row and column to determine the coordinates of the value you want returned. It’s effectively a combination of the VLOOKUP() (vertical lookup) and HLOOKUP()* (horizontal lookup) functions.
* HLOOKUP() simply matches a value to the first row (cf. column) of the table, returning the value in a corresponding column (cf. row)
Using INDEX() and MATCH()
I’ll describe the INDEX() and MATCH() functions separately. There are two permutations of the INDEX() function: array and reference. For the purposes of this post, I’ll use array. The INDEX() function is described as:
Using the same data from the previous example, we designate the array parameter as the table that contains the data we wish to return.
|4||Orange||Orange (what else!)|
To retrieve the same result from the first example, we can use the formula:
This returns the value from the array ($A$2:$B$4 – 1st parameter) that is in the 2nd row (2nd parameter) and the 2nd column (3rd parameter). Using the function in this way INDEX() is certainly no better than using VLOOKUP() (in fact, it’s worse, because there is no dynamic functionality – that is, there is no “searching” involved.)
So, how do we dynamically determine the row and column numbers based on certain criteria? We can use the MATCH() function, which can be described as:
To determine the row we are interested in (here, row 2 – the “Banana” row) we can use the formula:
This searches the data array ($A$2:$A$4 – the 2nd parameter) for the lookup value (“Banana” – the 1st parameter) and ensures that it matches exactly (Match_type – the 3rd parameter – equals 0; returns #N/A if there is no match.)
Similarly, we use this function to determine our required column number (here, column 2 – the “Colour” column) we can use:
This searches the data array ($A$1:$B$1 – the 2nd parameter) for the lookup value (“Colour” – the 1st parameter) and ensures that it matches exactly (Match_type – the 3rd parameter – equals 0; returns #N/A if there is no match.)
It should be easy to recognise from this point that for a much larger table, this takes a lot of the guess work out of determining the appropriate column number.
Accomodating changes to the table structure
As mentioned, one of the advantages of using the INDEX() and MATCH() functions is its ability to adapt when changes are made to the table structure.
|4||Orange||$0.80||Orange (what else!)||8|
=INDEX($A$2:$D$4,MATCH(“Banana”,$A$2:$A$4,0),MATCH(“Colour”,$A$1:$D$1,0)) will still return “Yellow” (once columns C and D are accounted for) as the 3rd parameter of the INDEX() function is now 3, while…
=VLOOKUP(“Banana”,$A$2:$D$4,2) will now return $2.50 as this is the value in the 2nd column of the “Banana” row, that is, it hasn’t adapted to the change in table structure.
Using named ranges
I love named ranges (as much as one can love an Excel® function) and using them can simplify even the most complicated formula. I will describe named ranges in more detail in another article, but for the sake of this piece, simply put, named ranges are names that refer to a cell or cell range. Using the above example, you could define or “name” the $A$2:$D$4 cell range as “Data“, $A$1:$D$1 as “Fields“, $A$2:$A$4 as “Fruits“, $B$2:$B$4 as “Prices” and so on. You can name ranges by selecting Insert -> Name -> Define from the Excel® file menu. Once this has been done, your formula simplifies to:
An additional example – the best use of INDEX() and MATCH()
This combination of functions is probably best utilised when using 2-dimensional matrices that have column and row headings.
With the appropriate named ranges, your formula to determine the distance between two given points could be:
- As mentioned with the VLOOKUP() example, “Banana” and “Colour” could both be cell references (you could even use a drop-down menu!)
- You could use VLOOKUP() and MATCH() (e.g. VLOOKUP(“Banana”,$A$2:$B$4,MATCH(“Colour”,$A$1:$B$1,0)) but you are still limited to your “row” search term being in the first row and that the list is sorted by the first row.
- Complications can arise if there are not unique values in a column or row (e.g. “Apples” can be “Red” or “Green“, so could have two entries in the table.) There are ways in which you can handle this (sometimes using SMALL() and BIG() that are beyond the scope of this article.)
Othewise, I’ve found this to be a good alternative to the VLOOKUP() function; one that’s easier to use and more powerful.
There are certainly aspects of the functions that I have omitted. I’d welcome any comments or criticism, so please feel free to add any comments. If you require further clarification on any part of this article, please leave a comment, or you can email me at: firstname.lastname@example.org
Ryan Kirgan is from Sydney, Australia.