Skip to content

Matching data in a table in Excel® using INDEX and MATCH – a VLOOKUP alternative

This post describes how you can use the INDEX() and MATCH() functions to look up data in a matrix.

Background to VLOOKUP()

Many users are familar with the VLOOKUP() (vertical lookup) function that simply looks up a value in the leading column of a table and returns the value in another specified column of the same table. The required parameters are:

VLOOKUP(Lookup_value,Table_array,Col_index_num,Range_lookup) where Range_lookup is optional.

For example:

With the data in the table below…

A B
1 Fruit Colour
2 Apple Red
3 Banana Yellow
4 Orange Orange (what else!)

…the formula

=VLOOKUP(“Banana”,$A$2:$B$4,2)

Result: Yellow

will search for “Banana” (1st parameter) in the leftmost column of the table ($A$2:$B$4) (2nd parameter) and return the value that is in the 2nd (3rd-parameter) column.

It should be noted that the 1st parameter (the value of which, here, is “Banana“) can also be a cell reference (e.g. $D$1) rather than statically entered into the result cell formula.

Problems with VLOOKUP()

I’ve never like this function for several reasons:

  1. Any changes to the structure of the table (such as the insertion of a new column) may result in the incorrect field value for that row entry being returned.
  2. You need to determine what column number the information you want return resides in. (This is obviously fine for small tables, but you shouldn’t have to waste time determining this in the first place!)
  3. The table needs to be (ascendingly?) sorted by the first column to work correctly.
  4. It takes longer than it ought to replicate the query to return the value for a different column.

So what are our alternatives? I prefer to use the INDEX() and MATCH() functions. For my mind this approach is not only more flexible, but it’s also more intuitive.

The premise is much the same, however, instead of matching only the leftmost column, you can also match a row and column to determine the coordinates of the value you want returned. It’s effectively a combination of the VLOOKUP() (vertical lookup) and HLOOKUP()* (horizontal lookup) functions.

* HLOOKUP() simply matches a value to the first row (cf. column) of the table, returning the value in a corresponding column (cf. row)

Using INDEX() and MATCH()

I’ll describe the INDEX() and MATCH() functions separately. There are two permutations of the INDEX() function: array and reference. For the purposes of this post, I’ll use array. The INDEX() function is described as:

=INDEX(array,row_num,column_num)

For example:

Using the same data from the previous example, we designate the array parameter as the table that contains the data we wish to return.

A B
1 Fruit Colour
2 Apple Red
3 Banana Yellow
4 Orange Orange (what else!)

To retrieve the same result from the first example, we can use the formula:

=INDEX($A$2:$B$4,2,2)

Result: Yellow

This returns the value from the array ($A$2:$B$4 – 1st parameter) that is in the 2nd row (2nd parameter) and the 2nd column (3rd parameter). Using the function in this way INDEX() is certainly no better than using VLOOKUP() (in fact, it’s worse, because there is no dynamic functionality – that is, there is no “searching” involved.)

So, how do we dynamically determine the row and column numbers based on certain criteria? We can use the MATCH() function, which can be described as:

=MATCH(Lookup_value,Lookup_array,Match_type)

To determine the row we are interested in (here, row 2 – the “Banana” row) we can use the formula:

=MATCH(Banana“,$A$2:$A$4,0)

This searches the data array ($A$2:$A$4 – the 2nd parameter) for the lookup value (“Banana” – the 1st parameter) and ensures that it matches exactly (Match_type – the 3rd parameter – equals 0; returns #N/A if there is no match.)

Result: 2

Similarly, we use this function to determine our required column number (here, column 2 – the “Colour” column) we can use:

=MATCH(Colour“,$A$1:$B$1,0)

This searches the data array ($A$1:$B$1 – the 2nd parameter) for the lookup value (“Colour” – the 1st parameter) and ensures that it matches exactly (Match_type – the 3rd parameter – equals 0; returns #N/A if there is no match.)

Result: 2

It should be easy to recognise from this point that for a much larger table, this takes a lot of the guess work out of determining the appropriate column number.

=INDEX($A$2:$B$4,MATCH(“Banana”,$A$2:$A$4,0),MATCH(“Colour”,$A$1:$B$1,0))

Result: Yellow

Accomodating changes to the table structure

As mentioned, one of the advantages of using the INDEX() and MATCH() functions is its ability to adapt when changes are made to the table structure.

A B C D
1 Fruit Price Colour Qty
2 Apple $1.50 Red 2
3 Banana $2.50 Yellow 7
4 Orange $0.80 Orange (what else!) 8

=INDEX($A$2:$D$4,MATCH(“Banana”,$A$2:$A$4,0),MATCH(“Colour”,$A$1:$D$1,0)) will still return “Yellow” (once columns C and D are accounted for) as the 3rd parameter of the INDEX() function is now 3, while…

=VLOOKUP(“Banana”,$A$2:$D$4,2) will now return $2.50 as this is the value in the 2nd column of the “Banana” row, that is, it hasn’t adapted to the change in table structure.

Using named ranges

I love named ranges (as much as one can love an Excel® function) and using them can simplify even the most complicated formula. I will describe named ranges in more detail in another article, but for the sake of this piece, simply put, named ranges are names that refer to a cell or cell range. Using the above example, you could define or “name” the $A$2:$D$4 cell range as “Data“, $A$1:$D$1 as “Fields“, $A$2:$A$4 as “Fruits“, $B$2:$B$4 as “Prices” and so on. You can name ranges by selecting Insert -> Name -> Define from the Excel® file menu. Once this has been done, your formula simplifies to:

=INDEX(Data,MATCH(“Banana”,Fruits,0),MATCH(“Colour”,Fields,0))

Result: Yellow

An additional example – the best use of INDEX() and MATCH()

This combination of functions is probably best utilised when using 2-dimensional matrices that have column and row headings.

For example:

A B C D
1 Origin/Destination Sydney Melbourne Brisbane
2 Sydney 0 885 926
3 Melbourne 885 0 1697
4 Brisbane 926 1697 0

With the appropriate named ranges, your formula to determine the distance between two given points could be:

=INDEX(Distances,MATCH(“Brisbane”,Origins,0),MATCH(“Sydney”,Destinations,0))

Results: 926

Other points

  • As mentioned with the VLOOKUP() example, “Banana” and “Colour” could both be cell references (you could even use a drop-down menu!)
  • You could use VLOOKUP() and MATCH() (e.g. VLOOKUP(“Banana”,$A$2:$B$4,MATCH(“Colour”,$A$1:$B$1,0)) but you are still limited to your “row” search term being in the first row and that the list is sorted by the first row.
  • Complications can arise if there are not unique values in a column or row (e.g. “Apples” can be “Red” or “Green“, so could have two entries in the table.) There are ways in which you can handle this (sometimes using SMALL() and BIG() that are beyond the scope of this article.)

Othewise, I’ve found this to be a good alternative to the VLOOKUP() function; one that’s easier to use and more powerful.

There are certainly aspects of the functions that I have omitted. I’d welcome any comments or criticism, so please feel free to add any comments. If you require further clarification on any part of this article, please leave a comment, or you can email me at: ryan@kirgs.com

Ryan Kirgan is from Sydney, Australia.

Ryan Kirgan

Ryan Kirgan

Share

{ 11 } Comments

  1. Arvin | April 22, 2009 at 8:49 pm | Permalink

    is it possible for VLOOKUP to have 2 reference(lookup) “B1 and A3” and will look for sheet2 in Columns A and B if it matches the Cells B1 and A3 from sheet1 it will get the value from sheet2 Column C and return to sheet1
    SHEET1
    1 A B C D E F G H I
    2 1 2 3 4 5 6 7 8
    3 name1 5
    4 name2 6
    5 name3 7
    6 name4 8
    7 name5 9
    8 name6 10
    9 name7 11
    10 name8 12

    SHEET2
    A B C
    1 name1 5
    2 name2 6
    3 name3 7
    4 name4 8
    5 name5 9
    6 name6 10
    7 name7 11
    8 name8 12

  2. Ryan Kirgan | April 23, 2009 at 11:16 am | Permalink

    @Arvin

    Hi Arvin,

    I’m not too sure what you’re after as the table has become distorted in WordPress, making it difficult to work out what values are in B1 & A3.

    Could you email me your spreadsheet to ryan@kirgs.com.

    You can certainly use lookup functions across worksheets and you can also match multiple criteria. As with most things, there are many ways to go about doing this, but with a bit more information I can hopefully provide you with a suitable approach – perhaps one that uses INDEX() and MATCH() rather than VLOOKUP() ;)

    Ryan

  3. Ryan Kirgan | April 23, 2009 at 12:33 pm | Permalink

    Hi again,

    I had a look at the email of your post (that maintained the correct formatting for some reason) and think I have an idea as to what you are after. My interpretation is that you want to match two criteria; the name in the leftmost column of Sheet1 and the number in the top row Sheet1 – if there’s a match in Sheet2, then return the value in the rightmost column of Sheet2.

    Lets start with Sheet2 because this is where your raw data is kept. In fact, if it’s ok, I’ll rename the worksheet to Data

    Data

    A B C
    1 1 name1 5
    2 2 name2 6
    3 3 name3 7
    4 4 name4 8
    5 5 name5 9
    6 6 name6 9
    7 7 name7 10
    8 8 name8 11
    9 9 name9 12

    At this point, I’d always recommend using named ranges on these columns, but to keep the solution in cell reference terms, we can substitute this in later.

    Now, turning our attention to Sheet1 (which I’ll call Output)…

    Output

    A B C D E F G H I
    1 1 2 3 4 5 6 7 8
    2 name1 ??? ??? ??? ??? ??? ??? ??? ???
    3 name2 ??? ??? ??? ??? ??? ??? ??? ???
    4 name3 ??? ??? ??? ??? ??? ??? ??? ???
    5 name4 ??? ??? ??? ??? ??? ??? ??? ???
    6 name5 ??? ??? ??? ??? ??? ??? ??? ???
    7 name6 ??? ??? ??? ??? ??? ??? ??? ???
    8 name7 ??? ??? ??? ??? ??? ??? ??? ???
    9 name8 ??? ??? ??? ??? ??? ??? ??? ???

    To reiterate, what I think you want to do is populate the coloured cells (containing ???) when there is a match to the column and row. As mentioned, there are more ways to do this than I care to mention, but the first that comes to mind involves using INDEX() and MATCH() (thank goodness, being the title of the blog entry and all ;)

    Let’s start by clicking on the first cell we want to calculate on the Output worksheet: B2.

    From earlier, INDEX() can be described as:

    =INDEX(array,row_num,column_num)

    Here, the array we are after is the rightmost column in the Data worksheet (Data!$C$1:$C$8). Being only one column wide, the only column_num we can have is 1. But how do we work out the row_num when there are two criteria? One way to do this is to concatenate the two search terms. You do this by using an ampersand (&). You then concatenate the columns you want to search so you’re comparing apples to apples (so to speak.)

    Describing MATCH():

    =MATCH(Lookup_value,Lookup_array,Match_type)

    So our Lookup_value for this first cell is $A2&B$1. We use the dollar ($) symbol in front of the A in $A1 to confine the search queries to the first column (i.e. the ones with name1name8 in them). Similarly, we use the dollar ($) symbol in from of the 1 in B$1 to confine the searches to the first row. That way, when we copy the formula to the other cells, we’re always matching the leftmost/top column that corresponds the cell we’re in.

    As mentioned, our Lookup_array is also a concatenation, however this time it is Data!$B$1:$B$8&Data!$A$1:$A$8. We have dollar ($) symbols in front of of all row and column references here because the raw data doesn’t move (i.e. its position is not relative the cells).

    This results in:
    =MATCH(Output!$A2&Output!B$1,Data!$B$1:$B$8&Data!$A$1:$A$8,0)

    So…putting it altogether we have:

    =INDEX(Data!$C$1:$C$8,MATCH(Output!$A2&Output!B$1,Data!$B$1:$B$8&Data!$A$1:$A$8,0),1)

    but pressing return results in #VALUE. This is because concatenating strings in a function such as this returns an array. To formulate cells like this instead of press Enter after entering the formula, you need to press Shift and Enter at the same time. This will put curly brackets ({ and }) around the formula. In doing so, for this example, you should now see the number 5 in B2 which corresponds to the two criteria in the raw data.

    You only need to do this once, as we’re going to now place the cursor on the bottom right-hand corner of B2 until the cross appears and will drag the formula to the other cells in the table. You can’t copy and paste the formula to more than one cell as Excel will complain that you “cannot change part of an array.”

    The only problem now is that there are #N/As where there isn’t a corresponding value in the raw data :| What’s a quick way to get rid of this. What I do is use a simple if statement to return nothing if the result is an #N/A. What I do in this instance is use a combination of the IF() and ISNA() functions.

    To describe this in broad terms:
    =IF(ISNA(cell value),nothing,something)

    i.e. if the cell value is #N/A then return nothing; otherwise, return something. In this case, it’s the MATCH() function that is returning the #N/A, so we only need to use that as the in the above.

    Our formula for B2 now becomes:

    ={IF(ISNA(MATCH(Output!$A2&Output!B$1,Data!$B$1:$B$8&Data!$A$1:$A$8,0)),”",INDEX(Data!$C$1:$C$8,MATCH(Output!$A2&Output!B$1,Data!$B$1:$B$8&Data!$A$1:$A$8,0),1))}

    (without the curly brackets – these are put in the cells formula after pressing Shift + Enter.

    As you can see, this formula is ridiculously long. Named ranges will improve this. For instance defining:

    dataID => Data!$A$1:$A$8

    dataName => Data!$B$1:$B$8
    dataResult => Data!$C$1:$C$8

    reduces the formula to:

    ={IF(ISNA(MATCH($A2&B$1,dataName&dataID,0)),”",INDEX(dataResult,MATCH($A2&B$1,dataName&dataID,0),1))}

    Output

    A B C D E F G H I
    1 1 2 3 4 5 6 7 8
    2 name1 5
    3 name2 6
    4 name3 7
    5 name4 8
    6 name5 9
    7 name6 10
    8 name7 11
    9 name8 12

    Hopefully this is what you were after

  4. Ryan Kirgan | April 23, 2009 at 12:47 pm | Permalink

    You can download the file from here: Multiple Criteria INDEX MATCH demo for Alvin.xls (22k)

  5. Ryan Kirgan | April 23, 2009 at 5:22 pm | Permalink

    I thought about this a bit more…

    If you don’t want to use INDEX() and MATCH() you might like to try:

    =SUMPRODUCT(($A2=dataName)*(B$1=dataID)*(dataResult)) in B2. It’s not an array formula (so no Ctrl + Shift + Enter) and it’s a lot more intuitive.

    Where there’s no match, you’ll see a 0. You can hide these in a number of ways:

    1. =IF(SUMPRODUCT(($A2=dataName)*(B$1=dataID)*(dataResult))=0,"",SUMPRODUCT(($A2=dataName)*(B$1=dataID)*(dataResult)))

    2. Select the cells, right-click and select “Format Cells”. From the “Custom” menu, append two semi-colons to the format. e.g. “General” becomes “General;;”

    Plus others.

  6. Wealth Coach | August 29, 2011 at 11:15 pm | Permalink

    I am trying to figure out how many widgets I need to order for every item listed. I have a sheet of items (rows) where for each row there may be an amount needed or not. For each row that has a quantity needed, I need to produce another sheet with ONLY the rows where there is an actual requirement with the quantity needed, item name and state. Here is sample data of the listing in Worksheet A:

    Name State Qty
    Apples WA 5
    Oranges FL
    Car MI 8
    Furniture VA 3
    Corn IN

    Notice that the Qty is null in many of the rows. Desired output on Worksheet B:

    Name State Qty
    Apples WA 5
    Car MI 8
    Furniture VA 3

    So the output does not include the empty qty records. I’ve tried various vlookup formulas and others and am just having a hard time. I’d prefer to do this with a formula rather than VBA macro or such and not using a filter. Any help would be gratifying.

  7. Ryan Kirgan | September 7, 2011 at 4:08 pm | Permalink

    @Wealth Coach
    Unfortunately, I don’t think that VLOOKUP or INDEX/MATCH are the right tools for you to use to output a filtered list.

    In essence you are simply filtering the list, rather than looking up a value against a set of criteria.

    What about using a Pivot Table whereby you filter the table to display only those with non-zero quantities? You can have the Pivot Table refer to a dynamic named range. You have to manually refresh the Pivot Table data when you make any changes (which you can do via menus), otherwise you can make it update automatically with a few lines of VBA:

    Private Sub Worksheet_Activate()
    PivotTables(1).PivotCache.Refresh
    End Sub

    I’ve uploaded an example of the file here: http://blog.kirgs.com/wp-content/examples/Example-for-Wealth-Coach.xlsm

  8. Ryan Kirgan | September 7, 2011 at 4:17 pm | Permalink

    On this last comment to @Wealth Coach,

    The Dynamic Named Range refers to the following:

    =OFFSET('Input Data'!$A$1,0,0,COUNTA('Input Data'!$A:$A),COUNTA('Input Data'!$1:$1))

    And will expand as you add more rows below your data set in the ‘Input Data’ worksheet.

    Reply to me if you have any other questions or this isn’t what you are after.

  9. Elena Rush | May 22, 2013 at 12:41 pm | Permalink

    Say you have a table of employee records. The first column is an employee number, and the remaining columns are various data about the employee. Any time you have an employee number in the worksheet, you can use VLOOKUP to return a specific datum about the employee. The syntax is VLOOKUP(value,data range,col no.,FALSE). It says to Excel, “Go to the data range. Find a row that has (value) in the first column of the data range. Return the (col no.)th value from that row. Once you get the hang of it, it is very simple and powerful.

  10. Margaret | March 4, 2014 at 7:36 am | Permalink

    Hi!

    I would like to match a set of values (product codes in table 1) to another set of values (in several tables as shown in table 2, 3) to see if they match. If yes, then return a value. I appreciate your suggestion. Thank you.

    For example,
    Table 1.
    Store Product Return
    1 1-234 2
    1 2-345 2
    1 3-456 2
    2 1-234 3
    2 4-567 3
    2 6-789 3

    Table 2.
    Return Product
    2 1-234
    2 2-345
    2 3-456

    Table 3.
    Return Product
    3 1-234
    3 4-567
    3 6-789

  11. Ryan Kirgan | March 4, 2014 at 10:16 am | Permalink

    Hi Margaret,

    This is a common request whereby you want to match multiple criteria. The three general methods I employ to do this us:

    1. INDEX & MATCH using Array Formulas
    2. SUMPRODUCT
    3. SUMIFS

    Given the blog post is based on INDEX & MATCH, the formula I’ve used is as follows:

    =INDEX(,MATCH( &"::"&, &"::"&,0),1)

    After typing this formula you need to press Ctrl + Shift as it is an Array Formula (i.e. it references/joins multiple ranges). You’ll note the formula will display in between { and } brackets to indicate as much.

    I include the “::” string in between as a delimeter between the fields we are querying. In essence, the string we are matching is a concatenation of the Product and the Store.

    So introducing a string in between, the query string becomes 1-234::1 for Product = 1-234 and Store = 1. If it is not there, it is difficult for us to tell the difference between Product = 1-234 and Store = 1 and Product = 1-23 from Store = 41, for example.

    I’ve included a solution to your request using all three methods here: Multiple Criteria Lookup for Margaret (RK 2014-03-04).xls

    Hope this helps.

Post a Comment

Your email is never published nor shared. Required fields are marked *
*