Cross-referencing vocabulary in Excel after some tidying is applied with IFERROR.

Excel for Polyglots: Comparative audits to keep languages in sync

Duolingo, Memrise, Anki, Microsoft Excel. Huh, wait – Excel? How is that a language learning app?

Well, the Office software has some handy features that just happen to be right up our street as language learners. Namely, the ability to curate and administer lists in table form. And it just happens that this can be particularly useful if you learn more than one language.

One source of frustration as a polyglot learner is the discrepancy of vocabulary level between languages. This can be most obvious with fairly close language pairs. For instance, when practising Icelandic, I often realise that I know a term in Norwegian – but not the language I am trying to speak.

So how best to address these discrepancies?

Language auditing

Getting into the habit of performing a regular language audit, such a revisiting beginner materials is a good strategy for any learner. But one particularly powerful method for multi-language learners is the comparative audit.

In short, a comparative audit is simply taking stock of which words you know in one language, but not the other.

At the very early stages of learning a language, this can be as easy as scanning down a list. But when you get to the point of having hundreds and hundreds of words in your vocab store, the task is mammoth.

Enter Excel, data wizard!

Microsoft Excel and VLOOKUP

Most of us will have used Excel or another spreadsheet program at some point. But like me, you might not have gone beyond basic numerical information and a few simple sum functions.

It turns out that Excel is pretty good at handling textual data too. Are you thinking what I’m thinking? Yes, vocabulary lists! And it has a special function, VLOOKUP, which allows you to compare data between two tables. Sounds just perfect for our comparative audit.

Here’s how to enlist Excel to your polyglot cause in a few simple(-ish) steps.

Step 1: Port your data into Excel

First things first – you have to get your vocabulary data into Excel. The easiest way is to export from your program of choice as a CSV (comma-separated values) or tab-delimited text file. If you use Anki, this is as easy as heading to File > Export and selecting ‘Notes in Plain Text (*.txt).

Ensure that you only export the basic data and no media or tags. Ideally, you should just be exporting a word and definition / translation field. My Norwegian and Icelandic decks, for example, are populated by vocab notes with an English and Target Language field.

Export a separate file for each of the two languages you want to compare. In my case, I end up with two files, norwegian.txt and icelandic.txt.

Exporting data from Anki

Exporting data from Anki

Step 2: Import your vocab into Excel

In Microsoft Excel, create a fresh spreadsheet document, and head to File > Import. Select Text File, hit Import and locate your first exported vocabulary file from above. To preserve accented characters in our Anki list, select Unicode (UTF-8) as the File origin.

Importing vocabulary into Excel

Importing vocabulary into Excel – note that ‘Unicode (UTF-8)’ has been selected as the file origin to make sure accented characters are handled correctly.

Create a second sheet in the same document, and import your other list of vocabulary into that. You should now have a two-sheet spreadsheet document, each sheet showing a list of words in a different language. For clarity, make sure you name your sheets too. Simply double-click on the tab titles “Sheet 1” etc. to do that.

Step 3: Format your lists as tables

In each sheet, click and drag across the table to select your whole vocabulary list as a block. Now, click Format as Table in the Home section of the function ribbon / toolbar. It doesn’t really matter which style you use – I choose the colour I like best!

Once that’s done, change the new column headers to something more meaningful than the default values. I use English and Norwegian in my example below. One caveat – you need to have a column with the same title in both your tables for the VLOOKUP trick to work. Here, English will be my common column between Norwegian and Icelandic.

Vocabulary data formatted as a table in Microsoft Excel

My Norwegian vocabulary data formatted as a table in Microsoft Excel

Now, instantly, these is already more useful to us than static lists. Formatting as a table means you can use the column heading drop-downs to sort and filter your entries. Try it – sort alphabetically on the target language column. You’ve turned your data into a nifty dictionary! Not our primary goal, but a nice trick on the way.

Before we go on, it’s a good idea to name our tables so they are easy to refer to later. To do this, click anywhere in your table, then switch to the Table tab in the ribbon / toolbar. The simpler, the better – below, I just call mine Icelandic.

Naming a table in Excel

Naming a table in Excel

But now it’s the turn of our star, VLOOKUP. This is where the real magic happens.

Step 4: Adding a comparative column

Click on the target language column header of your second table and copy it (CTRL + C). Now, go to your first table, select the cell next to the target language column header (C1 in my example), and paste (CTRL + V). It should add a blank new column within that table. Let’s fill it up!

In the first cell under that new column header, we type in our VLOOKUP formula. This will depend on what you have named your tables and sheets, but mine looks like this:

=VLOOKUP([@English], Icelandic, 2, 0)

Let’s dissect that just now. The first item in the brackets is the column of the first table we’ll use at the lookup – the English entry. The second item, Icelandic, is the table we’ll look for a value in. Remember, we named that table a little earlier. The third item, 2, is the column number we’ll look for that item in, which is the target language column of the Icelandic table. Finally the fourth value, 0, is a flag to Excel that we want exact matches only.

If that boggles, simply start typing =VLOOKUP( in the cell. That calls up Excel’s formula hints and point-and-click formula building, which should help you tie things together accurately.

After doing that, something special happens – suddenly, the whole column is filled with entries. If the English term was found in the Icelandic table, the corresponding Icelandic word is pulled in. If not, we simply get #N/A.

A quick note if that doesn’t work immediately: check that the data type of the cells in that third column are set to format as General, not Text.

A cross-referencing table in Excel using VLOOKUP

Our first step in creating a cross-referencing table in Excel using VLOOKUP.

Not very tidy, is it? That #N/A is simply stating that the lookup resulted in nothing at all.

Step 5: Tying off the loose ends

We can make it all look better by wrapping it in another Excel formula, IFERROR. Change the formula in that first cell to:

=IFERROR(VLOOKUP([@English], Icelandic, 2, 0), "-")

This tells Excel to carry out our VLOOKUP function, but to return a dash if it results in an error (i.e., no data). Suddenly, it’s looking a lot neater.

Cross-referencing vocabulary in Excel after some tidying is applied with IFERROR.

Cross-referencing vocabulary in Excel after some tidying is applied with IFERROR.

Now it is crystal clear where you know a word in one language but not the other. To make things even clearer, click the dropdown on that third column, and filter it to show just the dashed elements. There is your list of words to work on in the second language!

Filtering your vocabulary items in Excel.

Filtering your vocabulary items in Excel.

Alternatively, filter on everything but the dashes to revel in the wealth of words you know in both. Enjoy that moment of pride!

For reference, here’s an example Excel file comparing sample vocabulary in French and Spanish.

Where to go from here?

What you do next is up to you. But now, you have the data in your hands, and data is power: what you know, you can act on. Export the filtered list of gaps to work on learning missing vocabulary in any number of ways.

Clearly, you can take these techniques a lot further, too. Currently, the table only checks one way, such as Icelandic to Norwegian in my example. But you can experiment with the same techniques to create much more complex and comprehensive spreadsheets to interrogate both ways.

Lastly, I’ve used Microsoft Excel in this example, but the same functionality is available in other spreadsheet programs, too. The free alternative Google Sheets, for example, has its own VLOOKUP function that works in an almost identical manner. Play around with the tools available, and you can add that dull old spreadsheet package to your list of exciting, innovative language apps!

Have you given this trick a spin? Have any interesting and useful variations on it? Please share in the comments!

8 thoughts on “Excel for Polyglots: Comparative audits to keep languages in sync

  1. Nomsi says:

    I like this flottar nördarformúlur / fine nerdete formler! For all Excel nerds out there, XLOOKUP is on its way

  2. Richard West-Soley says:

    Vi er alle nerder nå! 🙂 XLOOKUP sounds interesting, will have a look at that – thanks for the heads up!

  3. Anita Moon says:

    Hi. This is an excellent article but most of it was way over my head. I have a two-column Excel spreadsheet. English words are in one column and their Spanish equivalents in adjacent cells in the other column. I would like to generate a text-to-speech file that I could listen to when I’m away from the computer (mp3?) My problem is that I don’t know how to make the English column be read by an English (American) voice and then pause a few seconds then the adjoining cell in the second column be read by a Spanish voice (Latin American) and so-on and so-forth down the worksheet for as many pairs of words I choose. Any ideas on your part? Thanks!

  4. Richard West-Soley says:

    Hi and thanks! That’s a really great question, although after a bit of Googling, it doesn’t seem like the answer is very straightforward, sadly. The most helpful thing I could find involves writing a macro using the speech synthesis commands in Windows Visual Basic – the documentation for that is here: https://docs.microsoft.com/en-us/office/vba/api/excel.speech

    However, there seems to be an issue setting the language for individual items in cells; it looks like Excel will take the language from the Windows settings as the default, so it wouldn’t be possible to specify English, then Spanish, and keep switching back and forth while a macro runs.

    This really got me thinking, though – I’ll keep looking to see if there’s a simpler way to do this, as I can imagine it would be really helpful for many of us! 🙂

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.