Cross-referencing vocabulary in Excel after some tidying is applied with IFERROR.

Excel for Polyglots: Comparative audits to keep languages in sync

Duolingo, Memrise, Anki, Microsoft Excel. Huh, wait – Excel? How is that a language learning app?

Well, the Office software has some handy features that just happen to be right up our street as language learners. Namely, the ability to curate and administer lists in table form. And it just happens that this can be particularly useful if you learn more than one language.

One source of frustration as a polyglot learner is the discrepancy of vocabulary level between languages. This can be most obvious with fairly close language pairs. For instance, when practising Icelandic, I often realise that I know a term in Norwegian – but not the language I am trying to speak.

So how best to address these discrepancies?

Language auditing

Getting into the habit of performing a regular language audit, such a revisiting beginner materials is a good strategy for any learner. But one particularly powerful method for multi-language learners is the comparative audit.

In short, a comparative audit is simply taking stock of which words you know in one language, but not the other.

At the very early stages of learning a language, this can be as easy as scanning down a list. But when you get to the point of having hundreds and hundreds of words in your vocab store, the task is mammoth.

Enter Excel, data wizard!

Microsoft Excel and VLOOKUP

Most of us will have used Excel or another spreadsheet program at some point. But like me, you might not have gone beyond basic numerical information and a few simple sum functions.

It turns out that Excel is pretty good at handling textual data too. Are you thinking what I’m thinking? Yes, vocabulary lists! And it has a special function, VLOOKUP, which allows you to compare data between two tables. Sounds just perfect for our comparative audit.

Here’s how to enlist Excel to your polyglot cause in a few simple(-ish) steps.

Step 1: Port your data into Excel

First things first – you have to get your vocabulary data into Excel. The easiest way is to export from your program of choice as a CSV (comma-separated values) or tab-delimited text file. If you use Anki, this is as easy as heading to File > Export and selecting ‘Notes in Plain Text (*.txt).

Ensure that you only export the basic data and no media or tags. Ideally, you should just be exporting a word and definition / translation field. My Norwegian and Icelandic decks, for example, are populated by vocab notes with an English and Target Language field.

Export a separate file for each of the two languages you want to compare. In my case, I end up with two files, norwegian.txt and icelandic.txt.

Exporting data from Anki

Exporting data from Anki

Step 2: Import your vocab into Excel

In Microsoft Excel, create a fresh spreadsheet document, and head to File > Import. Select Text File, hit Import and locate your first exported vocabulary file from above. To preserve accented characters in our Anki list, select Unicode (UTF-8) as the File origin.

Importing vocabulary into Excel

Importing vocabulary into Excel – note that ‘Unicode (UTF-8)’ has been selected as the file origin to make sure accented characters are handled correctly.

Create a second sheet in the same document, and import your other list of vocabulary into that. You should now have a two-sheet spreadsheet document, each sheet showing a list of words in a different language. For clarity, make sure you name your sheets too. Simply double-click on the tab titles “Sheet 1” etc. to do that.

Step 3: Format your lists as tables

In each sheet, click and drag across the table to select your whole vocabulary list as a block. Now, click Format as Table in the Home section of the function ribbon / toolbar. It doesn’t really matter which style you use – I choose the colour I like best!

Once that’s done, change the new column headers to something more meaningful than the default values. I use English and Norwegian in my example below. One caveat – you need to have a column with the same title in both your tables for the VLOOKUP trick to work. Here, English will be my common column between Norwegian and Icelandic.

Vocabulary data formatted as a table in Microsoft Excel

My Norwegian vocabulary data formatted as a table in Microsoft Excel

Now, instantly, these is already more useful to us than static lists. Formatting as a table means you can use the column heading drop-downs to sort and filter your entries. Try it – sort alphabetically on the target language column. You’ve turned your data into a nifty dictionary! Not our primary goal, but a nice trick on the way.

Before we go on, it’s a good idea to name our tables so they are easy to refer to later. To do this, click anywhere in your table, then switch to the Table tab in the ribbon / toolbar. The simpler, the better – below, I just call mine Icelandic.

Naming a table in Excel

Naming a table in Excel

But now it’s the turn of our star, VLOOKUP. This is where the real magic happens.

Step 4: Adding a comparative column

Click on the target language column header of your second table and copy it (CTRL + C). Now, go to your first table, select the cell next to the target language column header (C1 in my example), and paste (CTRL + V). It should add a blank new column within that table. Let’s fill it up!

In the first cell under that new column header, we type in our VLOOKUP formula. This will depend on what you have named your tables and sheets, but mine looks like this:

=VLOOKUP([@English], Icelandic, 2, 0)

Let’s dissect that just now. The first item in the brackets is the column of the first table we’ll use at the lookup – the English entry. The second item, Icelandic, is the table we’ll look for a value in. Remember, we named that table a little earlier. The third item, 2, is the column number we’ll look for that item in, which is the target language column of the Icelandic table. Finally the fourth value, 0, is a flag to Excel that we want exact matches only.

If that boggles, simply start typing =VLOOKUP( in the cell. That calls up Excel’s formula hints and point-and-click formula building, which should help you tie things together accurately.

After doing that, something special happens – suddenly, the whole column is filled with entries. If the English term was found in the Icelandic table, the corresponding Icelandic word is pulled in. If not, we simply get #N/A.

A quick note if that doesn’t work immediately: check that the data type of the cells in that third column are set to format as General, not Text.

A cross-referencing table in Excel using VLOOKUP

Our first step in creating a cross-referencing table in Excel using VLOOKUP.

Not very tidy, is it? That #N/A is simply stating that the lookup resulted in nothing at all.

Step 5: Tying off the loose ends

We can make it all look better by wrapping it in another Excel formula, IFERROR. Change the formula in that first cell to:

=IFERROR(VLOOKUP([@English], Icelandic, 2, 0), "-")

This tells Excel to carry out our VLOOKUP function, but to return a dash if it results in an error (i.e., no data). Suddenly, it’s looking a lot neater.

Cross-referencing vocabulary in Excel after some tidying is applied with IFERROR.

Cross-referencing vocabulary in Excel after some tidying is applied with IFERROR.

Now it is crystal clear where you know a word in one language but not the other. To make things even clearer, click the dropdown on that third column, and filter it to show just the dashed elements. There is your list of words to work on in the second language!

Filtering your vocabulary items in Excel.

Filtering your vocabulary items in Excel.

Alternatively, filter on everything but the dashes to revel in the wealth of words you know in both. Enjoy that moment of pride!

For reference, here’s an example Excel file comparing sample vocabulary in French and Spanish.

Where to go from here?

What you do next is up to you. But now, you have the data in your hands, and data is power: what you know, you can act on. Export the filtered list of gaps to work on learning missing vocabulary in any number of ways.

Clearly, you can take these techniques a lot further, too. Currently, the table only checks one way, such as Icelandic to Norwegian in my example. But you can experiment with the same techniques to create much more complex and comprehensive spreadsheets to interrogate both ways.

Lastly, I’ve used Microsoft Excel in this example, but the same functionality is available in other spreadsheet programs, too. The free alternative Google Sheets, for example, has its own VLOOKUP function that works in an almost identical manner. Play around with the tools available, and you can add that dull old spreadsheet package to your list of exciting, innovative language apps!

Have you given this trick a spin? Have any interesting and useful variations on it? Please share in the comments!

A dictionary won't always help you learn words in their natural habitat: the sentence.

Sentence building: Go beyond words with Tatoeba

Learning and assimilating vocabulary in a foreign language isn’t simply a case of learning lists of words: context matters. Just like a careful zoologist observing animals in the wild, it’s important to study words in their natural habitat: the sentence.

Conversely, a lot of reference material for language learners fails to provide this context. If you’re looking for single words in your foreign language, there are myriad look-up tools available. Unfortunately, only a few take steps to set the word in situ; Google Translate, for example, is surprisingly better than many online dictionaries at providing context. If you type in a single word, many entries come with a list of translations and a useful list of cross-referenced, related terms too. Arguably a lot more useful to language learners than the actual machine translation feature!

Google Translate is great for single word look-ups, too!

Google Translate is great for single word look-ups.

However, there is little else online in terms of whole-sentence reference, Apart from “basic phrases in…” pages. Indexed, systematic lists of example sentences, complete with translation support, are harder to find.

Habeas corpus (linguisticus)

One open-source resource, though, is changing that. Tatoeba – from the Japanese ‘for example’ – is a vast, and rapidly growing, corpus of thousands of sentences in scores of languages. Moreover, it’s expanding continually through user contributions. And you, as a native speaker of your own language (even if it’s English!), can help expand it further.

With many of the entries including native-speaker audio, it is a fantastic (and still quite untapped) resource for language learners. It’s full of colloquialisms, handy turns of phrase, and authentic language use. There are many ways you can work it into your own learning; here are just a few ideas for starters.

Words in context

Learnt a new word, but not sure exactly how native speakers use it? Type that single word into Tatoeba, and if you’re lucky, a whole load of sentences will come up. It’s a fantastic way to put your new vocab into context, something which definitely helps me to commit new words to memory. If sound is provided, it’s an instant way to practise / improve your pronunciation too, much like the brilliantly useful Forvo website for single words.

Putting your vocab in context with Tatoeba.

Putting your vocab in context with Tatoeba.

Build your own sentence lists

With your free Tatoeba account, you can save your own word lists to store favourite sentences. Simply click the list icon next to a sentence – you’ll quickly start to build quite extensive, custom ‘vocab in context’ learning resources.

There are also collaborative lists, which means you can work together with others. This might be with classmates, or perhaps even a teacher you’re working with remotely on iTalki. Conversely, it’s also an excellent way for teachers to collate and share useful phrase lists as teaching resources.

Combine with Anki

Anki Flashcards is a firm favourite of many linguaphiles for drilling vocab. You can combine it with Tatoeba by exporting your lists from that site as CSV files, then importing them directly into the Anki program. For now, the Tatoeba export will only extract the text, and no associated sound files. But if you’re willing to fiddle, here’s a short guide on including available sound files in your Tatoeba-Anki port.

If you’re a polyglottal sucker for punishment, you can even export the lists with a translation other than your native language, in order to practise two languages at once. See the screenshot below for a rather scary Norwegian-Greek export setup – I’m sure you can think up even more testing pairings!

Changing the language pairing in a Tatoeba export.

Changing the language pairing in a Tatoeba export.

Find ready-made Tatoeba Anki decks

If all the to-and-fro of exporting puts you off, then don’t despair – some Tatoeba decks have already been imported to Anki as shared desks. Check here for a list of them (several including sound files).

Contribute

Finally, the best way to grow the resource is to become part of it. You can add, correct, record and otherwise extend Tatoeba as a member. If you’ve found it useful, it’s an excellent way to give back.

Tatoeba is one more tool in the linguaphile’s online arsenal, and can be worked into a learning routine in many ways. Feel free to share your own experiences and tips in the comments below!