TTS can lend your learning some robotic voice magic. Image by Oliver Brandt on FreeImages.com

Disembodied Voices : Using TTS as a Native Speaker Boost

Native speaker modelling is a prerequisite for learning to speak Modern Foreign Languages. But when listening materials are scarce, or you struggle to find exactly the material you want to learn, then text-to-speech (TTS) can lend a helping hand.

Text-to-speech : native speakers out of thin air

TTS, or speech synthesis, has come on in leaps and bounds since the early days of tinny Speak ‘n’ Spell voices. At first mimicking chiefly (American) English, many projects have since diversified to conjure native speaker voices in many languages out of thin air. Polyglot TTS technologies such as Google Cloud’s offering are at the edge of machine learning developments, and sounding more and more human all the time.

Using these disembodied tongues for language learning is nothing new, of course. Switching the language of your digital voice assistant has become a pretty well-known polyglot trick for some robotic speaking practice. Siri has been speaking fluent Bokmål on my phone for a while now. As a result, I’ve become a dab hand at asking what the weather will be like in Norwegian!

TTS Toolkit

But as handy as voice assistants are, you can leverage the power of TTS much more directly than tapping into Siri or Alexa. At its most basic level, plain old TTS is brilliantly useful for hearing a spoken representation of a word or phrase you are unsure of.

For example, if you are tired of guessing how to reel off “das wäre sehr schön” in German (that would be very nice), never fear. Simply paste it into Google Translate and hit the speaker icon. The platform already offers and impressive number of languages with speech support.

Google Translate offers TTS features.

Google Translate offers TTS features – simply type / paste in your target language and click the speaker icon.

But it gets even better. Several other resources allow you to do more than just listen; they offer a download function too. This way, you can keep your most useful speech synthesis files and incorporate them into your own materials. Combined with vocabulary mining tools such as mass-sentence site Tatoeba, you can begin to curate large, offline collections of target language material with text and audio.

One notable and powerful multilingual voice synthesis site is TTSMP3.com. For a start, it offers plenty of language options. On top of that, several languages include a choice of voices, too.  More than enough to sate your curiosity when you wonder “How do you say that?”.

With a little Google digging, you may also find specialist TTS projects devoted to your particular language of study. For instance, Irish learners are spoilt by the Abair website (Abair means ‘say’ in Irish). Not only does it provide downloadable Irish narration, but it does this in a choice of three different regional accents. If you learn Irish too, you will well know what a godsend this is!

Irish TTS in a range of regional dialects on the Abair website.

Irish TTS in a range of regional dialects on the Abair website.

Incorporating TTS MP3s

With your MP3s downloaded, you can now incorporate them into your own resources. Easiest of all is simply to insert them as media in PowerPoint presentations or Word documents. I personally like to add them manually to Anki cards to add audio support when I revise vocabulary.

The quick and easy way to reach Anki’s media folder on the desktop program is to open up the Preferences panel, then the Backups tab, and finally to tap the “Open backup folder” link that appears. In the same location as that backup directory, you should see another folder with the name collection.media. Anything you put in here will be synced along with the rest of your Anki data.

Note: always back up your Anki decks before tinkering in the program folders!

Anki's media folder

Anki’s media folder

Drop your saved TTS files into this folder. Note that subfolders still don’t seem to work reliably, so keep everything in that single folder. Logical file naming will definitely help!

Finally, when you create / edit a vocabulary note, use the following format to add your sound as a playable item, replacing filename as appropriate:

[sound:filename.mp3]

When viewed on your device, you should see a play button on flashcards with embedded sounds. Magic!

Anki card with embedded sound

An Anki card with embedded sound

If you prefer an even easier route, then there is an Anki plugin specially created for automatically including TTS into notes. I prefer the manual method, as it satisfies the the tech control freak in me.

A human touch

Of course, TTS is not a perfect substitute for a native speaker recording. For those times when only a human voice will do, Forvo is a goldmine. The site is replete with native sounds across a dizzying array of languages, all recorded by native speaker volunteers. Just as with the TTS examples above, the sounds can be downloaded for use in your own resources, too.

To round off our trek through native speaker sites – real and synthesised – just a final note on copyright. If you intend to share the resources you create, always check the usage notes of the website of origin. Sites commonly have fair use policies for non-commercial projects, so making resources for personal use is usually not a problem at all. If you plan to sell your resources, though, you may well need to opt for a commercial plan with the respective platform.

Robotic resources can plug a real gap in native speaker support, especially for niche languages, or niche subjects in more mainstream ones. Do you use TTS in innovative ways in your own learning? Have you come across other specialist or language-specific TTS projects? If so, please share in the comments!

A dictionary won't always help you learn words in their natural habitat: the sentence.

Sentence building: Go beyond words with Tatoeba

Learning and assimilating vocabulary in a foreign language isn’t simply a case of learning lists of words: context matters. Just like a careful zoologist observing animals in the wild, it’s important to study words in their natural habitat: the sentence.

Conversely, a lot of reference material for language learners fails to provide this context. If you’re looking for single words in your foreign language, there are myriad look-up tools available. Unfortunately, only a few take steps to set the word in situ; Google Translate, for example, is surprisingly better than many online dictionaries at providing context. If you type in a single word, many entries come with a list of translations and a useful list of cross-referenced, related terms too. Arguably a lot more useful to language learners than the actual machine translation feature!

Google Translate is great for single word look-ups, too!

Google Translate is great for single word look-ups.

However, there is little else online in terms of whole-sentence reference, Apart from “basic phrases in…” pages. Indexed, systematic lists of example sentences, complete with translation support, are harder to find.

Habeas corpus (linguisticus)

One open-source resource, though, is changing that. Tatoeba – from the Japanese ‘for example’ – is a vast, and rapidly growing, corpus of thousands of sentences in scores of languages. Moreover, it’s expanding continually through user contributions. And you, as a native speaker of your own language (even if it’s English!), can help expand it further.

With many of the entries including native-speaker audio, it is a fantastic (and still quite untapped) resource for language learners. It’s full of colloquialisms, handy turns of phrase, and authentic language use. There are many ways you can work it into your own learning; here are just a few ideas for starters.

Words in context

Learnt a new word, but not sure exactly how native speakers use it? Type that single word into Tatoeba, and if you’re lucky, a whole load of sentences will come up. It’s a fantastic way to put your new vocab into context, something which definitely helps me to commit new words to memory. If sound is provided, it’s an instant way to practise / improve your pronunciation too, much like the brilliantly useful Forvo website for single words.

Putting your vocab in context with Tatoeba.

Putting your vocab in context with Tatoeba.

Build your own sentence lists

With your free Tatoeba account, you can save your own word lists to store favourite sentences. Simply click the list icon next to a sentence – you’ll quickly start to build quite extensive, custom ‘vocab in context’ learning resources.

There are also collaborative lists, which means you can work together with others. This might be with classmates, or perhaps even a teacher you’re working with remotely on iTalki. Conversely, it’s also an excellent way for teachers to collate and share useful phrase lists as teaching resources.

Combine with Anki

Anki Flashcards is a firm favourite of many linguaphiles for drilling vocab. You can combine it with Tatoeba by exporting your lists from that site as CSV files, then importing them directly into the Anki program. For now, the Tatoeba export will only extract the text, and no associated sound files. But if you’re willing to fiddle, here’s a short guide on including available sound files in your Tatoeba-Anki port.

If you’re a polyglottal sucker for punishment, you can even export the lists with a translation other than your native language, in order to practise two languages at once. See the screenshot below for a rather scary Norwegian-Greek export setup – I’m sure you can think up even more testing pairings!

Changing the language pairing in a Tatoeba export.

Changing the language pairing in a Tatoeba export.

Find ready-made Tatoeba Anki decks

If all the to-and-fro of exporting puts you off, then don’t despair – some Tatoeba decks have already been imported to Anki as shared desks. Check here for a list of them (several including sound files).

Contribute

Finally, the best way to grow the resource is to become part of it. You can add, correct, record and otherwise extend Tatoeba as a member. If you’ve found it useful, it’s an excellent way to give back.

Tatoeba is one more tool in the linguaphile’s online arsenal, and can be worked into a learning routine in many ways. Feel free to share your own experiences and tips in the comments below!