Disembodied Voices : Using TTS as a Native Speaker Boost

Native speaker modelling is a prerequisite for learning to speak Modern Foreign Languages. But when listening materials are scarce, or you struggle to find exactly the material you want to learn, then text-to-speech (TTS) can lend a helping hand.

Text-to-speech : native speakers out of thin air

TTS, or speech synthesis, has come on in leaps and bounds since the early days of tinny Speak ‘n’ Spell voices. At first mimicking chiefly (American) English, many projects have since diversified to conjure native speaker voices in many languages out of thin air. Polyglot TTS technologies such as Google Cloud’s offering are at the edge of machine learning developments, and sounding more and more human all the time.

Using these disembodied tongues for language learning is nothing new, of course. Switching the language of your digital voice assistant has become a pretty well-known polyglot trick for some robotic speaking practice. Siri has been speaking fluent Bokmål on my phone for a while now. As a result, I’ve become a dab hand at asking what the weather will be like in Norwegian!

TTS Toolkit

But as handy as voice assistants are, you can leverage the power of TTS much more directly than tapping into Siri or Alexa. At its most basic level, plain old TTS is brilliantly useful for hearing a spoken representation of a word or phrase you are unsure of.

For example, if you are tired of guessing how to reel off “das wäre sehr schön” in German (that would be very nice), never fear. Simply paste it into Google Translate and hit the speaker icon. The platform already offers and impressive number of languages with speech support.

Google Translate offers TTS features – simply type / paste in your target language and click the speaker icon.

But it gets even better. Several other resources allow you to do more than just listen; they offer a download function too. This way, you can keep your most useful speech synthesis files and incorporate them into your own materials. Combined with vocabulary mining tools such as mass-sentence site Tatoeba, you can begin to curate large, offline collections of target language material with text and audio.

One notable and powerful multilingual voice synthesis site is TTSMP3.com. For a start, it offers plenty of language options. On top of that, several languages include a choice of voices, too. More than enough to sate your curiosity when you wonder “How do you say that?”.

With a little Google digging, you may also find specialist TTS projects devoted to your particular language of study. For instance, Irish learners are spoilt by the Abair website (Abair means ‘say’ in Irish). Not only does it provide downloadable Irish narration, but it does this in a choice of three different regional accents. If you learn Irish too, you will well know what a godsend this is!

Irish TTS in a range of regional dialects on the Abair website.

Incorporating TTS MP3s

With your MP3s downloaded, you can now incorporate them into your own resources. Easiest of all is simply to insert them as media in PowerPoint presentations or Word documents. I personally like to add them manually to Anki cards to add audio support when I revise vocabulary.

The quick and easy way to reach Anki’s media folder on the desktop program is to open up the Preferences panel, then the Backups tab, and finally to tap the “Open backup folder” link that appears. In the same location as that backup directory, you should see another folder with the name collection.media. Anything you put in here will be synced along with the rest of your Anki data.

Note: always back up your Anki decks before tinkering in the program folders!

Anki’s media folder

Drop your saved TTS files into this folder. Note that subfolders still don’t seem to work reliably, so keep everything in that single folder. Logical file naming will definitely help!

Finally, when you create / edit a vocabulary note, use the following format to add your sound as a playable item, replacing filename as appropriate:

[sound:filename.mp3]

When viewed on your device, you should see a play button on flashcards with embedded sounds. Magic!

An Anki card with embedded sound

If you prefer an even easier route, then there is an Anki plugin specially created for automatically including TTS into notes. I prefer the manual method, as it satisfies the the tech control freak in me.

A human touch

Of course, TTS is not a perfect substitute for a native speaker recording. For those times when only a human voice will do, Forvo is a goldmine. The site is replete with native sounds across a dizzying array of languages, all recorded by native speaker volunteers. Just as with the TTS examples above, the sounds can be downloaded for use in your own resources, too.

To round off our trek through native speaker sites – real and synthesised – just a final note on copyright. If you intend to share the resources you create, always check the usage notes of the website of origin. Sites commonly have fair use policies for non-commercial projects, so making resources for personal use is usually not a problem at all. If you plan to sell your resources, though, you may well need to opt for a commercial plan with the respective platform.

Robotic resources can plug a real gap in native speaker support, especially for niche languages, or niche subjects in more mainstream ones. Do you use TTS in innovative ways in your own learning? Have you come across other specialist or language-specific TTS projects? If so, please share in the comments!

Polyglossic

Love Learning Languages

Disembodied Voices : Using TTS as a Native Speaker Boost

Text-to-speech : native speakers out of thin air

TTS Toolkit

Incorporating TTS MP3s

A human touch

Leave a Reply

Text-to-speech : native speakers out of thin air

TTS Toolkit

Incorporating TTS MP3s

A human touch

Share this:

Leave a Reply