There’s been a lot on the grapevine of late about AI-powered leaps forward in text-to-speech voices. From providing accent models to in-depth speaking games, next-gen TTS is poised to have a huge impact on language learning.
The catch? Much of the brand new tech isn’t available to the average user-on-the-street yet.
That’s why I was thrilled to happen across TTS service ElevenLabs recently. ElevenLabs’ stunning selection of voices powers a number of eLearning and audiobook sites already, and it’s no hype to say that they sound as close to human as you can get right now.
Even better, you can sign up for a free account that gives you 10,000 characters of text-to-speech conversion each month. For $5 a month you can up that to 30,000 characters too, as well as access voice-cloning features. Just imagine the hours of fun if you want to hear ‘yourself’ speak any number of languages!
Using ElevenLabs in Your Own Learning
There’s plenty to do for free, though. For instance, if you enjoy the island technique in your learning, you can get ElevenLabs to record your passages for audio practice / rote memorising. I make this an AI double-whammy, using ChatGPT to help prepare my topical ‘islands’ before pasting them into ElevenLabs.
The ChatGPT > ElevenLabs workflow is also brilliant for dialogue modelling. On my recent Sweden trip, I knew that a big conversational contact point would be ordering at coffee shops. This is the prompt I used to get a cover-all-bases model coffee-shop convo:
Create a comprehensive model dialogue in Swedish to help me learn and practise for the situation “ordering coffee in a Malmö coffee shop”.
Try to include the language for every eventuality / question I might be asked by the coffee shop employee. Ensure that the language is colloquial and informal, and not stilted.
The output will be pasted into a text-to-speech generator, so don’t add speaker names to the dialogue lines – just a dash will suffice to indicate a change of speaker.
I then ran off the audio file with ElevenLabs, and hey presto! Custom real-world social prep. You can’t specify different voices in the same file, of course. But you could run off the MP3 twice, in different voices, then splice it up manually in an audio editor like Audacity for the full dialogue effect. Needless to say, it’s also a great way for teachers to make custom listening activities.
The ElevenLabs voices are truly impressive – it’s worth setting up a free account just to play with the options and come up with your own creative use cases. TTS is set to only get better in the coming months – we’re excited to see where it leads!