ElevenLabs Hits the Right Note: A.I. Songwriting for Language Learners

In case you missed it, A.I. text-to-speech leader ElevenLabs is the latest platform to join the generative music scene – so language learners and teachers have another choice for creating original learning songs.

ElevenLabs’ Creative Platform ElevenMusic takes a much more structured approach to music creation that other platforms I’ve tried. Enter your prompt (or full lyrics), and it will build a song from block components – verse, chorus, bridge – just as you might construct one as a human writer. It makes for a much more natural-sounding track.

ElevenLabs music creation

ElevenLabs music creation

As you’d expect from voice experts ElevenLabs, the service copes with a wide range of languages and the diction is very convincing. A tad more so, I think, than the current iteration of the first big name on the block, Suno AI. No doubt the latter will have some tricks up its sleeve to keep up the pace – but for now, ElevenLabs is the place to go for quick and catchy learning song.

Anyway, here’s one I made earlier – a rather natty French rock and roll song about the Moon landings. Get those blue suede Moon boots on!

It’s definitely worth having a play on the site to see what you can come up with for you or your classes. ElevenLabs has a free tier, of course, so you can try it out straight away. [Note: that’s my wee affiliate link, so if you do sign up and hop on a higher tier later, you’re helping keep Polyglossic going!]

A swirl of IPA symbols in the ether. Do LLMs 'understand' phonology? And are they any good at translation?

Tencent’s Hunyuan-MT-7B, the Translation Whizz You Can Run Locally

There’s been a lot of talk this week about a brand new translation model, Tencent’s Hunyuan-MT-7B. It’s a Large Language Model (LLM) trained to perform machine translation. And it’s caused a big stir by beating heftier (and heavier) models by Google and OpenAI in a recent event.

This is all the more remarkable given that it’s really quite a small model by LLM standards. Hunyuan actually manages its translation-beating feat packed into just 7 billion parameters (the information nodes that models learn from). Now that might sound a lot. But fewer usually means weaker, and the behemoths are nearing post-trillion param levels already.

So Hunyuan is small. But in spite of that, it can translate accurately and reliably – market-leader beatingly so – between over 30 languages, including some low-resource ones like Tibetan and Kazakh. And its footprint is truly tiny in LLM terms – it’s lightweight enough to run locally on a computer or even tablet, using inference software like LMStudio or PocketPal.

The model is available in various GGUF formats at Hugging Face. The 4-bit quantised version comes in at just over 4 GB, making it iPad-runnable. If you want greater fidelity, then 8-bit quantised is still only around 8 GB, easily handleable in LMStudio with a decent laptop spec.

So is it any good?

Well, I ran a few deliberately tricky English to German tasks through it, trying to find a weak spot. And honestly, it’s excellent – it produces idiomatic, native-quality translations that don’t sound clunky. What I found particularly impressive was its ability to paraphrase where a literal translation wouldn’t work.

There are plenty of use cases, even if you’re not looking for a translation engine for a full-blown app. Pocketising it means you have a top-notch multi-language translator to use offline, anywhere. For language learners – particularly those struggling with the lower-resource languages the model can handle with ease – it’s another source of native-quality text to learn from.

Find out more about the model at Hugging Face, and check out last week’s post for details on loading it onto your device!

Ultra-Mobile LLMs : Getting the Most from PocketPal

If you were following along last week, I was deep into the territory of running open, small-scale Large Language Models (LLMs) locally on a laptop in the free LMStudio environment. There are lots of reasons you’d want to run these mini chatbots, including the educational, environmental, and security aspects.

I finished off with a very cursory mention of an even more mobile vehicle for these, PocketPal. This free, open source app (available on Google and iOS) allows for easy (no computer science degree required) searching, downloading and running LLMs on smartphones and tablets. And, despite the resource limitations of mobile devices compared with full computer hardware, they run surprisingly well.

PocketPal is such a powerful and unique tool, and definitely worth a spotlight of its own. So, this week, I thought I’d share some tips and tricks I’ve found for smooth running of these language models in your pocket.

Full-Fat LLMs?

First off, even small, compact models can be (as you’d expect) unwieldy and resource-heavy files. Compressed, self-contained LLM models are available as .gguf files from sources like Hugging Face, and they can be colossal. There’s a process you’ll hear mentioned a lot in the AI world called quantisation, which compresses models to varying degrees. Generally speaking, the more compression, the more poorly the model performs. But even the most highly compressed small models can weigh in at 2gb and above. After downloading them, these mammoth blobs then load into memory, ready to be prompted. That’s a lot of data for your system to be hanging onto!

That said, with disk space, a good internet connection, and decent RAM, it’s quite doable. On a newish MacBook, I was comfortably downloading and running .gguf files 8gb large and above in LMStudio. And you don’t need to downgrade your expectations too much to run models in PocketPal, either.

For reference, I’m using a 2023 iPad Pro with the M2 chip – quite a modest spec now – and a 2024 iPhone 16. On both of them, the sweet spot seems to be a .gguf size of around 4gb – you can go larger, but there’s a noticeable slowdown and sluggishness beyond that. A couple of the models I’ve been getting good, sensible and usable results from on mobile recently are:

  • Qwen3-4b-Instruct (8-bit quantised version) – 4.28gb
  • Llama-3.2-3B-Instruct (6-bit quantised version) – 3.26gb

The ‘instruct’ in those model names refers to the fact that they’ve been trained to follow instructions particularly keenly – one of the reasons they give such decent practical prompt responses with a small footprint.

Optimising PocketPal

Once you have them downloaded, there are a couple of things you can tweak in PocketPal to eke out even more performance.

The first is to head to the settings and switch on Metal, Apple’s hardware-accelerated API. Then, increase the “Layers on GPU” setting to around 80 or so – you can experiment with this to see what your system is happy with. But the performance improvement should be instantaneous, the LLM spitting out tokens at multiple times the default speed.

What’s happening with this change is that iOS is shifting some of the processing from the device’s CPU to the GPU, or graphical processing unit. That may seem odd, but modern graphics chips are capable of intense mathematical operations, and this small switch recruits them into doing some of the heavy work.

Additionally, on some recent devices, switching on “Flash Attention” can bring extra performance enhancements. This interacts with the way LLMs track how much weight to give certain tokens, and how that matrix is stored in memory during generation. It’s pot luck whether it will make a difference, depending on device spec, but I see a little boost.

Tweaking PocketPal’s settings to run LLMs more efficiently

Tweaking PocketPal’s settings to run LLMs more efficiently

Making Pals – Your Own Custom Bots

When you’re all up and running with your PocketPal LLMs, there’s another great feature you can play with to get very domain-specific results – “Pal” creation. Pals are just system prompts – instructions that set the boundaries and parameters for the conversation – in a nice wrapper. And you can be as specific as you want with them, instructing the LLM to behave as a language learning assistant, a nutrition expert, a habits coach, and such like – with as many rules and output notes as you see fit. It’s an easy way to turn a very generalised tool into something focused and with real-world application.

So that’s my PocketPal in-a-nutshell power guide. I hope you can see why it’s worth much more than just a cursory mention at the end of last week’s post! Tools like PocketPal and LMStudio put you right at the centre of LLM development, and I must admit it’s turned me into a models geek – I’m already looking forward to what new open LLMs will be unleashed next.

So what have you set your mobile models doing? Please share your tips and experiences in the comments!

Small LLMs

LLMs on Your Laptop

I mentioned last week that I’m spending a lot of time with LLMs recently. I’m poking and prodding them to test their ‘understanding’ (inverted commas necessary there!) of phonology, in particular with non-standard speech and dialects.

And you’d be forgiven for thinking I’m just tapping my prompts into ChatGPT, Claude, Gemini or the other big commercial concerns. Mention AI, and those are the names people come up with. They’re the all-bells-and-whistles web-facing services that get all the public fanfare and newspaper column inches.

The thing is, that’s not all there is to Large Language Models. There’s a whole world of open source (or the slightly less open ‘open weights’) models out there. Some of them offshoots of those big names, while others less well-known. But you can download all of them to run offline on any reasonably-specced laptop.

LMStudio – LLMs on your laptop

Meet LMStudio – the multi-platform desktop app that allows you to install and interrogate LLMs locally. It all sounds terribly technical, but at its most basic use – a custom chatbot – you don’t need any special tech skills. Browsing, installing and chatting with models is all done via the tab-based interface. You can do much more with it – the option to run it as a local server is super useful for development and testing – but you don’t have to touch any of that.

Many of the models downloadable within LMStudio are small models – just a few gigabytes, rather than the behemoths behind GPT-5 and other headline-grabbing releases. They feature the same architecture as those big-hitters, though. And in many cases, they are trained to approach, even match, their performance on specific tasks like problem-solving or programming. You’ll even find reasoning models, that produce a ‘stepwise-thinking’ output, similar to platforms like Gemini.

A few recent models for download include:

  • Qwen3 4B Thinking – a really compact model (just over 2gb) which supports reasoning by default
  • OpenAI’s gpt-oss-20b – the AI giant’s open weights offering, released this August
  • Gemma 3 – Google’s multimodal model optimised for use on everyday devices
  • Mistral Small 3.2 – the French AI company’s open model, with vision capabilities

So why would you bother, when you can just fire up ChatGPT / Google / Claude in a few browser clicks?

LLMs locally – but why?

Well, from an academic standpoint, you have complete control over these models if you’re exploring their use cases in a particular field, like linguistics or language learning. You can set parameters like temperature, for instance – the degree of ‘creativity wobble’ the LLM has (0 being a very rigid none, and 1 being, well, basically insane). And if you can set parameters, you can report these in your findings, which allows others to replicate your experiments and build on your knowledge.

Small models also run on smaller hardware – so you can develop solutions that people don’t need a huge data centre for. If you do hit upon a use case or process that supports researchers, then it’s super easy for colleagues to access the technology, whatever their recourse to funding support.

Secondly, there’s the environmental impact. If the resource greed of colossal data centres is something that worries you (and there’s every indication that it should be a conversation we’re all having ), then running LLMs locally allows you to take advantage of them without heating up a server farm somewhere deep inside the US. The only thing running hot will be your laptop fan (it does growl a bit with the larger models – I take that as a sign to give it a rest for a bit!).

And talk of those US server farms leads on to the next point: data privacy. OpenAI recently caused waves with their suggestion that user conversations are not the confidential chats many assume them to be. If you’re not happy with your prompts and queries passing out of your control and into the data banks of a foreign state, then local LLMs offer not a little peace of mind too.

Give it a go!

The best thing? LMStudio is completely free. So download it, give it a spin, and see whether these much smaller-footprint models can give you what you need without entering the ecosystem of the online giants.

Lastly, don’t have a laptop? Well, you can also run LLMs locally on phones and tablets too. Free app PocketPal (on iOS and Android) runs like a cut-down version of LMStudio. Great for tinkering on the go!

A robot reading a script. The text-to-speech voices at ElevenLabs certainly sound intelligent as well as natural!

ElevenLabs : 5-Star Tool for Language Work and Study

If you’re a regular reader, you’ll know how impressed I’ve been at ElevenLabs, the text-to-speech creator that stunned the industry when its super-realistic voices were unleashed on the world. Since then, it’s made itself irreplaceable in both my work and study, and it bears spreading the word again: ElevenLabs is a blow-your-socks-off kind of tool for creating spoken audio content.

Professional Projects

In my work developing language learning materials for schools, arranging quality narration used to involve coordinating with agencies and studios — a process that was both time-consuming and costly. We’ve had issues with errors, too, which cost a project time with re-recordings. And that’s not to mention the hassle keeping sections up-to-date. Removing ‘stereo’ from an old vocab section (who has those now?) would usually trigger a complete re-record.

With ElevenLabs, I can now produce new sections promptly, utilising its impressive array of voices across multiple languages. The authenticity and clarity of these voices are fantastic – I really can’t understate it – and it’s made maintaining the biggest language learning site for schools so much easier.

Supporting Individual Learning

As a language learner, ElevenLabs is more than worth its salt, too. It’s particularly good for assembling short listening passages – about a minute long – to practise ‘conversation islands’—a well-regarded polyglot technique for achieving conversational fluency.

Beyond language learning, the tool can be a great support to other academic projects. I’ve created concise narrations of complex topics, converting excerpts from scholarly papers into audio format. Listening to these clips in spare moments (or even in the background while washing up) has helped cement some key concepts, and prime my mind for conventional close study.

Flexible and Affordable Plans

ElevenLabs offers a range of pricing options to suit different needs:

Free Plan: Ideal for those starting out, this plan provides 10,000 characters per month, roughly equating to 10 minutes of audio.

Starter Plan: At £5 per month, you receive 30,000 characters (about 30 minutes of audio), along with features like voice cloning and commercial use rights.

Creator Plan: For £22 per month, this plan offers 100,000 characters (around 100 minutes of audio), plus professional voice cloning and higher-quality outputs.

For messing around, that free plan is not too stingy at all – you can really get a feel for the tool from it. Personally, I’ve not needed to move beyond the starter plan yet, which is pretty much a bargain at around a fiver a month.

Introducing ElevenReader

And there’s more! Complementing the TTS service, ElevenLabs has introduced ElevenReader, a free tool that narrates PDFs, ePubs, articles, and newsletters in realistic AI voices. Available on both iOS and Android platforms, the app doesn’t even consume credits from your ElevenLabs subscription plan.

Seriously, I can’t even believe this is still free – go and try it!

Final Thoughts

ElevenLabs has truly transformed the way I create and consume spoken content. It truly is my star tool from the current crop of AI-powered utilities.

The ElevenLabs free tier is enough for most casual users to have a dabble – go and try it today!

A robot playwright - now even more up-to-date with SearchGPT.

Topical Dialogues with SearchGPT

As if recent voice improvements weren’t enough of a treat, OpenAI has just introduced another killer feature to ChatGPT, one that can likewise beef up your custom language learning resources. SearchGPT enhances the LLM’s ability to access and incorporate bang up-to-date information from the web.

It’s a development that is particularly beneficial for language learners seeking to create study materials that reflect current events and colloquial language use. With few exceptions until now, LLMs like ChatGPT have had a ‘data cutoff’, thanks to mass text training having an end-point (albeit a relatively recent one). Some LLMs, like Microsoft’s Copilot, have introduced search capabilities, but their ability to retrieve truly current data could be hit and miss.

With SearchGPT, OpenAI appear to have cracked search accuracy a level to rival AI search tool Perplexity – right in the ChatGPT app. And it’s as simple as highlighting the little world icon that you might already have noticed under the prompt field.

The new SearchGPT icon in the ChatGPT prompt bar.

The new SearchGPT icon in the ChatGPT prompt bar.

Infusing Prompts with SearchGPT

Switching this on alongside tried-and-tested language learning prompt techniques yields some fun – and pedagogically useful – results. For instance, you can prompt ChatGPT to generate dialogues or reading passages based on the latest news from your target language country/ies. Take this example:

A language learning dialogue on current affairs in German, beefed up by OpenAI's SearchGPT

A language learning dialogue on current affairs in German, beefed up by OpenAI’s SearchGPT

SearchGPT enables content that mirrors real-life discussion with contemporary vocabulary and expressions (already something it was great at). But it also incorporates accurate, up-to-the-minute, and even cross-referenced information. That’s a big up for transparency.

Unsure where that info came from? Just click the in-text links!

Enhancing Speaking Practice with Authentic Contexts

Beyond reading, these AI-generated dialogues serve as excellent scripts for speaking practice. Learners can role-play conversations, solo or group-wise, to improve pronunciation, intonation, and conversational flow. This method bridges the gap between passive understanding and active usage, a crucial step in achieving fluency.

Incorporating SearchGPT into your language learning content creation toolbox reconnects your fluency journey with the real, evolving world. Have you used it yet? 

Apples and oranges, generated by Google's new image algorithm Imagén 3

Google’s Imagén 3 : More Reliable Text for Visual Resources

If you use AI imaging for visual teaching resources, but decry its poor text handling, then Google might have cracked it. Their new algorithm for image generation, Imagén 3, is much more reliable at including short texts without errors.

What’s more, the algorithm is included in the free tier of Google’s LLM, Gemini. Ideal for flashcards and classroom posters, you now get quite reliable results when prompting for Latin-alphabet texts on the platform. Image quality seems to have improved too, with a near-photographic finish possible:

A flashcard produced with Google Gemini and Imagén 3.

A flashcard produced with Google Gemini and Imagén 3.

The new setup seems marginally better at consistency of style, too. Here’s a second flashcard, prompting for the same style. Not quite the same font, but close (although in a different colour).

A flashcard created with Google Gemini and Imagén 3.

A flashcard created with Google Gemini and Imagén 3.

It’s also better at real-world details like flags. Prompting in another engine for ‘Greek flag’, for example, usually results in some terrible approximation. Not in Imagén 3 – here are our apples and oranges on a convincing Greek flag background:

Apples and oranges on a square Greek flag, generated by Google's Imagén 3

Apples and oranges on a square Greek flag, generated by Google’s Imagén 3

It’s not perfect, yet. For one thing, it performed terribly with non-Latin alphabets, producing nonsense each time I tested it. And while it’s great with shorter texts, it does tend to break down and produce the tell-tall typos with anything longer than a single, short sentence. Also, if you’re on the free tier, it won’t allow you to create images of human beings just yet.

That said, it’s a big improvement on the free competition like Bing’s Image Creator. Well worth checking out if you have a bunch of flashcards to prepare for a lesson or learning resource!

Greek text on a packet of crisps

Language Lessons from Packaging (And A Little Help from ChatGPT)

If you love scouring the multilingual packaging of household products from discounter stores (a niche hobby, I must admit, even for us linguists), then  there’s a fun way to automate it with LLMs like ChatGPT.

Take the back of this packet of crisps. To many, a useless piece of rubbish. To me (and some of you, I hope!), a treasure of language in use.

Greek text on a packet of crisps - food and household item packaging can be a great source of language in use.

Greek text on a packet of crisps

Normally, I’d idly read through these, looking up any unfamiliar words in a dictionary. But, using an LLM app with an image facility like ChatGPT, you can automate that process. What’s more, you can request all sorts of additional info like dictionary forms, related words, and so on.

From Packaging to Vocab List

Take a snap of your packaging, and try this prompt for starters:

Create a vocabulary list from the key content words on the packaging label. For each word, list:
– its dictionary form
– a new, original sentence illustrating the word in use
– common related words

The results should be an instantly useful vocab list with added content for learning:

Vocabulary list from food packaging by ChatGPT

Vocabulary list compiled by ChatGPT from a food packaging label

I added a note-taking stage to round it off. It always helps me to write down what I’m learning, adding a kinaesthetic element to the visual (and aural, if you’ve had ChatGPT speak its notes out loud). Excuse the scrawl… (As long as your notes are readable by you, they’re just fine!)

Handwritten vocabulary notes derives from crisp packet packaging

Notes on a crisp packet…

It’s a fun workflow that really underscores the fact that there are free language lessons all around us.

Especially in the humblest, and often least glamorous, of places.

ChatGPT takes conversation to the next level with Advanced Voice Mode

ChatGPT Advanced Voice Mode is Finally Here (For Most of Us!)

Finally – and it has taken SO much longer to get it this side of the Pond – Advanced Voice Mode has popped up in my ChatGPT. And it’s a bit of a mind-blower to say the least.

Multilingually speaking, it’s a huge step up for the platform. For a start, its non-English accents are hugely improved – no longer French or German with an American twang. Furthermore, user language detection seems more reliable, too. Open it up, initiate a conversation in your target language, and it’s ready to go without further fiddling.

But it’s the flexibility and emotiveness of those voices which is the real game-changer. There’s real humanity in those voices, now, reminiscent of Hume’s emotionally aware AI voices. As well as emotion, there’s variation in timbre and speed. What that means for learners is that it’s now possible to get it to mimic slow, deliberate speech when you ask that language learning staple “can you repeat that more slowly, please?”. It makes for a much more adaptive digital conversation partner.

Likewise – and rather incredibly – it’s possible to simulate a whole range of regional accents. I asked for Austrian German, and believe me, it is UNCANNILY good. Granted, it did occasionally verge on parody, but as a general impression, it’s shocking how close it gets. It’s a great way to prepare for speaking your target language with real people, who use real, regionally marked speech.

Advanced Voice Mode, together with its recently added ability to remember details from past conversations (previously achievable only via a hack), is turning ChatGPT into a much cannier language learning assistant. It was certainly worth the wait. And for linguaphiles, it’ll be fascinating to see how it continues to develop as an intelligent conversationalist from here.

Mapping out conversational probabilities - it's much easier with flowcharts.

Vocabulary Flowcharts : Preparing for Probabilities with ChatGPT

The challenge in preparing for a speaking task in the wild is that you’re dealing with multiple permutations. You ask your carefully prepared question, and you get any one of a number of likely responses back. That, in turn, informs your next question or reply, and another one-of-many comebacks follows.

It’s probability roulette.

What if you could map all of these conversational pathways out, though? Flowcharts have long been the logician’s tool of choice for visualising processes that involve forking choices. Combined with generative AI’s penchant for assembling real-world language, we have a recipe for much more dynamic language prep resources than a traditional vocab list.

And, thanks to a ready-made flowchart plugin for ChatGPT – courtesy of the charting folks at Whimsical.com – it’s really easy to knock one together.

Vocabulary Flowcharts in Minutes

In your ChatGPT account, you’ll need to locate the Whimsical GPT. Then, it’s just a case of detailing the conversational scenario you want to map out. Here’s an example for ‘opening a bank account in Germany’:

Create a flowchart detailing different conversational choices and paths in German for the scenario “Opening a bank account as a non-resident of Germany planning to work there for six months.” Include pathways for any problems that might occur in the process. Ensure all the text reflects formal, conversational German.

The result should be a fairly detailed ‘probability map’ of conversational turns:

A 'vocabulary flowchart' in German, created by the Whimsical.com GPT on ChatGPT.

A ‘vocabulary flowchart’ in German, created by the Whimsical.com GPT on ChatGPT.

Vocabulary flowcharts are another tool in your AI arsenal for speaking prep. Have you given them a whirl yet? Tell us about your own prep in the comments!