Apples and oranges, generated by Google's new image algorithm Imagén 3

Google’s Imagén 3 : More Reliable Text for Visual Resources

If you use AI imaging for visual teaching resources, but decry its poor text handling, then Google might have cracked it. Their new algorithm for image generation, Imagén 3, is much more reliable at including short texts without errors.

What’s more, the algorithm is included in the free tier of Google’s LLM, Gemini. Ideal for flashcards and classroom posters, you now get quite reliable results when prompting for Latin-alphabet texts on the platform. Image quality seems to have improved too, with a near-photographic finish possible:

A flashcard produced with Google Gemini and Imagén 3.

A flashcard produced with Google Gemini and Imagén 3.

The new setup seems marginally better at consistency of style, too. Here’s a second flashcard, prompting for the same style. Not quite the same font, but close (although in a different colour).

A flashcard created with Google Gemini and Imagén 3.

A flashcard created with Google Gemini and Imagén 3.

It’s also better at real-world details like flags. Prompting in another engine for ‘Greek flag’, for example, usually results in some terrible approximation. Not in Imagén 3 – here are our apples and oranges on a convincing Greek flag background:

Apples and oranges on a square Greek flag, generated by Google's Imagén 3

Apples and oranges on a square Greek flag, generated by Google’s Imagén 3

It’s not perfect, yet. For one thing, it performed terribly with non-Latin alphabets, producing nonsense each time I tested it. And while it’s great with shorter texts, it does tend to break down and produce the tell-tall typos with anything longer than a single, short sentence. Also, if you’re on the free tier, it won’t allow you to create images of human beings just yet.

That said, it’s a big improvement on the free competition like Bing’s Image Creator. Well worth checking out if you have a bunch of flashcards to prepare for a lesson or learning resource!

Shelves of helpful robots - a bit like Poe, really!

Which LLM? Poe offers them all (and some!)

One of the most frequent questions when I’ve given AI training to language professionals is “which is your favourite platform?”. It’s a tricky one to answer, not least because we’re currently in the middle of the AI Wars – new, competing models are coming out all the time, and my personal choice of LLM changes with each new release.

That said, I’m a late and recent convert to Poe – an app that gives you them all in one place. The real clincher is the inclusion of brand new models, before they’re widely available elsewhere.

To illustrate just how handy it is, just a couple of weeks ago, Meta dropped Llama 3.1 – the first of their models to really challenge the frontrunners. However, unless you have a computer powerful enough to run it locally, or access to Meta AI (US-only right now), you’ll be waiting a while to try it.

Enter Poe. Within a couple of days, all flavours of Llama 3.1 were available. And the best thing? You can interact with most of them for nothing.

The Poe Currency

Poe works on a currency of Compute Points, which are used to pay for messages to the model. More powerful models guzzle through compute points at a higher rate, and models tend to become cheaper as they get older. Meta’s Llama-3.1-405B-T, for example, costs 335 points per message, while OpenAI’s ChatGPT-4o-Mini comes in at a bargain 15 points for each request.

Users of Poe’s free tier get a pretty generous 3000 Compute Points every day. That’s enough credit to work quite extensively on some of the older models without much limitation at all. But it’s also enough to get some really useful (8-ish-requests daily) use from Llama 3.1. And, thanks to that, I can tell you – Llama 3.1 is great at creating language learning resources!

Saying that, with the right prompt, most of the higher-end models are, these days. Claude-3.5-Sonnet is another favourite – check out my interactive worksheet experiments with it here. And yes, Claude-3.5-Sonnet is available on Poe, at a cost of 200 points per message (and that’s already dropped from its initial cost some weeks back!). Even the image generation model Flux has made its way onto the platform, just days after the hype. And it’s a lot better with text-in-image (handy if you’re creating illustrated language materials).

Poe pulls together all sorts of cloud providers in a marketplace-style setup to offer the latest bots, and it’s a model that works. The latest and greatest will always burn through your stash of Computer Points faster, but there’s still no easier way to be amongst the first to try a new LLM!