You Don’t Need to Be a Developer to Start Playing with AI Models in Python

I’ve been singing the praises of local models of late, for so many reasons. From intelligent OCR to data crunching with enhanced privacy, there are gains to be had and they’re easy to access with free inferencing software like LM Studio and Ollama.

That said, there’s a moment that happens to a lot of people who work adjacent to tech – linguists, teachers, researchers – where they think: I’d love to tinker with these AI models properly – and maybe even build them directly into my own tech projects.

This post addresses that tinkering itch. The good news: it’s genuinely easier than you think, and you can get something running in an afternoon.

Why Python?

I ask this a lot, myself, coming from a totally different development background (full-stack and native web app coding). Going back into academia, Python seems to be everywhere.

Python has become the de facto language of AI and data science for a reason. Its syntax is readable almost like pseudocode, its libraries are extraordinarily well-developed and vast, and – linked to that last point – calling an API takes a handful of lines, not pages of custom routines. If you’re coming from a research or humanities background, Python also has the advantage of being widely taught in academic contexts, which means the community, tutorials, and Stack Overflow threads are abundant.

Compare calling an LLM in Python to doing the same in JavaScript or Swift, and you’ll understand immediately why the ‘AI for academia’ world standardised on Python.

And a big plus – it’s probably already installed on your machine. Open your terminal / command prompt interface, and type python --version or python3 --version. If you see a version number come back, you’re good to go. If not, head to python.org/downloads and grab the latest stable release – it’s a straightforward installer on every platform.

Two Ways In: Cloud or Local

Option 1: Hugging Face’s Free Inference API (great for experimenting, zero cost)

Hugging Face is essentially the GitHub of AI models – tens of thousands of open-source models, all in one place. The Serverless Inference API lets you call many of them without setting up any infrastructure, and the free tier is perfectly generous for tinkering and learning. You’ll hit rate limits if you go overboard, but for exploration it’s hard to beat.

Here’s what you need to get started:

  1. Create a free account at huggingface.co
  2. Go to Settings → Access Tokens and generate a token with Read permissions
  3. Install the library: pip install huggingface_hub

Then you can call a model like this:

from huggingface_hub import InferenceClient

client = InferenceClient(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct",
    token="hf_your_token_here"
)

response = client.text_generation("Explain enregisterment in simple terms.")
print(response)

That’s genuinely it for a first experiment. A few lines. No GPU. No cloud bill.

One gotcha: some popular models require you to accept their licence terms on the Hugging Face website before you can access them via the API. If you get a 403 error, that’s almost certainly why — head to the model page, accept the terms, and try again.

Option 2: LM Studio (run models locally, completely private)

If you’d rather not send your data to any external service – which matters for research involving sensitive text – LM Studio is still a brilliant solution. It gives you a clean interface to download and run open-source models on your own machine, with no internet connection required once the model is downloaded.

The local model landscape has improved dramatically. Models like Qwen3 (the 4B and 14B variants especially) are genuinely impressive on a modern laptop or desktop with a decent amount of RAM. You wouldn’t have believed this was possible two years ago.

LM Studio exposes a local API that mimics the OpenAI format, so you can call it from Python the same way:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:1234/v1",
    api_key="not-needed"  # LM Studio doesn't require auth locally
)

response = client.chat.completions.create(
    model="qwen3-14b",  # whatever model you've loaded in LM Studio
    messages=[{"role": "user", "content": "Hello, what can you do?"}]
)

print(response.choices[0].message.content)

The openai library here is just a convenient HTTP client — you’re not actually talking to OpenAI. You’re talking to a model running on your own machine.

Common stumbling block: LM Studio’s server needs to be running and a model needs to be loaded before your script will work. The error message when it’s not running is a bit cryptic (ConnectionRefusedError or similar) — if you see that, it just means you didn’t start the server yet.

Making the Output Actually Readable

Once you’re getting responses back, the next temptation is to do something with them in your terminal – loop through results, display analysis, format comparisons. The default print() approach quickly gets messy.

My namesake, the rich library is a revelation here (how nice to have a Python library named after me). It adds colour, formatting, tables, and syntax highlighting to terminal output with almost no effort:

pip install rich
from rich.console import Console
from rich.markdown import Markdown

console = Console()

response_text = client.text_generation("Write a haiku about Python.")
console.print(Markdown(response_text))

If the model returns markdown (which most do), rich will render it beautifully right in your terminal. Headers, code blocks, bold text — all of it. This is genuinely transformative for readability when you’re doing exploratory work.

Don’t Stop at Chat: Sentence Transformers Are Worth Knowing About

Here’s where it gets interesting for researchers and linguists in particular. Large language models are great for generation — producing text, summarising, answering questions. But there’s a whole other class of model designed for understanding text semantically: sentence transformers.

The Sentence Transformers library (also called sbert) lets you turn text into numerical vectors that capture meaning. Two sentences that mean the same thing will have vectors that are close together; two unrelated sentences won’t. This is called a semantic embedding.

Why does this matter? A few examples:

  • Corpus linguistics for semantics: Automatically cluster dialect examples by semantic similarity rather than just keyword matching
  • Research assistants: Find the most relevant papers or passages from a large collection based on meaning, not just exact words
  • Teaching tools: Build a quiz that detects when a learner’s answer is semantically equivalent to the model answer, even if the wording is different
pip install sentence-transformers
from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer("all-MiniLM-L6-v2")

sentences = [
    "The dialect features of the Black Country are highly distinctive.",
    "Black Country speech has unique phonological characteristics.",
    "The weather in Edinburgh is famously miserable."
]

embeddings = model.encode(sentences)
similarity = util.cos_sim(embeddings[0], embeddings[1])
print(f"Similarity: {similarity.item():.3f}")  # will be high

This runs entirely locally (the model downloads once and caches), is fast even on a modest laptop, and opens up a whole world of computational approaches to language that go well beyond chatting with an LLM.

Getting Set Up: The Boring-but-Important Bit

Beyond that, there are just a few things I’ve learnt from my initial tinkerings that will save you headaches.

Use a virtual environment. Every time. Before you install anything for a new project, do:

python -m venv venv
source venv/bin/activate  # on Mac/Linux
venv\Scripts\activate     # on Windows

This keeps your project’s dependencies isolated and prevents the infuriating “but it worked yesterday” problem where one project’s libraries silently break another’s.

Keep API secrets out of your code. Don’t paste your Hugging Face token directly into a script you might share or commit to GitHub. Use a .env file and the python-dotenv library:

pip install python-dotenv
# .env file (this file stays off GitHub — add it to .gitignore)
HF_TOKEN=hf_your_token_here
# your script
from dotenv import load_dotenv
import os

load_dotenv()
token = os.getenv("HF_TOKEN")

Read error messages. This sounds obvious, but: most Python errors from LLM libraries tell you exactly what went wrong. A 401 means authentication failed (wrong or missing token). A 503 means the model is loading on the server side – wait a moment and retry. A ConnectionRefusedError from a local API almost always means LM Studio’s server isn’t running.

What Next?

Once you’ve got a basic script running, the natural next steps are:

  • Build a simple chat loop that keeps track of conversation history and lets you have a back-and-forth with a model
  • Experiment with system prompts to give the model a persona or set of instructions
  • Try different models on the same prompts and compare the results – it’s illuminating
  • Start combining LLMs with sentence transformers for retrieval-augmented approaches where you search a corpus semantically before feeding results to a generative model

The Python AI ecosystem is genuinely exciting right now, and the barrier to entry has never been lower. You don’t need a GPU, you don’t need a cloud account, and you don’t need to be a professional developer. You just need an afternoon and a bit of curiosity.

Have questions or want to share what you built? Drop a comment below.

OCR for Historical Newsprint: Four Models Worth Running Locally in LM Studio

If you work with scanned, typeset documents from archives like the British Newspaper Archive, you will know the frustration of running standard OCR tools on material they were never really designed for: degraded print, Victorian column layouts, eccentric typography, and occasionally deliberate non-standard spelling.

You can leverage the power of local AI models, however, to automate this process, and with free inferencing software like LM Studio, the learning curve isn’t at all steep. Below, I take a look at four specialist OCR models you can run entirely locally using the package – and why you might prefer doing so over handing your documents to a web service.

Why Run OCR Locally?

There are some truly excellent web-based OCR services. There’s Transkribus, for instance, which is widely used in the academic community. Tool like this are powerful and convenient, but they come with some real trade-offs:

  • Privacy: Your document images leave your machine and are processed on someone else’s server. For sensitive archival material or unpublished research corpora, that matters.
  • Cost at scale: Processing hundreds or thousands of newspaper pages through a paid API adds up quickly.
  • No customisation: Cloud OCR engines don’t always offer many pipeline options. You cannot instruct them to preserve dialect spellings, flag ambiguous characters, or respect the orthographic conventions of a specific historical variety of English.
  • Reproducibility: Web services update their models silently. A corpus processed in 2024 may produce different output if you re-run it in 2026. A local model stays consistent – important for methodological reproducibility.

Running OCR-trained models in an inferencing software like LM Studio removes most of this friction. The program handles multiple model download and management through a clean interface, and also allows you to customise model settings, up to the inclusion of system prompts that persist across sessions. For historical document work, that means you can instruct the model once about the linguistic conventions of your material and have it apply those rules to every page you send it.

The Four Models

1. OLMOCR 2 (7B) — Best Overall for Documents

Developed by the Allen Institute for AI (Ai2), olmOCR 2 is built on Qwen2.5-VL-7B-Instruct and fine-tuned using reinforcement learning with unit-test rewards specifically targeting document OCR tasks. It is one of the few models designed from the ground up for this use case rather than adapted from a general vision assistant.

Size: 7 billion parameters. Available as a ~4.7 GB GGUF (Q4 quantisation) or ~8.85 GB at Q8. Needs around 5–10 GB RAM depending on quantisation.

Why it works for newspaper archives: Handles multi-column layouts, mixed content (tables, headings, body text), and degraded print reliably. Scores 82.4 on olmOCR-Bench. It responds well to system prompt instructions, making it a strong candidate for dialect-preservation workflows.

LM Studio: There’s a GGUF in the native catalogue – search and download directly in the app.
🔗 lmstudio.ai/models/allenai/olmocr-2-7b-1025

✅ Pros: Best-in-class document OCR accuracy; strong layout understanding; instruction-following is reliable; native LM Studio support.
❌ Cons: 7B means slower inference on modest hardware; not ideal for rapid bulk processing.

2. NANONETS-OCR-S — Clean Catalogue Option

Developed by Nanonets, a document AI company, this model is also based on the Qwen2.5-VL architecture but fine-tuned specifically on structured document extraction tasks including forms, invoices, and archival print.

Size: Approximately 7B parameters, similar footprint to olmOCR 2. Available directly via the LM Studio model catalogue as a GGUF.

Why it works for newspaper archives: Strong on structured layout extraction and clean Markdown output. Useful when you want transcription that preserves document structure (headings, columns, captions) as well as raw text.

LM Studio: Native catalogue – findable by searching “Nanonets” in the model browser.
🔗 lmstudio.ai/models (search: Nanonets-OCR-s)

✅ Pros: Easy one-click setup; good structural output; reliable on clean and moderately degraded scans.
❌ Cons: Less tested on heavily damaged historical material than olmOCR 2; similar hardware demands.

3. DOTS.OCR (1.7B) — Best for Complex Column Layouts

Released by Rednote (小红书) in late 2025, dots.ocr is a compact 1.7B vision-language model that combines layout detection and text recognition in a single pass. Unusually for its size, it explicitly predicts reading order — the sequence in which text blocks should be read — which is critical for Victorian newspaper pages where columns can be irregular and text wraps around illustrations.

Size: 1.7 billion parameters; approximately 2 GB as a GGUF. Runs comfortably on 3 GB VRAM.

Why it works for newspaper archives: Reading order prediction alone makes it worth considering for multi-column broadsheet layouts. Supports over 100 languages, outputs JSON, Markdown, or HTML, and benchmarks show Table TEDS accuracy of 88.6% — ahead of Gemini 2.5 Pro on that metric.

LM Studio: Load via HuggingFace GGUF import (paste the HuggingFace URL into LM Studio’s search bar).
🔗 huggingface.co/dotsdocx/dots.ocr-1.7B-GGUF

✅ Pros: Tiny footprint; reading order detection; fast; strong on multi-column layouts; multilingual.
❌ Cons: Smaller context window means system prompts may drift on very long sessions; can hallucinate on heavily degraded scans; not in the native LM Studio catalogue.

4. GLM-OCR (0.9B) — Best for Bulk Processing on Modest Hardware

Released by Z.ai (Zhipu AI) in early 2026, GLM-OCR is built on the GLM-V encoder–decoder architecture and fine-tuned exclusively for OCR. At under 1 billion parameters it is the smallest model here, yet it scores 94.0 on OCRBench and 93.96% Table TEDS accuracy – results that comfortably outperform much larger general-purpose models.

Size: 0.9 billion parameters; approximately 1 GB quantised (Q8). Needs under 1.5 GB VRAM – it will run on almost any laptop made in the last five years.

Why it works for newspaper archives: Speed and low resource use make it ideal for processing large batches of pages. It is not a chat model — it takes an image and outputs text, triggered by the phrase Text Recognition: — so it is best suited to pure transcription pipelines rather than interactive use.

LM Studio: Load via HuggingFace GGUF import using the ggml-org GGUF repository.
🔗 huggingface.co/ggml-org/GLM-OCR-GGUF

✅ Pros: Tiny; fast; runs on minimal hardware; excellent accuracy for its size; good for bulk workflows.
❌ Cons: Not a chat/instruction model — no system prompt support for dialect customisation; requires a separate layout detection step for complex multi-column pages; not in the native LM Studio catalogue.

Quick Comparison

Model Size (GGUF) VRAM LM Studio Route Best For
olmOCR 2 (7B) ~4.7 GB 5 GB+ Native catalogue Best accuracy, complex layouts, dialect workflows
Nanonets-OCR-s ~4.7 GB 5 GB+ Native catalogue Structured document extraction, clean output
dots.ocr (1.7B) ~2 GB 3 GB HuggingFace GGUF import Multi-column layouts, reading order, low VRAM
GLM-OCR (0.9B) ~1 GB <1.5 GB HuggingFace GGUF import Bulk processing, minimal hardware

A Practical Workflow for Newspaper Archives

For a large corpus like material from the British Newspaper Archive, a two-tier approach works well. Use GLM-OCR for the bulk of clean, well-preserved pages – it is fast and accurate enough for standard 20th-century newsprint. Then escalate difficult pages (damaged, illegible columns, unusual typefaces, pre-1880 material) to olmOCR 2 for a more careful second pass. If column order is scrambling your output, switch to dots.ocr for those pages specifically.

For dialect writing research – where you need the transcription to preserve non-standard spellings rather than silently normalise them – load olmOCR 2 or Nanonets-OCR-s and write a system prompt that explicitly instructs the model to treat all orthographic choices as intentional. That single step does something no traditional OCR engine is capable of: it makes the tool linguistically aware of your material.

All four models run fully offline once downloaded. No subscription, no API key, no usage limits — just your hardware and your documents.

The GLM-OCR model running in LM Studio, transcribing a 19th-century newspaper article

The GLM-OCR model running in LM Studio, transcribing a 19th-century newspaper article

Screenscot of Cell to Singularity, an immersive casual clicker game available on Steam.

Cell to Singularity : Casual Play for TL Immersion

Osmosis isn’t just for cells – it’s for language learners too! Soaking up target language simply by placing it in your everyday line of sight is one of the most effective strategies for fluency. From your instagram feed to cosy telly-watching, consolidation can be about throwing more of the things you love in your way.

Gaming is another entz stream that is really easy to target language-ify, since many titles have multiple language options. The Steam platform is a particular goldmine here – a huge multi-platform marketplace, with loads of free-to-play offerings. The trick is to find quite text-heavy games with dialogue and interactions, exposing you to as much content as possible in-play. There’s honestly something for everyone here, from word games to fully-fledged RPG.

This week, I chanced across a casual clicker on Steam that has been working its quiet way into the hearts of users since its inception in 2018. It’s Cell to Singularity, a game that simulates the blossoming of life on Earth, from eukaryotes, to jellyfish, to humans (and beyond). It’s the kind of game you can have running inconspicuously in the background while you work, slowly developing and growing like a bonsai that needs occasional tending. Very Zen.

Screenscot of Cell to Singularity, an immersive casual clicker game available on Steam.

As you can see from the screenshot, it’s also a great way to revise the building blocks of life. That’s the root educational application the game has been feted for, covering evolutionary biology in a fun, laddered way. Switching my interface to German gives me a ton of fun natural world vocab.

Beyond word level

But the game is also full of conversational exchanges you have with the ‘supercomputer’ running your life simulation, as well as Wikipedia-style descriptions of all your finds. In short, it supports word, sentence and text-level language skills in a rich, engaging environment. What more could you ask for?

Screenshot from Cell to Singularity showing dinosaurs

The range of languages available right now is already impressive. Not only the ‘mainstream’ school ones, but also Korean, Japanese, Polish and Portuguese, amongst others.

Screenscot of the language options in Cell to Singularity, an immersive casual clicker game available on Steam.

Cell to Singularity currently has an 89% positive rating from thousands of Steam users. I wonder how many of them are playing to improve their target language? Hopefully I’ve enticed a few more of you to do just that!

Escaping the Ecosystem : AI Edition

We live in such unexpected, shifting, fracturing geopolitical times just now. A stability taken for granted for decades no longer seems a given. So much so, that many have begun to question the global tech ecosystem we are embedded in, considering the safety of our data and workflows, and seeking less exposed, closer-to-home alternatives.

It’s something we can explore without straying into conspiracy territory, and it goes beyond data security. Tech writer Cory Doctorow has written at length on the downsides to walled garden platforms that make leaving costs high while degrading (or enshittifying – Macquarie Dictionary’s 2024 word of the year) their services. Linguaphiles should know – our own beloved Duo is one of them. It’s a compelling argument, and one that national consumer protection agencies are starting to incorporate into policy. The notion that we can take meaningful steps to decouple from tech monopolies is beginning to take hold.

Ecosystem creep : AI

This leads us to AI firms – arguably the fastest growing of tech behemoths, whose services nonetheless are working their way into many of our workflows. It’s not all doom and gloom here, though; Anthropic in particular has emerged as one US company willing to stand up for an ethical stance in the field.

That said, most European LLM traffic still goes down that American route, collecting on servers users’ states have no jurisdiction over. Users come to rely more and more on these services for key elements of their day-to-day, although have little control over their place in that ecosystem.

So what to do? LLMs are incredibly useful tools for a number of creative applications. For language teachers, they are particularly good at creating authentic-sounding materials for worksheets. In fact, I’ve often argued that LLMs are a tech almost tailor-made for language learning and teaching – in few other fields is the language structure more important than the actual content! They’re genuinely brilliant at creating copy, often highly nuanced, for learning.

AI Swaps

Well, one quick and easy swap is Le Chat by French AI company Mistral. It’s a ‘full fat’ LLM on a par with the big US names, running your prompts remotely on a multi-billion parameter model. Not so remote, though – their server activity remains within EU jurisdiction.

Then, of course, there is the ‘peak privacy’ option – running your own LLM. That’s a lot easier than it sounds, thanks to easy-setup software like LM Studio or Ollama (both US-based projects, but run locally on your own machine). Install, download a model, and prompt away. While few (to no) people will have the hardware to run full-sized LLMs, small models are getting better and better, rivalling the biggies for everyday use.

Google’s Gemma 4 is a case in point, a new small model (you can get a sub-20gb version) achieving some really impressive benchmark scores. Multi-language support is one of its strengths, and believe me, it does more than a good enough job of worksheet authoring and lesson planning. And it comes with an extra ‘externalities’ bonus, too – the only energy it’s using it your laptop battery, rather than spinning up some red-hot servers on a remote farm somewhere.

That has to be a win-win – using open source releases from the industry leads, without getting trapped inside the matrix.

We may have little control over geopolitics. But there are always choices when it comes to our exposure to it in the tech we use. I’m working on a list of these swaps as part of my own digital hygiene plan, and hope to share much more of this in coming weeks!

Speaking dialect in the mountains

AI and Dialect: Can You Nudge a Language Model Off the Standard Path?

One of my quiet worries about the AI revolution in language is standardisation. When a model trained on billions of documents produces text, it tends to sand everything down to some kind of weighted average – grammatically impeccable, register-appropriate, but utterly devoid of local colour. And where’s the fun in that? You know my thoughts on dialect and language learning.

But here’s the thing: despite the odds, the standard grey slop isn’t a mathematical inevitability. Large Language Models are trained on billions of words. And that material will include plenty of non-standard language, too, simply by dint of the extraordinarily broad sweep of the training brush. Dialect literature, forum posts, subtitles, transcripts – there’s room for non-standard, more colloquial language in every context of the model’s background.

Which raises an interesting question: can we nudge AI output away from the standard, and towards something more authentically regional?

I’ve been playing around with exactly this, and the short answer is: yes – and it’s surprisingly decent at it.

Stepping off the well-trodden path

I’ve said from the start that AI is built for language learning content creation. There aren’t many fields where content accuracy doesn’t matter too much – as long as the language is correct and authentic-sounding. One of my go-to moves is pushing the LLM for a more colloquial register. Standard written language is fine for reading comprehension, but if you’re teaching people how to actually speak, you want contractions, ellipsis, informal vocabulary – the stuff that sounds like a real person. AI handles this pretty well once you prompt for it explicitly.

Create a model German dialogue on the topic “Reality TV”. The language should be contemporary, colloquial and natural-sounding German.

Claude did a pretty good job of this:

  • Lena: Ey, hast du letzte Woche „Love Island” geschaut? Ich konnte gar nicht aufhören – ich war bis halb zwei wach.
  • Kai: Natürlich hab ich das geschaut – wär ich ja bescheuert, das zu verpassen. Diese Tiara und Marco… ich meine, komm schon. Die spielen da nur rum.
  • Lena: Genau! Marco tut total so, als wäre er an ihr interessiert, aber das ist doch alles für die Kamera. Ich find das so zum Fremdschämen.
  • Kai: Ja, aber genau deswegen schaust du’s doch! Gib’s zu. Du liebst den Drama.
  • Lena: Okay, okay, ich geb’s zu. Aber ich find’s trotzdem fragwürdig, wie die Leute da so bloßgestellt werden. Manche wirken echt überfordert.
  • Kai: Absolut. Ich glaub, vielen ist nicht bewusst, wie sehr die Produktion das alles lenkt. Die schneiden das so zusammen, dass du denkst, die Person ist voll die Zicke – aber vielleicht war das gar nicht so gemeint.
  • Lena: Stimmt schon. Und dann wird die Person im Netz fertiggemacht. Ich hab mal gelesen, dass einige danach richtig psychische Probleme hatten.
  • Kai: Das ist so traurig. Und trotzdem melden sich jedes Jahr tausende Leute an. Ich versteh’s irgendwie – du willst berühmt werden, schnelles Geld, Follower… Der Lockruf ist halt stark.
  • Lena: Stimmt. Und seien wir ehrlich – wir beide gucken’s ja auch. Also sind wir Teil des Problems, oder?
  • Kai: Pff… ich würd sagen: wir sind Teil des Unterhaltsystems. Klingt besser. Jetzt mach die nächste Folge an.

Some great non-standard, everyday features in there. Colloquial contractions like -e dropping on verbs, decent use of particles like halt – not a bad job at all. Claude even threw in a bunch of usage notes without asking, too.

But colloquial is still, broadly, floating just beneath a standard umbrella. What if we go further?

Adding Local Flavour

When I’m probing the edges of what AI can do, I like to start with varieties I can actually judge. Austria is my German-speaking region of choice in this regard, being my year abroad destination of old – I still have a huge soft spot for Kärnterisch. So here’s the tweak:

Create a model German dialogue on the topic “Reality TV” between two young Austrian speakers from Kärnten. The language should be contemporary, colloquial and natural-sounding Austrian German.

Here’s what came back:

  • Magdi: Oida, hast du gestern „Love Island” g’schaut? I kann nimma! Die Lea hat den Stefan komplett ausg’spielt, echt a Wahnsinn.
  • Tobi: Na freilich, i schau des jo seit Wochn! Aber geh, die Lea, die tuat doch nur so. Die weiß ganz genau, wos sie macht — des is alles g’spield.
  • Magdi: Ja eh, ober trotzdem — der Stefan is so a Trottel! Wie kann man so blöd sein, gell? Der glaubt ihr jo alles.
  • Tobi: Najo, i mein, de san jo auch so a bissi animiert zum Streiten, oder? Die Produzenten woin das jo so. Ohne Zoff läuft des ned.
  • Magdi: Des stimmt scho, ober i find des trotzdem scheißn. Die tean so, ois ob’s echt wär, und dann geht’s nur ums Follower kriegen. Danoch san alle auf Instagram mit Werbung zuagekleistert.
  • Tobi: Haha, stimmt! Drei Wochen nach der Sendung verkaufen’s alle irgendwelche Protein-Shakes. Gott sei Dank ned mia. [lacht]
  • Magdi: Ober weißt wos? I schau’s trotzdem weiter. I bin schon voll drin, i muss wissen, ob der Kevin die Kathi kriegt oder ned.
  • Tobi: Na sag i dir: die kriegen sich ned. Die Kathi is viel zu schlau für ihn. Die geht da raus, macht a paar Storys, und dann war’s des.
  • Magdi: Oida, du bist so a Zyniker! Vielleicht is es jo echt?
  • Tobi: Magdi. Des is Reality TV. Des is so echt wie mei Oma beim Wrestling. [grinst]
  • Magdi: Haha, okay okay. Ober schauen tust’s trotzdem, gell?
  • Tobi: … Ja. Natürli. Um neun bin i dabei. Sag’s ned weiter.

If you’ve spent any time in Austria, this is probably just as joyful to you as it is to me. It’s added in so much southern colour, from the tag particle ‘gell’ to the reduced i from ich and past participles in g’. There’s even some fun wordplay in there (as real as my gran at wrestling!), so it’s not only dialect, but character. It’s laid it on thick!

There’s an element of pastiche to this, for sure. Nobody will claim this is indistinguishable from an authentic speaker of Klagenfurt. But it’s genuinely not bad, especially if you’re after a solid introduction to local speech features.

And then there’s Trøndersk…

That’s all very well for ‘mainstream’ foreign languages like German. To push the envelope a little more, it’s worth playing around with some lesser-learnt languages – Norwegian is always a good choice.

For my Norwegian test, I went with Trøndersk, the dialect cluster associated with the Trondheim region and much of central Norway. It’s one of the most immediately recognisable Norwegian dialects, with distinctive pitch accent patterns, characteristic -a verb endings, and a few phonological quirks that mark it out clearly from the Oslo Bokmål you’ll meet in textbooks.

How good a job would Claude do of this challenge?

  • Marte: Oi, har du sett at det e ny sesong av Farmen nå? Æ begynte å se på den i går kveld og klarte itj å slutt.
  • Jonas: Åja? Æ har itte sett mye reality-greier på ei stund, men Farmen e jo noe anna. Det e liksom ordentlig underholdning, da.
  • Marte: Ja, det e noko med den serien! Og den nye dama fra Steinkjer – ho e jo helt vill. Ho klaga på absolutt alt og alle og gjor ingenting sjæl.
  • Jonas: Haha, klassisk. Det e alltid nåkka en sånn type på realitygreier. Produsenten e nødt til å ha med dem, skjønner du? Ellers e det jo kjedelig.
  • Marte: Jaja, men det e jo litt ekkelt å innrømme at man liker det, liksom. Æ satt der og ropte på skjermen som et idiot.
  • Jonas: Haha! Ka sa du da?
  • Marte: Noe sånt som «koffor gjør’u det, din tulling!» Men mora mi satt rett ved sia og hun så bare på mæ som om æ var gal.
  • Jonas: Uff da, haha. Ka syns ho om programmet da?
  • Marte: Ho e heilt avhengig sjæl! Det e liksom heime hos oss at hele familien samles rundt skjermen på tirsdager. Det har vorre sånn siden æ va lita.
  • Jonas: Okei men ka med all den andre reality-søpla da? Eks på stranda og sånt? Det e jo bare tull, syns æ.
  • Marte: Ja nei, det gidder itj æ. Det e jo bare folk som vil bli kjendis og som itte har nåkka å by på. Farmen e annleis fordi det faktisk krev noe.
  • Jonas: Sant nok. Kameraten min i Ålesund e helt hekta på Paradise Hotel derimot. Han ser på det heile tida.
  • Marte: Seriøst? Koss klare han det? Æ holdt ut to episoder og ga opp. Alle folkan der oppe e jo helt… æ veit itte. Tomme, liksom.
  • Jonas: Haha, han sier det e «sosialt eksperiment». Æ tror bare han har lyst på drama heime fra sofaen sjæl.
  • Marte: Hahaha! Ja det e nok det. Uansett — skal du se Farmen med mæ på fredag? Vi kan bestille pizza og le av alle.
  • Jonas: Ja, det høres bra ut! Men du bestemme pizza-topping. Sist gang valde du masse squash og det va itte greit.
  • Marte: Oi, squash e jo godt da! Men okei, du får velge. Bare itte ananas.
  • Jonas: Deal. Vi ses fredag!

As far as I can tell – and I’m not so well placed here, I must admit – it’s done a solid job. The k- question words (rather than hv-), the reduced æ for jeg… Even the cultural references, selecting two actual Norwegian shows for the exchange. I’d trust this as a bit of pre-Trondheim trip prep!

That said, if you speak a Trøndersk dialect, what’s the verdict? Does this read as a reasonable attempt, or is it giving you second-hand embarrassment? I’d genuinely love to know. (Drop a comment below, or find me wherever I am on the internet this week.)

AI and Dialect for Language Learning

AI platforms clearly have some representation of dialect. Awareness is the wrong word for it – anthropomorphising LLMs and all that – but you know what I mean. The patterns are in the training data; the question is whether a prompt can reliably surface them, or whether the model’s default pull towards the standard is too strong.

The two languages I explored here seem to work well. The pull to parody is always something to watch, particularly when written material in a dialect tends to be humorous, rather than in-community. And rarer or more exclusively spoken varieties will no doubt be much more troublesome. But the more linguistic scaffolding you give the model – the more you treat prompting as a kind of sociolinguistic briefing – the better the output is likely to get.

That standard grey slop, it seems, has more texture than it first appears – you just have to know where to poke.

The CEFR scale - a ladder to fluency in language learning

CEFR and Interactive Language Learning: Bringing the Threads Together

The Council of Europe’s CEFR (Common European Framework of Reference) scale is something we’re almost instinctively aware of as language learners. A1/2, B1/2 and C1/2 are the shorthand we regularly use for fluency. A recent paper by Gökhan Haldun Demirdöven in Frontiers in Education takes a long-overdue fresh look at the framework in light of new immersive, technology-mediated language learning. How does our common language ladder mesh with recent developments?

A lot of recent work on language learning tech focuses on platforms such as XR, AI-driven environments and conversational agents almost as a pedagogical end in themselves. Instead, this paper asks a more structured question: how might these technologies meaningfully align with what the CEFR already – with well-established practical application – conceptualises as language ability?

In other words, can we design immersive digital environments to support the kinds of communicative action, mediation, and interaction that CEFR descriptors actually describe?

CEFR : More Than Just Assessment

The author takes the 2020 revisions to the CEFR as a starting point, identifying features of the updated framework – particularly its emphasis on mediation, multiple competences, and real-world communicative tasks – and considers how these might map onto immersive learning contexts. Examples include simulated environments for task-based interaction, AI-supported conversational agents, and speech-recognition systems that are sensitive to sociolinguistic variation rather than enforcing a single normative model.

One enlightening conclusion is that the CEFR is not simply an assessment tool; it is also a design framework. New language learning technologies are useful as long as they support CEFR-aligned communicative action, rather than as standalone innovations. This is a helpful course correction in a space where language-learning technologies can sometimes prioritise novelty over pedagogical coherence. For me, as a language software developer, it’s certainly something that strikes home – language professionals should avoid developing new language tech in a vacuum, and instead build on previous work with a proven track record.

For teachers, curriculum designers, and language-learning app developers, it’s a really timely reminder that innovation in language learning does not have to sit outside existing standards. Frameworks like the CEFR can play an active role in shaping how new technologies are pedagogically grounded.

Diffuse squares

SingaKids: A Glimpse of Where Multimodal AI Tutoring May Be Headed

A recent pre-print on SingaKids, a multilingual multimodal tutoring system for young learners, offers an interesting look at how AI-supported language learning is evolving. You can read the paper here: SingaKids: A Multilingual Multimodal Dialogic Tutor for Language Learning.

Designed for early primary classrooms, SingaKids is an AI-based system that uses picture-description tasks as the basis for spoken interaction. It combines dense image captioning, multilingual speech recognition, a dialogue model tuned with pedagogical scaffolding, and child-friendly text-to-speech. The system works in English, Mandarin, Malay, and Tamil, with extra attention paid to the lower-resource languages to improve recognition and generation quality.

Flexible Scaffolding

Something that stood out to me in particular was the system’s focus on scaffolding rather than straightforward correction. That approach is flexible; depending on a child’s response, the system shifts between prompts, hints, explanations, and more structured guidance. Higher-performing learners are pushed towards fuller reasoning; less confident learners get clearer cues and more supportive turns. It’s a step away from the rigid “question–answer–score” pattern and closer to the texture of real classroom dialogue.

Although the work is aimed at children, several ideas have wider implications for the rest of us. Picture-guided dialogue isn’t new in ‘grown-up’ resources – think Rosetta Stone, for instance. But it could easily support adult learners practising free production in AI tools, too. Improved multilingual ASR – especially for hesitant, accented, or code-switched speech – would benefit almost every speaking-practice tool. And the flexible scaffolding approach hints at future e-tutors that adapt to the learner’s behaviour dynamically, rather than funnelling everyone down the same path.

The project sits firmly in the research space, but it points towards what the next generation of tools may look like: multimodal, context-aware systems that don’t just respond to learners but actively guide, prompt, and adjust. For anyone keeping an eye on developments in educational AI, it’s a nice indication of the direction of travel (and I’m probably a wee bit envious of those kids getting a chance to try it first!).

Perplexity Tasks for Language Learners

AI techniques to support language learning are pretty well-known now. From structured conversation partners to resource creators, LLM platforms have been embraced by the polyglot community.

Like many of us, I dip in and out of them almost unthinkingly now. Often, I’ll snap in a page from a chapter I’m working on with my Greek teacher, and it’ll help me prepare ahead of a lesson. Sometimes, I’ll get it to reel off a list of useful phrases on a topic I’m studying. LLMs can make great worksheet creators, too. In many ways, it’s simply a very interactive reference tool, giving (mostly) reliable answers but with a big nod to context.

I’d been pretty dogged in my choice of platform, sticking for the most part with ChatGPT Plus. Claude and Gemini were also in the mix, alongside some fun running local models. But for the most part, I thought my tool choices were pretty settled.

But then I gave Perplexity a whirl.

Perplexity – Task Master

Perplexity isn’t an LLM in the sense that ChatGPT, Gemini and Claude are. It uses LLM technology. But it’s actually more of an intelligent, context-sensitive search tool, that uses natural language APIs to turbo-boost its web-hunting activities.

I’d clearly not found that prospect very exciting, as I’d not gone near it until now. But thanks to a bundled free upgrade, I got to try the premium tier of late. And one particular feature stands out as potentially transformative for my learning habits: Perplexity Tasks.

Tasks are scheduled searches you set up with natural language instructions. And those instructions can be as rich as your usual LLM prompts in terms of requested formatting and such like, so in essence, you can build regular bulletins with up-to-date information in any language you like. Take one of mine, that runs daily:

Search the global news for the biggest world news story of the day. Summarise it in French, German, Modern Greek, Polish, Scottish Gaelic and Swahili at a level appropriate for an intermediate learner, ensuring that the translation is of the highest, native speaker standard quality, idiomatic and natural-sounding. Summaries should be 3-4 sentences long. Highlight key words in bold.

Accompany each summary text with a glossary / vocabulary list detailing all the key / difficult words from it in dictionary format (listing word class, irregular parts if applicable etc.). Hyperlink glossary items to Wiktionary entries where available with further information on them (use the English version en.wiktionary.com).

Lay it all out neatly to make it easy on the eye. Use plenty of emojis for impact too. Make this a fabulous resource for polyglot language learning! 🌍

Now, every morning, I get a wee news digest emailed straight to my inbox in multiple languages. It’s learner-friendly, includes vocab support, and gives me something to talk about in my language meets and lessons. I’ve done the same for academic paper searches in linguistics, and stories on dialect appearing in news outlets.

It feels like a proper game changer!

Tasking on Other Platforms

Now, you don’t need Perplexity to do this – it’s just one of the most user-friendly ways I’ve found to do it. If you have ChatGPT,  check out Scheduled Tasks. In Gemini, Scheduled Actions will do the trick for Pro members. Copilot is in on the game too. Others will no doubt follow suit shortly – clearly, task scheduling is becoming one of those features AI platforms are expected to have.

What I like about Perplexity, though, is that its whole raison d’être is the search – it feels particularly suited to web-based tasks like news digests. It’s also quite nice to keep the separation between my everyday LLM ramblings, and my more structured, scheduled items (use it for a few weeks and you’ll have clogged your timeline up with dozens of chats!).

If you’ve been looking for a way to make AI genuinely work for your learning rather than distract from it, try setting up a task or two – you might just find it becomes part of your morning ritual as well.

Image showing lots of document icons for a post on building a Zotero and Obsidian workflow

Zotero and Obsidian : A Workflow to Research Anything

If much of your study is electronic – e-books, PDF papers, worksheets and the like – you’ll face the same struggle I have: digital overwhelm. A clear workflow for dealing with mounds of virtual material is essential if you’re not to get lost.

I feel like I’ve tried them all, too. I’ve gone through the gamut of e-readers: GoodReader, PDF Expert, even trusty old Apple Preview (which has great annotation features). All very decent in their own way. On the file system side of things, though, it’s another story. I’ve cobbled together some sort of ‘folders on the Cloud’ system over the years, but it’s seriously creaky. I break my own rules half the time!

Bearing that in mind, I was chuffed to bits to chance upon a whole new system recently – one that’s passed me by completely. It seems to be a particularly big hit across North American universities. It also has a large, active community online, sharing performance tweaks. And best of all – it uses completely free software.

Zotero and Obsidian

Zotero is a publications manager that you simply drag your e-material into. The app retrieves bibliographical information, renames files sensibly and stores a copy online for working cross-device. Even better, it’s capable of generating full bibliographies, so is a file store, reader and referencing tool all in one.

Obsidian is the note-taking side of this – a sleek, markdown-driven text editor that is beautifully minimalistic. It excels in creating hyperlinked notes, allowing you to build your own Wiki-style knowledge bank. But it dovetails beautifully into Zotero thanks to community plugins that allow you to import your PDF annotations directly into bibliographically pigeon-holed notes.

After resisting the temptation to kick myself for not spotting it sooner, I did a deep-dive into Zotero + Obsidian workflow how-tos, and it’s an academic revelation. A couple of community content creators are real stand-outs here – so much so that it’s best I let them do the talking rather than waffle any more. I’m learning this as I go along, and these are great places to start.

Workflow Training

Here’s where I started, more by chance YouTube search than anything else. Girl in Blue Music namechecks a lot of the other big Z+O content creators here, so it’s a good jumping point for newcomers.

From there, it’s worth exploring morganeua‘s vast selection of content, including numerous how-to videos and worked examples.

Once you’ve worked through those, you can graduate to full geek mode! Bryan Jenks pushes the system well beyond anything else I’ve seen, and likewise has a huge back catalogue of training vids. He layers styling and advanced templating onto the base, making for a slick, colour-coded, optimally managed research system.

I feel very late indeed to this workflow party. But if you are too, join the club – and let me know if you’ve found this useful too!

ElevenLabs Hits the Right Note: A.I. Songwriting for Language Learners

In case you missed it, A.I. text-to-speech leader ElevenLabs is the latest platform to join the generative music scene – so language learners and teachers have another choice for creating original learning songs.

ElevenLabs’ Creative Platform ElevenMusic takes a much more structured approach to music creation that other platforms I’ve tried. Enter your prompt (or full lyrics), and it will build a song from block components – verse, chorus, bridge – just as you might construct one as a human writer. It makes for a much more natural-sounding track.

ElevenLabs music creation

ElevenLabs music creation

As you’d expect from voice experts ElevenLabs, the service copes with a wide range of languages and the diction is very convincing. A tad more so, I think, than the current iteration of the first big name on the block, Suno AI. No doubt the latter will have some tricks up its sleeve to keep up the pace – but for now, ElevenLabs is the place to go for quick and catchy learning song.

Anyway, here’s one I made earlier – a rather natty French rock and roll song about the Moon landings. Get those blue suede Moon boots on!

It’s definitely worth having a play on the site to see what you can come up with for you or your classes. ElevenLabs has a free tier, of course, so you can try it out straight away. [Note: that’s my wee affiliate link, so if you do sign up and hop on a higher tier later, you’re helping keep Polyglossic going!]