AI for Language Learners by Rich West-Soley; ChatGPT, Bing and more for your languages study

AI for Language Learners – Book Now Available!

It was a labour of love that happily took up most of my summer, and it’s finally out! I’m very chuffed to announce that my book AI for Language Learners is available on all Amazon stores.

 

The book is the product of months of tweaking, prodding and experimenting with emerging AI chat platforms. If you’re a Polyglossic regular, you’ll have seen some of those nascent techniques appear on the blog as I’ve developed and used them in my own learning. The blog has been a bedding ground for those first book ideas, and I’m thankful to everyone who has followed along with my own AI journey.

What we’ve come to call AI are, strictly speaking, actually large language models (LLMs). These LLMs arise from billions of words of training material – truly staggering amounts of data. The resulting super-text machines are perfect matches for subjects that benefit from a creative flair with words, and as language learners, wordplay is our currency. The book contains over 50 rich prompts for getting the absolute most out of AI’s impressive capacity for it.

The process has been huge fun. Of course, that’s thanks largely to the often unintentional humour our non-sentient friends ChatGPT, Bing and others. I try to get this across in the book, which has its fair share of lighthearted moments.

I hope you have as many smiles trying the recipes out as I did putting them together!

AI for Language Learners is available on Amazon Kindle (UK £2.99, US $2.99) or in paperback (UK £7.99, US $7.99). Even better: if you’re a Kindle Unlimited member, you can download and read it as part of your subscription.

Up the etymology garden path with ChatGPT

This week’s story starts with an instinct. I’ve been learning Swedish, which, as a Norwegian speaker, has advantages and disadvantages. One downside is the need to fight the assumption that the vocabulary of each matches up exactly with an identical etymology, when this is so often patently untrue.

In fact, Norwegian and Swedish have walked separate paths long enough for all sorts of things to happen to their individual vocabularies. For instance, take trist and ledsen, both meaning sad in Norwegian and Swedish respectively. Adding ledsen to my list of Swedish differences (I’m using my Swedish Anki deck just for the differing words), I started wondering about the etymology of both. Norwegian trist, clearly, I thought, is a French borrowing, probably via Danish. On the other hand, ledsen looks like it was inherited from the North Germanic parent language.

ChatGPT Etymology

Since I’m exploring the use of AI for language learning both personally and professionally at the moment, it seemed like a good test case for a chat. I went straight in with it: is the Norwegian word trist a borrowing from French?

But shockingly, ChatGPT was resolute in its rejection of that hypothesis. The AI assistant insisted that it’s from a Nordic root þrjóstr, the same that gives us þrjóstur (stubborn) in Modern Icelandic, with the variant þristr which seems to have evolved into Modern Norwegian trist.

Now, the thing with ChatGPT is that it can be so convincing. That’s entirely thanks to the very adept use of natural language in a conversational format. The bot simply speaks with an authoritative voice like it knows what it’s talking about.

So it must be true, right?

Manual Etymology

At this point, it all felt a bit off. I just had to do some manual digging to check. In Bokmål cases like these, my first port of call is the Norsk Akademi Ordbok. If there is an authority on Norwegian words, there’s little that comes close.

So I key in trist, and – lo and behold – it is a French borrowing.

The entry for 'trist' in the Norwegian Academy's Dictionary, showing its etymology.

The entry for ‘trist’ in the Norwegian Academy’s Dictionary, showing its etymology.

There’s no mention of Danish, just the French and the Latin that comes from. I suspect, with a bit of digging, it might turn out to have been borrowed into Danish first, but NAOB is definitive. Not a hint of Norse etymology.

Now there’s a chance ChatGPT knows something that NAOB doesn’t, although I doubt it. More likely, it’s just the innate talent the emergent AI has for winging it, and making best guesses. That’s what makes it so powerful, but, like human guesses, it’s also what makes it fallible just now. It’s a timely reminder to double-check AI-generated facts for the time being.

And maybe, to just trust your own instinct.

Blue hearts on a blue background - missing someone can make the heart feel blue. Image from freeimages.com.

Missing Me, Missing You : A Typology of “I Miss You”

Amongst the first snippets of foreign language we learn are often those expressing everyday emotional connection. The language of missing is usually somewhere in the mix.

There’s quite an interesting split in how languages express I miss you. I spot two big camps, although there are more for sure. The first of these two biggies has the person doing the missing as the subject of the active verb:

English I miss you
Finnish kaipaan sinua
German ich vermisse dich
Icelandic ég sakna þín
Polish tęsknię za tobą
Spanish te echo de menos
Swahili ninakukosa
Turkish seni özlerim

But in the second camp, the person being missed is the active subject. The person feeling the absence will be in an oblique or dative case:

Albanian më mungon
French tu me manques
Greek μου λείπεις (mou lípis)
Hungarian hiányzol ‘you are missing’ – the ‘me’ is understood
Italian mi manchi
Serbian nedostaješ mi

Who’s Missing Whom?

The split is primarily a semantic one, with verbs tending to express either the emotional work of missing, or the state of being missing or absent. Some languages, of course, use totally different constructions, like the idiomatic Spanish echar de menos, although the doer here is still clear: it’s the person doing the missing. The same goes for other languages that use completely different constructions, like Japanese and Korean, which commonly use some version of I want to see you.

The dividing lines are most interesting because they don’t necessarily follow language family groups. Romance, Finno-Ugric and Slavic languages straddle both tables. There’s some evidence of the Balkan sprachbund in the second table, perhaps, but it seems largely chance which kind of phrasing a language ends up on.

Whether it is chance or not is hard to say. Surprisingly, it doesn’t appear that many linguists have attempted to answer that question, since a literature search turns up very little. Does anything in particular prompt a language to drift towards the ‘active misser’ or ‘active missed’ route? Is it a cultural difference? And could the construction even impact how we think of missing itself, or is it a chance mapping of syntax onto feelings?

For now, then, it’s just another of those little quirks we have to register when we learn a new foreign language. Perhaps more fundamentally, it’s simply another hue or picture setting to marvel at in the human kaleidoscope of modes of expression.

Have you come across other configurations in the typology of “I miss you”? And do you have your own inklings around an explanation? Let us know in the comments!

The movement of atoms. The morpheme could be called the atom of language. Image from freeimages.com.

Houston, We Have A Morpheme Problem

It was in Greek class that I realised it. I have a morpheme problem.

Yes, those pesky little indivisible chunks of languagey-ness are causing me grief. The exact nature of that grief is a regular mixing up of pronouns and possessives with s- (you) and t- (him/his/her), to the amusement of my teacher.

Πού είναι ο μπαμπάς του… ΣΟΥ; Pou íne o babás tou… SOU?
Where is his… YOUR dad?

The source? Probably the romance languages I’ve learned, where the correspondence is reversed. French has ton (your) and son (his/her), for example, while Spanish has tu and su. The romance you/he/she attachment to those tiny little chunks has reasserted itself temporarily (I hope) to wreak happy havoc.

Yes, interference is real, and it’s not just about whole words – it’s a morpheme thing, too.

Morpheme Madness

In reality, it’s nothing to worry about. It’s a natural by-product of a brain built for pattern-spotting, and studies of bilingual infants show that we’re well-equipped to remedy it in the natural course. I can talk about it now because I realised I was doing it, and self-corrected along the way.

But what else can I do about in the immediate term?

Much of it is to do with voice, at least for me. Cultivating distinct voices for each language you learn is a great way to compartmentalise and separate. But unless you’re a gifted impressionist, your repertoire might be limited, and you might have to double up. I realised my Greek voice was suspiciously like my Spanish one., all faux-masterful and brooding. No doubt a bit of clowning around and trying new accents on might help there.

But it’s an ideal case for mass-sentence training too, which I’d become lax with of late. Glossika has a ton of sentences including those little σου and του, and an extra five or ten minutes of training a day will – I hope – re-cement the little imps into my Hellenic pathways.

Have you noticed interference between your languages at the morpheme level? What are your strategies for re-enforcing separation? Let us know in the comments!

False equivalencies - the equation 1+1=3. Image from freeimages.com.

Equivocal Equivalencies : Avoiding the X=Y Trap in Language Learning

When starting out with language learning, it’s tempting to assume a one-on-one correspondence between your native and target language for everything you come across. It seems like a simple game of equivalencies: X equals Y. But you quickly learn that it’s not always as simple as that. Different languages carve the world up in subtly different ways.

It’s most obviously the case with content words. For instance, ‘sad’ in English covers both the person feeling the emotion, and the situation causing it. In Greek, it’s two words: λυπημένος (lipiménos, the former, with a Greek passive adjective ending) and λυπηρός (lipirós, the latter). Now that would have scuppered Elton John’s sad sad situation.

But function words differ, too. Grammatical categories that have lexically crumbled into each other in English remain resolutely separate in other languages. Take the word where. In English, you can use this as an interrogative:

Where is the bank?

And you can use it as a relative:

I know where you are.

Same word, two completely different functions. It leads English monolinguals to assume that they’re equivalent, identical. For sure, their function is related – both referencing place – but they’re performing different jobs, respectively standing in for missing information and joining two clauses.

False Equivalencies

Something that took me a little time to get my head around was the same situation in Scottish Gaelic. The interrogative and the relative are different words here, càit(e) and far:

Càit a bheil e? (Where is he?)
Tha fios agam far a bheil e. (I know where he is.)

Norwegian behaves in a similar way, although with a further complication. Generally, hvor is the interrogateive, and der the relative:

Hvor er du? (Where are you?)
Jeg vil være der du er. (I want to be where you are.)

But when a question is implicit, the relative is just hvor, as in English:

Jeg vil vite hvor du kommer fra. (I want to know where you come from.)

Incidentally, it’s the same situation with Norwegian then, which is variously når or da, according to the rule above.

Interesting tidbits of language, for a geek like me / us. But they serve as a reminder to delve a little deeper into usage using a resource like Wiktionary when you learn a word that seems to correspond neatly to one in your native language(s).

It may be less than half the story!

The Study of Language by George Yule. Eighth Edition, Cambridge University Press.

The Study of Language, 8th Edition [Review]

New year, new books. Well, we have to live by some adage don’t we? And perhaps it’s the time of year, but shiny new tomes in the postbox do have their appeal. Appropriately, this week’s doormat delight was George Yule‘s essential Linguistics primer The Study of Language, refreshed and updated in its 8th iteration.

It’s a text with some measure of nostalgia for me, appearing on a preliminary reader list ahead of my own MSc. And it has doubtless done so for many other courses, having become something of a modern classic; it offers a solid and systematic overview of all branches of the field, from historical linguistics to second language acquisition. If your university offers a course on it, there’s probably an introductory chapter on it in The Study of Language. It’s as comprehensive as it is reliable.

An Interactive Text

It’s been a good two years since the last edition, so what’s changed? One key enhancement is a considerable expansion of the end-of-unit study questions and tasks. It’s something that always made the volume perfect for working in tandem with programme instructors, now even more so. Activities range from simple questions to more exploratory project-based tasks, providing ample independent learning opportunities.

An example from one of the sections of study questions in The Study of Language by George Yule (8th Edition, Cambridge University Press).

Extensive study questions cap each of the concise, snappy chapters.

There is additional online support on the Cambridge website, too, which has seen a refresh along with the core text. This includes a substantial, 152-page PDF study guide for students, adding a good deal of value to the course.

Keeping It Current

The commitment of Cambridge University Press to keeping this key text up-to-date is impressive. Several of the chapters have gone through major rewrites to reflect current research. This is immediately evident in the further reading lists, replete with pointers to fresh, new sources.

The chapter on Second Language Acquisition is a case in point. Clearly it’s quite a dear topic to my own heart, and (predictably) one of my first stop-offs. But even I spotted some interesting new references to follow up in the mix, in the form of recent papers and monographs. It’s great to see the last couple of years represented in the lists of publications like this, underscoring the fact that this is a bang up-to-date edition.

The Study of Language is a broad, engaging and highly readable introduction to language sciences. It equips the reader with a robust roadmap to ensure they aren’t overwhelmed by unfamiliar buzzwords and jargon on starting out on a formal Linguistics course. This eighth edition is a very welcome continuation of that, ensuring that students get the very best and most up-to-date start possible.

Christmas is coming! Make it a language learning one.

Christmas Books for Language Lovers : 2022 Edition!

Christmas is coming, and the books are getting fat – with expectations that kindly language learners will come along and buy them.

A strained metaphor, I’ll admit. But if you’re still searching for that special Christmas gift for the linguist in your life – even if that happens to be you – then 2022 saw a few new and updated titles from language course publishers that have always been good to us.

Here are some of my favourite stocking fillers of the year.

Routledge

Ever a mainstay of self-paced language learning, Routledge released a welcome new edition of Colloquial Irish this year. For sure, that made for a quieter year than 2021, which saw new Chinese, Hebrew and Zulu editions, but it’s nonetheless great to see the Irish course with a new lick of paint. MP3 listening material for all courses is available online, too, if you fancy a taster of what they have to offer.

In other news, the publisher also released a couple of brand new titles in its Comprehensive and Essential Grammar series. What makes this particularly exciting for polyglots and language aficionados is the off-the-beaten-track nature of the languages themselves.

Principally, the recently extinct Máku language of Venezuela and Brazil now has a Comprehensive Grammar thanks to the hugely important work of researchers working with the last two speakers. It’s an incredible opportunity to explore a linguistic heritage very nearly lost forever. In the Essential series, Filipino now counts amongst the ranks, along with a brand new edition of the Hindi grammar.

Teach Yourself

It’s been a busy year for Teach Yourself with Olly Richards’ growing set of graded readers. There’s been a flurry of updates and new editions, with Irish added to the beginners’ range (Irish learners are particularly lucky this year, it seems). Japanese gets the intermediate treatment, while Italian and Spanish get a whole new volume of beginners’ stories. All very welcome Christmas stocking fodder.

In Three Months

2022 also saw the reissue of some familiar old friends of the language learning world. In January, DK freshened up its in Three Months range with smart new typesetting and jackets. Under the Hugo banner for several decades, the courses are still solid introductions or refreshers, now with free online audio. And they look pretty nifty in their new clothes – not the most important aspect of course, but we do love a smart new book!

These days, the DK in Three Months range now focuses on a few mainstream learning languages rather than the original Hugo set (which you still pick up for a steal at second-hand outlets). These new editions are available in colourfully-bound Dutch, French, German, Italian, Portuguese and Spanish versions.

What a great little cache of 2022 releases, for Christmas or otherwise. Which titles have I missed? Leave your language learning gifting ideas in the comments!

The Black Country flag. Black Country English has undergone the processes of language change just as any other variety has.

Language Change – Up Close and Personal

The constant churn of language change is an ever-present backdrop for every speaker and learner of a language. Sounds change, words come and go, phraseology shifts. What was common parlance a century ago can be a complete archaism today.

But the classic textbook examples – napron, methinks, thou art and such like – can often seem  a bit dry, distant and theoretical on the page. It’s rarely that we see it up close and personal, not just in our first languages, but in our particular varieties of them. We’re used to thinking of traditional dialects as timeless, apart from Standard English; somehow they were always just so.

It made a nice change, then, to notice real-time language change in my own family vernacular of late. I’ve been conducting a piece of research into Black Country dialect, which has turned up all sorts of fascinating texts. And some of the forms in there had me boggling at how our local speech has altered in the last 150 years or so.

Vocabulary items, as you’d expect, drop in and out regularly over the course time. That doesn’t stop them being surprising when they pop up, though. My searches turned up things I’ve never heard a West Midlander say in my life. But there they are, in black and white, as examples of typical Black Country speech in newspaper articles both pillorying and honouring the dialect over time.

Welly Enough

My first ooh – I’ve never come across that before!  is the word welly for nearly. Take this lovely example from the story “How Leyvi Crafts Got Rid O’ The Mice“, which appeared in an edition of the popular “Tom Brown’s Black Country annual” in 1889. The narrator, bemoaning the stench from a bottle of poison, exclaims:

“The smell on it’s welly enough to kill a christian, let aloon a mouse!”

And then there’s the sign-off from a humorous letter to the editor in dialect, from the 1890s. After speaking his piece, the writer rounds off:

“well mr Editer i’n sed welly all as i wanted”

So what became of the welly? I’ve asked our resident Black Country expert (aka John, my stepdad), and he hasn’t a clue either.

Bear in mind that sometimes the word remains, but its form is different. Who knew, for example, that Black Country had an Old English -en plural for fleas, just like oxen? As one commentator in 1892 writes:

“…the Black Country mother may be heard to declare that ‘her babby has been peffled all o’er wi’ fleen.'”

I know. Not a nice image.

Better Nor That…

And then there are constructions that have changed a fair bit, too. I grew up around some very broad and proud speakers of Black Country English, who nonetheless formed comparisons as in Standard English: “better than that“, for example. Switch that out for nor, and you end up with the kind of phrase you’d more likely hear in the early 20th Century:

“…yow oughten to ha’ known better nor that…”

One of the most unusual lost phrases I’ve come across yet is the use of without as unless. It was nestling in a local yarn published in 1906:

“I dow know, without yow goo to Dr. Brown’s…”

Again, it’s lost on my family when I press them (no doubt ever nearing their limits of patience) for clarification. A hundred years can do a lot to a language.

How has your local speech changed in the last century or so? Let us know in the comments!

Pop linguistics books

Pop Linguistics Books for Prep or Pleasure

I fulfilled a long-time promise to myself in 2020. I went back to university to do the linguistics masters I never had the chance to do years ago. It’s been a journey (and still is!).

That said, as a long-time language nerd, I wasn’t going in completely blind. Like most linguaphiles, I love reading about languages, as well as learning them. Over the years, I’ve happened across a few pop linguistics titles that prepared the ground (little did I know then) for my return to uni. They’re accessible, fun reads, and nobody needs a formal linguistics background to enjoy them. Just a healthy interest will do. And whether or not you plan to take the same step as I did, they’ll all get you thinking about how languages work, and change, in whole new ways.

Without further ado, here are a few of my favourite pop ling books.

Dying Words

Nicholas Evans

Nicholas Evans is an Australian linguist specialising in endangered languages. Dying Words is first and foremost his empassioned cry to recognise the value of every language to the library of human knowledge. 

To drive the point home, he builds his arguments on solid research and extensive field experience; his expertise on Australian languages is worth the price of the book alone.

But it’s all written so accessibly, with each technical term or methodological aspect so carefully explained, that the book doubles as a kind of gentle introduction to historical linguistics. Linguistics primer gold.

The Unfolding of Language

Guy Deutscher

The Unfolding of Language by Guy Deutscher - one of my top recommended linguistics books

The Unfolding of Language by Guy Deutscher

This book is pretty special to me. It was the one that first got me thinking language change is cool!

In it, Israeli linguist Guy Deutscher tells the most fascinating stories about how words and grammar develop. The most lasting insight from this, for me, was that of the great churn of language change. It’s truly never-ending, as the results of yesterday’s changes provide the material for tomorrow’s. It’s quite the revelation how French has iterated and iterated from Latin hodie (today) to aujourd’hui – tautologically, on the day of this day.

If you like this one, it’s also worth checking out his Through the Language Glass.

The First Word

Christine Kenneally

Author Christine Kenneally takes perhaps the most speculative of linguistics topics – the evolution of language – and provides an exciting and compelling tour of scholarship in the field. A trained linguist herself, she now works as a journalist, and the combination of the two makes this a compelling pleasure to read. Even if you find the concept of language evolution too woolly and conjectural, the book is fantastic for simply prompting thoughts on what language is.

The Adventure of English

Melvyn Bragg

Despite being the only book on this list by a non-linguist (at least professionally), the author of The Adventure of English is nonetheless a sharp tool and very well informed – of course, none other than the legendary broadcaster and cultural commentator Melvyn Bragg. His book on the history of the English language, and the emergence of many different global Englishes, made a decent splash in the right circles, in any case. I’ve seen it recommended as pre-reading for a few different English linguistics courses, including a former Open Uni module. As you’d expect from a broadcast journalist, it’s pacy and entertaining – so much so that you might well finish it in a couple of sittings.

Books for Prep or Pleasure

So there you go – a handful of tips for some light linguistics reading. That goes for anyone interested in the field, whether for personal interest or uni prep. Also note that there’s not a Language Instinct in sight, although I do love that one, too. It’s just a bit too obvious as it remains ubiquitously recommended here, there and everywhere!

None of these are really academic texts, of course. Most are written in that chipper, journalistic style familiar from that close cousin to the field, pop science. But for that reason, they’re all a bit of a joy to read. I hope you enjoy them too.

The Language Instinct by Steven Pinker

Just for the sake of completion: my (now very battered) copy of The Language Instinct by Steven Pinker

 

Waves crash against rocks. Over time, contact creates change. Image by FreeImages.com

That’ll Leave A Mark! Language Contact and Change

When languages brush up against each other, they tend to leave a mark. With tongues jostling for existence within the same space, language contact situations serve up some fascinating examples of cross-pollenation.

It’s something that you keep spotting as a Gaelic learner, for example. With clockwork regularity, you come across word-for-word calques, or loan translations, lifted straight from English. You cuir air an telebhisean (put on the television). You cuir dheth co-dhùnadh (put off a decision). And I’ve even seen how you can cuir suas le cudeigein (put up with someone).

Wrapped up in Gaelic lexemes, look indigenous enough. But those prepositions air (on) and dheth (off) are behaving in ways that they might not have done, say, in Classical Gaelic, which might constrain their use more tightly. In effect, English has imported its own phrasal verb construction, which is now becoming an increasingly acceptable category in contemporary Gaelic, too. There’s syntactic change afoot.

It’s gone the other way often enough in the past, of course. The origins of the English progressive (to be -ing) may well lie with the partical + verbal noun structure of Celtic. And contemporary Hiberno-English has a past tense construction to be after doing, roughly equivalent to the perfect tense, which it appears to have nabbed from Irish.

(Un)mutual Contact

But as you might expect, language change through contact isn’t usually happening equally at any one point in time. Many factors, not least social dominance of one language over the other, can make the  transference very lop-sided.

Contact linguist Myers-Scotton makes sense of this by asking where two languages meet, fundamentally: in the minds of speakers who have to use them both. Locating the process within bilingual speakers, and how they switch between languages, is a neat way to expose the front line of contact induced change. For a start, it allows us to evaluate the status of the two parties squaring off. The ‘base’ tongue is the matrix language, forming the main sentence frames of speech. Into that, embedded language – the outside influence – inserts itself to varying degrees, in the middle of it all.

Sometimes this insertion can come in the form of a single word. Myers-Scotton gives one example from Nairobi Swahili speakers: “ku-appreciate hiyo” (to appreciate it). English, the embedded language, contributes the verb appreciate. But it’s the matrix language, Swahili, giving it a regular infinitival marker ku-.

Elsewhere, larger, deeper syntactic structures can be recruited from the embedded language. The results can drastically alter a language’s syntax; the Balkan Sprachbund is a region where neighbouring languages – from completely separate branches of Indo-European: Albanian, Greek and Slavic – have gradually come to resemble one another grammatically. The most likely driver, again, was the bumping together of different peoples, and the necessary cross-linguistic skills and code-switching that required.

The End of the Road?

For some, this kind of change is the thin end of a wedge that leads to total replacement of the less socially secure language. At some point, the matrix and embedded languages will flip. Social pressure might privilege the outside language for a new generation of speakers, who might start slotting just the odd heritage language word in, here and there, as a cultural nod. A generation on, perhaps even that will peter out.

Is that the fate befalling Gaelic, gradually taking on anglicisms to the point of transformation? Actually, I don’t think that’s the foregone conclusion here. Syntactic convergence doesn’t necessarily spell the end for a language. It can be seen as a strategy to support continued bilingualism, for example; if languages share structures, it’s cognitively less costly to maintain more than one at a time. For sure, borrowed syntax is also a crutch that helps the army of new speakers (thanks to Duolingo et al.) feel a little less lost when getting to grips with Gaelic.

No, death isn’t always the end. Contact outcomes are many, and include paths that lead to sometimes surprising, but very much un-dead extremes. Living proof of that, Media Lengua (literally something like ‘between language’), is the outcome of indigenous Kichwa crashing up against Spanish in Colombia and Eduador. The resulting mixed language preserves Kichwa grammar, but has been almost entirely re-lexified with Spanish vocabulary. Deep breaths, purists: Gaelic is a long way off from that.

Oceans Collide

As with all things linguistic, bilingual speakers are just one part of a complex picture of contact change. But running through the countless evidence as above – anecdotal and otherwise – it’s easy to appreciate why they are a particularly active site. Bilingual speakers are the point at which two tides crash up against each other and the waters mix. A sort of linguistic Grenen, Skagen where oceans collide.

It’s also pause for thought for polyglots. What features do we carry over from one language to another? And if we embed into our target language cultures, do we become agents for change?