37.593, Reviews: Lost in Automatic Translation: Vered Shwartz (2025)
The LINGUIST List
linguist at listserv.linguistlist.org
Thu Feb 12 19:05:02 UTC 2026
LINGUIST List: Vol-37-593. Thu Feb 12 2026. ISSN: 1069 - 4875.
Subject: 37.593, Reviews: Lost in Automatic Translation: Vered Shwartz (2025)
Moderator: Steven Moran (linguist at linguistlist.org)
Managing Editor: Valeriia Vyshnevetska
Team: Helen Aristar-Dry, Mara Baccaro, Daniel Swanson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org
Homepage: http://linguistlist.org
Editor for this issue: Helen Aristar-Dry <hdry at linguistlist.org>
================================================================
Date: 12-Feb-2026
From: Carrie A. Ankerstein [c.ankerstein at mx.uni-saarland.de]
Subject: Vered Shwartz (2025)
Book announced at https://linguistlist.org/issues/36-2514
Title: Lost in Automatic Translation
Subtitle: Navigating Life in English in the Age of Language
Technologies
Publication Year: 2025
Publisher: Cambridge University Press
http://www.cambridge.org/linguistics
Book URL:
https://www.cambridge.org/universitypress/subjects/languages-linguistics/applied-linguistics-and-second-language-acquisition/lost-automatic-translation-navigating-life-english-age-language-technologies?format=PB&isbn=9781009552332
Author(s): Vered Shwartz
Reviewer: Carrie A. Ankerstein
SUMMARY
Lost in Automatic Translation is for anyone interested in language,
language learning, automatic translators, chatbots and other
language-based AIs. For non-experts, it is an easy-to-understand
introduction to the issues surrounding translation; and for experts,
it is a very readable take with fresh examples and a deviation from
the staid formality of conventional academic texts. The focus here is
on English, but the issues raised are relevant for any other language.
Author Vered Shwartz’s PhD explored English lexical semantics in
natural language processing, in particular the distributional
hypothesis that the meaning of a word is defined by the way it is used
and by the other words that surround it, i.e., context. She interned
at Google in 2017 working on automatic translation methods for
implicit semantic relations, and she is currently an Assistant
Professor of Computer Science at the University of British Columbia.
When asked in a Q&A session with UBC about what motivated her to write
Lost in Automatic Translation, Shwartz said “I went through a parallel
process of improving my English, acquiring anything from vocabulary
through figurative expressions to cultural references after having
moved [from Israel] to the US and then Canada. Having gone through
this process as both a user of language technologies and a researcher
studying them motivated me to explore this question”
(https://tinyurl.com/m2vmcd6z). She uses this stance as a second
language learner to frame her book, presenting linguistic concepts
with anecdotes and personal examples before moving on to discussions
of what they mean for automatic translators.
The introduction posits the usual questions: “What then is the role of
language technologies […]? Will they aid us in the quest to master
foreign languages and better understand one another? Or will they make
language learning obsolete?” It also questions the impact of these
technologies on the job market, education and society in general.
These are open questions, but Shwartz takes a cautiously optimistic
tone, noting that “the technology doesn’t really ‘understand’ language
the way humans do” (p. 2).
The book is sectioned into three parts: Part 1 “Communicating in
English”; Part 2 “Understanding Cultural Norms and References”; and
Part 3 “Cultural Integration Through Language”. Part 1 sets out the
major lexical and grammatical challenges for translators, human and
otherwise. Shwartz notes that the early translators had two
priorities: faithfulness and fluency. When neural network-based
methods came along, there were huge advances. However, despite these
improvements, some languages remain “low-resource,” meaning that there
is not a lot of training data for the neural networks. English and
German, for example, have numerous parallel text resources, and so
automatic translations between English and German are likely to be
good in terms of faithfulness and fluency. For other language pairs,
such as Igbo and English, where there are few parallel texts,
automatic translations may be less good. Shwartz gives the example of
giving a human translator a text with 76 repetitions of the letter “i”
separated by spaces – the human translator’s immediate response would
likely be that 76 repetitions of a vowel is not a legitimate sentence.
However, when Shwartz asked Google Translate in 2018 to provide a
translation, she was provided with the following: “As it is written in
the book of the law of Moses, which was in the wilderness, which was
before the man who did the work of the kingdom of Israel” (p. 9),
which is not fluent, nor is it faithful in terms of adhering to the
input, though it may be faithful in the tradition of Abrahamic
religions. Shwartz argues that the religious references in translated
texts are not really unexpected, given that the parallel texts for
low-resource languages are likely to be translations of religious
texts. She also notes the fact that automatic translators suggest a
translation for gibberish input is due to their tendency to operate on
the assumption that the query is valid.
The longest section of the book is Part 2 “Understanding Cultural
Norms and References”. Here Shwartz goes into more detail on the
concept of context – it’s not just the surrounding words, but the
surrounding societal conventions, shared knowledge, predictable
scripts, euphemisms and idioms that contribute to meaning. Shwartz
notes the problems for automatic translators, such as how to deal with
ambiguity as in the following: “olive oil” is oil from olives, but
“cuticle oil” is oil for the cuticles and “snake oil” is something
else entirely. For the sake of propriety, I will not get into “booty
call” and “butt dial” or “hand job” and “manual labor” or even the
more mundane “evening gown” and “night gown” or “red eye” and “pink
eye” – happily Shwartz does get into it. There’s also the issue of the
conventionalized syntax of newspaper headlines such as “Stevie Wonder
announces he’ll be having kidney surgery during London concert” (cited
on p. 42), noting that in 2023, ChatGPT misinterpreted the sentence.
When I entered the same headline into ChatGPT in January 2026, I was
disappointed; it now states “It does not mean the surgery will
literally happen on stage during the concert—it’s just the timing of
the announcement that coincides with the London performance”. Advances
in these technologies are fast. In the concluding chapter, Shwartz
presciently notes that “Some claims I make in this book about what AI
can’t currently do may be outdated by the time you read this” (p.
176).
These issues are not just for the translation of the words and phrases
of texts, but also for memes and emojis, and even concepts and
practices. For example, the point at which “afternoon” starts is
culturally dependent. To an American it’s sometime after 12pm; for
others, such as Hebrew-speaking Shwartz, it’s around 4pm. Similarly,
whether tipping is polite--again this is culturally dependent. Shwartz
notes that when she asked ChatGPT about tipping, it had a North
American bias, which at the time of writing this review in January
2026, has been corrected. ChatGPT now notes the differences in tipping
practices in North American, European and Asian countries. Even emoji
have different meanings depending on culture – the angel emoji in the
West is used to denote innocence whereas in China it is a symbol of
death. Memes too present a challenge for automatic translators. For
example MiniGPT4, a vision and language model, misinterpreted the
“drowning kid in a pool” meme, describing it as “The meme poster is
trying to convey that the person in the image is having fun in the
pool…” (p. 135).
Part 3: Cultural Integration through Language” comments on
accentedness, especially foreign accentedness and automatic speech
recognition (ASR). ASRs, like neural networks for text, are stronger
for English, especially American English and particularly for male
voices – a training resource bias. Throughout the book, Shwartz
reflects on her personal experience as a Hebrew speaker in America,
with people asking her where she’s from because of her accent; she
even writes about her experiences with an accent coach to make her
English sound more American. Many non-native speakers of any language
will be able to relate, not only to the human-to-human struggles but
also the human-to-machine struggles. Shwartz also comments on the
language-specific fillers used in spoken language, noting that some
translators and bots have been programmed to sound more human by
inserting filler words and spoken discourse markers into their output
– getting into “uncanny valley” vibes for many of us.
Shwartz also discusses taboo words and topics that are not appropriate
for casual conversation and the fact that second language learners
have to learn the rules for their new culture. Bots too have required
adjustments to prevent the production of offensive, harmful and
illegal content. Shwartz notes that these filters sometimes go too far
because they are too superficial (and can sometimes be bypassed by
translating the prompt). For example, after asking ChatGPT if someone
could invite a Muslim friend over for ramen, she reports it “started
answering but then stopped and said something about hate speech” (p.
157) apparently triggered by the word “Muslim”. In another example,
when asked to produce images of German soldiers in 1943, Gemini
produced an image with a racially diverse group of soldiers – clearly
a result of a patch to correct a white-bias in image generation.
Shwartz also discusses body language and tone of voice, noting that
these, too, are not universal. For example, it has been found that
speakers of American English rely on facial expressions more than
Mandarin Chinese speakers. There are also differences in use of eye
contact and hand gestures; some new technologies, including Zoom, are
working to incorporate information from facial expressions and tone of
voice into their systems.
EVALUATION
Lost in Automatic Translation is an enjoyable read - it’s been awhile
since I laughed out loud reading a nonfiction book. Anyone who has
learned another language and experienced mistranslations from Google
Translate, their own mistranslations and their own misunderstandings
of linguistic input will be able to relate. Lost in Automatic
Translation is of course also interesting to those interested in
machine translation – and here is perhaps my only true critique of the
book – as a linguist with some knowledge of computational linguistics,
neural networks and LLMs, I would have liked more about these in this
volume, especially given Shwartz’s background as a computer scientist
and not a linguist. In Lost in Automatic Translation, she focuses more
on describing how languages work and how humans relate to language and
a little less on how machines compute our linguistic outpourings.
Shwartz’s focus is largely on text translators, with some comments on
vision and language models and automatic speech recognition apps in
later chapters. Some readers might find inspiration for further
research in language technologies, investigating the issues raised by
Shwartz or using her criteria to evaluate new and existing tools
(though such a task would require the extraction of criteria from the
book – there is no clear list.)
In the presentation of these new technologies, there is also
discussion of potential problems such as bias (of all kinds) and
privacy issues, in addition to their tendency to hallucinate. Shwartz
notes too that there’s also the risk that use of these technologies
may erase the author’s voice or they may take away dialectal and
individual differences in writing. She also expresses concern that we
will adapt the way we interact with these apps and that this may
spread into interactions with people. Here I was less convinced that
these are serious issues – we humans have been style-shifting and
adapting to communicative situations before the advent of AI and we
will likely just expand our skill set to include human-computer
interactions.
In her concluding chapter, Shwartz notes that she could have offloaded
the writing of the book onto AI, but she notes “I had my own ideas to
express – and I enjoy the process of writing” (p. 176); and with this
book not only does she convey the content of her message, but also
demonstrates the value of human-produced texts. She writes with a
voice all her own, sprinkled with anecdotes and cheeky asides that can
only come from a situated, lived-experience. Lost in Automatic
Translation ends, as most such texts do, with cautious optimism – and
this time I’m actually convinced.
ABOUT THE REVIEWER
Dr Carrie Ankerstein is a Senior Lecturer in Applied Linguistics in
the English Department at Saarland University in Saarbrücken, Germany.
Her research interests include language processing in a first and
second language, academic writing in English as a second language and
AI in the language classroom.
------------------------------------------------------------------------------
********************** LINGUIST List Support ***********************
Please consider donating to the Linguist List, a U.S. 501(c)(3) not for profit organization:
https://www.paypal.com/donate/?hosted_button_id=87C2AXTVC4PP8
LINGUIST List is supported by the following publishers:
Bloomsbury Publishing http://www.bloomsbury.com/uk/
Cambridge University Press http://www.cambridge.org/linguistics
Cascadilla Press http://www.cascadilla.com/
De Gruyter Brill https://www.degruyterbrill.com/?changeLang=en
Edinburgh University Press http://www.edinburghuniversitypress.com
John Benjamins http://www.benjamins.com/
Language Science Press http://langsci-press.org
Lincom GmbH https://lincom-shop.eu/
MIT Press http://mitpress.mit.edu/
Multilingual Matters http://www.multilingual-matters.com/
Narr Francke Attempto Verlag GmbH + Co. KG http://www.narr.de/
Netherlands Graduate School of Linguistics / Landelijke (LOT) http://www.lotpublications.nl/
Peter Lang AG http://www.peterlang.com
SIL International Publications http://www.sil.org/resources/publications
----------------------------------------------------------
LINGUIST List: Vol-37-593
----------------------------------------------------------
More information about the LINGUIST
mailing list