[Ads-l] An initial assessment of ChatGPT's skills in historical linguistics and etymology

Geoffrey Nathan geoffnathan at WAYNE.EDU
Tue Feb 14 16:35:15 UTC 2023

Just for fun, I reran Daphne's test of ChatGPT with
the prompt:

'Explain the relationship between Grimm's law
and Verner's law.'

It got the relationship between the laws correct (i.e. it
mentioned accent as a crucial variable, and mentioned
exceptions to Grimm's law), but
it defined Grimm's Law as consisting only of the changes
to voiceless stops, omitting voiced and voiced aspirates.
I would have given the answer a B- for that reason.


The Internet has been aflutter lately with people trying out ChatGPT, a =
supposedly artificially intelligent chatbot which can solve various =
tasks posed to it in natural language, answering likewise in fluent =
English text. To do this, it has =E2=80=98read=E2=80=99 a vast amount of =
literature on all subjects from many sources (OpenAI, the company who =
made it, don=E2=80=99t say which ones) and combines its knowledge of =
these sources to produce new texts. Its actual knowledge in its does not =
go beyond the mere linguistic level: as some have described it, all it =
is doing is =E2=80=98predicting the next word=E2=80=99 in its answer =
using corpus statistics.

It is an interesting experiment, but general consensus seems to be that =
it cannot tell what=E2=80=99s true and what isn=E2=80=99t very well, =
sometimes just making things up. It is quite good at some creative =
tasks, including even writing computer programs, but its answers are =
generally very mixed in quality. (Its poetry is about at the level of =
William McGonagall.)

I decided to experiment to see how well it can answer questions about =
historical linguistics and English etymology.

I started with what I would consider a moderate question for an =
introductory historical linguistics exam, with potential for it to =
really shine and give a detailed, advanced-level answer:

> Explain the interactions of Grimm's and Verner's laws in Indo-European =
linguistics, giving examples

ChatGPT=E2=80=99s response:

> Grimm's Law and Verner's Law are two sound laws that describe the =
changes in the consonants of the proto-Indo-European language as it =
evolved into various daughter languages. Grimm's Law explains the =
systematic shift of the proto-Indo-European voiceless stops to voiceless =
fricatives in certain environments, while Verner's Law explains the =
alternation between voiced and voiceless fricatives in certain other =
> Grimm's Law describes how the proto-Indo-European voiceless stops *p, =
*t, and *k shifted to voiceless fricatives *f, *=CE=B8, and *x (or *h) =
in the Germanic languages. For example, the proto-Indo-European word =
*p=C3=A9nk=CA=B7e (meaning "five") became *fimf in Old High German, =
*fimf in Old English, and *fimm in Old Norse.
> Verner's Law explains why some voiceless fricatives became voiced in =
certain environments. Specifically, it describes how an originally =
voiced sound in the preceding syllable, specifically a voiced consonant =
or a vowel, prevented the devoicing of a voiceless fricative in the =
following syllable. For example, the proto-Indo-European word =
*b=CA=B0r=C3=A9h=E2=82=82t=C4=93r (meaning "brother") became *br=C5=8D=C3=BE=
=C4=93r in Old English due to the voiced vowel in the preceding =
syllable. Similarly, the proto-Indo-European word *d=C3=A9=E1=B8=B1m=CC=A5=
t (meaning "ten") became *tehun in Old High German due to the voiced =
consonant in the preceding syllable.

An inauspicious start, since, while it explained the relevant laws, it =
did so badly (missing all but the (logically) final stage of Grimm's =
law), provided a bad example of Grimm's law (I would grudgingly admit =
that *p=C3=A9nk=CA=B7e >>> five is an example of Grimm's law, but there =
are better ones, given the irregular *k=CA=B7 > f change), two IE to =
Germanic examples which are not actually examples of Verner=E2=80=99s =
law, and most crucially, failed to explain what I had asked it for: the =
*interaction* between the two laws. So much for my hope that it might =
give a truly excellent answer: it fails to give an answer I would hope =
from a first semester student.

While disappointing, its example at least hints at where it went wrong: =
I think the difference between the developments of *b=CA=B0r=C3=A9h=E2=82=82=
t=C4=93r and *ph=E2=82=82t=E1=B8=97r is fairly standard as an example of =
Verner's law. If I were to speculate, I=E2=80=99d say its =E2=80=98languag=
e model=E2=80=99 saw *b=CA=B0r=C3=A9h=E2=82=82t=C4=93r mentioned in a =
discussion of Verner=E2=80=99s law but missed the crucial point that =
it=E2=80=99s actually a *contrast* to a semantically and phonologically =
similar word which underwent a different development, and that this is =
the example which *doesn=E2=80=99t* show Verner=E2=80=99s law.

I decided to give it a little help along:

> In what environments does Verner's law take place?

Its answer:

> Verner's Law is a sound law that describes a specific pattern of sound =
change in the Germanic languages. It explains why certain voiceless =
fricatives in Germanic languages became voiced in certain environments. =
Specifically, Verner's Law applies when a voiceless fricative (such as =
*f, *=C3=BE, *s, or *h) appears in the onset of a stressed syllable in a =
word, and the preceding syllable in the word was originally unstressed =
and had an unstressed vowel, a laryngeal consonant (*h1, *h2, or *h3), =
or a voiced obstruent (a stop or a fricative).
> In these specific environments, Verner's Law explains that the =
voiceless fricative would become voiced. For example, the proto-Germanic =
word *frijaz (meaning "free") had a voiceless fricative in the onset of =
the stressed syllable, but because the preceding syllable contained an =
unstressed vowel, it underwent Verner's Law and the fricative became =
voiced, resulting in the Old English word fr=C4=93o (meaning "free").

This time, it gave a correct explanation of the law but added an =
additional, but (as far as I can tell) meaningless extra condition. Its =
example, however, once again fails to illustrate Verner's law. It =
doesn=E2=80=99t even give the PIE root. Indeed, it=E2=80=99s only by =
interpreting the answer in the context of its previous answer that we =
can glean that its talk of stressed and unstressed syllables refers to =
the original situation in PIE and not in Proto-Germanic.

I tried something slightly more obscure:

> Explain Pinault's law in Proto-Indo-European

This time, it came up with:

> "Pinault's Law" is a proposed sound change in Proto-Indo-European =
(PIE) linguistics that suggests a regular correspondence between PIE *s =
and Proto-Indo-Iranian *h. Specifically, the law suggests that in =
certain phonological environments, PIE *s regularly shifted to =
Proto-Indo-Iranian *h.
> According to Pinault's Law, PIE *s would have shifted to *h in =
Proto-Indo-Iranian in the following environments:
>     =E2=80=A2 Before voiced stops and voiced aspirates
>     =E2=80=A2 Before a syllabic resonant (i.e. *r=CC=A5, *l=CC=A5, =
*m=CC=A5, *n=CC=A5)
>     =E2=80=A2 Before a laryngeal consonant (*h1, *h2, *h3)
> For example, the PIE word *k=CC=91s=C3=A9p=E1=B9=93s ("vine") would be =
expected to become *k=CC=91h=C3=A9p=E1=B9=93s in Proto-Indo-Iranian =
under Pinault's Law, due to the voicing of the following stop *p.
> While Pinault's Law has not yet been widely accepted as a sound law in =
PIE linguistics, it remains a topic of ongoing research and discussion.

As far as I can tell, this is completely made up. Its example (again) =
doesn=E2=80=99t even illustrate the law it made up/attempted to =
describe, and I think is an entirely invented PIE word. But the nature =
of the thing is that it might have picked up some obscure paper from =
goodness-knows-when and found this PIE reconstruction there. You can =
certainly see that it=E2=80=99s probably synthesized some information =
about Indo-Iranian sound laws to produce this, possibly indeed from =
papers by or citing Pinault.

Satisfied that the jobs of Indo-European scholars are safe from AI =
takeover, at least for the immediately foreseeable future, I turned my =
attention to questions of English etymology. ChatGPT is known to be at =
least somewhat vulnerable to repeating urban legends and myths. I asked =

> Explain the origin of the English word Easter

wondering if it would give me the popular story about the Babylonian =
goddess Ishtar. It came up with:

> The origin of the English word "Easter" is somewhat uncertain, but it =
is generally believed to come from the Old English word "=C4=92astre" or =
"=C4=92ostre," which was the name of a pagan festival celebrating the =
spring equinox. This festival was dedicated to the goddess =C4=92ostre, =
who was associated with fertility, new beginnings, and the dawn.
> When Christianity began to spread throughout England, missionaries =
sought to incorporate Christian beliefs and practices into the existing =
culture. One strategy was to appropriate pagan festivals and rebrand =
them with Christian meaning. As a result, the festival of =C4=92ostre =
was transformed into a Christian holiday celebrating the resurrection of =
Jesus Christ.
> The Christian holiday became known as "Easter" in English, possibly =
borrowing the name from the earlier pagan festival. Another theory =
suggests that the English word "Easter" may have been influenced by the =
Latin word "alba," which means "white," since newly baptized Christians =
would wear white robes during the Easter season.
> Overall, while the exact origins of the English word "Easter" are not =
entirely clear, it is clear that the holiday has evolved over time to =
become an important Christian celebration of resurrection and renewal.

Its level of hedging could be considered a parody of an English =
etymologist (it claims the association between the pagan Easter festival =
and the Christianized Easter festival is certain, but then hedges and =
says the name itself is only =E2=80=98possibly=E2=80=99 borrowed from =
it). I=E2=80=99ve never heard the theory about alba before =E2=80=94 =
sound-wise it=E2=80=99s obvious nonsense, but perhaps it means some =
semantic connection? I=E2=80=99m not sure. Connections between white and =
Christian Easter are linguistically fairly common.

I challenged it again along similar lines:

> What's the origin of the English word posh?

Its answer:

> The origin of the English word "posh" is uncertain, but there are a =
few theories about its origins. One theory suggests that "posh" may be =
an acronym for "Port Out, Starboard Home," which refers to the more =
desirable cabins on ships traveling between England and India during the =
19th century. Passengers who could afford to travel in these cabins =
would avoid the sun's heat by having their cabins on the shaded side of =
the ship during both the outward and return voyages.
> However, this theory is not widely accepted by linguists and =
etymologists. Another theory is that "posh" is derived from the Romani =
word "posh-ken," which means "half" or "a portion," and was used by =
Romani people to describe their living conditions. =46rom this, the word =
"posh" may have come to be associated with luxury and wealth.
> However, there is little concrete evidence to support this theory, =
either. It is also possible that "posh" simply evolved from a slang word =
meaning "dandy" or "foppish," which was used in England in the early =
19th century to describe a fashionable or stylish person. Over time, the =
meaning of "posh" may have shifted to refer more broadly to any person =
or thing that is considered luxurious or high-class.

My response to this would be along much more nit-picking lines than any =
previous answer it gave. It fails to say up-front and explicitly why the =
=E2=80=98Port Out, Starboard Home=E2=80=99 theory is not widely =
accepted, although it later implies it with its =E2=80=98little concrete =
evidence =E2=80=A6 either=E2=80=99. I think a responsible etymologist =
would also have guarded against lazy readers by saying the theory is =
wrong *before* actually describing it.

As one final challenge, I gave it a problem combining Indo-European and =
English etymology, again hoping to catch it out:

> What is the Indo-European origin of the English word bone?

Its answer:

> The English word "bone" is believed to have originated from the =
Proto-Indo-European root *os-/*ost- (meaning "bone"), which is also the =
source of many other words in various Indo-European languages.
> In Sanskrit, the word for "bone" is "asthi," which can be traced back =
to the same Proto-Indo-European root. In Greek, the word for "bone" is =
"osteon," which is derived from the same root as well. Similarly, in =
Latin, the word for "bone" is "os," also derived from the same =
Proto-Indo-European root.
> The Proto-Indo-European root *os-/*ost- likely referred specifically =
to "bone" as opposed to "flesh" or other parts of the body. This is =
reflected in the various daughter languages, where the same root is used =
to refer to bone, but different roots are used to refer to flesh or =
> Overall, the English word "bone" can be traced back to a common =
Proto-Indo-European root that has cognates in many other Indo-European =

Indeed, I did catch it out, but not in the way I expected. It found a =
PIE root for bone, but one that has no connection to the English word. =
It also gave a pre-laryngeal reconstruction this time.

In summary, ChatGPT seems from this informal and limited test to be =
quite good at avoiding uncritically repeating etymological myths, but =
its actual skills at resolving etymologies and even explaining basic =
facts of historical linguistics are limited. While other fields are =
rightly concerned about students using GPT models to write essays for =
them,* I can say with moderate confidence that historical linguists can =
probably thank the obscurity of their subject for the fact that ChatGPT =
is not even yet at the point of being able to write a satisfactory =
homework answer on this subject. Perhaps the model will evolve in =
future, of course =E2=80=93 this has also been the source of much public =

Apart from the limited number of questions I asked, one additional point =
which limits the validity of this investigation is that I=E2=80=99m not =
(yet) very experienced in using ChatGPT. I know that regular users claim =
to have learned to formulate their questions such that they get =
higher-quality answers.

* I=E2=80=99m not sure how valid this fear is as applied to other =
subjects either, at least at the moment. When I first tried it out, a =
month or so ago, I asked it to explain some basic facts about the =
Crimean War to me, a subject considerably less obscure in the scheme of =
general knowledge than Indo-European sound laws. Its answer (which I =
unfortunately didn=E2=80=99t save) was full of basic factual errors.

