[Lingtyp] "AI" and linguistics problem sets

Fri Nov 7 03:25:22 UTC 2025

Dear Mark and Juergen,

A while ago when I was teaching an undergraduate morphology & syntax course I had the same concerns about students relying on AI to solve problem sets, so I tested ChatGPT (probably v. 3.5) on some fairly obscure data prior to setting assignments. The first task was a grammatical sketch based on ~two dozen sentences in Nagamese with English translations. While it did quite well with identifying word classes, tense marking, and other details of morphology, it struggled to make sense of the postpositional case markers (I had included example sentences of differential marking of P arguments in the data set). Nevertheless, it would have gotten through with a pass. I then tested it on some Dyirbal data with sentences demonstrating the split alignment system in the case marking/pronominals. This time it did extremely poorly and would have earned an F for its attempt. Naturally I shared the findings with my students 😊

This suggests that if there is language data available that a LLM can access to learn, then it is risky to use a data set of that or a typologically similar language for assessment. At the stage of ChatGPT 3.5 it seemed that it hadn’t had much exposure to head-final languages, and that may explain its inability to identify postpositional case markers. But this may change in the future, and its performance might have already improved vastly.

Alec
--
Assoc. Prof. Alexander R. Coupe, Ph.D. | Associate Chair (Research) | School of Humanities | Nanyang Technological University
48 Nanyang Avenue, SHHK-03-84D, Singapore 639818
Tel: +65 6904 2072 GMT+8h | Email: arcoupe at ntu.edu.sg<mailto:arcoupe at ntu.edu.sg>
Academia.edu: https://nanyang.academia.edu/AlexanderCoupe
ORCID ID: https://orcid.org/0000-0003-1979-2370
Webpage: https://blogs.ntu.edu.sg/arcoupe/

From: Lingtyp <lingtyp-bounces at listserv.linguistlist.org> on behalf of "lingtyp at listserv.linguistlist.org" <lingtyp at listserv.linguistlist.org>
Reply to: Juergen Bohnemeyer <jb77 at buffalo.edu>
Date: Friday, 7 November 2025 at 1:09 AM
To: Mark Post <mark.post at sydney.edu.au>, "lingtyp at listserv.linguistlist.org" <lingtyp at listserv.linguistlist.org>
Subject: Re: [Lingtyp] "AI" and linguistics problem sets

[Alert: Non-NTU Email] Be cautious before clicking any link or attachment.
Dear Mark — I’m actually surprised to hear that an AI bot is able to adequately solve your problem sets. My assumption, based on my own very limited experience with ChatGPT, has been that LMMs would perform so poorly at linguistic analysis that the results would dissuade students from trying again in the future. Would it be possible at all to share more details with us?

(One recommendation I have, which I however haven’t actually tried out, is to put a watermark of sorts in your assignments, in the form of a factual detail about some lesser-studied language. Even though such engines are of course quite capable of information retrieval, their very nature seems to predispose them toward predicting the answer rather than to looking it up. With the results being likely straightforwardly false.)

Best — Juergen

Juergen Bohnemeyer (He/Him)
Professor, Department of Linguistics
University at Buffalo

Office: 642 Baldy Hall, UB North Campus
Mailing address: 609 Baldy Hall, Buffalo, NY 14260
Phone: (716) 645 0127
Fax: (716) 645 3825
Email: jb77 at buffalo.edu<mailto:jb77 at buffalo.edu>
Web: http://www.acsu.buffalo.edu/~jb77/

Office hours Tu/Th 3:30-4:30pm in 642 Baldy or via Zoom (Meeting ID 585 520 2411; Passcode Hoorheh)

There’s A Crack In Everything - That’s How The Light Gets In
(Leonard Cohen)

--

From: Lingtyp <lingtyp-bounces at listserv.linguistlist.org> on behalf of Mark Post via Lingtyp <lingtyp at listserv.linguistlist.org>
Date: Tuesday, November 4, 2025 at 18:27
To: typology list <lingtyp at listserv.linguistlist.org>
Subject: [Lingtyp] "AI" and linguistics problem sets
Dear Listmembers,

I trust that most lingtyp subscribers will have engaged with “problem sets” of the type found in Language Files, Describing Morphosyntax, and my personal favourite oldie-but-goodie the Source Book for Linguistics. Since the advent of ChatGPT, I’ve been migrating away from these (and even edited/obscured versions of them) for assessments, and relying more and more on private/unpublished data sets, mostly from languages with lots of complex morphology and less familiar category types, that LLMs seemed to have a much harder time with. This was not an ideal situation for many reasons, not least of which being that these were not the only types of languages students should get practice working with. But the problem really came to a head this year, when I found that perhaps most off-the-shelf LLMs were now able to solve almost all of my go-to problem sets to an at least reasonable degree, even after I obscured much of the data.

Leaving aside issues around how LLMs work, what role(s) they can or should (not) play in linguistic research, etc., I’d like to ask if any listmembers would be willing to share their experiences, advice, etc., specifically in the area of student assessment in the teaching of linguistic data analysis, and in particular morphosyntax, in the unfolding AI-saturated environment. Is the “problem set” method of teaching distributional analysis irretrievably lost? Can it still be employed, and if so how? Are there different/better ways of teaching more or less the same skills?

Note that I would really like to avoid doomsdayisms if possible here (“the skills traditionally taught to linguists have already been made obsolete by AIs, such that there’s no point in teaching them anymore” - an argument with which I am all-too-familiar), and focus, if possible, on how it is possible to assess/evaluate students’ performance under the assumption that there is at least some value in teaching at least some human beings how to do a distributional analysis “by hand” - such that they are actually able, for example, to evaluate a machine’s performance in analysing a new/unfamiliar data set, and under the further assumption that assessment/evaluation of student performance in at least many institutions will continue to follow existing models.

Many thanks in advance!
Mark

________________________________

CONFIDENTIALITY: This email is intended solely for the person(s) named and may be confidential and/or privileged. If you are not the intended recipient, please delete it, notify us and do not copy, use, or disclose its contents.
Towards a sustainable earth: Print only when necessary. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20251107/53ab3aa1/attachment.htm>