[Lingtyp] "AI" and linguistics problem sets

Tue Nov 4 23:23:35 UTC 2025

Dear Listmembers,

I trust that most lingtyp subscribers will have engaged with “problem sets” of the type found in Language Files, Describing Morphosyntax, and my personal favourite oldie-but-goodie the Source Book for Linguistics. Since the advent of ChatGPT, I’ve been migrating away from these (and even edited/obscured versions of them) for assessments, and relying more and more on private/unpublished data sets, mostly from languages with lots of complex morphology and less familiar category types, that LLMs seemed to have a much harder time with. This was not an ideal situation for many reasons, not least of which being that these were not the only types of languages students should get practice working with. But the problem really came to a head this year, when I found that perhaps most off-the-shelf LLMs were now able to solve almost all of my go-to problem sets to an at least reasonable degree, even after I obscured much of the data.

Leaving aside issues around how LLMs work, what role(s) they can or should (not) play in linguistic research, etc., I’d like to ask if any listmembers would be willing to share their experiences, advice, etc., specifically in the area of student assessment in the teaching of linguistic data analysis, and in particular morphosyntax, in the unfolding AI-saturated environment. Is the “problem set” method of teaching distributional analysis irretrievably lost? Can it still be employed, and if so how? Are there different/better ways of teaching more or less the same skills?

Note that I would really like to avoid doomsdayisms if possible here (“the skills traditionally taught to linguists have already been made obsolete by AIs, such that there’s no point in teaching them anymore” - an argument with which I am all-too-familiar), and focus, if possible, on how it is possible to assess/evaluate students’ performance under the assumption that there is at least some value in teaching at least some human beings how to do a distributional analysis “by hand” - such that they are actually able, for example, to evaluate a machine’s performance in analysing a new/unfamiliar data set, and under the further assumption that assessment/evaluation of student performance in at least many institutions will continue to follow existing models.

Many thanks in advance!
Mark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20251104/b325a6c5/attachment.htm>