[Lingtyp] Save the Dates! WS Series "Linguistics Meets ChatGPT: From Prompt to Theory"

Wed Nov 5 10:19:02 UTC 2025

Dear colleagues,

I appreciate the exchange of views. The Linguistics Meets ChatGPT series is not intended to “legitimize” any company or model — though different models use different algorithms, and consistency matters — but to enable linguists to examine LLMs critically, from both linguistic and mathematical perspectives.

My background combines general linguistics (theory of grammar and Slavic linguistics), mathematics, and computer programming, and the series is open to all who wish to engage in scientific discussion about LLMs and their relation to language, regardless of institutional affiliation or theoretical persuasion. Participation is, of course, entirely voluntary. Our aim is to foster informed, respectful dialogue about what LLMs reveal — and do not reveal — about language.

I focus on ChatGPT because its tokenizer and published lists of tokens allow linguists to study the model’s internal representation of form directly — a crucial step toward understanding how LLMs process language.

Regarding the “open data” argument, I would note that linguistic transparency is not straightforward in this context: LLMs do not manipulate words or linguistic units but numerical representations. It is therefore unclear what a linguist could gain from raw training data.

Finally, I believe it is time for linguists to move beyond the “stochastic parrot” metaphor. A possible way to test this claim is to ask several LLMs themselves whether they identify as stochastic parrots — and then to ask why. Their answers are remarkably revealing about how information and argumentation are represented in these systems, and about the linguistic biases (including English dominance) that shape both AI models and our own field.

With best regards,

Stela Manova
CEO & Research Lead, MANOVA AI
PI, Gauss:AI Global

> On 05.11.2025, at 00:06, Fabio Meroni via Lingtyp <lingtyp at listserv.linguistlist.org> wrote:
> 
> Dear all,
> 
> I’m glad to see the conversation turning toward what really matters here: linguistics itself. I'm sure the community of this mailing list won't mind if I take the occasion to underline, with the aid of some sources, how this must be of interest to all of us in this particular moment in time.
> 
> Emily, we must thank you in particular for your contributions! Even long before LLMs were a thing, Emily was already warning about the dangers of the opacity of the computational aspects of linguistics: I remember in particular how in Bender (2009) <https://aclanthology.org/W09-0106/> she powerfully challenged the idea that statistical NLP inherently produces universal systems, and also how in Bender (2011) <https://journals.colorado.edu/index.php/lilt/article/view/1239/1077> she argued that anyone analysing language scientifically is by definition a linguist and should be equipped to engage with linguistic complexity. In this same last collection of papers I found another inspiring perspective by Eva Hajičová: "Computational linguists are sometimes regarded as sitting on two chairs: linguists just say “we do not understand”, and therefore they would like to look at computational linguists from a distance and not integrate them into their (i.e., linguistic) domain, and computer scientists tend to say the same from their perspective and behave in the same way. Also institutionally, some Computational Linguistics (CL) institutes, departments, or teams are housed in Arts Faculties, and some are affiliated with Computer Science". (Hajičová, 2011 <https://journals.colorado.edu/index.php/lilt/article/view/1247/1129>, p. 22).
> 
> Opitz, Wein and Schneider (2024) <https://arxiv.org/pdf/2405.05966> is a recent paper I appreciated. Not only it discussed the difference between what seems to be the intended understanding of a possible definition of NLP and CL, but also showed how CL is sometimes broadly defined as a synonym of NLP and sometimes more narrowly defined as a field that “focuses on computational formalisation and processing for the end goal of studying how language works”. They adopt “cL” for this last, emphasizing the prevalence of linguistic theory in it. Their acronym RELIES (Resources, Evaluation, Low-resource settings, Interpretability, Explanation, and the Study of language) is what they use to advocate for their cause, summarised in their title “Natural Language Processing RELIES on Linguistics”. 
> 
> Apart from these highlights I am choosing to mention here, be sure there is a good florilegium of contributions about these issues. My voice came last with a position paper (Meroni, 2025 <https://www.scitepress.org/Papers/2025/132591/132591.pdf>), to say that, as long as the goal remains the modelling and processing of human language, linguistics is not merely helpful, but indispensable. This belief I have in linguistically-grounded computational applications to our science is what makes me a member of NooJ community, and what makes me define myself as someone who uses computers as long as they help with the work of a linguist without replacing its competence.
> 
> If anyone is interested in papers that concretely explore how LLMs systematically disadvantage speakers of low-resource languages and propose ways to bridge this divide, I also feel like recommending the white paper published by Stanford University, Asia Foundation and Pretoria University (Pava et al., 2025 <https://hai.stanford.edu/assets/files/hai-taf-pretoria-white-paper-mind-the-language-gap.pdf>) and the position paper by Indigenous Protocol and Artificial Intelligence Working Group (Lewis et al., 2020 <https://spectrum.library.concordia.ca/id/eprint/986506/7/Indigenous_Protocol_and_AI_2020.pdf>).
> 
> It’s encouraging that we’re engaging critically with these questions, and I look forward to continuing the discussion about how best to study language (and language models) in ways that strengthen our field.
> 
> Best,
> 
> Fabio
> De: Lingtyp <lingtyp-bounces at listserv.linguistlist.org <mailto:lingtyp-bounces at listserv.linguistlist.org>> en nombre de Slavomír Čéplö via Lingtyp <lingtyp at listserv.linguistlist.org <mailto:lingtyp at listserv.linguistlist.org>>
> Enviado: martes, 4 de noviembre de 2025 22:38
> Para: Emily M. Bender <ebender at uw.edu>
> Cc: Lingtyp Linguistics Typology <lingtyp at listserv.linguistlist.org>
> Asunto: Re: [Lingtyp] Save the Dates! WS Series "Linguistics Meets ChatGPT: From Prompt to Theory"
>  
> Dear all,
> 
> I second everything Emily has said here.
> AI grift should be resisted and shamed wherever one encounters it.
> 
> Best,
> 
> Slavomír
> 
> On Tue, Nov 4, 2025 at 4:16 PM Emily M. Bender via Lingtyp
> <lingtyp at listserv.linguistlist.org> wrote:
> >
> > Dear all,
> >
> > In case anyone missed it: this "workshop series" is put on by a for-profit company, and is not an academic exercise. Participation in this series of events would only serve to legitimate this for-profit company which positions itself as a "a research and consulting company". You have better things to do with your time --- and the field of linguistics surely has enough problems with our empirical foundations without turning to synthetic text extruding machines. (If you are interested in studying LLMs scientifically, then bare minimum you have to be working with fully open systems where the training algorithms and training data are all open for inspection.)
> >
> > Emily
> >
> > On Tue, Nov 4, 2025 at 12:11 PM Stela MANOVA via Lingtyp <lingtyp at listserv.linguistlist.org> wrote:
> >>
> >> Dear colleagues,
> >>
> >> The LingTransformer of Gauss:AI Global is happy to announce the workshop series
> >>
> >> Linguistics Meets ChatGPT: From Prompt to Theory
> >>
> >> Website: https://gaussaiglobal.com/LingTransformer
> >> The announcement is also available at: https://www.manova-ai.com/workshops
> >> (Gauss:AI Global is a subsidiary of MANOVA AI.)
> >>
> >> Format: Hybrid (onsite in Vienna, Austria, and online)
> >>
> >> Frequency: Regular series – 8 thematic workshops in 4 cumulative blocks (March–June 2026) and a final conference in October 2026.
> >>
> >> Target group: Linguists, language researchers, and educators interested in understanding and using Large Language Models (LLMs) such as ChatGPT for linguistic inquiry. No preliminary knowledge of computer science is required.
> >>
> >> Organizer: Dr. Stela Manova, CEO & Research Lead, MANOVA AI / PI, Gauss:AI Global. Personal homepage:https://www.stelamanova.com <https://www.stelamanova.com/>  (with a link to ChatGPT papers in the announcement banner)
> >>
> >> ________________________________
> >>
> >> Overview
> >>
> >> The series Linguistics Meets ChatGPT explores how contemporary Large Language Models (LLMs) transform the possibilities and boundaries of linguistic research.
> >>
> >> By engaging with ChatGPT both as a research tool and a research object, the series promotes a new kind of linguistic literacy — one that bridges formal linguistic theory, empirical data handling, and AI-based modeling. The knowledge gained during the workshops can be applied in linguistics and across disciplines in academia and outside it, providing job-relevant skills for linguists and researchers facing limited academic opportunities.
> >>
> >> ________________________________
> >>
> >> 📢 Save the Dates!
> >>
> >> WS Block 1: Mar 23–24, 2026
> >>
> >> WS 1. What Type of Research Can a Linguist Do with ChatGPT?
> >> Linguistic units (phoneme, morpheme, etc.) vs. computational units (bit, byte, token, etc.); what happens to linguistic categories in a subword-based system such as ChatGPT, which operates without words; can subword-based systems serve as a linguistic corpus and assist corpus annotation?
> >>
> >> WS 2. Grammar Without Grammar: How ChatGPT Handles Syntax and Morphology
> >> Are grammatical regularities emergent or encoded? Where does the grammatical knowledge of ChatGPT come from? When and how do words enter the model?
> >>
> >> WS Block 2: Apr 27–28, 2026
> >>
> >> WS 3. Prompting as Experimental Method in Linguistics
> >> How can prompts function as elicitation tools and operationalized hypotheses? Can prompting manipulate the results of linguistic research? Do linguists need a prompt documentation database?
> >>
> >> WS 4. Meaning, Semantics, and Hallucination
> >> How does ChatGPT handle meaning, if there is no explicit encoding of meaning and no embodied cognition? Why can it compare the meaning of sentences? What do AI “hallucinations” reveal about semantic competence, truth conditions, and inference?
> >>
> >> WS Block 3: May 21–22, 2026
> >>
> >> WS 5. Cross-Linguistic Prompting and Multilingual Modeling
> >> Prompt translation versus language-particular prompting? How does ChatGPT represent languages if there are no typological parameters?
> >>
> >> WS 6. Sociolinguistics and Style in the Machine
> >> Can ChatGPT model register, politeness, or identity? Should we be polite with ChatGPT: Do the outcomes of polite and impolite prompting differ?
> >>
> >> WS Block 4: Jun 22–23, 2026
> >>
> >> WS 7. Experimental Design and Data Collection with ChatGPT
> >> How to integrate LLMs responsibly into linguistic research workflows. Documentation and citation of prompts and LLMs’ assistance.
> >>
> >> WS 8. Toward a Theory of AI Language
> >> What does ChatGPT teach us about the nature of “language” itself? Do we need a theory of AI language?
> >>
> >> Conference: Oct 22–23, 2026
> >>
> >> Linguistics Meets ChatGPT: From Prompt to Theory
> >> Participants (and invited speakers) present papers inspired by the workshop series, demonstrating how the discussions and experiments have influenced their research.
> >>
> >> ________________________________
> >>
> >> Structure
> >>
> >> Each workshop includes three parts:
> >> 1.  Introductory Lecture by the organizer (≈ 50 min + 10 min Q&A)
> >> 2.  Hands-on Session (≈ 45 min + 15 min break)
> >> 3.  Participant 5-minute Talks (after abstract selection) & Discussion (≈ 60 min)
> >>
> >> Additional discussions and consultations with the lecturer will be offered — in person for onsite participants and online (via Discourse) for both onsite and online participants.
> >>
> >> A relevant lexicon (list of mathematical and computer science terminology with accessible explanations) will be made available before every workshop.
> >>
> >> ________________________________
> >>
> >>
> >> Requirements
> >>
> >>
> >> A computer (a tablet is less appropriate, a phone is insufficient).
> >>
> >> You do NOT need a paid ChatGPT subscription for workshop participation.
> >>
> >>
> >> ________________________________
> >>
> >> Aims and Learning Outcomes
> >>
> >> Participants will:
> >>
> >> Understand how LLMs like ChatGPT structure, generate, and manipulate linguistic data.
> >>
> >> Learn to design reproducible linguistic experiments using AI-based and prompt-based methods.
> >> Critically evaluate the limits of LLM-based data for linguistic analysis.
> >>
> >> Develop interdisciplinary literacy connecting linguistics, AI, and data science — skills applicable beyond linguistics.
> >>
> >> ________________________________
> >>
> >> Upcoming Workshops (Block 1)
> >>
> >> What Type of Research Can a Linguist Do with ChatGPT?
> >>
> >> Grammar Without Grammar: How ChatGPT Handles Syntax and Morphology
> >>
> >> Details, including registration fees, will be announced soon.
> >>
> >> ________________________________
> >>
> >> Support the Series
> >>
> >> If you think that linguistics needs events like this, please consider a donation.
> >>
> >> You can donate via Stripe or by bank transfer (IBAN).
> >> For clarification of conditions (anonymous, named + inclusion in our list of supporters), contact: office at manova-ai.com.
> >> PayPal donations are possible on request.
> >>
> >> Universities and research institutions may also book the series in full or as individual workshops — please use the same contact email.
> >>
> >>
> >> With best regards,
> >> Dr. Stela Manova
> >> CEO & Research Lead, MANOVA AI
> >> PI, Gauss:AI Global
> >>
> >> Email: manova at manova-ai.com
> >>
> >>            manova at gaussaiglobal.com
> >>
> >> —-----------------------------------------------------------
> >> www.manova-ai.com <http://www.manova-ai.com/> | www.gaussaiglobal.com <http://www.gaussaiglobal.com/>
> >>
> >> www.manova-ai.eu <http://www.manova-ai.eu/>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Lingtyp mailing list
> >> Lingtyp at listserv.linguistlist.org
> >> https://urldefense.com/v3/__https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp__;!!K-Hz7m0Vt54!iDFdvrFsuTd-6civS1QZN9C63Or6tVMZjahw2idM0ZNfUQYhCXyqS2qo8w4jvrvzl7KEU41hh9rt7wLiDnOWAtCaC7dKsTcYkQ$
> >
> > _______________________________________________
> > Lingtyp mailing list
> > Lingtyp at listserv.linguistlist.org
> > https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org <mailto:Lingtyp at listserv.linguistlist.org>
> https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20251105/1aeb2a24/attachment.htm>