[Lingtyp] Linguistic Typology with LLMs: A Large-Scale Community Project (CFP)

Fri Feb 6 18:30:05 UTC 2026

Dear colleagues,

In November 2025, in several LingTyp posts related to large language models (LLMs), I demonstrated how a math-gifted person approaches linguistic problems and explained the basic logic of LLMs. We also searched together for real-world activities that separate form from meaning, which is the case in LLMs. Some LingTyp readers wrote to me that they found this line of inquiry interesting but could not see how LLMs relate to linguistic typology. To make this relationship explicit, I would like to initiate a new large-scale scientific experiment with your help. To ensure that this project is robust and impactful, I invite your active participation and expertise.

Description of the Experiment
The goal is to task an LLM with providing typological analyses of various language samples covering “neutral” (non-linguistic) topics. We will use:
an English text;
a text from another well-studied language (e.g., Russian);
a sample from a less-studied language with existing grammars and glossed papers (pre-dating the LLM’s last training cutoff);
a sample from a “low-resource” language with only a single grammar and minimal web presence.

We will analyze these data using four distinct approaches:

LLM-style Deep-Network Analysis
An analysis generated by the LLM relying solely on its internal mechanisms—specifically, learned distributional regularities over linear sequences of tokens.

Descriptive Typological Analysis (WALS-style + Leipzig Glossing Rules)
Since modern LLMs (e.g., ChatGPT, Gemini) are trained on vast linguistic texts (including analyses and corpora) and are often fine-tuned for linguistic tasks, they can typically identify languages and apply specific conventions. We will test their ability to gloss raw data and provide WALS-style analyses in varying orders (i.e. first glossing then analysis, and vice versa) to check for consistency.

Framework-Compatible Analysis
An analysis generated by the LLM within a framework compatible with descriptive typology (e.g., Dependency Grammar or any other well-represented framework online).

Framework-Incompatible Analysis
An analysis generated by the LLM within a framework not entirely compatible with descriptive typology (e.g., the Minimalist Program, as used in my pilot study).

Comparison and Conclusions
Finally, we will ask the LLM to compare these four analyses—both language-specifically and cross-linguistically—in order to draw broader typological conclusions.

How You Can Help
To ensure that the experiment is scientifically sound, I am seeking assistance with:
Data Selection: Identifying appropriate text samples.
Methodological Rigor: Formulating procedures that prevent the model from simply reproducing existing online analyses.
Expert Evaluation: Since LLMs are prone to “hallucinations” or errors, independent expert verification of the outputs is crucial.

Participation and Transparency
Once the data sets and the methodology are finalized, I will run the experiment and share the full dialogues with the LLM. However, to ensure replicability, I suggest that each participant also run the experiment independently. I will provide technical guidance to support this and to ensure no-cost research, as well as the use of comparable AI environments. Collaborative execution will allow us to assess the impact of prompt engineering and to determine whether the typological community requires a shared prompt repository.

By the conclusion of this project, participants should be able to:
conduct methodologically controlled linguistic analyses using LLMs;
ethically reduce research workloads using AI tools;
confidently integrate LLMs into linguistics pedagogy.

Proposed Schedule (2026)
Data Selection & Methodology: End of March
Experiments & Discussion of Prompts / Individual Results: End of April
Final Results & Takeaways: End of June

Preliminary Work
I have already conducted a pilot study using an English text (approx. 1,400 words) generated by Perplexity (to ensure that it was “unseen” by other models) and tested the four types of analysis. In addition, I used Gemini to produce Quechua sentences and asked ChatGPT to gloss them according to the Leipzig Glossing Rules. I then gave the same glossing task to Gemini. Interestingly, ChatGPT correctly identified the language, including the specific Quechua variety, and provided a similar, but not identical, analysis to Gemini. The observed differences were minor and occurred primarily in the translations, without affecting the underlying structural analysis, suggesting overall that the results were not simply copied from a web source but produced by the models.

Finally, the strength of this experiment depends on expert deliberation. Although an advanced LLM provides near-instantaneous processing, the substantial work of data selection and qualitative verification requires the time and insight of trained typologists. 

I look forward to your suggestions for language samples and comments on the proposed methodology, as well as indications of who would be willing to participate in the evaluation phase.

Kind regards, 
Stela 
—
Dr. Stela MANOVA  
Chief Executive Officer (CEO), MANOVA AI
Principal Investigator (PI), Gauss:AI Global 
Registration number 655453b
Sterngasse 3/2/6, 1010 Vienna, Austria
Email: 	
manova at manova-ai.com <mailto:manova at manova-ai.com> 
manova at gaussaiglobal.com <mailto:manova at gaussaiglobal.com>  
manova.stela at gmail.com <mailto:manova.stela at gmail.com>  
Web:  	
https://www.manova-ai.com <https://www.manova-ai.com/>
https://www.gaussaiglobal.com <https://gaussaiglobal.com/>
https://www.stelamanova.com <https://www.stelamanova.com/> 

Workshop Series (2026) Linguistics Meets ChatGPT: From Prompt to Theory <https://gaussaiglobal.com/LingTransformer/>
Registration for all blocks is now open. Block 1 starts March 23, 2026.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20260206/e8502e0e/attachment.htm>