[Lingtyp] L1 English

Stela Manova manova.stela at gmail.com
Fri Aug 9 19:21:42 UTC 2024


Dear Colleagues,

I am writing to ask for your help with the following issue. I have been working on the relationship between LLMs and linguistic theory and now it is time to check how language acquisition happens in humans and machines. I am therefore looking for the very first, at least, 100 words of children with L1 English (in the ideal case, data should come from different varieties of the language). Currently, for technical reasons, LA comparisons are possible only for English.

So far, I have worked with the following British English data: https://childes.talkbank.org/access/Eng-UK/Sekali.html. However, it turned out that CHILDES is problematic in many respects: too old data, too late starting of the recordings, recording of the parents' input but not of the child's answers, especially at the early stages, etc. (I thank Katharina Korecky-Kröll for invaluable help with CHILDES.) I therefore decided to turn to the lingtyp community.  Please feel free to forward the query to linguists that are not on the list.

Thanks in advance for your help. If interesting things come out, I'll post a note.

[Based on the CHILDES data mentioned above and a comparison with the ChatGPT vocabulary, LA of English in both humans and machines should happen in the same way. A short note on linguistic theory, psycholinguistics and LLMs, and how I compare things, at: https://lingbuzz.net/lingbuzz/008123. Any criticism is welcome!]

Best,
Stela

***
Dr. Stela MANOVA, Gauss:AI
LingTransformer1,2 / LearningTransformer / CodeTransformer
Email (default): manova.stela at gmail.com<mailto:manova.stela at gmail.com>
Email (alternative): manova.stela at proton.me<mailto:stela.manova at proton.me>
Web: https://sites.google.com/view/stelamanova
---
1 "A transformer<https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)> model is a neural network that learns context and thus meaning by tracking relationships in sequential data like the words in this sentence.", https://blogs.nvidia.com/blog/what-is-a-transformer-model, Wikipedia link inserted by SM.
---
2 The LingTransformer: I have a strong background in mathematics and computer science. In 2021,  I openly claimed that syntactic trees used for formal representation of language (Chomsky's approach) are not hierarchical structures and have an unnatural direction of growth: from leaves to the root, lingbuzz/006082<https://lingbuzz.net/lingbuzz/006082>, see also lingbuzz/007598<https://ling.auf.net/lingbuzz/007598>. Since then, Noam Chomsky,<https://linguistics.mit.edu/user/chomsky/> Matilde Marcolli<https://www.its.caltech.edu/~matilde/>, and Robert Berwick<https://idss.mit.edu/staff/robert-c-berwick/> have been looking for new representations of syntactic structures; for their papers, visit: https://ling.auf.net/lingbuzz/_search?q=marcolli. Curious why they will not succeed in their endeavor and what this means for research in linguistics, join Gauss:AI (for the moment, just drop me an email message)!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20240809/22febef1/attachment.htm>


More information about the Lingtyp mailing list