[Lingtyp] Call for Papers: Dependency Grammar for Typology; Workshop @ ALT 15 in Zhuhai, China; November 8-10, 2024
Annemarie Verkerk
annemarie.verkerk at uni-saarland.de
Tue Jan 30 20:48:25 UTC 2024
<apologies for cross-posting>
Call for Papers: Dependency Grammar for Typology
Workshop @ ALT 15 in Zhuhai, China; November 8-10, 2024
Large-scale multilingual corpora such as Universal Dependencies (de
Marneffe et al 2021) have enabled advances in quantitative methods in
morphosyntactic typology, allowing a transition from binary or
multivariate classifications of linguistic features to more nuanced,
continuous classifications. These enable us to capture variation better
than ever before (Levshina et al. 2023) while studying linguistic
variation from a token-based perspective (Haspelmath 2018). Going beyond
use of these resources for typological research directly, the Universal
Dependencies treebanks are used to annotate further large-scale
multilingual corpora (Kondratyuk & Straka 2019) and to syntactically
parse languages which are not covered within the framework yet as well
as for zero-shot parsing (Ammar et al. 2016; Tran & Bisazza 2019; Üstün
et al. 2022). Hence, they have become a valuable tool for multilingual
morphosyntactic analysis, the products of which are indispensable for
typology.
However, large-scale multilingual resources such as the Universal
Dependencies treebanks have also been conceived of as problematic. A
major concern for typologists has always been language sampling: this
type of resource is typically biased towards including mostly WEIRD and
especially European languages. Secondly, there is (as of yet) no devoted
program to counter this sampling bias, i.e. any coordinated effort to
include low-resource and less-described languages is on the shoulders of
individual language specialists, whose time and funds are already under
pressure. Third, as with any attempt to construct cross-linguistically
appropriate schemes for tagging and annotation, the universal
applicability of such schemes has been called into question (Croft et
al. 2017).
This workshop aims to bring together typologists working using
dependency-annotated resources for quantitative typological research. We
aim to include both new studies that peruse dependency-annotated corpora
to answer typological questions, as well as more critical authors who
point to the limitations of ‘dependency grammar for typology’. This also
includes proposals on how quantitative typology can be conducted using
heterogeneous data sources and the development of new resources, as long
as a focus on comparative research is maintained.**
Topics of interest include, but are not limited to:
➔Synchronic comparative studies on variation that can only be accessed
using corpora, such as word order (Levshina 2019, Talamo & Verkerk 2022);
➔Comparative studies that employ such resources to uncover universal
principles of grammar, including dependency length optimization
(Futrell, Mahowald & Gibson 2015; Liu 2021, Yingqi, Blasi & Bickel
2022), word order universals (Choi et al. 2021, Gerdes et al. 2021, Yan
& Liu 2023), the memory-surprisal trade-off (Hahn, Degen & Futrell 2021);
➔Diachronic studies of language change, such as the evolution rate of
word order in main and subordinate clauses (Jing et al. 2023) or word
order change (Hahn & Xu 2022);
➔Theoretical challenges in annotation, such as the universality of
syntactic labels, as well as of parts of speech, morpho-syntactic
features, and tokenization (Croft et al. 2017, Osborne & Gerdes 2019,
Sinnemäki and Haakana 2020, Hohn 2021);
➔Development of new resources, in particular with respect to
low-resource languages, starting from different type of texts (corpora,
fieldwork notes, existing treebanks, Wikipedia, grammars, etc.)
(Zariquiey et al. 2022, Kahane et al. 2023);
➔Projects that employ such resources to go beyond sentence-level
syntactic dependencies by developing additional layers of annotation for
studying discourse and information structure, among other levels;
➔Robustness and statistical validity of typological quantitative
measures on the basis of different theoretical approaches and annotation
schema (Gerdes et al. 2018, Osborne & Gerdes 2019, Yan & Liu 2019).
➔Limits of dependency grammar for typology: issues such as unbalanced
sampling, limitations of annotation in terms of availability, quality,
as well as ‘missing’ annotation, and heterogeneousness of the annotation
across treebanks, both in terms of application and quality.
We envision a worthwhile exchange between more traditional typologists
and typologists who have already worked with these resources. If you
want to join us, please submit your abstract to ALT15, explicitly
indicating that it is intended for the workshop "Dependency Grammar for
Typology". Instructions on how to submit abstracts can be found on the
ALT2024 page:
_https://sites.google.com/view/alt2024/call-for-papers_---- Abstracts
are due March 15th!
Organizers: Andrew Dyer, Luigi Talamo, Annemarie Verkerk (Saarland
University), Luca Brigada Villa, and Erica Biagetti (Universities of
Bergamo and Pavia)
References
Ammar, Waleed,George Mulcaire, Miguel Ballesteros, Chris Dyer & Noah A.
Smith. 2016. Many Languages, One Parser. In Transactions of the
Association for Computational Linguistics, edited by Lillian Lee, Mark
Johnson and Kristina Toutanova. 4:431–444.
Choi, Hee-Soo, Bruno Guillaume & Karën Fort.
2021.<https://aclanthology.org/2021.quasy-1.3>_Corpus-based language
universals analysis using Universal Dependencies_. In /Proceedings of
the Second Workshop on Quantitative Syntax (Quasy, SyntaxFest 2021)/,
33–44, Sofia, Bulgaria. Association for Computational Linguistics.
Croft, William, Dawn Nordquist, Katherine Looney & Michael Regan.
Linguistic Typology Meets Universal Dependencies. 2017. In Proceedings
of the 15th International Workshop on Treebanks and Linguistic Theories
(TLT15), edited by Markus Dickinson, Jan Hajic, Sandra Kübler, and Adam
Przepiórkowski. 63–75. CEUR Workshop Proceedings.
Futrell, Richard, Kyle Mahowal & Edward Gibson. 2015. Quantifying Word
Order Freedom in Dependency Corpora. In Proceedings of the Third
International Conference on Dependency Linguistics (Depling 2015),
edited by Joakim Nivre, Eva Hajičová, 91–100, Uppsala, Sweden. Uppsala
University, Uppsala, Sweden.
Gerdes, Kim. Bruno Guillaume, Sylvain Kahane & Guy Perrier. 2018. SUD or
Surface-Syntactic Universal Dependencies: An annotation scheme
near-isomorphic to UD. Universal Dependencies Workshop 2018. Brussels,
Belgium. ⟨10.18653/v1/W18-6008⟩. ⟨hal-01930614⟩
Hahn, Michael, Judith Degen & Richard Futrell. 2021. Modeling word and
morpheme order in natural language as an efficient trade-off of memory
and surprisal. Psychological Review, 128(4), 726–756.
https://doi.org/10.1037/rev0000269 <https://doi.org/10.1037/rev0000269>
Hahn, Michael & Yang Xu. 2022.Crosslinguistic word order variation
reflects evolutionary pressures of dependency and information locality.
In Proceedings of the National Academy of Sciences of the United States
of America vol. 119,24 (2022): e2122604119. doi:10.1073/pnas.2122604119
Kondratyuk, Dan& Milan Straka. 2019. 75 Languages, 1 Model: Parsing
Universal Dependencies Universally. in Proceedings of the 2019
Conference on Empirical Methods in Natural Language Processing and the
9th International Joint Conference on Natural Language Processing
(EMNLP-IJCNLP), edited by Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun
Wan, 2779–2795.
Haspelmath, Martin. 2018. How Comparative Concepts and Descriptive
Linguistic Categories Are Different. In Aspects of Linguistic Variation,
edited by Daniël Olmen, Tanja Mortelmans, and Frank Brisard, 83–114.
Berlin, Boston: De Gruyter.
Hohn, Georg F K. 2021. Towards a Consistent Annotation of Nominal Person
in Universal Dependencies. In Proceedings of the Fifth Workshop on
Universal Dependencies (UDW, SyntaxFest 2021), edited by Miryam de
Lhoneux, Reut Tsarfaty, 75-83.
Jing, Yingi, Damián E. Blasi & Balthasar Bickel. 2022. Dependency-length
minimization and its limits: A possible role for a probabilistic version
of the final-over-final condition. Language 98(3), 397–418.
Jing, Yingi, Paul Widmer & Balthasar Bickel. 2023. Word order evolves at
similar rates in main and subordinate clauses. Diachronica.
https://doi.org/10.1075/dia.20035.jin
Kahane, Sylvain, Santiago Herrera, Bruno Guillaume & Kim Gerdes. 2023.
Autogramm : développement simultané de treebanks et de grammaires à
partir de corpus. In Actes de CORIA-TALN 2023. Actes de la 30e
Conférence sur le Traitement Automatique des Langues Naturelles (TALN),
volume 6 : projets, 37–42, Paris, France. ATALA.
Levshina, Natalia. 2019. Token-based typology and word order entropy: A
study based on Universal Dependencies. Linguistic Typology 23(3),
533-572. https://doi.org/10.1515/lingty-2019-0025
<https://doi.org/10.1515/lingty-2019-0025>
Levshina, Natalia, Savithry Namboodiripad, Marc Allassonnière-Tang,
Mathew Alex Kramer, Luigi Talamo, Annemarie Verkerk, Sasha Wilmoth et
al. 2023. Why We Need a Gradient Approach to Word Order. Linguistics
61(4), 825–883. https://doi.org/10.31234/osf.io/yg9bf
<https://doi.org/10.31234/osf.io/yg9bf>.
Liu, Zoey. 2021. The Crosslinguistic Relationship between Ordering
Flexibility and Dependency Length Minimization: A Data-Driven Approach.
In Proceedings of the Society for Computation in Linguistics: Vol. 4,
Article 25. https://doi.org/10.7275/xt42-4282
<https://doi.org/10.7275/xt42-4282>__
Marneffe, Marie-Catherine de, Christopher D. Manning, Joakim Nivre, and
Daniel Zeman. ‘Universal Dependencies’. /Computational Linguistics/ 47,
no. 2 (20 May 2021): 255–308. https://doi.org/10.1162/coli_a_00402
<https://doi.org/10.1162/coli_a_00402>.
Osborne, Timothy & Kim Gerdes. 2019. The status of function words in
dependency grammar: A critique of Universal Dependencies (UD). Glossa: a
journal of general linguistics 4(1): 17. doi:
https://doi.org/10.5334/gjgl.537
Sinnemäki, Kaius & Viljami Haakana. 2020. Variation in Universal
Dependencies Annotation: A Token-Based Typological Case Study on
Adpossessive Constructions. In Proceedings of the Fourth Workshop on
Universal Dependencies (UDW 2020), edited by Marie-Catherine de
Marneffe, Miryam de Lhoneux, Joakim Nivre, Sebastian Schuster, 158–167.
Talamo, Luigi & Annemarie Verkerk. 2022. A new methodology for an old
problem. Italian Journal of Linguistics, 34(2), 171-226.
Tran, Ke & Bisazza, Arianna. 2019. Zero-shot Dependency Parsing with
Pre-trained Multilingual Sentence Representations. In Proceedings of the
2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo),
edited by Colin Cherry, Greg Durrett, George Foster, Reza Haffari,
Shahram Khadivi, Nanyun Peng, Xiang Ren, Swabha Swayamdipta, 281–288.
Üstün, Ahmet, Arianna Bisazza, Gosse Bouma & Gertjan van Noord. 2022.
UDapter: Typology-based Language Adapters for Multilingual Dependency
Parsing and Sequence Labeling. Computational Linguistics. 48. 1-37.
Yan, Jianwei and Haitao Liu. 2019. Which annotation scheme is more
expedient to measure syntactic difficulty and cognitive demand?. In
Proceedings of the First Workshop on Quantitative Syntax (Quasy,
SyntaxFest 2019), 16–24, Paris, France. Association for Computational
Linguistics.
Yan, Jianwei & Haitao Liu. 2023. Basic word order typology revisited: a
crosslinguistic quantitative study based on UD and WALS. Linguistics
Vanguard. https://doi.org/10.1515/lingvan-2021-0001
Zariquiey, Roberto, Arturo Oncevay & Javier Vera. 2022. CLD² Language
Documentation Meets Natural Language Processing for Revitalising
Endangered Languages. In Proceedings of the Fifth Workshop on the Use of
Computational Methods in the Study of Endangered Languages, 20–30,
Dublin, Ireland. Association for Computational Linguistics.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20240130/4f582ae1/attachment.htm>
More information about the Lingtyp
mailing list