[Lingtyp] Call for papers ALT 2022 Workshop: Spoken-and signed-language corpus studies in linguistic typology

Danielle Barth Danielle.Barth at anu.edu.au
Thu Feb 3 22:30:16 UTC 2022


ALT 2022 Workshop
Spoken-and signed-language corpus studies in linguistic typology
Organizers: Danielle Barth, Ludger Paschen, François Pellegrino, Matt Stave, Stefan Schnell & Frank Seifart


Recent years have seen a surge in corpus-based research in linguistic typology (cf. Levshina 2021a;
Schnell & Schiborr 2022 for recent overviews). One prominent line of work takes up long-standing
questions in linguistic typology, e.g. in word order typology (Greenberg 1963), which is notoriously
subject to intra-language variation (cf. Futrell et al 2020; Gerdes et al. 2020; Wälchli 2009), and
where variability is found to be constrained by principles of efficiency (Ying et al 2021; Levshina 2019;
2021b; Blasi et al. 2019; Futrell et al 2015; 2020). Another such example is marking asymmetries
(Greenberg 1966), which some corpus studies explain in terms of the form-frequency correspondence
hypothesis (Haspelmath 2021), much in the spirit of Zipf (1935). What most of these studies have in
common is that they draw primarily on multilingual written-language corpora, e.g. Universal
Dependencies (UDs 2.8 Zeman et al 2021), the Parallel Bible Corpus (PBC; Mayer & Cysouw 2014), or
the Universal Declaration of Human Rights (UDHR; cf. Bentz & Ferrer-i-Cancho 2016). This is not ideal
since their overall conclusions in terms of efficiency often make reference to properties of language
usage more typical of online spoken-and signed-language production, e.g. noisy channels, efficiency
of production vis-à-vis security of information transmission, time pressure on structure planning and
recognition, and the role of an altruistic producer.

This supremacy of written-corpora studies in typology is largely due to the limited availability of
multilingual spoken-language corpora, which are by nature less accessible, less comparable, and
more complex to process. Consequently, typological studies based on spoken-language corpora are
still comparatively rare. Notable exceptions include Bickel (2003) and Stoll & Bickel's (2009)
studies of referential density, as well as research into sentence structure and information packaging,

e.g. the recent critical assessments of Du Bois' (1987) seminal hypothesis of preferred argument
structure (cf. Schnell et al. 2021; Haig & Schnell 2016). Some recent research also takes spoken-
language prosodic patterns into account, e.g. alternations in speech rate and disfluencies such as
pausing. While these latter phenomena are typically not in the purview of typology as such, findings
of speech rate alternations and disfluencies are relevant for typology in explaining tendencies of
structural unit formation, e.g. the suffixing preference (Seifart et al 2018; Seifart & Bickel 2017;
Himmelmann 2014; Bybee et al. 1990; Cutler et al 1985), syllable complexity (Coupé et al. 2019),
prosodic units (Seifart et al. 2021), as well as information packaging from an interactional perspective
(Ozerov to appear, 2021). Signed-language corpora are to date even more underrepresented, but see
Schembri et al. (2005) for a comparative study of motion verbs and more recently Hodge et al. (2019)
for a study of reference production in Auslan. These recent studies illustrate a turning point in
linguistic typology, following several initiatives aimed at building, documenting, and sharing
multilingual databases consisting of spoken-and signed-language corpora. The stakes of this
fundamental change are at the heart of this workshop proposal.
Objective

Their seminal insights notwithstanding, findings based predominantly on written corpora cannot be
immediately related to general considerations of efficiency of communication. Rather, observed
regularities are more likely to be carried over from habits of spontaneous language production,
mostly in spoken or signed mode. Therefore, the purpose of this workshop is to provide a forum for
discussion of ongoing corpus-based typological work that is primarily based on spoken-or signed-
language corpora, and that addresses questions of efficiency, information-theoretic considerations in
language production, cross-linguistic phonetic-prosodic patterns, the cognitive-articulatory basis of
speech production, etc. Also relevant here are studies that address specific comparisons of mode
differences. A further goal is to bring to the fore the key role of language acquisition studies of
spoken and signed languages, including languages that have received much less attention in
acquisition research than better-studied languages (but see Moran et al 2016; Stoll & Lieven 2014).
This workshop will also address methodological and epistemic questions such as the bottom-up
perspective of corpus-based typology that it shares with documentary linguistics (Himmelmann
1998) as well as modern distributional typology and its emphasis on fine-grained low-level categories
(Bickel 2015, 2009, 2007; Levinson & Evans 2010). Likewise, the role of external factors and intra-vs.
inter-language variation, e.g. idiolectal or register variation, are to be addressed (cf. Barth et al to
appear). Finally, most languages are only spoken or signed, not written. Hence this workshop will
also address new insights from hitherto mostly unknown languages as well as specific challenges of
spoken-and signed-language corpus building and processing.


The purpose of this workshop is to provide a forum for discussion of ongoing corpus-based
typological work that is primarily based on spoken-or signed-language corpora, and that addresses
questions of efficiency, information-theoretic considerations in language production, cross-linguistic
phonetic-prosodic patterns, the cognitive-articulatory basis of speech production, etc. Also relevant
are studies that address specific comparisons of mode differences. A further goal is to bring to the
fore the key role of language acquisition studies of spoken and signed languages. This workshop will
showcase the enormous relevance of spoken-and signed language production research for linguistic
typology and to establish an ongoing exchange between researchers in this area and beyond on
matters of development and analysis of spoken and signed corpora. We welcome contributions of
the following kind and on related topics:

* Comparative studies into any area of linguistic and discourse structure based on spoken-or
signed-language corpora

* Chunking of speech/signing and its relationship with discourse planning and/or the formation
of linguistic units

* Studies in conversation analysis in spoken and/or signed languages

* Corpus-based studies of language acquisition of under-studied spoken or signed languages,
or comparisons thereof

* Comparative studies that explicitly address mode differences and their effects on language
production and evolution

* Comparative corpus-phonetic studies addressing typological variation or putative universals
* Reports on (multilingual) corpus building projects (should include showcase/proof-of concept
study)

* We welcome studies using any spoken or signed corpora, including, but not limited to,
corpora from Multi-CAST (https://multicast.aspra.uni-bamberg.de), DoReCo
(http://doreco.info/), or SCOPIC (https://scopicproject.wordpress.com). Note that these
three will be significantly enhanced by addition of new languages in the course of 2022.
Please contact the organizers for updates.

Abstract specifications (from ALT):


·         Abstract submission deadline: 1 April 2022


·         Abstracts must be anonymous


·         Abstracts should be at a maximum length of one single-spaced page, 12pt font, with another page (at maximum) for references and examples.


·         Please put this information at the top of your abstract: abstract title; abstract category: oral; workshop title: Spoken-and signed-language corpus studies in linguistic typology

More information on submitting an abstract and the conference can be found on the general ALT 2022 abstract submission page: https://sites.google.com/view/alt2022/call-for-papers


For more information, please contact the organizers:
danielle.barth at anu.edu.au, paschen at leibniz-zas.de, francois.pellegrino at univ-lyon2.fr, matthew.stave at crns.fr, stefan.schnell at uzh.ch, seifart at leibniz-zas.de


References

Barth D, Evans N, Arka I W, Bergqvist H, Forker D, Gipper S, Hodge G, Kashima E, Kasuga Y, Kawakami C, Kimoto Y, Knuchel D, Kogura N, Kurabe K, Mansfield J, Narrog H, Pratiwi D P E, van Putten S, Senge C, Tykhostup O. 2021. Language vs. individuals in cross-linguistic corpus typology. In G Haig, S Schnell, & F Seifart (eds.), Doing corpus-based typology with spoken language data: State of the art, 179-232. Honolulu, HI: University of Hawai'i Press.

Bentz C, Ferrer-i-Cancho R. 2016. Zipf's law of abbreviation as a language universal. In Proceedings of the Leiden Workshop on Capturing Phylogenetic Algorithms for Linguistics, ed. C Bentz, G Jager, I Yanovich, Univ. Tubingen, Ger., accessed on Jul. 12, 2021. https://publikationen.uni-tuebingen.de/xmlui/handle/10900/68558

Bickel B. 2003. Referential density in discourse and syntactic typology. Language 79(4):708-36

Bickel B. 2007. Typology in the 21st century: major current developments. Linguist. Typology 11(1):239-51

Bickel B. 2009. Typological patterns and hidden diversity. In: Plenary Talk, 8th Association for Linguistic Typology Conference, Berkeley, CA, http://www.unileipzig.de/_bickel/research/presentations/alt2009bickel-plenary.html.

Bickel B. 2015. Distributional typology: statistical inquiries into the dynamics of linguistic diversity. In: B Heine, H Narrog (Eds.), The Oxford handbook of linguistic analysis, 2nd edition, 901 - 923. Oxford: Oxford University Press, 901 -923.

Blasi DE, Cotterell R,Wolf-Sonkin L, Stoll S, Bickel B, Baroni M. 2019. On the distribution of deep clausal embeddings: a large cross-linguistic study. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3938-43. Stroudsburg, PA: Assoc. Comput. Linguist.

Coupé C, Oh Y, Dediu D, Pellegrino F. 2019. Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche. Science Advances 5(9): eaaw2594.

Du Bois JW. 1987. The discourse basis of ergativity. Language 63(4):805-55.

Futrell R, Levy RP, Gibson E. 2020. Dependency locality as an explanatory principle for word order. Language 96(2):371-412.

Futrell R, Mahowald K, Gibson E. 2015. Quantifying word order freedom in dependency corpora. In Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015), pp. 91-100, Uppsala, Sweden, Aug. 24-26

Gerdes K, Kahane S, Chen X. 2021. Typometrics: from implicational to quantitative universals in word order typology. Glossa 6(1). 17 pp.

Greenberg JH. 1963. Some universals of grammar with particular reference to the order of meaningful elements. In Universals of Grammar, ed. J Greenberg, pp. 73-113. Cambridge, MA: MIT Press.

Greenberg JH. 1966. Language Universals, with Special Reference to Feature Hierarchies. The Hague, Neth.: Mouton

Haig G, Schnell S. 2016. The discourse basis of ergativity revisited. Language 92(3):591-618

Haspelmath M. 2021. Explaining grammatical coding asymmetries: form-frequency correspondences and predictability. J. Linguist. 57(3):605-33

Himmelmann NP. 1998. Documentary and descriptive linguistics. Linguistics 36(2):161-95

Himmelmann NP. 2014. Asymmetries in the prosodic phrasing of function words: another look at the suffixing preference. Language 90(4):927-60

Hodge G, Ferrara L, Anible B. 2019. The semiotic diversity of doing reference in a deaf sign language. Journal of Pragmatics 143:33-52. https://doi.org/10.1016/j.pragma.2019.01.025

Levinson SC, Evans N. 2010. Time for a sea change in linguistics: Response to comments on 'The myth of language universals'. Lingua 120, 2733 - 1758.

Levshina N. 2019. Token-based typology and word order entropy: a study based on universal dependencies. Linguist. Typology 23(3):533-72

Levshina N. 2021a. Corpus-based typology: applications, challenges, and some solutions. Linguistic Typology aop. https://doi.org/10.1515/lingty-2020-0118

Levshina N. 2021b. Cross-linguistic trade-offs and causal relationships between cues to grammatical subject and object, and the problem of efficiency-related explanations. Frontiers in Psychology https://doi.org/10.3389/fpsyg.2021.648200

Mayer T, Cysouw M. 2014. Creating a massively parallel Bible corpus. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 3148-63, May 26-31, Reykjavik

Ozerov, P. 2021. This research topic of yours - Is it a research topic at all? Using comparative interactional data for a fine-grained reanalysis of traditional concepts. In G Haig, S Schnell, & F Seifart (eds.), Doing corpus-based typology with spoken language data: State of the art, 233-280. Honolulu, HI: University of Hawai'i Press.

Ozerov P. 2021. Multifactorial information management (MIM): summing up the emerging alternative to information structure. Linguist. Vanguard 7(1):20200039

Schembri A, Jones C, Burnham D. 2005. Comparing action gestures and classifier verbs of motion: Evidence from Australian Sign Language, Taiwan Sign Language, and nonsigners' gestures without speech. Journal of Deaf Studies and Deaf Education 10(3):272-290. https://doi.org/10.1093/deafed/eni029

Schnell S, Schiborr NN, Haig G. 2021. Efficiency in discourse processing: Does morphosyntax adapt to accommodate new referents? Linguist. Vanguard 7(s3):20190064

Schnell S., Schiborr, NN. to appear/2022. Cross-linguistic corpus studies in linguistic typology. Annual Review of Linguistics 8.

Seifart F, Strunk J, Danielsen S, Hartmann I, Pakendorf B, et al. 2018. Nouns slow down speech across structurally and culturally diverse languages. PNAS 115(22):5720-25

Seifart F, Strunk J, Danielsen S, Hartmann I, Pakendorf B, Wichmann S, et al. 2021. The extent and degree of utterance-final word lengthening in spontaneous speech from 10 languages. Linguistics Vanguard 7.

Stoll S, Bickel B. 2009. How deep are differences in referential density? In Crosslinguistic Approaches to the Psychology of Language: Research in the Tradition of Dan Isaac Slobin, ed. J Guo, E Lieven, N Budwig, S Ervin-Tripp, K Nakamura, S Özçalışkan, pp. 543-55. London: Psychol. Press

Walchli B. 2009. Data reduction typology and the bimodal distribution bias. Linguist. Typology 13:77-94

Zeman D, Nivre J, Abrams M, Ackermann E, Aepli N, et al. 2021. Universal Dependencies 2.8. Prague: Universal Dependencies Consortium. https://universaldependencies.org/

Zipf GK. 1935. The Psycho-Biology of Language: An Introduction to Dynamic Philology. Cambridge, MA: MIT Press


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20220203/166c843d/attachment.htm>


More information about the Lingtyp mailing list