<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Dear Jürgen and others,<br>
</p>
<p>I think this is one of the major methodological problems of
linguistic typology (which, if I remember correctly, has been
discussed on this list before). There's no single 'correct' way of
analysing a language. Two linguists working on the same language
will often provide very different analyses, and both may be right
in their own ways. It starts with phonology, where you have a lot
of degrees of freedom in, for instance, minimizing or maximizing
phoneme inventories (e.g. by [not] introducing phonological
domains and features operating on these domains), and it gets
worse in morphology, specifically if there is distributed
exponence and other complexities of this type. At the level of
syntax the impact of the specific theoretical background can be
seen, for instance, in publications using the UD corpora. These
corpora were annotated with a specific version of dependency
grammar, I think essentially for pragmatic reasons (dependency
grammar was very popular among computational linguists for a
while). The theorerical assumptions of the annotation model
obviously have an impact on the results (just think of the very
old discussion of what a 'subject' is, represented as the 'nsubj'
relation in the UD annotations).<br>
</p>
<p>For many languages we only have one description, and the linguist
describing it comes from a specific background or 'school' (and
these schools are often associated with particular areas and
particular phylogenetic groupings, introducing further biases of
the type you mention). Again, the effects are visible at the level
of phonology already. For example, the Papuan language Idi could
be described as having just three vowels, or as having nine vowels
(perhaps even more), depending on your assumptions about
phonotactics etc. (There's a published analysis of that language,
by D. Schokkin, N. Evans, C. Döhler and me, but the analysis
really reflects some kind of compromise between the authors, and
it leaves a few non-trivial questions open.)<br>
</p>
<p>The specific linguist and their school or background is a source
of statistical non-independence. Even relying on exactly one
description per language, and having the data coded by several
researchers, often leads to low inter-annotator agreement in my
experience.</p>
<p>I think we need to be aware that typological data is behavioural
data at three layers: (i) language is a behavioural activity, (ii)
describing a language is a behavioural activity, and (iii)
extracting information from descriptions is another behavioural
activity. Variance occurs at all levels and is multiplied in the
process from (i) to (iii).</p>
<p>Approximately determining the amount of variance of that type
would be a major project. For instance, we could have five
undocumented (unstandardized) languages described by five
linguists each, using data from five different speakers per
language. Many will think that this would be a waste of resources,
given the number of (varieties) of languages that still await
description.</p>
<p>What follows from all this, in my view, is that we need to be
careful in applying statistical analyses "blindly". Linguistics is
not a natural science. Given the large amount of inherent variance
in typological data we linguists should remain in the driver's
seat and use quantitative typological evidence as an assistance
system, being aware of its limits and possibilities, rather than
take a back seat and let the autopilot drive.</p>
<p>Best,<br>
Volker</p>
<p><br>
</p>
<div class="moz-cite-prefix">Am 28.09.2024 um 20:17 schrieb Juergen
Bohnemeyer via Lingtyp:<br>
</div>
<blockquote type="cite"
cite="mid:SJ0PR15MB4696A146401E096FD352F9D7DD742@SJ0PR15MB4696.namprd15.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator"
content="Microsoft Word 15 (filtered medium)">
<style>@font-face
{font-family:Helvetica;
panose-1:0 0 0 0 0 0 0 0 0 0;}@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}@font-face
{font-family:"CMU Serif";
panose-1:2 0 6 3 0 0 0 0 0 0;}@font-face
{font-family:"Times New Roman \(Body CS\)";
panose-1:2 11 6 4 2 2 2 2 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;
mso-ligatures:standardcontextual;}span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"CMU Serif";
color:windowtext;
font-weight:normal;
font-style:normal;}.MsoChpDefault
{mso-style-type:export-only;
font-size:11.0pt;}div.WordSection1
{page:WordSection1;}</style>
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif"">Dear
all – I’m wondering whether anybody has attempted to
estimate the size of the following putative effect on
descriptive and typological research:<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif"">Suppose
there is a particular phenomenon in Language L, the known
properties of which are equally compatible with an analysis
in terms of construction types (comparative concepts) A and
B.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif"">Suppose
furthermore that L belongs to a language family and/or
linguistic area such that A has much more commonly been
invoked in descriptions of languages of that family/area
than B.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif"">Then
to the extent that a researcher attempting to adjudicate
between A and B wrt. L (whether in a description of L, in a
typological study, or in coding for an evolving typological
database) is aware of the prevalence of A-coding/analyses
for languages of the family/area in question, that might
make them more likely to code/analyze L as exhibiting A as
well.
<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif"">So
for example, a researcher who assumes languages of the
family/area of L to be typically tenseless may be influenced
by this assumption and as a result become (however slightly)
more likely to treat L as tenseless as well. In contrast, if
she assumes languages of the family/area of L to be
typically tensed, that might make her ever so slightly more
likely to analyze L also as tensed.
<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif"">It
seems to me that this is a cognitive bias related to, and
possibly a case of, essentialism. (And just as in the case
of (other forms of) essentialism, the actual cognitive
causes/mechanisms of the bias may vary.)<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif"">But
regardless, my question is, again, has anybody tried to
guestimate to what extent the results of current typological
studies may be warped by this kind of researcher bias? (Note
that the bias may be affecting both authors of descriptive
work and typologists using descriptive work as data, so
there is a possible double-whammy effect.)<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif"">Thanks!
– Juergen<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"CMU Serif""><o:p> </o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span
style="font-size:9.0pt;font-family:Helvetica;color:black;mso-ligatures:none">Juergen
Bohnemeyer (He/Him)<br>
Professor, Department of Linguistics<br>
University at Buffalo <br>
<br>
Office: 642 Baldy Hall, UB North Campus<br>
Mailing address: 609 Baldy Hall, Buffalo, NY 14260 <br>
Phone: (716) 645 0127 <br>
Fax: (716) 645 3825<br>
Email: </span><span
style="font-family:"Calibri",sans-serif;mso-ligatures:none"><a
href="mailto:jb77@buffalo.edu"
title="mailto:jb77@buffalo.edu" moz-do-not-send="true"><span
style="font-size:9.0pt;font-family:Helvetica;color:#0078D4">jb77@buffalo.edu</span></a></span><span
style="font-size:9.0pt;font-family:Helvetica;color:black;mso-ligatures:none"><br>
Web: </span><span
style="font-family:"Calibri",sans-serif;mso-ligatures:none"><a
href="http://www.acsu.buffalo.edu/~jb77/"
title="http://www.acsu.buffalo.edu/~jb77/"
moz-do-not-send="true"><span
style="font-size:9.0pt;font-family:Helvetica;color:#0563C1">http://www.acsu.buffalo.edu/~jb77/</span></a></span><span
style="font-size:9.0pt;font-family:Helvetica;color:black;mso-ligatures:none"> <br>
<br>
</span><span
style="font-family:"Calibri",sans-serif;color:black;mso-ligatures:none">Office
hours Tu/Th 3:30-4:30pm in 642 Baldy or via Zoom
(Meeting ID 585 520 2411; Passcode Hoorheh) </span><span
style="font-size:9.0pt;font-family:Helvetica;color:black;mso-ligatures:none"><br>
<br>
There’s A Crack In Everything - That’s How The Light
Gets In <br>
(Leonard Cohen) </span><span
style="font-family:"Calibri",sans-serif;mso-ligatures:none"><o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-family:"Calibri",sans-serif;mso-ligatures:none">-- <o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><span lang="DE"><o:p> </o:p></span></p>
</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre wrap="" class="moz-quote-pre">_______________________________________________
Lingtyp mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Lingtyp@listserv.linguistlist.org">Lingtyp@listserv.linguistlist.org</a>
<a class="moz-txt-link-freetext" href="https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp">https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp</a>
</pre>
</blockquote>
</body>
</html>