<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Thanks, Mark, for bringing up this concrete example! Indeed, such

      questions often arise in comparative work, both in phonology and

      morphosyntax. But I think the answer is always the same:

      Comparison cannot be in terms of structural CONTRASTS, but must

      ultimately be in terms of (phonetic and conceptual-functional)

      SUBSTANCE.</p>

    <p>I can recommend the following two articles. The first deals with

      meanings and diachrony, though Bybee has also argued for phonetic

      substance as key to understanding phonology. The second is more

      general.<br>

    </p>

    <div class="csl-bib-body"

      style="line-height: 1.35; margin-left: 2em; text-indent:-2em;">

      <div class="csl-entry">Bybee, Joan L. 1988. Semantic substance vs.

        contrast in the development of grammatical meaning. <i>Berkeley

          Linguistics Society</i> 14. 247–264.</div>

      <span class="Z3988"

title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Semantic%20substance%20vs.%20contrast%20in%20the%20development%20of%20grammatical%20meaning&rft.jtitle=Berkeley%20Linguistics%20Society&rft.volume=14&rft.aufirst=Joan%20L.&rft.aulast=Bybee&rft.au=Joan%20L.%20Bybee&rft.date=1988&rft.pages=247%E2%80%93264&rft.spage=247&rft.epage=264"></span>

    </div>

    <p></p>

    <div class="csl-bib-body"

      style="line-height: 1.35; margin-left: 2em; text-indent:-2em;">

      <div class="csl-entry">Boye, Kasper & Engberg-Pedersen,

        Elisabeth. 2016. Substance and structure in linguistics. <i>Acta

          Linguistica Hafniensia</i> 48(1). 5–6. (doi:<a

          href="https://doi.org/10.1080/03740463.2016.1202014">10.1080/03740463.2016.1202014</a>)</div>

      <span class="Z3988"

title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_id=info%3Adoi%2F10.1080%2F03740463.2016.1202014&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Substance%20and%20structure%20in%20linguistics&rft.jtitle=Acta%20Linguistica%20Hafniensia&rft.volume=48&rft.issue=1&rft.aufirst=Kasper&rft.aulast=Boye&rft.au=Kasper%20Boye&rft.au=Elisabeth%20Engberg-Pedersen&rft.date=2016-01-02&rft.pages=5-6&rft.spage=5&rft.epage=6&rft.issn=0374-0463"></span>

    </div>

    <p></p>

    <p>Specifically for phonology, there is a 2018 book on typology

      edited by Larry Hyman and Frans Plank

      (<a class="moz-txt-link-freetext" href="https://www.degruyter.com/document/doi/10.1515/9783110451931/html">https://www.degruyter.com/document/doi/10.1515/9783110451931/html</a>),

      which includes papers by Kiparsky and Maddieson that discuss the

      conceptual foundations of phonological comparison.</p>

    <p>Kiparsky says that "there are no non-analytic universals of

      language. All universals are analytic, and their validity often

      turns on a set of critical cases where different solutions can be

      and have been entertained", though confusingly, he says

      "descriptive" instead of "non-analytic". The issues are discussed

      further in my 2019 blogpost: <a class="moz-txt-link-freetext" href="https://dlc.hypotheses.org/1817">https://dlc.hypotheses.org/1817</a></p>

    <p>(I admit, however, that I'm not sure what exactly this means for

      the typology of "tone" and "obstruent (breathy) voicing". It may

      ultimately mean that traditional typologies in terms of these

      notions need to be revised quite thoroughly.)<br>

    </p>

    <p>Best,</p>

    <p>Martin (Haspelmath)<br>

    </p>

    <div class="moz-cite-prefix">On 30.09.24 23:04, Mark Donohue wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAKyRuzRCUx1JB7PcP+UFgTNG0dMtX9jmUPcYN5WsJ5yARYAB9A@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">I have to disagree about the point that "the

        *classifications* should not be different if the different

        linguists have access to the same information".

        <div><br>

        </div>

        <div>In many Himalayan languages low tone is associated with

          breathy voice, and voicing is (stochastically) predictable

          from tone.</div>

        <div>The one language can then be analysed (has been analysed)

          by linguists from different descriptive backgrounds as having</div>

        <div><br>

        </div>

        <div>a. ph vs. p consonant manners, with contrastive tone, </div>

        <div>or</div>

        <div>b. ph vs. p vs. b vs. bh consonant manners, with no

          contrastive tone.</div>

        <div><br>

        </div>

        <div>Under a., the language is classified as having tone.</div>

        <div>Under b., the language is not classified as having tone.</div>

        <div><br>

        </div>

        <div>I'm thinking of Tamang.</div>

        <div><br>

        </div>

        <div>

          <p class="MsoNormal"

style="margin:0cm 0cm 7.2pt 17pt;color:rgb(0,0,0);line-height:15pt"><span

              lang="FR"><font face="arial, sans-serif">Mazaudon,

                Martine. 1973. Phonologie Tamang: Etude phonologique du

                dialecte tamang de Risiangky (langue tibéto-birmane du

                Népal). Paris: Centre National de la Recherche

                Scientifique, Société d‘Études Linguistiques et

                Anthropologiques de France.</font></span></p>

          <p class="MsoNormal"

style="margin:0cm 0cm 7.2pt 17pt;color:rgb(0,0,0);line-height:15pt"><span

              lang="FR"><font face="arial, sans-serif">Michaud, Alexis,

                and Martine Mazaudon. 2006. Pitch and voice quality

                characteristics of the lexical word-tones of Tamang, as

                compared with level tones (Naxi data) and

                pitch-plus-voice-quality tones (Vietnamese data).<span

                  class="gmail-Apple-converted-space"> </span><i>Proceedings

                  of Speech Prosody 2006, Dresden</i>, 823-826.

                Available online at: <a

href="https://sprosig.org/sp2006/contents/papers/PS7-18_0137.pdf"

                  moz-do-not-send="true" class="moz-txt-link-freetext">https://sprosig.org/sp2006/contents/papers/PS7-18_0137.pdf</a>.</font></span></p>

        </div>

        <div>

          <p class="MsoNormal"

style="margin:0cm 0cm 7.2pt 17pt;color:rgb(0,0,0);line-height:15pt"><font

              face="arial, sans-serif">Poudel, Kedar Prasad. 2006.<span

                class="gmail-Apple-converted-space"> </span><i>Dhankute

                Tamang Grammar</i>. Munich: Lincom Europa.</font></p>

        </div>

        <div><br>

        </div>

        <div>- Mark (Donohue)<br>

        </div>

        <div><br>

        </div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Mon, 30 Sept 2024 at 23:12,

          Martin Haspelmath via Lingtyp <<a

            href="mailto:lingtyp@listserv.linguistlist.org"

            moz-do-not-send="true" class="moz-txt-link-freetext">lingtyp@listserv.linguistlist.org</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote"

style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

          <div>

            <p>Of course, <span

                style="font-family:"CMU Serif"">"areal/phylogenetic

                researcher bias (APRB)" exists, and during the Grambank

                coding, I often heard Hedvig Skirgård talk about it as a

                potential issue. (I don't remember if it was addressed

                in a specific way, though.)</span></p>

            <p><span style="font-family:"CMU Serif"">I don't

                know if it can be measured somehow (given the enormous

                diversity of researcher traditions, I'm a bit

                skeptical), but I think it can be mitigated if we are

                aware that the purpose of comparative concepts in

                typology is NOT to provide *analyses* – rather, it is to

                enable us to *classify* languages.</span></p>

            <p><span style="font-family:"CMU Serif"">Volker

                Gast rightly says: "</span>Two linguists working on the

              same language will often provide very different analyses,

              and both may be right in their own ways."</p>

            <p>But while the *analyses* may well be different (because

              of the well-known non-uniqueness problem first highlighted

              by Yuen-Ren Chao in 1934: <a

                href="https://dlc.hypotheses.org/3381" target="_blank"

                moz-do-not-send="true" class="moz-txt-link-freetext">https://dlc.hypotheses.org/3381</a>),

              the *classifications* should not be different if the

              different linguists have access to the same information.</p>

            <p>I wrote about this in the following blogpost, where I

              note that the "difficulties of classification" that

              typologists talk about are typically due to the unclarity

              of the comparative concepts, not necessarily to lack of

              data: <a href="https://dlc.hypotheses.org/2528"

                target="_blank" moz-do-not-send="true"

                class="moz-txt-link-freetext">https://dlc.hypotheses.org/2528</a>.</p>

            <p>In practice, of course, different linguists do not have

              access to the same kinds of data, and subjectiveness

              cannot be excluded entirely. However, if we are careful to

              distinguish between analyses/descriptions (at the p-level)

              and classifications and cross-linguistic generalizations

              (at the g-level), some problems will go away.</p>

            <p>Best,</p>

            <p>Martin<br>

            </p>

            <div>On 29.09.24 12:41, Volker Gast via Lingtyp wrote:<br>

            </div>

            <blockquote type="cite">

              <p>Dear Jürgen and others,<br>

              </p>

              <p>I think this is one of the major methodological

                problems of linguistic typology (which, if I remember

                correctly, has been discussed on this list before).

                There's no single 'correct' way of analysing a language.

                Two linguists working on the same language will often

                provide very different analyses, and both may be right

                in their own ways. It starts with phonology, where you

                have a lot of degrees of freedom in, for instance,

                minimizing or maximizing phoneme inventories (e.g. by

                [not] introducing phonological domains and features

                operating on these domains), and it gets worse in

                morphology, specifically if there is distributed

                exponence and other complexities of this type. At the

                level of syntax the impact of the specific theoretical

                background can be seen, for instance, in publications

                using the UD corpora. These corpora were annotated with

                a specific version of dependency grammar, I think

                essentially for pragmatic reasons (dependency grammar

                was very popular among computational linguists for a

                while). The theorerical assumptions of the annotation

                model obviously have an impact on the results (just

                think of the very old discussion of what a 'subject' is,

                represented as the 'nsubj' relation in the UD

                annotations).<br>

              </p>

              <p>For many languages we only have one description, and

                the linguist describing it comes from a specific

                background or 'school' (and these schools are often

                associated with particular areas and particular

                phylogenetic groupings, introducing further biases of

                the type you mention). Again, the effects are visible at

                the level of phonology already. For example, the Papuan

                language Idi could be described as having just three

                vowels, or as having nine vowels (perhaps even more),

                depending on your assumptions about phonotactics etc.

                (There's a published analysis of that language, by D.

                Schokkin, N. Evans, C. Döhler and me, but the analysis

                really reflects some kind of compromise between the

                authors, and it leaves a few non-trivial questions

                open.)<br>

              </p>

              <p>The specific linguist and their school or background is

                a source of statistical non-independence. Even relying

                on exactly one description per language, and having the

                data coded by several researchers, often leads to low

                inter-annotator agreement in my experience.</p>

              <p>I think we need to be aware that typological data is

                behavioural data at three layers: (i) language is a

                behavioural activity, (ii) describing a language is a

                behavioural activity, and (iii) extracting information

                from descriptions is another behavioural activity.

                Variance occurs at all levels and is multiplied in the

                process from (i) to (iii).</p>

              <p>Approximately determining the amount of variance of

                that type would be a major project. For instance, we

                could have five undocumented (unstandardized) languages

                described by five linguists each, using data from five

                different speakers per language. Many will think that

                this would be a waste of resources, given the number of

                (varieties) of languages that still await description.</p>

              <p>What follows from all this, in my view, is that we need

                to be careful in applying statistical analyses

                "blindly". Linguistics is not a natural science. Given

                the large amount of inherent variance in typological

                data we linguists should remain in the driver's seat and

                use quantitative typological evidence as an assistance

                system, being aware of its limits and possibilities,

                rather than take a back seat and let the autopilot

                drive.</p>

              <p>Best,<br>

                Volker (Gast)<br>

              </p>

              <p><br>

              </p>

              <div>Am 28.09.2024 um 20:17 schrieb Juergen Bohnemeyer via

                Lingtyp:<br>

              </div>

              <blockquote type="cite">

                <div>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif"">Dear all – I’m

                      wondering whether anybody has attempted to

                      estimate the size of the following putative effect

                      on descriptive and typological research:</span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif"">Suppose there

                      is a particular phenomenon in Language L, the

                      known properties of which are equally compatible

                      with an analysis in terms of construction types

                      (comparative concepts) A and B.</span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif"">Suppose

                      furthermore that L belongs to a language family

                      and/or linguistic area such that A has much more

                      commonly been invoked in descriptions of languages

                      of that family/area than B.</span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif"">Then to the

                      extent that a researcher attempting to adjudicate

                      between A and B wrt. L (whether in a description

                      of L, in a typological study, or in coding for an

                      evolving typological database) is aware of the

                      prevalence of A-coding/analyses for languages of

                      the family/area in question, that might make them

                      more likely to code/analyze L as exhibiting A as

                      well. </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif"">So for example,

                      a researcher who assumes languages of the

                      family/area of L to be typically tenseless may be

                      influenced by this assumption and as a result

                      become (however slightly) more likely to treat L

                      as tenseless as well. In contrast, if she assumes

                      languages of the family/area of L to be typically

                      tensed, that might make her ever so slightly more

                      likely to analyze L also as tensed. </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif"">It seems to me

                      that this is a cognitive bias related to, and

                      possibly a case of, essentialism. (And just as in

                      the case of (other forms of) essentialism, the

                      actual cognitive causes/mechanisms of the bias may

                      vary.)</span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif"">But regardless,

                      my question is, again, has anybody tried to

                      guestimate to what extent the results of current

                      typological studies may be warped by this kind of

                      researcher bias? (Note that the bias may be

                      affecting both authors of descriptive work and

                      typologists using descriptive work as data, so

                      there is a possible double-whammy effect.)</span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif"">Thanks! –

                      Juergen</span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <p class="MsoNormal"><span

style="font-size:12pt;font-family:"CMU Serif""> </span></p>

                  <div>

                    <div>

                      <p class="MsoNormal"><span

style="font-size:9pt;font-family:Helvetica;color:black">Juergen

                          Bohnemeyer (He/Him)<br>

                          Professor, Department of Linguistics<br>

                          University at Buffalo </span><span

                        style="white-space:pre-wrap">

</span></p>

                    </div>

                  </div>

                </div>

              </blockquote>

            </blockquote>

            <pre cols="72">-- 

Martin Haspelmath

Max Planck Institute for Evolutionary Anthropology

Deutscher Platz 6

D-04103 Leipzig

<a

href="https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/"

            target="_blank" moz-do-not-send="true"

            class="moz-txt-link-freetext">https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/</a></pre>

          </div>

          _______________________________________________<br>

          Lingtyp mailing list<br>

          <a href="mailto:Lingtyp@listserv.linguistlist.org"

            target="_blank" moz-do-not-send="true"

            class="moz-txt-link-freetext">Lingtyp@listserv.linguistlist.org</a><br>

          <a

href="https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp"

            rel="noreferrer" target="_blank" moz-do-not-send="true"

            class="moz-txt-link-freetext">https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp</a><br>

        </blockquote>

      </div>

    </blockquote>

    <pre class="moz-signature" cols="72">-- 

Martin Haspelmath

Max Planck Institute for Evolutionary Anthropology

Deutscher Platz 6

D-04103 Leipzig

<a class="moz-txt-link-freetext" href="https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/">https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/</a></pre>

  </body>

</html>