<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Many thanks, Guillaume, for your post and the link to your paper

      (Pellard et al.), which looks very useful.</p>

    <p>As you say, the reliability of these studies hinges on the

      cognate coding, which is done manually, by humans with their

      biases. I'm wondering if there is a way to measure the degree to

      which different linguists agree or not (by some kind of kappa

      statistic), and a way to identify or exclude systematic biases

      (which are part of normal human behaviour). Another thing that I

      worry about is that grammatical markers (even demonstratives and

      interrogatives) are ignored (see the list of 170 comparison

      meanings in IE-COR: <a class="moz-txt-link-freetext" href="https://iecor.clld.org/parameters">https://iecor.clld.org/parameters</a>), even

      though we know that these are the most resistant to borrowing.

      Especially in closely related languages, it's very hard to

      distinguish lexical loanwords from inherited words, isn't it? (For

      example, Dutch begrijpen 'understand' is said to have been

      borrowed from German <a class="moz-txt-link-freetext" href="https://wold.clld.org/word/72181920155924122">https://wold.clld.org/word/72181920155924122</a>,

      but without the rich attestation of both languages since the

      Middle Ages, we wouldn't be able to tell.)</p>

    <p>So it is my feeling that looking at unrelated languages is much

      safer in typology. And I don't understand why Simon Greenhill said

      (about the proposal to sample only one language from a family):<br>

    </p>

    <p><span style="white-space: pre-wrap"><font size="2">"But then what does this mean when you take one language from a family like Austronesian with ~1300 languages and a one from a family like Eastern Trans-Fly with 4 languages. This means that you've sampled 0.0007% of Austronesian but 1/4 of ETF. This feels wrong."</font></span></p>

    <p><span style="white-space: pre-wrap">It doesn't feel wrong to me at all, just as it doesn't feel wrong to treat large languages like Russian in the same way as small languages like Sorbian. They have many more speakers, but these speakers are not independent of each other; in the same way, Austronesian speakers are not independent of each other, so a genealogically stratified sample would have only one Austronesian language (one that is at least 30 languages away from Papuan languages).</span></p>

    <p><span style="white-space: pre-wrap">Best,</span></p>

    <p><span style="white-space: pre-wrap">Martin

</span></p>

    <div class="moz-cite-prefix">On 07.11.23 09:33, Guillaume Jacques

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAAzt3zaW3KULvamMfePqSywjGZ8kXkcoMEXSqYYuiizz7e6xQA@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div dir="ltr">

          <div dir="ltr">

            <div dir="ltr">The consensus trees that are published in the

              articles on phylogeny is just the tip of the iceberg of

              the amount of information you can gain from these tree

              distributions, but for now there is no convenient

              interface to explore these data, and some knowledge of R

              or other languages is necessary. This forthcoming chapter

              presents a (hopefully) readable introduction to

              phylogenies for historical linguists: <a

href="https://www.academia.edu/101656989/The_Family_Tree_model"

                moz-do-not-send="true">(99+) The Family Tree model |

                Guillaume Jacques and Thomas Pellard - Academia.edu</a></div>

            <div class="gmail_quote">

              <div><br>

              </div>

              <div>In the end, what decides the reliability of these

                studies is the reliability of cognate coding, which

                means that historical linguistics specialized in

                meticulous etymologies and sound laws will play a

                crucial part, and should work collectively to produce

                better phylogenies, which typologists can then use to

                study the distribution of structural features through

                time and space.</div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <pre class="moz-signature" cols="72">-- 

Martin Haspelmath

Max Planck Institute for Evolutionary Anthropology

Deutscher Platz 6

D-04103 Leipzig

<a class="moz-txt-link-freetext" href="https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/">https://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/</a></pre>

  </body>

</html>