<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Dear Dorothee,<br>
    I just had a brief look at the Akan corpus. I'd be curious what
    guided your decision to come up with a custom XML based export
    format. The namespace URL <br>
    <pre id="line1"><span><a class="attribute-value">http://typecraft.org/typecraft</a></span></pre>
    doesn't seem to resolve, so I guess there is no schema defining the
    XML, right? We included (very basic) support for IGT in CLDF (see
    <a class="moz-txt-link-freetext" href="https://github.com/cldf/cldf/tree/master/components/examples">https://github.com/cldf/cldf/tree/master/components/examples</a>),
    because<br>
    - the examples we found in databases like WALS could be modeled in
    this simplistic form and<br>
    - CSV is better suited for tools like version control than XML<br>
    - we wanted to have IGT data available in the same format framework
    as other linguistic data to make links between data homogenous.<br>
    <br>
    We also discussed other IGT formats (see
    <a class="moz-txt-link-freetext" href="https://github.com/cldf/cldf/issues/10">https://github.com/cldf/cldf/issues/10</a>), among them XIGT
    (<a class="moz-txt-link-freetext" href="https://github.com/xigt/xigt">https://github.com/xigt/xigt</a>), which is also an XML format. Did you
    look at XIGT, and if so, why was it not suitable as export format
    for TypeCraft?<br>
    <br>
    best<br>
    robert<br>
    <br>
    <br>
    <div class="moz-cite-prefix">On 25.03.2018 16:51, Dorothee Beermann
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:e92280a6-770e-0bb6-c4cd-000f8a36cb7e@ntnu.no">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <p>Dear all,</p>
      <p>I have followed the discussion on this thread with interest.
        Let me ask you, would any of what you discuss and suggest here
        also apply to Interlinear Glossed Data?<br>
      </p>
      <p>Sebastian talked about making  "typological research more
        replicable". A related issue is reproducible research in
        linguists. I guess a good starting point for whatever we do as
        linguists is to keep things<br>
      </p>
      <div class="moz-forward-container">
        <p>transparent, and to give public access to data collections.
          Especially for languages with little to no public resources
          (except for what one finds in articles), this seems essential.<br>
        </p>
        <p>Here is an example of what I have in mind:  We just released
          41 Interlinear Glossed Texts in Akan. The data can be
          downloaded as XML from:</p>
        <p><a class="moz-txt-link-freetext"
            href="https://typecraft.org/tc2wiki/The_TypeCraft_Akan_Corpus"
            moz-do-not-send="true">https://typecraft.org/tc2wiki/The_TypeCraft_Akan_Corpus</a><br>
        </p>
        The corpus is described on the download page, and also in the
        notes contained in the download. (Note that we can offer the
        material in several other formats.) <br>
        <br>
        <br>
        Dorothee <br>
        <br>
        <font color="#999999" size="-1">Professor Dorothee Beermann, PhD<br>
          Norwegian University of Science and Technology (NTNU)<br>
          Dept. of Language and Literature<br>
          Surface mail to: NO-7491 Trondheim, Norway/Norge<br>
          <br>
          Visit: Building 4, level 5, room 4512, Dragvoll,<br>
          E-mail:  <a class="moz-txt-link-abbreviated"
            href="mailto:dorothee.beermann@ntnu.no"
            moz-do-not-send="true">dorothee.beermann@ntnu.no</a><br>
          <br>
          Homepage:<a class="moz-txt-link-freetext"
            href="http://www.ntnu.no/ansatte/dorothee.beermann"
            moz-do-not-send="true">http://www.ntnu.no/ansatte/dorothee.beermann</a><br>
          TypeCraft:<a class="moz-txt-link-freetext"
            href="http://typecraft.org/tc2wiki/User:Dorothee_Beermann"
            moz-do-not-send="true">http://typecraft.org/tc2wiki/User:Dorothee_Beermann</a><br>
        </font><br>
        <br>
        <br>
        <br>
        <br>
        -------- Forwarded Message --------
        <table class="moz-email-headers-table" border="0"
          cellspacing="0" cellpadding="0">
          <tbody>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:
              </th>
              <td>Re: [Lingtyp] Empirical standards in typology:
                incentives</td>
            </tr>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date:
              </th>
              <td>Fri, 23 Mar 2018 11:59:18 +1100</td>
            </tr>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">From:
              </th>
              <td>Hedvig Skirgård <a class="moz-txt-link-rfc2396E"
                  href="mailto:hedvig.skirgard@gmail.com"
                  moz-do-not-send="true"><hedvig.skirgard@gmail.com></a></td>
            </tr>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">To: </th>
              <td>Johanna NICHOLS <a class="moz-txt-link-rfc2396E"
                  href="mailto:johanna@berkeley.edu"
                  moz-do-not-send="true"><johanna@berkeley.edu></a></td>
            </tr>
            <tr>
              <th nowrap="nowrap" valign="BASELINE" align="RIGHT">CC: </th>
              <td>Linguistic Typology <a class="moz-txt-link-rfc2396E"
                  href="mailto:lingtyp@listserv.linguistlist.org"
                  moz-do-not-send="true"><lingtyp@listserv.linguistlist.org></a></td>
            </tr>
          </tbody>
        </table>
        <br>
        <br>
        <div dir="ltr">Dear all, 
          <div><br>
          </div>
          <div>I think Sebastian's suggestion is very good. </div>
          <div><br>
          </div>
          <div>Is this something LT would consider, Masja?</div>
          <div><br>
          </div>
          <div>Johanna's point is good as well, but it shouldn't matter
            for Sebastian's suggestion as I understand it. We're not
            being asked to submit the coding criteria prior to the
            survey being completed, but only at the time of publication.
            There are initiatives in STEM that encourages research teams
            to submit what they're planning to do prior to doing if (to
            avoid biases), but that's not baked into what Sebastian is
            suggestion, from what I can tell.</div>
          <div><br>
          </div>
          <div>I would also add a 4 star category which includes
            inter-coderreliabiity tests, i.e. the original author(s)
            have given different people the same instructions and tested
            how often they do the same thing with the same grammar.</div>
          <div><br>
          </div>
          <div>/Hedvig</div>
        </div>
        <div class="gmail_extra"><br clear="all">
          <div>
            <div class="gmail_signature"
              data-smartmail="gmail_signature">
              <div dir="ltr">
                <div>
                  <div dir="ltr">
                    <div dir="ltr">
                      <div dir="ltr">
                        <div dir="ltr">
                          <div dir="ltr">
                            <div dir="ltr">
                              <div dir="ltr">
                                <div dir="ltr">
                                  <div dir="ltr">
                                    <div dir="ltr">
                                      <p style="margin:0cm 0cm
                                        0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><span
                                          style="font-size:9pt"><b><br>
                                          </b></span></p>
                                      <p style="margin:0cm 0cm 0.0001pt"><font
                                          face="arial, helvetica,
                                          sans-serif" size="2"><b>Med
                                            vänliga hälsningar</b><b>,</b><br>
                                        </font></p>
                                      <p style="margin:0cm 0cm 0.0001pt"><b><font
                                            face="arial, helvetica,
                                            sans-serif" size="2">Hedvig
                                            Skirgård</font></b></p>
                                      <p style="margin:0cm 0cm 0.0001pt"><br>
                                      </p>
                                      <p style="margin:0cm 0cm 0.0001pt"><font
                                          size="1"><span
                                            style="font-family:verdana,sans-serif;color:rgb(0,0,0)">PhD
                                            Candidate</span><br>
                                        </font></p>
                                      <p
style="color:rgb(0,0,0);font-family:Verdana,Helvetica,Arial,sans-serif;margin:0cm
                                        0cm 0.0001pt"><span
                                          style="font-family:verdana,sans-serif"><font
                                            size="1">The Wellsprings of
                                            Linguistic Diversity</font></span></p>
                                      <p
style="color:rgb(0,0,0);font-family:Verdana,Helvetica,Arial,sans-serif;margin:0cm
                                        0cm 0.0001pt"><font
                                          face="verdana, sans-serif"
                                          size="1">ARC Centre of
                                          Excellence for the Dynamics of
                                          Language</font></p>
                                      <p
style="color:rgb(0,0,0);font-family:Verdana,Helvetica,Arial,sans-serif;margin:0cm
                                        0cm 0.0001pt"><font
                                          face="verdana, sans-serif"
                                          size="1">School of Culture,
                                          History and Language<br>
                                          College of Asia and the
                                          Pacific</font></p>
                                      <p
style="color:rgb(0,0,0);font-family:Verdana,Helvetica,Arial,sans-serif;margin:0cm
                                        0cm 0.0001pt"><font
                                          face="verdana, sans-serif"
                                          size="1">The Australian
                                          National University</font></p>
                                      <p style="margin:0cm 0cm 0.0001pt"><font
                                          color="#666666" face="arial,
                                          helvetica, sans-serif"
                                          size="1"><a
                                            href="https://sites.google.com/site/hedvigskirgard/"
                                            target="_blank"
                                            moz-do-not-send="true">Website</a><br>
                                        </font></p>
                                      <div><br>
                                      </div>
                                      <p style="margin:0cm 0cm 0.0001pt"><br>
                                      </p>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
          <br>
          <div class="gmail_quote">2018-03-23 0:49 GMT+11:00 Johanna
            NICHOLS <span dir="ltr"><<a
                href="mailto:johanna@berkeley.edu" target="_blank"
                moz-do-not-send="true">johanna@berkeley.edu</a>></span>:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div dir="ltr">
                <div>What's in the codebook -- the coding categories and
                  the criteria?  That much is usually in the body of the
                  paper.<br>
                  <br>
                </div>
                <div>Also, a minor but I think important point: 
                  Ordinarily the codebook doesn't in fact
                  chronologically precede the spreadsheet.  A draft or
                  early version of it does, and that gets revised many
                  times as you run into new and unexpected things.  (And
                  every previous entry in the spreadsheet gets checked
                  and edited too.)  By the time you've finished your
                  survey the categories and typology can look different
                  from what you started with.  You publish when you're
                  comfortably past the point of diminishing returns.  In
                  most sciences this is bad method, but in linguistics
                  it's common and I'd say normal.  The capacity to
                  handle it needs to be built into the method in
                  advance.  <br>
                </div>
                <span class="HOEnZb"><font color="#888888">
                    <div><br>
                    </div>
                    Johanna<br>
                  </font></span></div>
              <div class="HOEnZb">
                <div class="h5">
                  <div class="gmail_extra"><br>
                    <div class="gmail_quote">On Thu, Mar 22, 2018 at
                      2:10 PM, Sebastian Nordhoff <span dir="ltr"><<a
href="mailto:sebastian.nordhoff@glottotopia.de" target="_blank"
                          moz-do-not-send="true">sebastian.nordhoff@<wbr>glottotopia.de</a>></span>
                      wrote:<br>
                      <blockquote class="gmail_quote" style="margin:0 0
                        0 .8ex;border-left:1px #ccc
                        solid;padding-left:1ex">Dear all,<br>
                        taking up a thread from last November, I would
                        like to start a<br>
                        discussion about how to make typological
                        research more replicable, where<br>
                        replicable means "less dependent on the original
                        researcher". This<br>
                        includes coding decisions, tabular data,
                        quantitative analyses etc.<br>
                        <br>
                        Volker Gast wrote (full quote at bottom of
                        mail):<br>
                        > Let's assume that self-annotation cannot be
                        avoided for financial<br>
                        > reasons. What about establishing a standard
                        saying that, for instance,<br>
                        > when you submit a quantitative-typological
                        paper to LT you have to<br>
                        > provide the data in such a way that the
                        coding decisions are made<br>
                        > sufficiently transparent for readers to see
                        if they can go along with<br>
                        > the argument?<br>
                        <br>
                        I see two possibilities for that: Option 1:
                        editors will refuse papers<br>
                        which do not adhere to this standard. That will
                        not work in my view.<br>
                        What might work (Option 2) is a star/badge
                        system. I could imagine the<br>
                        following:<br>
                        <br>
                        - no stars: only standard bibliographical
                        references<br>
                        - *         raw tabular data (spreadsheet)
                        available as a supplement<br>
                        - **        as above, + code book available as a
                        supplement<br>
                        - ***       as above, + computer code in R or
                        similar available<br>
                        <br>
                        For a three-star article, an unrelated
                        researcher could then take the<br>
                        original grammars and the code book and
                        replicate the spreadsheet to see<br>
                        if it matches. They could then run the computer
                        code to see if they<br>
                        arrive at the same results.<br>
                        <br>
                        This will not be practical for every research
                        project, but some might<br>
                        find it easier than others, and, in the long
                        run, it will require good<br>
                        arguments to submit a 0-star (i.e.
                        non-replicable) quantitative article.<br>
                        <br>
                        Any thoughts?<br>
                        Sebastian<br>
                        <br>
                        PS: Note that the codebook would actually
                        chronologically precede the<br>
                        spreadsheet, but I fill that spreadsheets are
                        more easily available than<br>
                        codebooks, so in order to keep the entry barrier
                        low, this order is<br>
                        reversed for the stars.<br>
                        <br>
                      </blockquote>
                    </div>
                  </div>
                </div>
              </div>
              <br>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Lingtyp mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Lingtyp@listserv.linguistlist.org">Lingtyp@listserv.linguistlist.org</a>
<a class="moz-txt-link-freetext" href="http://listserv.linguistlist.org/mailman/listinfo/lingtyp">http://listserv.linguistlist.org/mailman/listinfo/lingtyp</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>