<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Dear Robert,</p>
    <p>Thanks for the feedback. Our <span class="Y0NH2b CLPzrc">XML
        schema definition you find here<b>:</b></span> <a
        class="moz-txt-link-freetext"
        href="https://typecraft.org/typecraft.xsd">https://typecraft.org/typecraft.xsd
        <span class="Y0NH2b CLPzrc"><br>
        </span></a></p>
    <p><span class="Y0NH2b CLPzrc">We started the development of our
        IGT-XML (TC-XML) in 2006/7,  at that time XIGT was not around
        yet. It was first presented in 2014, as far as I recall. <br>
      </span></p>
    <p>The most common IGT type is the basic three-line interlinear
      format, a format that can also be exported from TypeCraft.  Our
      Akan data is  part of speech tagged in addition.  The TypeCraft
      editor allows for annotations on several tiers which is also
      reflected in our XML.  <br>
    </p>
    <p>I agree with you; its is a good idea to also offer a CSV format.
      We do not do that at the moment, although it is an option, since
      we work with a PostgreSQL database.</p>
    <p>Best,</p>
    <p>Dorothee<br>
    </p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 04. april 2018 11:19, Robert Forkel
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:1afc924d-5b7f-2bef-ccf1-4dc917b349a1@shh.mpg.de">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      Dear Dorothee,<br>
      I just had a brief look at the Akan corpus. I'd be curious what
      guided your decision to come up with a custom XML based export
      format. The namespace URL <br>
      <pre id="line1"><span><a class="attribute-value" moz-do-not-send="true">http://typecraft.org/typecraft</a></span></pre>
      doesn't seem to resolve, so I guess there is no schema defining
      the XML, right? We included (very basic) support for IGT in CLDF
      (see <a class="moz-txt-link-freetext"
        href="https://github.com/cldf/cldf/tree/master/components/examples"
        moz-do-not-send="true">https://github.com/cldf/cldf/tree/master/components/examples</a>),
      because<br>
      - the examples we found in databases like WALS could be modeled in
      this simplistic form and<br>
      - CSV is better suited for tools like version control than XML<br>
      - we wanted to have IGT data available in the same format
      framework as other linguistic data to make links between data
      homogenous.<br>
      <br>
      We also discussed other IGT formats (see <a
        class="moz-txt-link-freetext"
        href="https://github.com/cldf/cldf/issues/10"
        moz-do-not-send="true">https://github.com/cldf/cldf/issues/10</a>),
      among them XIGT (<a class="moz-txt-link-freetext"
        href="https://github.com/xigt/xigt" moz-do-not-send="true">https://github.com/xigt/xigt</a>),
      which is also an XML format. Did you look at XIGT, and if so, why
      was it not suitable as export format for TypeCraft?<br>
      <br>
      best<br>
      robert<br>
      <br>
      <br>
      <div class="moz-cite-prefix">On 25.03.2018 16:51, Dorothee
        Beermann wrote:<br>
      </div>
      <blockquote type="cite"
        cite="mid:e92280a6-770e-0bb6-c4cd-000f8a36cb7e@ntnu.no">
        <p>Dear all,</p>
        <p>I have followed the discussion on this thread with interest.
          Let me ask you, would any of what you discuss and suggest here
          also apply to Interlinear Glossed Data?<br>
        </p>
        <p>Sebastian talked about making  "typological research more
          replicable". A related issue is reproducible research in
          linguists. I guess a good starting point for whatever we do as
          linguists is to keep things<br>
        </p>
        <div class="moz-forward-container">
          <p>transparent, and to give public access to data collections.
            Especially for languages with little to no public resources
            (except for what one finds in articles), this seems
            essential.<br>
          </p>
          <p>Here is an example of what I have in mind:  We just
            released 41 Interlinear Glossed Texts in Akan. The data can
            be downloaded as XML from:</p>
          <p><a class="moz-txt-link-freetext"
              href="https://typecraft.org/tc2wiki/The_TypeCraft_Akan_Corpus"
              moz-do-not-send="true">https://typecraft.org/tc2wiki/The_TypeCraft_Akan_Corpus</a><br>
          </p>
          The corpus is described on the download page, and also in the
          notes contained in the download. (Note that we can offer the
          material in several other formats.) <br>
          <br>
          <br>
          Dorothee <br>
          <br>
          <font color="#999999" size="-1">Professor Dorothee Beermann,
            PhD<br>
            Norwegian University of Science and Technology (NTNU)<br>
            Dept. of Language and Literature<br>
            Surface mail to: NO-7491 Trondheim, Norway/Norge<br>
            <br>
            Visit: Building 4, level 5, room 4512, Dragvoll,<br>
            E-mail:  <a class="moz-txt-link-abbreviated"
              href="mailto:dorothee.beermann@ntnu.no"
              moz-do-not-send="true">dorothee.beermann@ntnu.no</a><br>
            <br>
            Homepage:<a class="moz-txt-link-freetext"
              href="http://www.ntnu.no/ansatte/dorothee.beermann"
              moz-do-not-send="true">http://www.ntnu.no/ansatte/dorothee.beermann</a><br>
            TypeCraft:<a class="moz-txt-link-freetext"
              href="http://typecraft.org/tc2wiki/User:Dorothee_Beermann"
              moz-do-not-send="true">http://typecraft.org/tc2wiki/User:Dorothee_Beermann</a><br>
          </font><br>
          <br>
          <br>
          <br>
          <br>
          -------- Forwarded Message --------
          <table class="moz-email-headers-table" border="0"
            cellspacing="0" cellpadding="0">
            <tbody>
              <tr>
                <th valign="BASELINE" align="RIGHT" nowrap="nowrap">Subject:
                </th>
                <td>Re: [Lingtyp] Empirical standards in typology:
                  incentives</td>
              </tr>
              <tr>
                <th valign="BASELINE" align="RIGHT" nowrap="nowrap">Date:
                </th>
                <td>Fri, 23 Mar 2018 11:59:18 +1100</td>
              </tr>
              <tr>
                <th valign="BASELINE" align="RIGHT" nowrap="nowrap">From:
                </th>
                <td>Hedvig Skirgård <a class="moz-txt-link-rfc2396E"
                    href="mailto:hedvig.skirgard@gmail.com"
                    moz-do-not-send="true"><hedvig.skirgard@gmail.com></a></td>
              </tr>
              <tr>
                <th valign="BASELINE" align="RIGHT" nowrap="nowrap">To:
                </th>
                <td>Johanna NICHOLS <a class="moz-txt-link-rfc2396E"
                    href="mailto:johanna@berkeley.edu"
                    moz-do-not-send="true"><johanna@berkeley.edu></a></td>
              </tr>
              <tr>
                <th valign="BASELINE" align="RIGHT" nowrap="nowrap">CC:
                </th>
                <td>Linguistic Typology <a
                    class="moz-txt-link-rfc2396E"
                    href="mailto:lingtyp@listserv.linguistlist.org"
                    moz-do-not-send="true"><lingtyp@listserv.linguistlist.org></a></td>
              </tr>
            </tbody>
          </table>
          <br>
          <br>
          <div dir="ltr">Dear all, 
            <div><br>
            </div>
            <div>I think Sebastian's suggestion is very good. </div>
            <div><br>
            </div>
            <div>Is this something LT would consider, Masja?</div>
            <div><br>
            </div>
            <div>Johanna's point is good as well, but it shouldn't
              matter for Sebastian's suggestion as I understand it.
              We're not being asked to submit the coding criteria prior
              to the survey being completed, but only at the time of
              publication. There are initiatives in STEM that encourages
              research teams to submit what they're planning to do prior
              to doing if (to avoid biases), but that's not baked into
              what Sebastian is suggestion, from what I can tell.</div>
            <div><br>
            </div>
            <div>I would also add a 4 star category which includes
              inter-coderreliabiity tests, i.e. the original author(s)
              have given different people the same instructions and
              tested how often they do the same thing with the same
              grammar.</div>
            <div><br>
            </div>
            <div>/Hedvig</div>
          </div>
          <div class="gmail_extra"><br clear="all">
            <div>
              <div class="gmail_signature"
                data-smartmail="gmail_signature">
                <div dir="ltr">
                  <div>
                    <div dir="ltr">
                      <div dir="ltr">
                        <div dir="ltr">
                          <div dir="ltr">
                            <div dir="ltr">
                              <div dir="ltr">
                                <div dir="ltr">
                                  <div dir="ltr">
                                    <div dir="ltr">
                                      <div dir="ltr">
                                        <p style="margin:0cm 0cm
                                          0.0001pt;font-size:11pt;font-family:Calibri,sans-serif"><span
                                            style="font-size:9pt"><b><br>
                                            </b></span></p>
                                        <p style="margin:0cm 0cm
                                          0.0001pt"><font size="2"
                                            face="arial, helvetica,
                                            sans-serif"><b>Med vänliga
                                              hälsningar</b><b>,</b><br>
                                          </font></p>
                                        <p style="margin:0cm 0cm
                                          0.0001pt"><b><font size="2"
                                              face="arial, helvetica,
                                              sans-serif">Hedvig
                                              Skirgård</font></b></p>
                                        <p style="margin:0cm 0cm
                                          0.0001pt"><br>
                                        </p>
                                        <p style="margin:0cm 0cm
                                          0.0001pt"><font size="1"><span
style="font-family:verdana,sans-serif;color:rgb(0,0,0)">PhD Candidate</span><br>
                                          </font></p>
                                        <p
style="color:rgb(0,0,0);font-family:Verdana,Helvetica,Arial,sans-serif;margin:0cm
                                          0cm 0.0001pt"><span
                                            style="font-family:verdana,sans-serif"><font
                                              size="1">The Wellsprings
                                              of Linguistic Diversity</font></span></p>
                                        <p
style="color:rgb(0,0,0);font-family:Verdana,Helvetica,Arial,sans-serif;margin:0cm
                                          0cm 0.0001pt"><font size="1"
                                            face="verdana, sans-serif">ARC
                                            Centre of Excellence for the
                                            Dynamics of Language</font></p>
                                        <p
style="color:rgb(0,0,0);font-family:Verdana,Helvetica,Arial,sans-serif;margin:0cm
                                          0cm 0.0001pt"><font size="1"
                                            face="verdana, sans-serif">School
                                            of Culture, History and
                                            Language<br>
                                            College of Asia and the
                                            Pacific</font></p>
                                        <p
style="color:rgb(0,0,0);font-family:Verdana,Helvetica,Arial,sans-serif;margin:0cm
                                          0cm 0.0001pt"><font size="1"
                                            face="verdana, sans-serif">The
                                            Australian
                                            National University</font></p>
                                        <p style="margin:0cm 0cm
                                          0.0001pt"><font
                                            color="#666666" size="1"
                                            face="arial, helvetica,
                                            sans-serif"><a
                                              href="https://sites.google.com/site/hedvigskirgard/"
                                              target="_blank"
                                              moz-do-not-send="true">Website</a><br>
                                          </font></p>
                                        <div><br>
                                        </div>
                                        <p style="margin:0cm 0cm
                                          0.0001pt"><br>
                                        </p>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
            <br>
            <div class="gmail_quote">2018-03-23 0:49 GMT+11:00 Johanna
              NICHOLS <span dir="ltr"><<a
                  href="mailto:johanna@berkeley.edu" target="_blank"
                  moz-do-not-send="true">johanna@berkeley.edu</a>></span>:<br>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                <div dir="ltr">
                  <div>What's in the codebook -- the coding categories
                    and the criteria?  That much is usually in the body
                    of the paper.<br>
                    <br>
                  </div>
                  <div>Also, a minor but I think important point: 
                    Ordinarily the codebook doesn't in fact
                    chronologically precede the spreadsheet.  A draft or
                    early version of it does, and that gets revised many
                    times as you run into new and unexpected things. 
                    (And every previous entry in the spreadsheet gets
                    checked and edited too.)  By the time you've
                    finished your survey the categories and typology can
                    look different from what you started with.  You
                    publish when you're comfortably past the point of
                    diminishing returns.  In most sciences this is bad
                    method, but in linguistics it's common and I'd say
                    normal.  The capacity to handle it needs to be built
                    into the method in advance.  <br>
                  </div>
                  <span class="HOEnZb"><font color="#888888">
                      <div><br>
                      </div>
                      Johanna<br>
                    </font></span></div>
                <div class="HOEnZb">
                  <div class="h5">
                    <div class="gmail_extra"><br>
                      <div class="gmail_quote">On Thu, Mar 22, 2018 at
                        2:10 PM, Sebastian Nordhoff <span dir="ltr"><<a
href="mailto:sebastian.nordhoff@glottotopia.de" target="_blank"
                            moz-do-not-send="true">sebastian.nordhoff@<wbr>glottotopia.de</a>></span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">Dear all,<br>
                          taking up a thread from last November, I would
                          like to start a<br>
                          discussion about how to make typological
                          research more replicable, where<br>
                          replicable means "less dependent on the
                          original researcher". This<br>
                          includes coding decisions, tabular data,
                          quantitative analyses etc.<br>
                          <br>
                          Volker Gast wrote (full quote at bottom of
                          mail):<br>
                          > Let's assume that self-annotation cannot
                          be avoided for financial<br>
                          > reasons. What about establishing a
                          standard saying that, for instance,<br>
                          > when you submit a
                          quantitative-typological paper to LT you have
                          to<br>
                          > provide the data in such a way that the
                          coding decisions are made<br>
                          > sufficiently transparent for readers to
                          see if they can go along with<br>
                          > the argument?<br>
                          <br>
                          I see two possibilities for that: Option 1:
                          editors will refuse papers<br>
                          which do not adhere to this standard. That
                          will not work in my view.<br>
                          What might work (Option 2) is a star/badge
                          system. I could imagine the<br>
                          following:<br>
                          <br>
                          - no stars: only standard bibliographical
                          references<br>
                          - *         raw tabular data (spreadsheet)
                          available as a supplement<br>
                          - **        as above, + code book available as
                          a supplement<br>
                          - ***       as above, + computer code in R or
                          similar available<br>
                          <br>
                          For a three-star article, an unrelated
                          researcher could then take the<br>
                          original grammars and the code book and
                          replicate the spreadsheet to see<br>
                          if it matches. They could then run the
                          computer code to see if they<br>
                          arrive at the same results.<br>
                          <br>
                          This will not be practical for every research
                          project, but some might<br>
                          find it easier than others, and, in the long
                          run, it will require good<br>
                          arguments to submit a 0-star (i.e.
                          non-replicable) quantitative article.<br>
                          <br>
                          Any thoughts?<br>
                          Sebastian<br>
                          <br>
                          PS: Note that the codebook would actually
                          chronologically precede the<br>
                          spreadsheet, but I fill that spreadsheets are
                          more easily available than<br>
                          codebooks, so in order to keep the entry
                          barrier low, this order is<br>
                          reversed for the stars.<br>
                          <br>
                        </blockquote>
                      </div>
                    </div>
                  </div>
                </div>
                <br>
              </blockquote>
            </div>
            <br>
          </div>
        </div>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <br>
        <pre wrap="">_______________________________________________
Lingtyp mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Lingtyp@listserv.linguistlist.org" moz-do-not-send="true">Lingtyp@listserv.linguistlist.org</a>
<a class="moz-txt-link-freetext" href="http://listserv.linguistlist.org/mailman/listinfo/lingtyp" moz-do-not-send="true">http://listserv.linguistlist.org/mailman/listinfo/lingtyp</a>
</pre>
      </blockquote>
      <br>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Lingtyp mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Lingtyp@listserv.linguistlist.org">Lingtyp@listserv.linguistlist.org</a>
<a class="moz-txt-link-freetext" href="http://listserv.linguistlist.org/mailman/listinfo/lingtyp">http://listserv.linguistlist.org/mailman/listinfo/lingtyp</a>
</pre>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Professor Dorothee Beermann, PhD
Norwegian University of Science and Technology (NTNU)
Dept. of Language and Literature
Surface mail to: NO-7491 Trondheim, Norway/Norge

Visit: Building 4, level 5, room 4512, Dragvoll,
Tel.:    +47 73 596525
E-mail:  <a class="moz-txt-link-abbreviated" href="mailto:dorothee.beermann@ntnu.no">dorothee.beermann@ntnu.no</a>

Homepage:<a class="moz-txt-link-freetext" href="http://www.ntnu.no/ansatte/dorothee.beermann">http://www.ntnu.no/ansatte/dorothee.beermann</a>
TypeCraft:<a class="moz-txt-link-freetext" href="http://typecraft.org/tc2wiki/User:Dorothee_Beermann">http://typecraft.org/tc2wiki/User:Dorothee_Beermann</a>


</pre>
  </body>
</html>