<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class="">Dear Volker,</div><div class=""><br class=""></div><div class="">  I think most typologists are aware that (i) defining categories for coding is very hard, especially across languages -- hence all the discussions about comparative concepts on Lingtyp (some of which have subsequently been published in some form in Linguistic Typology), of which this discussion of ‘word’ is only the latest; and (ii) that typologists must usually operationalize those criteria and make the operationalizations as explicit as possible. I think that (i) and (ii) are fairly common practice in typology, despite my previous comments about essentialism and methodological opportunism (cherry-picking of criteria).</div><div class=""><br class=""></div><div class="">   On the other hand, your point about mono-annotator annotation is well taken. Nevertheless, the operational factor is this one:</div><blockquote type="cite" class=""><div text="#000000" bgcolor="#FFFFFF" class=""><p class="">And I'm not saying that mono-annotator projects are useless, sometimes you just don't have the manpower for multi-annotator projects </p></div></blockquote><div>  I have recently been working on computational projects that involve annotation, and even there, where there is a lot more large-scale funding than in typology, it is very expensive to hire and train annotators, and in the end there are maybe two annotators and a third person acting as adjudicator for a pilot annotation at most. (In fact, most of the effort in computational linguistics is towards training classifiers to do the annotation automatically on large corpora, and in my small experience those are often worse than mono-annotator annotations.)</div><div><br class=""></div><div>    In typology, there is virtually no funding for any sort of multi-annotator annotation whatsoever. This is especially true for graduate students doing typological dissertations, but also for faculty doing typological research. I would guess that many typologists are aware that multi-annotator annotation is preferable, but impractical. But we don’t normally add a statement like “We are aware that engaging multiple annotators would improve the reliability of our coding and hence of the results of our crosslinguistic study; but due to lack of funding, all annotation of the data was performed by the author.” Perhaps we typologists should starting adding such statements.</div><div><br class=""></div><div>Best wishes,</div><div>Bill</div><div><br class=""><blockquote type="cite" class=""><div class="">On Nov 18, 2017, at 6:32 AM, Volker Gast <<a href="mailto:volker.gast@uni-jena.de" class="">volker.gast@uni-jena.de</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" class="">

  <div text="#000000" bgcolor="#FFFFFF" class=""><p class=""><br class="">

    </p><p class="">Hi Johanna, even if I could do this diplomatically, I wouldn't,

      and I think it wouldn't make much sense, as my point is not about

      specific publications or authors; it's about common practice (and

      common practice is reflected in the publications of 'major

      authorities'). But I think I get your point; so let me be a bit

      more specific.</p><p class="">A lot of (quantitative) typological work relies on 'coding':

      Information is extracted from grammars and transformed into a data

      matrix. Now, it is common practice (and I'm not excluding myself

      here) for the coding to be done by the analyst him/herself, and by

      no one else. But that's considered bad practice in other fields.

      Ideally, you'd need a team of annotators coding independently, on

      the basis of annotation guidelines. The team codes a sample,

      determines inter-annotator agreement, and adjusts/specifies the

      annotation guidelines where necessary. This is done until the

      inter-annotator agreement is satisfactory. And then you can start

      with the actual coding. Ideally, the analyst shouldn't be involved

      in the coding process, as her annotation decisions might be

      (subconsciously) influenced by her working hypotheses. (Note that

      this might be a viable solution to the question of how comparative

      concepts can reliably be defined, for a given study; you can just

      measure how much inter-annotator variation there is; whether or

      not the operationalizations make sense is a different question, of

      course, one of validity. When you use a set of criteria

      disjunctively, the question is what exactly your

      operationalizations are intended to represent.)<br class="">

    </p><p class="">Note that I'm not saying that there are no multi-annotator

      projects in typology (I'm actually involved in two such projects,

      though one of them is actually a comparative corpus linguistics

      project); but as far as I can tell, it is 'basically' comon

      practice for analysts to code the data themselves. And I'm not

      saying that mono-annotator projects are useless, sometimes you

      just don't have the manpower for multi-annotator projects (and one

      of the multi-annotator projects I'm involved in was really

      painful; but it was instructive to see that even for categories

      that we thought we had defined rather clearly, inter-annotator

      agreement was rather low in some cases). But as I said earlier, it

      would be nice to have some standards or at least general

      guidelines for coding typological data. Minimally, I think, the

      data should be published, along with at least some information on

      the operational tests that were applied, even if done by a single

      annotator.<br class="">

    </p><p class="">I hope this clarifies my (too general) remarks in my previous

      post.<br class="">

      Volker<br class="">

    </p>

    <br class="">

    <div class="moz-cite-prefix">Am 18.11.2017 um 13:27 schrieb Johanna

      NICHOLS:<br class="">

    </div>

    <blockquote type="cite" cite="mid:CAHDpjwqRrP9PV8QRwsJJcakZFfzxDHpdHxikP+pCVf6gtd_7Mw@mail.gmail.com" class="">

      <div dir="ltr" class="">

        <div class="">

          <div class="">Volker,  <br class="">

            <br class="">

          </div>

          If there's a way to do this diplomatically, could you cite an

          example or two of  "important publications by major

          authorities of the field where these criteria are simply not

          applied"?   In linguistics we don't have as much technical

          comment on publications as some other fields do, and maybe we

          should.  In journals where I see technical comments sections

          those comments are refereed, edited, brief, and focused on

          factual and methodological matters, i.e. about empirical

          fundamentals and not debate on theoretical frameworks.</div>

        <div class=""><br class="">

        </div>

        <div class="">If there's no way to do it diplomatically, never mind.<br class="">

        </div>

        <div class=""><br class="">

        </div>

        Johanna<br class="">

      </div>

      <div class="gmail_extra"><br class="">

        <div class="gmail_quote">On Sat, Nov 18, 2017 at 12:37 PM,

          Volker Gast <span dir="ltr" class=""><<a href="mailto:volker.gast@uni-jena.de" target="_blank" moz-do-not-send="true" class="">volker.gast@uni-jena.de</a>></span>

          wrote:<br class="">

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div text="#000000" bgcolor="#FFFFFF" class=""><p class="">Matthew -- are you saying that "one cannot rule out

                disjunctively defined comparative concept" because this

                is what you did?</p><p class="">I am not convinced by "disjunctive comparative

                concepts". Now, that's nothing for you to worry about --

                I'm just one reader (actually, audience of your

                ALT/2015-talk) who doesn't buy your conclusions because

                he doesn't accept your operationalizations.</p><p class="">But if we want "to talk TO each other (not only PAST

                each other)", as Martin writes, it would be good to have

                what other fields call "standards of empirical

                research". We have copied a lot of statistical methods

                from fields such as the social sciences and biology. I

                think it would also be beneficial to take a look at

                their standards at the "lower" level -- for instances,

                wrt how data is gathered, processed and classified, how

                hypotheses are operationalized, etc., to make sure that

                the results obtained by somebody are also accepted by

                others (just think of the 5%-threshold for statistical

                significance, which is just a matter of convention).<br class="">

              </p><p class="">I'm aware that this type of remark is annoying for some

                of you. I teach both corpus linguistics and typology. In

                corpus linguistics our students deal with very basic

                questions of empirical research -- like the traditional

                'quality criteria' -- e.g. (external, internal)

                validity, objectivity, reliability -- and then, in

                typology, we read important publications by major

                authorities of the field where these criteria are simply

                not applied, sometimes the statistics are faulty, and

                students do enquire about this. What can I say? There

                are no research standards in typology? There is an

                ongoing discussion about "arbitrary/subjective/random/<wbr class="">disjunctive

                comparative concepts" on the Lingtype-list? I'm afraid

                it wouldn't convince them. What I say is that typology

                still has some way to go to in terms of research

                methods. There are many non-trivial problems, as we have

                seen in various discussions on this list, and we should

                be aware that linguistic data is sui generis (for

                instance, I think we can't adopt just any

                method/software package from genetics). But we shouldn't

                use "authority" as a criterion in our methodological

                choices, and the choices shouldn't be made in such a way

                to legitimize our own research 'ex post'.<br class="">

              </p><p class="">Volker<br class="">

              </p>

              <br class="">

              <div class="m_-3772461755557999611moz-cite-prefix">Am

                18.11.2017 um 07:36 schrieb Dryer, Matthew:<br class="">

              </div>

              <blockquote type="cite" class="">

                <div class=""><p class="MsoNormal">With respect to Martin’s comment</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">“<span style="font-size:14.0pt;font-family:Calibri" class="">It is

                      my impression that such ortho-affixes (= forms

                      written as affixes) are perhaps even more common

                      than “phonologically weak” ortho-affixes, but this

                      is an empirical question (in his 2015 ALT

                      abstract, Matthew mentions 248 languages with weak

                      affixes, but 308 languages with only affixes of

                      the Tauya type, apparently confirming my

                      impression).</span>”</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">I realize that this is a

                    reasonable inference from my abstract, but one often

                    has to simplify things for the purposes of an

                    abstract. My definition of a weak affix is very

                    narrow and many if not most affixes that are not

                    weak affixes by my narrow criteria can still be

                    shown to be attached phonologically by broader

                    criteria. Furthermore, I also treat a morpheme as an

                    affix for the purposes of this study if it triggers

                    phonologically conditioned allomorphy in stems it

                    attaches to and it is clear from Macdonald’s

                    description of Tauya that some of the ortho-affixes

                    Martin mentions do trigger phonologically

                    conditioned allomorphy in stems they attach to (pp

                    54, 72, 74, 79). </p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">I counted an affix as weak for

                    the purposes of the study in my 2015 ALT talk only

                    if the description of it in a grammar makes clear

                    that it is nonsyllabic (or has nonsyllabic

                    allomorphs) or that it exhibits phonologically

                    allomorphy or triggers phonologically conditioned

                    allomorphy in adjacent stems. But in many grammars,

                    it is only in the discussion of phonology that it

                    becomes clear that a given affix exhibits

                    phonologically conditioned allomorphy or that it

                    triggers phonologically conditioned allomorphy in

                    adjacent stems. But because I wanted to include a

                    large sample of languages and because it is often

                    unclear from discussions of phonology whether

                    particular rules apply to particular affixes or

                    stems such affixes combine with, I adopted the

                    procedure of not consulting the discussions of

                    phonology in classifying ortho-affixes as weak. This

                    made sense for my 2015 ALT talk since I was

                    examining whether there is a suffixing preference

                    and restricting attention to weak affixes so defined

                    applies equally to prefixes and suffixes. For a

                    different type of typological study, this would have

                    been inappropriate. This illustrates how comparative

                    concepts are specific to particular typological

                    studies.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Furthermore, there are other

                    factors that I did not examine that are relevant to

                    whether a given ortho-affix is attached

                    phonologically. There may be clear evidence from

                    allophonic rules, but it is often very unclear from

                    grammatical descriptions whether particular

                    allophonic rules apply to particular ortho-affixes

                    or stems to which ortho-affixes are attached. And

                    even if the information is there in the grammatical

                    description, it may take a lot of work to see

                    whether they apply to a particular affix. For

                    example, careful examination of Macdonald’s

                    description of Tauya implies that the benefactive

                    ortho-affix <i class="">-pe</i> that Martin mentions is

                    attached phonologically, since she gives examples of

                    phonetic representations of forms containing this

                    morpheme where it takes the form [-be] after /m/

                    ([tembe] on page 54).</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">There might also be evidence from

                    stress, but still be unclear how stress is assigned

                    to forms including ortho-affixes. For example, Tauya

                    has word-final stress, but it is not clear from

                    Macdonald’s description whether this means that

                    nouns bearing the ortho-affixes that Martin mentions

                    take stress on the ortho-affix.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Some of you may have noticed that

                    what I say here contradicts what I said in my

                    earlier email about comparative concepts needing to

                    be exhaustive. The comparative concept I used in my

                    2015 ALT talk was not exhaustive and was in fact

                    disjunctive. Since that seemed appropriate for that

                    study, this suggests that one cannot rule out

                    disjunctively defined comparative concepts. I

                    sympathize with Martin’s objecting to disjunctive

                    comparative concepts as a way to continue to use

                    confusing and ambiguous terms and I agree that there

                    is something odd about arbitrary disjunctive

                    comparative concepts, but it is a mistake to simply

                    rule out disjunctive comparative concepts.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">I should note finally that while

                    it is clear that the ortho-affixes that Martin

                    mentions are attached phonologically, they are

                    actually not affixes by either his criteria or mine

                    since they are clitics that attach to postnominal

                    modifiers. [Martin has written about problems with

                    the use of the term “clitic”. I am in complete

                    agreement with him about this. But I use the term

                    here and elsewhere in my research (including my

                    upcoming ALT talk on the encliticization preference)

                    as a label for a comparative concept for grammatical

                    morphemes that are phonologically attached but

                    attach to stems of more than one stem class.]</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Matthew</p>

                </div>

                <div class=""><br class="">

                </div>

                <span id="m_-3772461755557999611OLK_SRC_BODY_SECTION" class="">

                  <div style="font-family: Calibri; font-size: 11pt; text-align: left; border-width: 1pt medium medium; border-style: solid none none; padding: 3pt 0in 0in; border-top-color: rgb(181, 196, 223);" class="">

                    <span style="font-weight:bold" class="">From: </span>Lingtyp

                    <<a href="mailto:lingtyp-bounces@listserv.linguistlist.org" target="_blank" moz-do-not-send="true" class="">lingtyp-bounces@listserv.<wbr class="">linguistlist.org</a>>

                    on behalf of Martin Haspelmath <<a href="mailto:haspelmath@shh.mpg.de" target="_blank" moz-do-not-send="true" class="">haspelmath@shh.mpg.de</a>><br class="">

                    <span style="font-weight:bold" class="">Date: </span>Thursday,

                    November 16, 2017 at 7:14 PM<br class="">

                    <span style="font-weight:bold" class="">To: </span>"<a href="mailto:lingtyp@listserv.linguistlist.org" target="_blank" moz-do-not-send="true" class="">lingtyp@listserv.<wbr class="">linguistlist.org</a>"

                    <<a href="mailto:lingtyp@listserv.linguistlist.org" target="_blank" moz-do-not-send="true" class="">lingtyp@listserv.<wbr class="">linguistlist.org</a>><br class="">

                    <span style="font-weight:bold" class="">Subject: </span>Re:

                    [Lingtyp] wordhood: bonded vs. bound<br class="">

                  </div>

                  <div class=""><br class="">

                  </div>

                  <div class="">

                    <div bgcolor="#FFFFFF" text="#000000" class=""> Matthew Dryer

                      thinks that wordhood is generally understood by

                      grammar authors in terms of <b class="">bondedness</b> (=

                      phonological weakness, as shown by nonsyllabicity

                      and phono-conditioned allomorphy), not in terms of

                      <b class="">boundness</b> (= inability to occur in

                      isolation).

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      I don’t know if this is true, but Matthew actually

                      recognizes that grammars often describe

                      grammatical markers as “affixes” even when they do

                      not show the two “phonological weakness” (or

                      bondedness) features.

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      For example, Tauya (a language of New Guinea) is

                      said to have (syllabic) case suffixes, but these

                      never show any allomorphy, e.g.

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      fena’a-ni [woman-ERG]<br class="">

                      na-pe [you-BEN]<br class="">

                      wate-’usa [house-INESS]<br class="">

                      Aresa-nani [Aresa-ALL]<br class="">

                      Tauya-sami [Tauya-ABL] (MacDonald 1990: 119-126)

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      It is my impression that such ortho-affixes (=

                      forms written as affixes) are perhaps even more

                      common than “phonologically weak” ortho-affixes,

                      but this is an empirical question (in his 2015 ALT

                      abstract, Matthew mentions 248 languages with weak

                      affixes, but 308 languages with only affixes of

                      the Tauya type, apparently confirming my

                      impression).

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      For this reason, I have suggested that the

                      stereotypical “affix” notion should perhaps be

                      captured in terms of boundness together with

                      single-root-class adjacency. Since the Tauya

                      case-markers attach only to nouns, they count as

                      affixes; by contrast, if a bound role marker

                      attaches to both nouns (English “for children”)

                      and adjectives (“for older children”) as well as

                      to other elements (“for many children”), we do not

                      regard it as an affix (but as a preposition), even

                      if it is bound (= does not occur in isolation;

                      English "for" does not).

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      Matthew quite rightly points out that this notion

                      of boundness (which goes back at least to

                      Bloomfield 1933: §10.1) implies that most function

                      words in English are bound, and in fact most

                      function words in most languages are bound – but

                      this is exactly what we want, I feel, because the

                      best way to define a “function word” is as a bound

                      element that is not an affix. Linguists often

                      think of function words (or “functional

                      categories”) as defined semantically, but it is

                      actually very hard to say what is the

                      semantic(-pragmatic) difference between a plural

                      marker and a word like “several”, between a dual

                      marker and the word “two”, between a past-tense

                      marker and the expression “in the past”, or

                      between a comitative marker and the word

                      “accompany”. It seems to me that these

                      distinctions are best characterized in terms of

                      boundness, i.e. inability to occur in isolation.

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      It may be true that occurrence in isolation is a

                      feature of an element that is not easy to elicit

                      from speakers, but in actual language use, there

                      are a very large number of very short utterances,

                      so at least positive evidence for free status

                      (=non-bound status) is not difficult to obtain.

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      In any event, it seems clear to me that some key

                      concepts of grammatical typology such as “flag” (=

                      bound role marker on a nominal) and “person index”

                      (= bound person marker, generally on a verb)

                      require the Bloomfieldian boundness notion, and

                      that these concepts are much easier to work with

                      in typology than the traditional stereotypical

                      notions of “case”, “adposition”, “agreement

                      marker”, and “pronominal clitic”. (For bound

                      person forms, this was a major lesson of Anna

                      Siewierska’s 2004 book “Person”.)

                      <div class=""> <br class="webkit-block-placeholder"></div>

                      Best,<br class="">

                      Martin<br class="">

                      <br class="">

                      <div class="m_-3772461755557999611moz-cite-prefix">On

                        14.11.17 07:02, Dryer, Matthew wrote:<br class="">

                      </div>

                      <blockquote type="cite" class="">

                        <div class=""><p class="MsoNormal" style="font-size:14px">I

                            have a number of problems with Martin’s

                            proposal:</p><div style="font-size: 14px;" class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px"><span style="font-size:14pt" class="">"<b class="">Here’s a

                                proposal for defining a notion of

                                “affix”, in such a way that the results

                                do not go too much against our

                                intuitions or stereotypes:</b></span></p><div style="font-size: 14px;" class=""><span style="font-size:14pt" class=""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px"><b class=""><span style="font-size:14pt" class="">An affix is a

                                bound form that always occurs together

                                with a root of the same root-class and

                                is never separated from the root by a

                                free form or a non-affixal bound form."</span></b><span style="font-size:14pt" class=""></span></p><div style="font-size: 14px;" class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px">If

                            one examines the notion of “bound” from his

                            2013 paper, I believe it implies a

                            comparative concept of affix that differs

                            greatly from what most linguists (at last

                            most non-generative linguists) understand by

                            the term. That’s not a problem for it as a

                            comparative concept, but it is a comparative

                            concept that differs considerably from the

                            stereotype.</p><div style="font-size: 14px;" class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px">Martin’s

                            definition of “free and “bound” from his

                            2013 paper is as follows:</p><div style="font-size: 14px;" class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal"><b class=""><span style="font-family:Times;font-size:18px" class="">"But

                                distinguishing in a general way between

                                bound elements and free elements is

                                quite straightforward, because there is

                                a single criterion: Free forms are forms

                                that can occur on their own, i.e. in a

                                complete (possibly elliptical) utterance

                                (Bloomfield 1933: 160). This criterion

                                correlates very highly with the

                                criterion of contrastive use: Only free

                                forms can be used contrastively."</span></b></p><div style="font-size: 14px;" class=""><span style="font-size:13.0pt;font-family:Times" class=""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px"><span style="font-family:Times" class="">First, I find

                              the notion of complete utterance

                              ambiguous. Does it mean utterances in

                              normal speech or does it include

                              metalinguistic uses (like “What is the

                              last word in the sentence “Who are you

                              going with”? Answer “with”). I would

                              assume that it does not include such

                              metalinguistic uses. But then many if not

                              most so-called function words in English

                              would count as bound since they cannot be

                              used as complete utterances. Perhaps other

                              speakers of English would have different

                              intuitions, but if so that only indicates

                              the lack of clarity in the notion.

                              Furthermore, for many function words in

                              English, I am not sure how to judge

                              whether they can occur alone as

                              utterances. Many such so-called function

                              words would appear to count as bound by

                              Martin’s definition, though they would not

                              count as affixes since they lack other

                              properties in his definition of “affix”.</span></p><div style="font-size: 14px;" class=""><span style="font-family:Times" class=""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px"><span style="font-family:Times" class="">Second, many

                              languages have grammatical morphemes that

                              must occur adjacent to an open class word

                              but which behave as separate words

                              phonologically. These would all apparently

                              count as affixes by Martin’s definition.

                              Again, I have no problem with this as a

                              comparative concept, only that it means

                              his notion of affix deviates considerably

                              from the stereotype.</span></p><div style="font-size: 14px;" class=""><span style="font-family:Times" class=""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px"><span style="font-family:Times" class="">Third, Martin

                              says that his criterion “</span><span style="font-family:Times" class="">correlates very

                              highly with the criterion of contrastive

                              use</span><span style="font-family:Times" class="">”.

                              But by my intuitions, the ability to occur

                              as complete utterances does not correlate

                              closely with the criterion of contrastive

                              use, since most so-called function words

                              CAN occur with contrastive use (such as

                              can in this sentence!), as can some

                              morphemes that are conventionally treated

                              as affixes, like <i class="">un-</i> in “I’m not

                              happy, I’m UNhappy”. Of course, Martin

                              might argue that <i class=""> un-</i> is more like

                              so-called function words and less like

                              morphemes conventionally treated as

                              affixes. But the fact remains that <i class="">un-</i>

                              is easily the locus of contrast but cannot

                              be used as a complete utterance. I thus

                              see no evidence of a close correlation

                              between the ability to occur as a complete

                              utterance and the ability to be the locus

                              of contrast.</span></p><div style="font-size: 14px;" class=""><span style="font-family:Times" class=""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px"><span style="font-family:Times" class="">Finally, it is

                              my experience that languages differ in

                              their conventions regarding what can be a

                              complete utterance. Imagine two closely

                              related languages that differ in their

                              grammatical rules governing what is a

                              complete utterance. By Martin’s

                              definition, there might be a large number

                              of morphemes that count as separate words

                              in one language but as affixes in the

                              other language. This strikes me as odd. It

                              seems odd to have a criterion for what is

                              a word and what is an affix so dependent

                              on the grammatical rules in the language

                              for what constitutes a complete utterance.</span></p><div style="font-size: 14px;" class=""><span style="font-family:Times" class=""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="font-size:14px"><span style="font-family:Times" class="">Matthew</span></p>

                        </div>

                        <div style="font-size:14px" class=""><br class="">

                        </div>

                        <span id="m_-3772461755557999611OLK_SRC_BODY_SECTION" style="font-size:14px" class="">

                          <div style="font-family: Calibri; font-size: 11pt; text-align: left; border-width: 1pt medium medium; border-style: solid none none; padding: 3pt 0in 0in; border-top-color: rgb(181, 196, 223);" class=""> <span style="font-weight:bold" class="">From: </span>Lingtyp

                            <<a href="mailto:lingtyp-bounces@listserv.linguistlist.org" target="_blank" moz-do-not-send="true" class="">lingtyp-bounces@listserv.<wbr class="">linguistlist.org</a>>

                            on behalf of Martin Haspelmath <<a href="mailto:haspelmath@shh.mpg.de" target="_blank" moz-do-not-send="true" class="">haspelmath@shh.mpg.de</a>><br class="">

                            <span style="font-weight:bold" class="">Date: </span>Sunday,

                            November 12, 2017 at 10:47 PM<br class="">

                            <span style="font-weight:bold" class="">To: </span>"<a href="mailto:lingtyp@listserv.linguistlist.org" target="_blank" moz-do-not-send="true" class="">lingtyp@listserv.<wbr class="">linguistlist.org</a>"

                            <<a href="mailto:lingtyp@listserv.linguistlist.org" target="_blank" moz-do-not-send="true" class="">lingtyp@listserv.<wbr class="">linguistlist.org</a>><br class="">

                            <span style="font-weight:bold" class="">Subject: </span>Re:

                            [Lingtyp] wordhood<br class="">

                          </div>

                          <div class=""><br class="">

                          </div>

                          <div class="">

                            <div bgcolor="#FFFFFF" text="#000000" class=""><p class="MsoNormal">Mattis List and

                                Balthasar Bickel rightly emphasize that

                                “word” is not a Platonic entity (a

                                natural kind) that exists in advance of

                                language learning or linguistic analysis

                                – few linguists would disagree here, not

                                even generativists (who otherwise

                                liberally assume natural-kind

                                catgeories).</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">But I think many

                                linguists still ACT AS IF there were

                                such a natural kind, because the “word”

                                notion is a crucial ingredient to a

                                number of other notions that linguists

                                use routinely – e.g. “gender”, which is

                                typically defined in terms of

                                “agreement” (which is defined in terms

                                of inflectional marking on targets; and

                                inflection is defined in terms of

                                “word”).</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">So is it possible to

                                define a comparative concept ‘word’ that

                                applies to all languages equally, and

                                that accords reasonably with our

                                stereotypes? Note that I didn’t deny

                                this in my 2011 paper, I just said that

                                nobody had come up with a satisfactory

                                definition (that could be used, for

                                instance, in defining “gender” or

                                “polysynthesis”). So I’ll be happy to

                                contribute to a discussion on how to

                                make progress on defining “word”.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Larry Hyman notes

                                that other notions like “syllable” and

                                “sentence” are also problematic in that

                                they also “leak”. However, I think it is

                                important to distinguish two situations

                                of “slipperiness”:</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">(1) “Leakage” of

                                definitions due to vague defining

                                notions</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">(2) Incoherence of

                                definitions due to the use of different

                                criteria in different languages</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">The first can be

                                addressed by tightening the defining

                                notions, but the second is fatal.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">To take up Östen

                                Dahl’s example of the “family” notion:

                                In one culture, a family might be said

                                to be a set of minimally three living

                                people consisting of two adults

                                (regardless of gender) living in a

                                romantic relationship plus all their

                                descendants. In another culture, a

                                family might be defined as a married

                                couple consisting of a man and a woman

                                plus all their living direct ancestors,

                                all their (great) uncles and (great)

                                aunts, and all the descendants of all of

                                these.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">With two family

                                concepts as different as these, it is

                                obviously not very interesting to ask

                                general cross-cultural questions about

                                “families” (e.g. “How often do all

                                family members have meals together?”).

                                So the use of different criteria for

                                different cultures is fatal here.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">What I find worrying

                                is that linguists often seem to accept

                                incoherent definitions of comparative

                                concepts (this was emphasized especially

                                in my 2015 paper on defining vs.

                                diagnosing categories). Different

                                diagnostics in different languages would

                                not be fatal if “word” were a Platonic

                                (natural-kind) concept, but if we are

                                not born with a “word” category,

                                typologists need to use the SAME

                                criteria for all languages.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">So here’s a proposal

                                for defining a notion of “simple

                                morphosyntactic word”:</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal"><b class="">A simple

                                  morphosyntactic word is a form that

                                  consists of (minimally) a root, plus

                                  any affixes.</b></p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Here’s a proposal for

                                defining a notion of “affix”, in such a

                                way that the results do not go too much

                                against our intuitions or stereotypes:</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal"><b class="">An affix is a

                                  bound form that always occurs together

                                  with a root of the same root-class and

                                  is never separated from the root by a

                                  free form or a non-affixal bound form.</b></p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">These definitions

                                make use of the notions of “root” and

                                “root-class” (defined in Haspelmath

                                2012) and<span class="">  </span>“bound (form)”

                                vs. “free (form)” (defined in Haspelmath

                                2013). All these show leakage as in (1)

                                above, but they are equally applicable

                                to all languages, so they are not

                                incoherent. (I thank Harald Hammarström

                                for a helpful discussion that helped me

                                to come up with the above definitions,

                                which I had not envisaged in 2011.)</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">(What I don’t know at

                                the moment is how to relate “simple

                                morphosyntactic word” to

                                “morphosyntactic word” in general,

                                because I cannot distinguish compounds

                                from phrases comparatively; and I don’t

                                know what to do with “phonological

                                word”.)</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Crucially, the

                                definitions above make use of a number

                                of basic concepts that apply to ALL

                                languages in the SAME way. David Gil’s

                                proposal, to measure “bond strength” by

                                means of a range of language-particular

                                phenomena, falls short of this

                                requirement (as already hinted by Eitan

                                Grossman). Note that the problem I have

                                with David’s proposal is not that it

                                provides no categorical contrasts

                                (recall my acceptance of vagueness in

                                (1) above), but that there is no way of

                                telling which phenomena should count as

                                measuring bond strength.</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">David’s approach

                                resembles Keenan’s (1976) attempt at

                                defining “subject” (perhaps not by

                                accident, because Ed Keenan was David’s

                                PhD supervisor), but I have a similar

                                objection to Keenan: If different

                                criteria are used for different

                                languages, how do we know that we are

                                measuring the same phenomenon across

                                languages? Measuring X by means of Y

                                makes sense only if we know

                                independently that X and Y are very

                                highly correlated. But do we know this,

                                for subjects, or for bond strength?</p><div class=""> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Best,</p><p class="MsoNormal">Martin</p><div class=""> <br class="webkit-block-placeholder"></div>

                              <br class="">

                              <pre class="m_-3772461755557999611moz-signature" cols="72">-- 

Martin Haspelmath (<a class="m_-3772461755557999611moz-txt-link-abbreviated" href="mailto:haspelmath@shh.mpg.de" target="_blank" moz-do-not-send="true">haspelmath@shh.mpg.de</a>)

Max Planck Institute for the Science of Human History

Kahlaische Strasse 10   

D-07745 Jena  

&

Leipzig University 

IPF 141199

Nikolaistrasse 6-10

D-04109 Leipzig    

</pre>

                            </div>

                          </div>

                        </span></blockquote>

                      <br class="">

                      <pre class="m_-3772461755557999611moz-signature" cols="72">-- 

Martin Haspelmath (<a class="m_-3772461755557999611moz-txt-link-abbreviated" href="mailto:haspelmath@shh.mpg.de" target="_blank" moz-do-not-send="true">haspelmath@shh.mpg.de</a>)

Max Planck Institute for the Science of Human History

Kahlaische Strasse 10   

D-07745 Jena  

&

Leipzig University 

IPF 141199

Nikolaistrasse 6-10

D-04109 Leipzig    

</pre>

                    </div>

                  </div>

                </span> <br class="">

                <fieldset class="m_-3772461755557999611mimeAttachmentHeader"></fieldset>

                <br class="">

                <pre class="">______________________________<wbr class="">_________________

Lingtyp mailing list

<a class="m_-3772461755557999611moz-txt-link-abbreviated" href="mailto:Lingtyp@listserv.linguistlist.org" target="_blank" moz-do-not-send="true">Lingtyp@listserv.linguistlist.<wbr class="">org</a>

<a class="m_-3772461755557999611moz-txt-link-freetext" href="http://listserv.linguistlist.org/mailman/listinfo/lingtyp" target="_blank" moz-do-not-send="true">http://listserv.linguistlist.<wbr class="">org/mailman/listinfo/lingtyp</a><span class="HOEnZb"><font color="#888888" class="">

</font></span></pre>

                <span class="HOEnZb"><font color="#888888" class=""> </font></span></blockquote>

              <span class="HOEnZb"><font color="#888888" class=""> <br class="">

                  <pre class="m_-3772461755557999611moz-signature" cols="72">-- 

Prof. Volker Gast

English and American Studies

Ernst-Abbe-PLatz 8

D-07743 Jena

Fon: ++49 3641 9-44546

Fax: ++49 3641 9-44542</pre>

                </font></span></div>

            <br class="">

            ______________________________<wbr class="">_________________<br class="">

            Lingtyp mailing list<br class="">

            <a href="mailto:Lingtyp@listserv.linguistlist.org" moz-do-not-send="true" class="">Lingtyp@listserv.linguistlist.<wbr class="">org</a><br class="">

            <a href="http://listserv.linguistlist.org/mailman/listinfo/lingtyp" rel="noreferrer" target="_blank" moz-do-not-send="true" class="">http://listserv.linguistlist.<wbr class="">org/mailman/listinfo/lingtyp</a><br class="">

            <br class="">

          </blockquote>

        </div>

        <br class="">

      </div>

    </blockquote>

    <br class="">

    <pre class="moz-signature" cols="72">-- 

Prof. Volker Gast

English and American Studies

Ernst-Abbe-PLatz 8

D-07743 Jena

Fon: ++49 3641 9-44546

Fax: ++49 3641 9-44542</pre>

  </div>

_______________________________________________<br class="">Lingtyp mailing list<br class=""><a href="mailto:Lingtyp@listserv.linguistlist.org" class="">Lingtyp@listserv.linguistlist.org</a><br class="">http://listserv.linguistlist.org/mailman/listinfo/lingtyp<br class=""></div></blockquote></div><br class=""></body></html>