Dear Colleagues,

Martin’s paper and the discussions on the list are all thought-provoking. I
would like to share a few thoughts from the perspective of Mandarin. As
Mandarin does not use any space to separate words in writing (unless one
uses pinyin or some other romanization system), the notion “word” is not as
intuitive as it is to English speakers, for example. As the Chinese writing
system centers around characters, earlier linguistic studies of the Chinese
language written in Chinese, to a large extent, also centered around the
discussion of characters. In fact, “word”, as a linguistic term, was not
introduced into Chinese until the early 20th century (although there were
discussions of Chinese words in writings published in other languages than
Chinese, e.g. in Georg van der Gabelentz’s *Chinesische Grammatik*
published in 1881).

While the notion of word is not as intuitive to Chinese speakers as it is
to English speakers, Chinese speakers still have a lot of agreement as to
what can be considered a word (thanks to Paolo for his quote of Saussure,
particularly the insight about the word as “a unity that is evident for the
mind, something pivotal in the language mechanism”.) For example, there is
little or no disagreement as to the status of *中国 Zhōngguó* ‘China’ as a
word in Chinese. On the issue of word segmentation in Chinese, Sproat et
al. (1996) [“A stochastic finite-state word-segmentation algorithm for
Chinese”] report that native speakers of Chinese have about 75% agreement
as to word segmentation. The result was achieved by their simply
instructing the subjects to “mark all places they might plausibly pause if
they were reading the text aloud” (p.393). As Martin points out in his
paper, the reliance on prosodic pause to segment words can be problematic.
It would be interesting to know whether the result will be better if
clearer and more informative instructions are given. Clearer and more
informative instructions require a clear and reasonable definition of word.
In this respect, I fully agree with Eitan and Martin about the importance
of having such a definition. (With respect to Martin’s definition of “a
simple morphosyntactic word” as “a form that consists of (minimally) a
root, plus any affixes”, I wonder whether “free” needs to be added before

Best regards,


> May I quote Saussure’s words concerning ‘wordhood’ which I cited in my
> speech held at the  Conference on Word-Formation Theories’, II (Košice,
> 2016, No. 2 Special Number: Selected papers from the Word-Formation
> Theories conference] :  *Cours de linguistique générale* (p. 154): le
> mot, malgré la difficulté qu’on a à le définir, est une unité qui s’impose
> à l’esprit, quelque chose de central dans le mécanisme de la langue [the
> word, in spite of the difficulties of its definition, is a unity that is
> evident for the mind, something pivotal in the language mechanism].
> Endnote 1:
> Immediately after the sentence quoted in the text Saussure adds the
> following comment: “mais c’est là un sujet qui remplirait à lui seul un
> volume” [but this is a matter which could ‘per se’ fill up a whole book].
> In fact, in the *Notes personnelles*, which Saussure never published and
> have been published by Simon Bouquet as* Écrits de linguistique générale*,
> Gallimard, Paris 2002, we read (p. 24) under the title ‘Linguistique et
> phonétique’: “... il m'est impossible de voir que le mot, au milieu de tous
> les usages qu'on en fait, soit quelque chose de donné, et qui s'impose à
> moi comme la perception d'une couleur. Le fait est que, tant que l'on parle
> du mot *a*, du mot* b*, on reste fondamentalement dans le donné
> MORPHOLOGIQUE, en dépit de tous les points de vue qu'on prétend introduire,
> parce que le mot est *une distinction qui relève de l’ordre d’idées
> morphologiques* [my italics : P.R.]” […it is impossible to see in the
> word, midst of the many uses we do of it, something as given and naturally
> self-imposing like colour perception. The fact is that as long as we speak
> of the word* a* or *b* we basically stick at the MORPHOLOGICAL side, in
> spite of the many viewpoints we pretend to introduce, since the word is a
> distinction that comes from the morphological domain].
> Best,
> Paolo
> Dear all,
> Honestly, I cannot see why an essentialist approach to wordhood would
> contradict a typological, cross-linguistically valid approach to wordhood.
> This just shows how different our backgrounds are...
> This discussion sounds just perfect for the Morphology Meeting in Budapest
> next May (the deadline is tomorrow), or as a workshop for the next SLE in
> Tallinn.
> Best,
> Anne
> I am not arguing for an extreme position like writing grammars without
> word boundaries either. I am just trying to bring to people’s attention
> that wordhood is problematic, and to persuade someone to look at wordhood
> without presupposing an essentialist concept of ‘word’, that would get us
> past appealing to intuitions which are actually rather unclear on closer
> inspection. There might be a common core, i.e. a set of crosslinguistically
> valid criteria which form universal patterns like a typological prototype
> (as the latter is defined in my “Typology and Universals” textbook). But I
> don’t know what the criteria are or what their typological relationships
> are. I would really like to know.
> Actually, I *don’t* know what a family is, in a cross-cultural sense, and
> even in my own culture, given the notions of immediate, nuclear and
> extended family, foster children, adoption, divorce etc. I don’t even know
> if ‘family’ makes sense cross-culturally, given the variety of kin systems
> and the organization of society they reflect.
> Bill
> >
> > I am sorry if I gave the impression that I'm arguing for an extreme
> position (such as writing grammars without word boundaries). I'm rather
> trying to see what the ultimate consequences are of Martin's proposals. But
> what I am wondering about is whether there isn't a common core to the
> language-specific concepts of "word", although it need not involve precise
> criteria. I think "word" may be a concept rather much like "family".
> Consider Wikipedia's definition of "family", which hardly provides any
> criteria that can be used to identify families cross-culturally:
> >
> > "In the context of human society, a family (from Latin: familia) is a
> group of people affiliated either by consanguinity (by recognized birth),
> affinity (by marriage or other relationship), or co-residence (as implied
> by the etymology of the English word "family"[1]) or some combination of
> these. Members of the immediate family may include spouses, parents,
> brothers, sisters, sons, and daughters. Members of the extended family may
> include grandparents, aunts, uncles, cousins, nephews, nieces, and
> siblings-in-law. Sometimes these are also considered members of the
> immediate family, depending on an individual's specific relationship with
> them."
> >
> > Still, we think we know what a family is.
> >
> > Östen
> >
> >
> > The problem that we need to guard against is using language-specific
> definitions for a supposedly crosslinguistic (comparative) concept of
> ‘word’. One has to use a crosslinguistically valid criterion for wordhood,
> and apply the same criterion across languages. I have yet to see anyone do
> this.
> >
> > As usual, the problem is the belief in which linguistic units have
> essences like ’noun, ‘verb’, ‘word’ etc., and all we linguists need to do
> is “discover” this essence through some accidental linguistic fact of a
> particular language (using ‘essence’ and ‘accident’ in the philosophical
> sense); and it doesn’t matter if the facts are different from one language
> to the next, or are defined in a way that works only for that language.
> Until, of course, someone else comes along and decides that the essence is
> different from what the first person thought, even by looking at the same
> accidental facts; or maybe that they don’t even believe in the essence.
> >
> > The solution, in my opinion, is to look at the “accidental" facts, that
> is, the different criteria for wordhood (defined in a crosslinguistically
> valid fashion), and find out what the typological universals are that
> govern those facts. I would expect that (a) the criteria won’t match,
> within or across languages, as with parts of speech etc.; but (b) the
> criteria would pattern typologically in such a way that most of the
> morpheme strings that we would intuitively call “words” would have a fairly
> high degree of syntagmatic unity most of the time. (Yes, “morpheme” raises
> some of the same issues -- but if we don’t address these issues, we can’t
> really trust our results.)
> >
> > Bill
> >
> >>
> >> I agree with Fritz (if I interpret his message correctly).  As far as I
> can see, we can work with any definition of "word" in crosslinguistic
> research and then see if that definition is useful or not - i.e., whether
> it does or does not yield typological correlates. If we try this approach,
> I cannot see that we could go wrong; or is there any possible problem that
> we need to guard against?
> >>
> >> Edith Moravcsik
> >>
> >> Let's say that there are no rigid consistent criteria that distinguish
> words, prefixes, and suffixes. I don't see why that would necessarily
> prevent us from making valid generalizations about prefixes and suffixes.
> Consider an analogy. We can make valid generalizations about men and women
> (their preferences for whatever, their likelihood to do whatever, etc.)
> even though gender is to a certain extent fluid. There are adults who
> consider themselves neither male or female and others who consider
> themselves both. Different criteria lead to different assignments for being
> a man or for being a woman. It seems like an analogous issue would come up
> for virtually any 'natural' category. What is the essential problem here?
> >>
> >> --fritz
> >>
> >>
> >>
> >>> As far as I'm aware, only one typologist has taken up the challenge
> >>> of my 2011 paper: Matthew Dryer in his 2015 ALT talk at Albuquerque (I
> have copied his abstract below, as it seems to be no longer available from
> the UNM website).
> >>>
> >>> Otherwise, the reaction has generally been that this is old news (for
> >>> those with no stake in the syntax-morphology distinction), or that
> >>> the distinction is fuzzy, like almost all distinctions in language.
> >>> But the latter reaction misses the point that it's not clear whether
> >>> there are any cross-linguistic regularities to begin with (apart from
> >>> orthographic conventions) that point to the cross-linguistic
> >>> relevance of something like a "word" notion. (The results of the
> >>> recent work by Jim Blevins and colleagues do seem to point in this
> >>> direction, but it is only based on four European languages.)
> >>>
> >>> An interesting case is OUP's recent handbook on polysynthesis: While
> >>> all definitions of polysynthesis make reference to the "word" notion,
> almost none of the authors and editors try to justify it, instead simply
> presupposing that there is such a thing as polysynthesis.
> >>>
> >>> (The one paper that addresses the issue, by Bickel & Zúñiga, agrees
> >>> with my skepticism in that it finds that "polysynthetic "words" are
> often not unified entities defined by a single domain on which all criteria
> would converge". OUP's handbook is hard to access, but a manuscript version
> of Bickel & Zúñiga can be found here:
> >>> http://www.comparativelinguistics.uzh.ch/en/bickel/publications/in-pr
> >>> e
> >>> ss.html)
> >>>
> >>> Best,
> >>> Martin
> >>>
> >>> ***********************************
> >>>
> >>> Evidence for the suffixing preference
> >>>
> >>> Matthew S. Dryer
> >>>
> >>> University at Buffalo
> >>>
> >>> Haspelmath (2011) argues that there are no good criteria for
> >>> distinguishing affixes from separate words, so that claims that make
> >>> reference to a distinction between words and affixes are suspect. He
> >>> claims that there is therefore no good evidence for the suffixing
> >>> preference (Greenberg 1957). since that assumes that one can
> distinguish affixes from separate words. He implies that decisions that
> linguists describing languages make in terms of what they represent as
> words may at best be based on inconsistent criteria and he has suggested
> that we have no way of knowing whether the apparent suffixing preference
> reflects anything more than the fact that the orthography of European
> languages far more often represents grammatical morphemes as suffixes than
> as prefixes.
> >>>
> >>> In this paper, I provide evidence that the suffixing preference is
> unlikely to be an artifact of orthographic conventions, at least as it
> applies to tense-aspect affixes.
> >>> I examined the phonological properties of tense-aspect affixes in a
> sample of over 500 languages, distinguishing two types on the basis of
> their phonological properties.
> >>> Type 1 affixes are either ones that are nonsyllabic, consisting only
> >>> of consonants, or ones that exhibit allomorphy that is conditioned
> >>> phonologically by verb stems. Type
> >>> 2 affixes are those that exhibit neither of these two properties. The
> >>> reason that this distinction is relevant is that grammatical
> >>> morphemes of the first sort are almost always represented as affixes
> >>> rather than as separate words in grammatical descriptions, so that we
> >>> can safely assume that in the vast majority of cases, grammatical
> morphemes of this sort that are represented as affixes really are such.
> Haspelmath’s suggestion that the suffixing preference might be an artifact
> of orthographic conventions thus predicts that we should not find a
> significant difference in the relative frequency of Type 1 prefixes and
> suffixes, but only with Type 2 prefixes and suffixes.
> >>>
> >>> The results of my study show that this prediction is not confirmed.
> >>> They show that for both types of affixes, suffixes outnumber prefixes
> >>> by a little over 2.5 to 1. The number of languages in my sample with
> >>> Type 1 suffixes outnumber the number of languages with Type 1 prefixes
> by 181 to 67, or around 2.7 to 1, while the number of languages with only
> Type 2 suffixes outnumber the number of languages with only Type 2 prefixes
> by 223 to 85, approximately 2.6 to 1. Thus the prediction that the
> suffixing preference should be found primarily with Type 2 affixes, is not
> borne out. To the contrary, we find the same suffixing preference among
> both types of affixes.
> >>>
> >>> This provides evidence that, at least for tense-aspect affixes, the
> suffixing preference is real and not an artifact of orthographic
> conventions.
> >>>
> >>> References
> >>>
> >>> Haspelmath, Martin. 2011. The indeterminacy of word segmentation and
> >>> the nature of morphology and syntax. Folia Linguistica 45: 31-80.>
> >>>
> >>>     I am writing a paper about wordhood - has anyone responded to
> Haspelmath's 2011 Folia Linguistica paper on the topic?
> >>>
> >>> I have only found two sources that mention the paper and seem to put
> forward an argument against its conclusions, but its mostly in en passant
> fashion.
> >>>
> >>> On is Blevins (2016) Word and Paradigm Morphology and another is
> Geertzen, Jeroen, James P. Blevins & Petar Milin. ‘Informativeness of unit
> boundaries’
> >>> [pdf]. Italian Journal of Linguistics 28(2), 1–24.
> >>>
> >>> Any correspondence in this regard would be greatly appreciated,
> >>>
> >>> Adam
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
