[Lingtyp] wordhood: bonded vs. bound

Jan Rijkhoff linjr at cc.au.dk
Fri Nov 17 10:07:00 UTC 2017

This is an interesting discussion and I was particularly pleased to see Mark Post mention ‘function’ (“…, it [wordhood] must be functionally motivated, …”) and briefly addressing the fuzzy nature of a ‘comparative concept’. It seems one of the main obstacles to make progress in this area, and in cross-linguistic research generally, is the term ‘comparative concept’. This post goes a bit beyond definitional problems concerning wordhood, of course, but to go back into the stratosphere again (to quote Bill Croft), I think using the term ‘comparative concept’ ultimately won't help us to come up with reliable cross-linguistic categories, because (i) no one seems to know what a concept is (Machery 2009, Malt 2010), and (ii) ‘comparative concepts’ seem/appear to allow for categories that may contain a rather wide variety of (incomparable) forms and constructions.

     In morphosyntactic typology, for example, the almost exclusive focus on ‘meaning’ that comes (came initially? - cf. Haspelmath 2007*) with the conceptual approach has resulted in categories that contain rather different morphosyntactic forms and constructions. But how can a collection of different morphosyntactic forms and constructions constitute a single (valid) morphosyntactic category (Rijkhoff 2009, 2016)? The fuzzy nature of (mostly) meaning-based definitions has been a problem since Greenberg’s groundbreaking work on universals and continues to be a problem with ‘conceptual concepts’, which are essentially more explicit versions of Greenbergian categories (1966: 74).

     Furthermore, as long as ‘(communicative or discourse) function’ is not recognized as a relevant parameter in cross-linguistic research (apparently these days ‘comparative concepts’ are ultimately supposed to be exclusively based on “phonetic substance or semantic substance”), any attempt to arrive at a valid cross-linguistic category is bound to fail. For example, many words (however defined) can be used in various functions (e.g. as different kinds of modifiers, or as a modifier and a predicate; Rijkhoff 2014).

     It may also be good to remember that we cannot compare everything in all languages. If we always want to include all languages in a typological investigation, we will often end up with a mixed bag, a category whose members have (too) little in common with each other. One way to avoid this is to employ distinct functional, semantic and formal criteria (not simultaneously, but sequentially) to arrive at a cross-linguistic category whose members are truly comparable (Rijkhoff 2016).

     In sum, in order to get ahead, it seems we need (i) to replace ‘concept’ with a less abstract term like ‘category’ (as some of the contributors to this discussion already did), and (ii) to establish cross-linguistic categories in a systematic, transparent way, for example by using a procedure in which the various kinds of criteria are clearly distinguished and applied in a certain order. Only then others can assess the validity of a proposed cross-linguistic category and decide to follow the same procedure (or not, of course). This would also make it possible to compare the outcome of typological investigations, which is currently usually not possible. Not to mention the fact that one needs a theory to make sense of data (eloquently stated by Charles Darwin in a letter to Henry Fawcett in 1861), and many typologists seem to be working outside comprehensive theoretical frameworks.

* Haspelmath (2007: 119, 126-127): “meaning”: “substance (unlike categories) is universal”, “comparison must be semantically based”, “we must hold the meaning constant – at least this must be universal”. On problems with such a meaning-based approach to cross-linguistic research, see also Willems (2016a, 2016b).

Best, Jan


     Greenberg, Joseph H. 1966. Some universals of grammar with particular reference to the order of meaningful elements. In Joseph H. Greenberg (ed.), Universals of language (2nd edition), 73-113. Cambridge: MIT.

     Haspelmath, Martin. 2007. Pre-established categories don’t exist: Consequences for language description and typology. Linguistic Typology 11-1, 119-132.

     Machery, Edouard. 2009. Doing without concepts. Cambridge: Cambridge University Press.

     Malt, Barbara C. 2010. Why we should do without concepts. Mind and Language 25(2). 622-633.

     Rijkhoff, Jan. 2009. On the (un)suitability of semantic categories. Linguistic Typology 13-1, 95‑104.

     Rijkhoff, Jan. 2014. Modification as a propositional act. In M. de los Ángeles Gómez González, F. José Ruiz de Mendoza Ibáñez & F. Gonzálvez-García (eds.), Theory and Practice in Functional-Cognitive Space, 129-150. Amsterdam: Benjamins.

     Rijkhoff, Jan. 2016. Crosslinguistic categories in morphosyntactic typology: Problems and prospects. Linguistic Typology 20-2, 333-363.

     Willems, Klaas. 2016a. Empirische, essentiële en mogelijke universalia: Unzeitgemäße Betrachtungen bij het ‘categoriale particularisme’ in de moderne taaltypologie. Leuvense Bijdragen 99-100, 170-187.

     Willems, Klaas. 2016b. The universality of categories and meaning: a Coserian perspective. Acta Lingusitica Hafniensia, DOI: 10.1080/03740463.2016.1141565.

J. Rijkhoff - Associate Professor, Linguistics
School of Communication and Culture, Aarhus University
Jens Chr. Skous Vej 2, Building 1485-621
DK-8000 Aarhus C, DENMARK
Phone: (+45) 87162143
E-mail: linjr at cc.au.dk
URL: http://person.au.dk/en/linjr@hum<https://www.researchgate.net/deref/http%3A%2F%2Fperson.au.dk%2Fen%2Flinjr%40hum>
From: Lingtyp <lingtyp-bounces at listserv.linguistlist.org> on behalf of Mark Post <markwpost at gmail.com>
Sent: Friday, November 17, 2017 2:10 AM
To: lingtyp at listserv.linguistlist.org
Subject: Re: [Lingtyp] wordhood: bonded vs. bound

Having been away for some time, I've had some difficulty reconstructing the chronology of this discussion and some posts containing HTML have not been fully replicated in the LINGTYP archives, so I sincerely apologize in advance if I have missed or misconstrued anything that has been said. That said, it seems to me that an essential problem underlying the exchange in these threads that may not have been fully articulated yet is that the concept "word" has multiple personalities. It is both two things, and one thing:

That is, we know that "word" requires independent grammatical and phonological definitions, leading to the notions "grammatical (or morphosyntactic) word" and "phonological (or prosodic) word". And we know that these two independently-defined units very frequently fail to coincide: certain forms may be good grammatical words but bad phonological words (clitics), while other forms may be good phonological words but bad grammatical words (oddly, I don't think that there is an accepted term for this perhaps less commonly-identified phenomenon - please correct me if I'm wrong). And as Hyman, Bickel and others <https://www.academia.edu/197257/The_phonology_and_grammar_of_Galo_words_A_case_study_in_benign_disunity> have pointed out, the concepts "grammatical word" and "phonological word" themselves might be easier or more difficult (or perhaps impossible?) to define in terms of unified sets of criteria in particular languages.

But at the same time, we can see that these two independently-defined types of unit very often do align, in many languages, much of the time. That is to say, we very often get cases in which there is a type of unit in a particular language between "morpheme" (i.e., whose constituents are morphemes) and "syntactic phrase" (i.e., which is a constituent of a syntactic phrase), and a type of unit between "foot" and "phonological phrase" (although for my own part, I'm less confident about the latter), and we very often find correspondence between these units. This fact also seems to merit recognition, and some sort of explanation (and no, I don't believe that orthography has very much to do with it in an overall sense - Sinitic and Thai grammarians have the more or less the same sorts of problems in terms of defining wordhood, despite that the traditional orthographies for these language groups are in a sense on opposite ends of the segmentation spectrum).

The clear implication is that wordhood is an emergent phenomenon, as Ross suggested<http://listserv.linguistlist.org/pipermail/lingtyp/2017-November/005829.html> early in the first thread. That being the case, it must be functionally motivated, although I confess I have a hard time trying to characterize what that motivation would be - perhaps it's as simple as the (eminently violable) isomorphism principle? All things being equal, a language should want one phonological gesture to correspond to one semantic unit, expressed as a grammatical form, and vice versa? Whatever the case might be, I feel that unless typologists are prepared to accept fuzzy "comparative concepts" along the above lines ("words" tend to exhibit the sets of grammatical and phonological characteristics a, b, c... and p, q, r...), as opposed to those based on all-or-nothing criteria (an extension of Dryer's definition of affixes here<http://listserv.linguistlist.org/pipermail/lingtyp/2017-November/005867.html>, for example, wouldn't seem to account for endoclitics<https://global.oup.com/academic/product/endoclitics-and-the-origins-of-udi-morphosyntax-9780199246335?cc=au&lang=en&>, although to some extent that might depend on analyses), we'll continue to go around in circles. To take Bickel's statement <http://listserv.linguistlist.org/pipermail/lingtyp/2017-November/005843.html> one step further, the problem is not just that there are no Platonic "words" out there, although that's no doubt true, but also that the use of would-be-watertight definitions for comparative concepts is incommensurate with the nature of the object under definition.


PS - can I also point out that there is a session <https://cloudstor.aarnet.edu.au/plus/index.php/s/l8zu11lidrfJgc1#pdfviewer> in the coming ALT conference on "wordhood"? It would be nice to be able to continue some of this discussion there!

------ Original Message ------
From: "Larry M. HYMAN" <hyman at berkeley.edu<mailto:hyman at berkeley.edu>>
To: "Plank" <frans.plank at uni-konstanz.de<mailto:frans.plank at uni-konstanz.de>>
Cc: "lingtyp at listserv.linguistlist.org" <lingtyp at listserv.linguistlist.org<mailto:lingtyp at listserv.linguistlist.org>>
Sent: 17/11/2017 5:06:33 AM
Subject: Re: [Lingtyp] wordhood: bonded vs. bound

I of course agree with Frans (as I usually do), and not only because we are both great fans of phonology!

I do have to add a couple of things:

1. It is well-known to phonologists that the "(phonological) word" can not only differ from what is needed for grammatical purposes, but in fact can be inconsistent, requiring one parsing for vowel harmony, another for stress etc. In fact, I distinguish 8 different logical "phonological words" on pp.335-6 of this paper:

Hyman, Larry M. 2008. “Directional asymmetries in the morphology and phonology of words, with special reference to Bantu.” Linguistics 46(2), 309-349.

2. The article that Frans cites by Joan Bybee in that great journal (what is it called again? oh! LT!) is a very stimulating and provocative paper, as I can attest (I guess no harm in revealing that I was the associate editor that recommended publication). In it Joan looks at a small set of languages (ca. 20) in her GramCats database and doesn't find the expected degree of skewing of phonological properties according to root vs. affix. However, I believe that a wider search would show this is much more widespread than her survey revealed. Granted that Africa has relatively few genetic stocks, and that root-affix (or rather stem-initial vs. non-stem-initial) asymmetries are in many cases an areal phenomenon, but it is typical of African languages that (1) affixes do not have as many contrasts in vowels, consonants or (sometimes) tones as roots; (2) the realization of the consonants, vowels and tones can be different on affixes vs. root. This is also documented in the above article.

I would be very interested in whether non-phonology-oriented typologists find this interesting, uninteresting, or irrelevant to their/your work.

Thanks, Larry

On Thu, Nov 16, 2017 at 1:41 AM, Plank <frans.plank at uni-konstanz.de<mailto:frans.plank at uni-konstanz.de>> wrote:
Doesn’t PHONOLOGY give you away when you’re an affix?

See, among many others, Roman Jakobson:
“Affixes, particularly inflectional suffixes, in the languages where they exist, habitually differ from the other morphemes by a restricted and selected use of phonemes and their combinations” (Jakobson 1966 [1990: 414])

Translated into OT:
There is a universal ordering priority of faithfulness in roots over faithfulness in affixes (McCarthy & Prince 1995: 116–117, Alderete 1999, Ussishkin 2000)

And the reason, as always, is grammaticalization:
The phonological segments in affixes “are drawn from a progressively shrinking set” (Hopper & Traugott 2003: 154), whose members are the universally unmarked segments

Alas, here is Joan Bybee, debunking this wonderful idea in that spoilsport of a journal:
Restrictions on phonemes in affixes: A crosslinguistic test of a popular hypothesis.
Linguistic Typology 9. 165-222, 2005.

Isn’t phonology useless …


On 16. Nov 2017, at 10:14, Martin Haspelmath <haspelmath at shh.mpg.de<mailto:haspelmath at shh.mpg.de>> wrote:

Matthew Dryer thinks that wordhood is generally understood by grammar authors in terms of bondedness (= phonological weakness, as shown by nonsyllabicity and phono-conditioned allomorphy), not in terms of boundness (= inability to occur in isolation).

I don’t know if this is true, but Matthew actually recognizes that grammars often describe grammatical markers as “affixes” even when they do not show the two “phonological weakness” (or bondedness) features.

For example, Tauya (a language of New Guinea) is said to have (syllabic) case suffixes, but these never show any allomorphy, e.g.

fena’a-ni [woman-ERG]
na-pe [you-BEN]
wate-’usa [house-INESS]
Aresa-nani [Aresa-ALL]
Tauya-sami [Tauya-ABL] (MacDonald 1990: 119-126)

It is my impression that such ortho-affixes (= forms written as affixes) are perhaps even more common than “phonologically weak” ortho-affixes, but this is an empirical question (in his 2015 ALT abstract, Matthew mentions 248 languages with weak affixes, but 308 languages with only affixes of the Tauya type, apparently confirming my impression).

For this reason, I have suggested that the stereotypical “affix” notion should perhaps be captured in terms of boundness together with single-root-class adjacency. Since the Tauya case-markers attach only to nouns, they count as affixes; by contrast, if a bound role marker attaches to both nouns (English “for children”) and adjectives (“for older children”) as well as to other elements (“for many children”), we do not regard it as an affix (but as a preposition), even if it is bound (= does not occur in isolation; English "for" does not).

Matthew quite rightly points out that this notion of boundness (which goes back at least to Bloomfield 1933: §10.1) implies that most function words in English are bound, and in fact most function words in most languages are bound – but this is exactly what we want, I feel, because the best way to define a “function word” is as a bound element that is not an affix. Linguists often think of function words (or “functional categories”) as defined semantically, but it is actually very hard to say what is the semantic(-pragmatic) difference between a plural marker and a word like “several”, between a dual marker and the word “two”, between a past-tense marker and the expression “in the past”, or between a comitative marker and the word “accompany”. It seems to me that these distinctions are best characterized in terms of boundness, i.e. inability to occur in isolation.

It may be true that occurrence in isolation is a feature of an element that is not easy to elicit from speakers, but in actual language use, there are a very large number of very short utterances, so at least positive evidence for free status (=non-bound status) is not difficult to obtain.

In any event, it seems clear to me that some key concepts of grammatical typology such as “flag” (= bound role marker on a nominal) and “person index” (= bound person marker, generally on a verb) require the Bloomfieldian boundness notion, and that these concepts are much easier to work with in typology than the traditional stereotypical notions of “case”, “adposition”, “agreement marker”, and “pronominal clitic”. (For bound person forms, this was a major lesson of Anna Siewierska’s 2004 book “Person”.)


On 14.11.17 07:02, Dryer, Matthew wrote:
I have a number of problems with Martin’s proposal:

"Here’s a proposal for defining a notion of “affix”, in such a way that the results do not go too much against our intuitions or stereotypes:

An affix is a bound form that always occurs together with a root of the same root-class and is never separated from the root by a free form or a non-affixal bound form."

If one examines the notion of “bound” from his 2013 paper, I believe it implies a comparative concept of affix that differs greatly from what most linguists (at last most non-generative linguists) understand by the term. That’s not a problem for it as a comparative concept, but it is a comparative concept that differs considerably from the stereotype.

Martin’s definition of “free and “bound” from his 2013 paper is as follows:

"But distinguishing in a general way between bound elements and free elements is quite straightforward, because there is a single criterion: Free forms are forms that can occur on their own, i.e. in a complete (possibly elliptical) utterance (Bloomfield 1933: 160). This criterion correlates very highly with the criterion of contrastive use: Only free forms can be used contrastively."

First, I find the notion of complete utterance ambiguous. Does it mean utterances in normal speech or does it include metalinguistic uses (like “What is the last word in the sentence “Who are you going with”? Answer “with”). I would assume that it does not include such metalinguistic uses. But then many if not most so-called function words in English would count as bound since they cannot be used as complete utterances. Perhaps other speakers of English would have different intuitions, but if so that only indicates the lack of clarity in the notion. Furthermore, for many function words in English, I am not sure how to judge whether they can occur alone as utterances. Many such so-called function words would appear to count as bound by Martin’s definition, though they would not count as affixes since they lack other properties in his definition of “affix”.

Second, many languages have grammatical morphemes that must occur adjacent to an open class word but which behave as separate words phonologically. These would all apparently count as affixes by Martin’s definition. Again, I have no problem with this as a comparative concept, only that it means his notion of affix deviates considerably from the stereotype.

Third, Martin says that his criterion “correlates very highly with the criterion of contrastive use”. But by my intuitions, the ability to occur as complete utterances does not correlate closely with the criterion of contrastive use, since most so-called function words CAN occur with contrastive use (such as can in this sentence!), as can some morphemes that are conventionally treated as affixes, like un- in “I’m not happy, I’m UNhappy”. Of course, Martin might argue that un- is more like so-called function words and less like morphemes conventionally treated as affixes. But the fact remains that un- is easily the locus of contrast but cannot be used as a complete utterance. I thus see no evidence of a close correlation between the ability to occur as a complete utterance and the ability to be the locus of contrast.

Finally, it is my experience that languages differ in their conventions regarding what can be a complete utterance. Imagine two closely related languages that differ in their grammatical rules governing what is a complete utterance. By Martin’s definition, there might be a large number of morphemes that count as separate words in one language but as affixes in the other language. This strikes me as odd. It seems odd to have a criterion for what is a word and what is an affix so dependent on the grammatical rules in the language for what constitutes a complete utterance.


From: Lingtyp <lingtyp-bounces at listserv.linguistlist.org<mailto:lingtyp-bounces at listserv.linguistlist.org>> on behalf of Martin Haspelmath <haspelmath at shh.mpg.de<mailto:haspelmath at shh.mpg.de>>
Date: Sunday, November 12, 2017 at 10:47 PM
To: "lingtyp at listserv.linguistlist.org<mailto:lingtyp at listserv.linguistlist.org>" <lingtyp at listserv.linguistlist.org<mailto:lingtyp at listserv.linguistlist.org>>
Subject: Re: [Lingtyp] wordhood

Mattis List and Balthasar Bickel rightly emphasize that “word” is not a Platonic entity (a natural kind) that exists in advance of language learning or linguistic analysis – few linguists would disagree here, not even generativists (who otherwise liberally assume natural-kind catgeories).

But I think many linguists still ACT AS IF there were such a natural kind, because the “word” notion is a crucial ingredient to a number of other notions that linguists use routinely – e.g. “gender”, which is typically defined in terms of “agreement” (which is defined in terms of inflectional marking on targets; and inflection is defined in terms of “word”).

So is it possible to define a comparative concept ‘word’ that applies to all languages equally, and that accords reasonably with our stereotypes? Note that I didn’t deny this in my 2011 paper, I just said that nobody had come up with a satisfactory definition (that could be used, for instance, in defining “gender” or “polysynthesis”). So I’ll be happy to contribute to a discussion on how to make progress on defining “word”.

Larry Hyman notes that other notions like “syllable” and “sentence” are also problematic in that they also “leak”. However, I think it is important to distinguish two situations of “slipperiness”:

(1) “Leakage” of definitions due to vague defining notions

(2) Incoherence of definitions due to the use of different criteria in different languages

The first can be addressed by tightening the defining notions, but the second is fatal.

To take up Östen Dahl’s example of the “family” notion: In one culture, a family might be said to be a set of minimally three living people consisting of two adults (regardless of gender) living in a romantic relationship plus all their descendants. In another culture, a family might be defined as a married couple consisting of a man and a woman plus all their living direct ancestors, all their (great) uncles and (great) aunts, and all the descendants of all of these.

With two family concepts as different as these, it is obviously not very interesting to ask general cross-cultural questions about “families” (e.g. “How often do all family members have meals together?”). So the use of different criteria for different cultures is fatal here.

What I find worrying is that linguists often seem to accept incoherent definitions of comparative concepts (this was emphasized especially in my 2015 paper on defining vs. diagnosing categories). Different diagnostics in different languages would not be fatal if “word” were a Platonic (natural-kind) concept, but if we are not born with a “word” category, typologists need to use the SAME criteria for all languages.

So here’s a proposal for defining a notion of “simple morphosyntactic word”:

A simple morphosyntactic word is a form that consists of (minimally) a root, plus any affixes.

Here’s a proposal for defining a notion of “affix”, in such a way that the results do not go too much against our intuitions or stereotypes:

An affix is a bound form that always occurs together with a root of the same root-class and is never separated from the root by a free form or a non-affixal bound form.

These definitions make use of the notions of “root” and “root-class” (defined in Haspelmath 2012) and  “bound (form)” vs. “free (form)” (defined in Haspelmath 2013). All these show leakage as in (1) above, but they are equally applicable to all languages, so they are not incoherent. (I thank Harald Hammarström for a helpful discussion that helped me to come up with the above definitions, which I had not envisaged in 2011.)

(What I don’t know at the moment is how to relate “simple morphosyntactic word” to “morphosyntactic word” in general, because I cannot distinguish compounds from phrases comparatively; and I don’t know what to do with “phonological word”.)

Crucially, the definitions above make use of a number of basic concepts that apply to ALL languages in the SAME way. David Gil’s proposal, to measure “bond strength” by means of a range of language-particular phenomena, falls short of this requirement (as already hinted by Eitan Grossman). Note that the problem I have with David’s proposal is not that it provides no categorical contrasts (recall my acceptance of vagueness in (1) above), but that there is no way of telling which phenomena should count as measuring bond strength.

David’s approach resembles Keenan’s (1976) attempt at defining “subject” (perhaps not by accident, because Ed Keenan was David’s PhD supervisor), but I have a similar objection to Keenan: If different criteria are used for different languages, how do we know that we are measuring the same phenomenon across languages? Measuring X by means of Y makes sense only if we know independently that X and Y are very highly correlated. But do we know this, for subjects, or for bond strength?


Martin Haspelmath (haspelmath at shh.mpg.de<mailto:haspelmath at shh.mpg.de>)
Max Planck Institute for the Science of Human History
Kahlaische Strasse 10
D-07745 Jena
Leipzig University
IPF 141199
Nikolaistrasse 6-10
D-04109 Leipzig

Martin Haspelmath (haspelmath at shh.mpg.de<mailto:haspelmath at shh.mpg.de>)
Max Planck Institute for the Science of Human History
Kahlaische Strasse 10
D-07745 Jena
Leipzig University
IPF 141199
Nikolaistrasse 6-10
D-04109 Leipzig

Lingtyp mailing list
Lingtyp at listserv.linguistlist.org<mailto:Lingtyp at listserv.linguistlist.org>

Lingtyp mailing list
Lingtyp at listserv.linguistlist.org<mailto:Lingtyp at listserv.linguistlist.org>

Larry M. Hyman, Professor of Linguistics & Executive Director, France-Berkeley Fund
Department of Linguistics, University of California, Berkeley
President, Linguistic Society of America

Support the LSA’s efforts to advance the scientific study of language with every Amazon Smile<http://www.linguisticsociety.org/content/lsa-amazon-smile-contribute-today> purchase you make throughout the year.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20171117/bbd7cb49/attachment.htm>

More information about the Lingtyp mailing list