[Lingtyp] Folk definition of “word”

Mon Nov 29 08:33:58 UTC 2021

Thank you, that's an important point! The morphology is based around the
triconsonantal roots (along with the vocalic patterns that create
fully-formed words), although it is somewhat unexpected (and a challenge
for someone learning these languages) that they actually do not write
morphological contrasts such as passives that are specified only by vowels.
What I meant is that the writing system clearly reflects a particular
perspective, i.e. structural analysis, for how Semitic languages work.

I don't think it's coincidental that both (1) abjads originally developed
in these languages, and (2) abjads remain in use for (most of) these
languages, but other languages (Greek, etc.) have added vowels to make
alphabets.

Daniel

On Mon, Nov 29, 2021 at 12:27 AM Peter Arkadiev <peterarkadiev at yandex.ru>
wrote:

> Dear Danny,
>
> >Phoenician follows the well-known triliteral root system of Semitic
> languages, and those languages >typically use abjads (consonant-only
> writing systems), and I have to assume that this is due to the >central
> importance of consonants over vowels in their morphology.
>
> This is just to the contrary, I'm afraid: most of the Semitic morphology
> is performed by vowels only (cf. kataba 'he wrote' vs. kutiba 'it was
> written'), with the role of consonants (apart from the root) being limited
> -- although important enough, of course.
>
> Best regards,
>
> Peter
>
>
> 29.11.2021, 06:38, "Daniel Ross" <djross3 at gmail.com>:
>
> This is the topic of the next lecture in my Morphology class (and my
> students are currently reading your 2011 paper, Martin), so thank you
> everyone for this timely and interesting discussion.
>
> I would like to look at your conclusion from a different perspective,
> though: I agree that spaces may not directly tell us about word boundaries
> in languages, but for another reason.
>
> Japanese is a very interesting example because there are three
> (sub)scripts working together: kanji (from Chinese characters) for most
> lexical items, hiragana for function morphemes, and katakana for borrowings
> and onomatopoeia. The first response from most people (i.e. students)
> learning about this for the first time is that Japanese sounds hard to
> write, with the impression that the system may be redundant. But remember
> that Japanese does not use spaces. And it simply does not need to: kanji
> and hiragana very clearly mark the morphosyntactic structure of a sentence,
> so it is easy to skim and identify word boundaries, or at least equivalent
> information to what word boundaries do for us. This is an extremely
> efficient and transparent system, reflecting how Japanese words
> grammatically, not just orthographically.
>
> My suggestion then is to not look at when spaces are used in an
> orthography, but to look at what different orthographies do instead of
> spaces, or otherwise in a way that reflects specific morphosyntactic
> properties of languages.
>
> Looking back at the history of our familiar alphabet, Greek vowels were
> basically an accident when adapting the Phoenician writing system to Greek.
> Phoenician follows the well-known triliteral root system of Semitic
> languages, and those languages typically use abjads (consonant-only writing
> systems), and I have to assume that this is due to the central importance
> of consonants over vowels in their morphology. This can be traced back to
> the origins of this writing in Ancient Egyptian hieroglyphs, where a direct
> iconic representation of a meaning shifted to take on a specific consonant
> value, and this was codified for only consonants, with vowels unwritten.
> That is still the case today in Arabic, Hebrew, etc. (Aside: I prefer to
> think of so-called the "long vowels" as vowel-holding consonants, i.e.
> semivowels, etc., similar to how /i/ and /j/ or /u/ and /w/ may be
> represented with the same letter in alphabets, such as the letter "V" in
> Latin.) Consonant-only writing isn't such an obvious fit for another kind
> of language where the vowels are equally important morphologically. The
> Arabic script has been adapted for a number of other languages, so I'm not
> suggesting it is impossible or that it won't work, but that it probably
> wouldn't arise naturally, and due to this borrowing, it probably doesn't
> tell us much about the structure of those languages. (On the other hand,
> the Arabic script might have been a good fit for Turkish given that vowel
> harmony means there are few contrasts to represent within vowels, so that's
> another topic to look at.)
>
> We might also ask what the introduction of spaces can tell us about the
> structure of European languages. Perhaps the highly fusional inflection of
> Latin and Greek was in itself enough to signal word boundaries and
> morphological structure in general given certain typical orthotactic(?)
> forms were much more frequent than others (similar to Japanese hiragana
> marking functional morphemes).
>
> But I don't think that adopting the existing traditions of
> English/European orthography to a new, previously unwritten language
> necessarily tells us very much about the morphological structure of that
> language, because it will more likely be heavily influenced by the norms of
> English. This might be less likely in cases where the speakers of the
> language are illiterate before writing their own language, rather than
> biliterate(?) with English or another European language (or Indonesian,
> etc., following similar conventions). Where they write spaces might give us
> some suggestions about word boundaries, of course. But I think it is even
> more interesting to see what non-alphabetic scripts can tell us about the
> languages that they represent.
>
> Unfortunately we don't have a substantial number of truly independent
> writing systems around the world to really test these ideas, but it's
> certainly interesting to think about. There are a few more relevant
> examples, like how Chinese simply has no need for a word "word" because it
> has almost exclusively monosyllabic morphemes and characters, as well as
> some idiomatic combinations of them (i.e. compounds). That tells us
> something about the morphological structure of Chinese too, I think.
>
> Whether any of this is really about "wordhood" is not yet clear to me, but
> I do think that different orthographic traditions can give insights into
> morphological structure in general. One way of looking at it is that
> orthographies are a kind of formal analysis for morphological structure,
> and as we all know, analyses are informed by but do not determine
> linguistic organization. So if we think about writers as linguists, that
> may be helpful in this discussion. In fact, just like there are different
> grammatical theories, it may be that different orthographies are different
> theories of wordhood or similar levels of structure. If so, it may be that
> "words" (in the European sense) are just one way of looking at languages,
> and that they are an analysis, but not necessarily a fundamental part of
> linguistic structure. Or more interestingly, it may be that different
> languages have different units on part with "words", often reflected by
> orthographic systems. This is also why it's so interesting to look at
> proposals for writing signed languages, which introduce other kinds of
> challenges. I think this is generally in line with the conclusions of your
> 2011 paper, Martin.
>
> Last week I assigned my students a paper about questions of polysynthetic
> wordhood in Cree and Dakota (https://doi.org/10.1075/cilt.174.08rus), and
> the paper emphasized that speakers of these languages would often write
> much shorter words (with spaces between them) than expected according to
> the traditional polysynthetic analysis of linguists. But I am suspicious
> that they may be constrained by what they expect written "words" to look
> like due to familiarity with English, and I was left wondering, most
> importantly, what an original, indigenous script for Cree or Dakota would
> look like: what is the ideal way to write these languages, not how English
> writing can be borrowed for them. (I should add that Cree is often written
> in Canadian syllabics, but I think that is a general writing system, and
> according to Wikipedia designed by a linguist, so it may have other biases.
> But perhaps a syllabary has other advantages suitable for "polysynthetic"
> languages, however their structure is best analyze-- one option is that
> there are multiple word-like levels in their structure, rather than a
> unique level, and in that case a syllabary seems like a nice compromise to
> divide it into iterated units.)
>
> Daniel
>
> On Sun, Nov 28, 2021 at 8:29 AM Martin Haspelmath <
> martin_haspelmath at eva.mpg.de> wrote:
>
> This is a really interesting thread! It still seems to me that the term
> "word" has a well-understood orthographic sense, but no well-understood
> general phonological or morphosyntactic sense. Writing is now almost
> universal, but it does appear that most unwritten languages did not have a
> word for 'word' (as opposed to 'speech' or 'what someone said').
>
> I agree with Ian that "the emergence of spaces is sufficient evidence of
> wordhood", in the sense of orthographic wordhood – because spaces define
> orthographic words.
>
> As the fascinating discussion of the history of reading has made clear,
> reading is by no means a straightforward or natural activity. It's more
> like riding a bike – extremely useful, but dependent on highly specific
> cultural traditions and practices.
>
> It may well be that orthographic spaces are primarily an autonomous device
> to facilitate reading, like punctuation, paragraphs, section headings, and
> typographical ascenders/descenders in Latin script – but with no direct
> relationship to anything in the spoken language. As our grammatical
> investigations began with written language (*gram-matica* originally
> means 'study of writing', cf. *graph-* 'write'), it is natural that it
> was based on the study of written language. *Sciptio continua* may simply
> be a bit harder to read than spaced writing (just as I find Cyrillic a bit
> harder to read than Latin, because there are fewer ascenders/descenders).
>
> So I'm not sure if we can presuppose that spaces between words tell us
> anything about non-written language structure.
>
> Best,
> Martin
>
> Am 26.11.21 um 11:54 schrieb JOO, Ian [Student]:
>
> Dear David,
>
> thank you for introducing your interesting paper which I’ll have a look
> into soon.
> But, I don’t think speakers not employing spaces necessarily indicates the
> absence of wordhood.
> In many traditional orthographies, there are no spaces at all: Thai,
> Tibetan, Khmer, Japanese, pre-modern Korean, etc.
> But that wouldn’t necessarily mean that Thai speakers don’t perceive words.
> Many orthographies only transcribe consonants - but that wouldn’t mean
> that the speakers don’t perceive vowels as phonological units.
> So I think the emergence of spaces is sufficient, but not necessary,
> evidence of wordhood.
>
> Regards,
> Ian
> On 26 Nov 2021, 6:45 PM +0800, David Gil <gil at shh.mpg.de> <gil at shh.mpg.de>,
> wrote:
>
> Following on Nikolaus' comment, it is also an experiment that is performed
> whenever speakers of an unwritten language decide to introduce an
> orthography for the first time:  Do they insert spaces, and if so where?
>
> I wrote about about this in Gil (2020), with reference to a naturalistic
> corpus of SMS messages in Riau Indonesian, produced in 2003, which was the
> year everybody in the village I was staying in got their first mobile
> phones and suddenly had to figure out how to write their language.  In the
> 2020 article, my focus was more on the presence or absence of evidence for
> bound morphology, and less on whether they introduce spaces in the first
> case. What I did not mention there, but which is most germane to Ian's
> query, is the latter question, whether they use spaces at all.  In fact, my
> corpus contains lots of messages that were written without spaces at all.
> Within a couple of years the orthography became more conventionalized, and
> everybody started using spaces, but to begin with, at least, it seemed like
> many speakers were not entertaining any (meta-)linguistic notion of 'word'
> whatsoever.
>
> (BTW, in Riau and many other dialects of Indonesian, the word for 'word',
> *kata*, also means 'say'.)
>
> David
>
> Gil, David (2020) "What Does It Mean to Be an Isolating Language? The Case
> of Riau Indonesian", in D. Gil and A. Schapper eds., *Austronesian
> Undressed: How and Why Languages Become Isolating*, John Benjamins,
> Amsterdam, 9-96.
>
>
> On 26/11/2021 12:11, Nikolaus P Himmelmann wrote:
>
> Hi
> On 26/11/2021 10:17, JOO, Ian [Student] wrote:
>
>
> The question would be, when one asks a speaker of a given language to
> divide a sentence into words, would the number of words be consistent
> throughout different speakers?
> It would be an interesting experiment. I’d be happy to be informed of any
> previous study who conducted such an experiment.
>
> Yes, indeed. And it is an experiment, though largely uncontrolled, that is
> carried out whenever someone carries out fieldwork on an undocumented lect.
> In this context, speakers provide evidence for word units in two ways: a)
> in elicitation when prompted by pointing or with a word from a contact
> language; b) when chunking a recording into chunks that can be written down
> by the researcher.
>
> In my experience, speakers across a given community are pretty consistent
> in both activities though one may distinguish two basic types speakers. One
> group provides word-like units, so when you ask for "stone" you get a
> minimal form for stone. The other primarily provides utterance-like units.
> So you do not get "stone" but rather "look at this stone", "how big the
> stone is", "stones for building ovens" or the like.
>
> Depending on the language, there is some variation in the units provided
> in both activities but this is typically restricted to the kind of
> phenomena that later on cause the main problems in the analytical
> reconstruction of a word unit, i.e. mostly phenomena that come under the
> broad term of "clitics". In my view, one should clearly distinguish between
> these analytical reconstructions, which are basic building blocks of
> grammatial descriptions, and the "natural" units provided by speakers,
> which are primary data providing the basis for the description.
>
> Best
>
> Nikolaus
>
>
>
> --
> David Gil
>
> Senior Scientist (Associate)
> Department of Linguistic and Cultural Evolution
> Max Planck Institute for Evolutionary Anthropology
> Deutscher Platz 6, Leipzig, 04103, Germany
>
> Email: gil at shh.mpg.de
> Mobile Phone (Israel): +972-526117713
> Mobile Phone (Indonesia): +62-81344082091
>
>
> *Disclaimer:*
>
> *This message (including any attachments) contains confidential
> information intended for a specific individual and purpose. If you are not
> the intended recipient, you should delete this message and notify the
> sender and The Hong Kong Polytechnic University (the University)
> immediately. Any disclosure, copying, or distribution of this message, or
> the taking of any action based on it, is strictly prohibited and may be
> unlawful.*
>
> *The University specifically denies any responsibility for the accuracy or
> quality of information obtained through University E-mail Facilities. Any
> views and opinions expressed are only those of the author(s) and do not
> necessarily represent those of the University and the University accepts no
> liability whatsoever for any losses or damages incurred or caused to any
> party as a result of the use of such information.*
>
> _______________________________________________
> Lingtyp mailing listLingtyp at listserv.linguistlist.orghttp://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
>
>
> --
> Martin Haspelmath
> Max Planck Institute for Evolutionary Anthropology
> Deutscher Platz 6
> D-04103 Leipzighttps://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/
>
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> http://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
> ,
>
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> http://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
>
>
> --
> Peter Arkadiev, PhD Habil.
> Institute of Slavic Studies
> Russian Academy of Sciences
> Leninsky prospekt 32-A 119334 Moscow
> peterarkadiev at yandex.ru
> http://inslav.ru/people/arkadev-petr-mihaylovich-peter-arkadiev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20211129/4530d5b0/attachment.htm>