[Lingtyp] Folk definition of “word”

Daniel Ross djross3 at gmail.com
Mon Nov 29 09:26:43 UTC 2021


I believe in those languages, they are obligatory, though, and although the
history is complicated, these historical developments in orthography are
not independent, so I find the development of a Semitic orthography within
Semitic languages to be more convincing than possible Semitic influence on
other orthographies.

Again, I don't mean to suggest any clear conclusions from this! I'm not
sure what we should be learning from these orthographies, but I think there
is something to be learned. It's a thought I've had previously, but this
conversation on the list has made me think about exploring it more
thoroughly in the future, so I appreciate all of the ideas here, and your
cautious approach is also warranted.

On Mon, Nov 29, 2021 at 1:18 AM Peter Arkadiev <peterarkadiev at yandex.ru>
wrote:

> Well, it is perhaps telling, but we should be cautious not to extrapolate
> this link too broadly, because the whole family of Asian alphabetical
> scripts write vowels as diacritics regardless of the morphological
> structure of the languages thus written.
>
> 29.11.2021, 12:08, "Daniel Ross" <djross3 at gmail.com>:
>
> Yes, it can optionally be specified, but it still highlights the
> morphological type of the languages: the vowels are written as diacritics,
> rather than as full letters equivalent to the consonants.
>
> I think that is telling. I'm not sure precisely what it is telling us
> (about "wordhood", etc.), but I think it is somehow related to
> morphological structure.
>
> On Mon, Nov 29, 2021 at 12:37 AM Peter Arkadiev <peterarkadiev at yandex.ru>
> wrote:
>
> Well, note that both Arabic and Hebrew invented intricate notation for
> vowels quite early on, which is optional in some types of texts, but does
> exist.
>
> 29.11.2021, 11:34, "Daniel Ross" <djross3 at gmail.com>:
>
> Thank you, that's an important point! The morphology is based around the
> triconsonantal roots (along with the vocalic patterns that create
> fully-formed words), although it is somewhat unexpected (and a challenge
> for someone learning these languages) that they actually do not write
> morphological contrasts such as passives that are specified only by vowels.
> What I meant is that the writing system clearly reflects a particular
> perspective, i.e. structural analysis, for how Semitic languages work.
>
> I don't think it's coincidental that both (1) abjads originally developed
> in these languages, and (2) abjads remain in use for (most of) these
> languages, but other languages (Greek, etc.) have added vowels to make
> alphabets.
>
> Daniel
>
> On Mon, Nov 29, 2021 at 12:27 AM Peter Arkadiev <peterarkadiev at yandex.ru>
> wrote:
>
> Dear Danny,
>
> >Phoenician follows the well-known triliteral root system of Semitic
> languages, and those languages >typically use abjads (consonant-only
> writing systems), and I have to assume that this is due to the >central
> importance of consonants over vowels in their morphology.
>
> This is just to the contrary, I'm afraid: most of the Semitic morphology
> is performed by vowels only (cf. kataba 'he wrote' vs. kutiba 'it was
> written'), with the role of consonants (apart from the root) being limited
> -- although important enough, of course.
>
> Best regards,
>
> Peter
>
>
> 29.11.2021, 06:38, "Daniel Ross" <djross3 at gmail.com>:
>
> This is the topic of the next lecture in my Morphology class (and my
> students are currently reading your 2011 paper, Martin), so thank you
> everyone for this timely and interesting discussion.
>
> I would like to look at your conclusion from a different perspective,
> though: I agree that spaces may not directly tell us about word boundaries
> in languages, but for another reason.
>
> Japanese is a very interesting example because there are three
> (sub)scripts working together: kanji (from Chinese characters) for most
> lexical items, hiragana for function morphemes, and katakana for borrowings
> and onomatopoeia. The first response from most people (i.e. students)
> learning about this for the first time is that Japanese sounds hard to
> write, with the impression that the system may be redundant. But remember
> that Japanese does not use spaces. And it simply does not need to: kanji
> and hiragana very clearly mark the morphosyntactic structure of a sentence,
> so it is easy to skim and identify word boundaries, or at least equivalent
> information to what word boundaries do for us. This is an extremely
> efficient and transparent system, reflecting how Japanese words
> grammatically, not just orthographically.
>
> My suggestion then is to not look at when spaces are used in an
> orthography, but to look at what different orthographies do instead of
> spaces, or otherwise in a way that reflects specific morphosyntactic
> properties of languages.
>
> Looking back at the history of our familiar alphabet, Greek vowels were
> basically an accident when adapting the Phoenician writing system to Greek.
> Phoenician follows the well-known triliteral root system of Semitic
> languages, and those languages typically use abjads (consonant-only writing
> systems), and I have to assume that this is due to the central importance
> of consonants over vowels in their morphology. This can be traced back to
> the origins of this writing in Ancient Egyptian hieroglyphs, where a direct
> iconic representation of a meaning shifted to take on a specific consonant
> value, and this was codified for only consonants, with vowels unwritten.
> That is still the case today in Arabic, Hebrew, etc. (Aside: I prefer to
> think of so-called the "long vowels" as vowel-holding consonants, i.e.
> semivowels, etc., similar to how /i/ and /j/ or /u/ and /w/ may be
> represented with the same letter in alphabets, such as the letter "V" in
> Latin.) Consonant-only writing isn't such an obvious fit for another kind
> of language where the vowels are equally important morphologically. The
> Arabic script has been adapted for a number of other languages, so I'm not
> suggesting it is impossible or that it won't work, but that it probably
> wouldn't arise naturally, and due to this borrowing, it probably doesn't
> tell us much about the structure of those languages. (On the other hand,
> the Arabic script might have been a good fit for Turkish given that vowel
> harmony means there are few contrasts to represent within vowels, so that's
> another topic to look at.)
>
> We might also ask what the introduction of spaces can tell us about the
> structure of European languages. Perhaps the highly fusional inflection of
> Latin and Greek was in itself enough to signal word boundaries and
> morphological structure in general given certain typical orthotactic(?)
> forms were much more frequent than others (similar to Japanese hiragana
> marking functional morphemes).
>
> But I don't think that adopting the existing traditions of
> English/European orthography to a new, previously unwritten language
> necessarily tells us very much about the morphological structure of that
> language, because it will more likely be heavily influenced by the norms of
> English. This might be less likely in cases where the speakers of the
> language are illiterate before writing their own language, rather than
> biliterate(?) with English or another European language (or Indonesian,
> etc., following similar conventions). Where they write spaces might give us
> some suggestions about word boundaries, of course. But I think it is even
> more interesting to see what non-alphabetic scripts can tell us about the
> languages that they represent.
>
> Unfortunately we don't have a substantial number of truly independent
> writing systems around the world to really test these ideas, but it's
> certainly interesting to think about. There are a few more relevant
> examples, like how Chinese simply has no need for a word "word" because it
> has almost exclusively monosyllabic morphemes and characters, as well as
> some idiomatic combinations of them (i.e. compounds). That tells us
> something about the morphological structure of Chinese too, I think.
>
> Whether any of this is really about "wordhood" is not yet clear to me, but
> I do think that different orthographic traditions can give insights into
> morphological structure in general. One way of looking at it is that
> orthographies are a kind of formal analysis for morphological structure,
> and as we all know, analyses are informed by but do not determine
> linguistic organization. So if we think about writers as linguists, that
> may be helpful in this discussion. In fact, just like there are different
> grammatical theories, it may be that different orthographies are different
> theories of wordhood or similar levels of structure. If so, it may be that
> "words" (in the European sense) are just one way of looking at languages,
> and that they are an analysis, but not necessarily a fundamental part of
> linguistic structure. Or more interestingly, it may be that different
> languages have different units on part with "words", often reflected by
> orthographic systems. This is also why it's so interesting to look at
> proposals for writing signed languages, which introduce other kinds of
> challenges. I think this is generally in line with the conclusions of your
> 2011 paper, Martin.
>
> Last week I assigned my students a paper about questions of polysynthetic
> wordhood in Cree and Dakota (https://doi.org/10.1075/cilt.174.08rus), and
> the paper emphasized that speakers of these languages would often write
> much shorter words (with spaces between them) than expected according to
> the traditional polysynthetic analysis of linguists. But I am suspicious
> that they may be constrained by what they expect written "words" to look
> like due to familiarity with English, and I was left wondering, most
> importantly, what an original, indigenous script for Cree or Dakota would
> look like: what is the ideal way to write these languages, not how English
> writing can be borrowed for them. (I should add that Cree is often written
> in Canadian syllabics, but I think that is a general writing system, and
> according to Wikipedia designed by a linguist, so it may have other biases.
> But perhaps a syllabary has other advantages suitable for "polysynthetic"
> languages, however their structure is best analyze-- one option is that
> there are multiple word-like levels in their structure, rather than a
> unique level, and in that case a syllabary seems like a nice compromise to
> divide it into iterated units.)
>
> Daniel
>
> On Sun, Nov 28, 2021 at 8:29 AM Martin Haspelmath <
> martin_haspelmath at eva.mpg.de> wrote:
>
> This is a really interesting thread! It still seems to me that the term
> "word" has a well-understood orthographic sense, but no well-understood
> general phonological or morphosyntactic sense. Writing is now almost
> universal, but it does appear that most unwritten languages did not have a
> word for 'word' (as opposed to 'speech' or 'what someone said').
>
> I agree with Ian that "the emergence of spaces is sufficient evidence of
> wordhood", in the sense of orthographic wordhood – because spaces define
> orthographic words.
>
> As the fascinating discussion of the history of reading has made clear,
> reading is by no means a straightforward or natural activity. It's more
> like riding a bike – extremely useful, but dependent on highly specific
> cultural traditions and practices.
>
> It may well be that orthographic spaces are primarily an autonomous device
> to facilitate reading, like punctuation, paragraphs, section headings, and
> typographical ascenders/descenders in Latin script – but with no direct
> relationship to anything in the spoken language. As our grammatical
> investigations began with written language (*gram-matica* originally
> means 'study of writing', cf. *graph-* 'write'), it is natural that it
> was based on the study of written language. *Sciptio continua* may simply
> be a bit harder to read than spaced writing (just as I find Cyrillic a bit
> harder to read than Latin, because there are fewer ascenders/descenders).
>
> So I'm not sure if we can presuppose that spaces between words tell us
> anything about non-written language structure.
>
> Best,
> Martin
>
> Am 26.11.21 um 11:54 schrieb JOO, Ian [Student]:
>
> Dear David,
>
> thank you for introducing your interesting paper which I’ll have a look
> into soon.
> But, I don’t think speakers not employing spaces necessarily indicates the
> absence of wordhood.
> In many traditional orthographies, there are no spaces at all: Thai,
> Tibetan, Khmer, Japanese, pre-modern Korean, etc.
> But that wouldn’t necessarily mean that Thai speakers don’t perceive words.
> Many orthographies only transcribe consonants - but that wouldn’t mean
> that the speakers don’t perceive vowels as phonological units.
> So I think the emergence of spaces is sufficient, but not necessary,
> evidence of wordhood.
>
> Regards,
> Ian
> On 26 Nov 2021, 6:45 PM +0800, David Gil <gil at shh.mpg.de> <gil at shh.mpg.de>,
> wrote:
>
> Following on Nikolaus' comment, it is also an experiment that is performed
> whenever speakers of an unwritten language decide to introduce an
> orthography for the first time:  Do they insert spaces, and if so where?
>
> I wrote about about this in Gil (2020), with reference to a naturalistic
> corpus of SMS messages in Riau Indonesian, produced in 2003, which was the
> year everybody in the village I was staying in got their first mobile
> phones and suddenly had to figure out how to write their language.  In the
> 2020 article, my focus was more on the presence or absence of evidence for
> bound morphology, and less on whether they introduce spaces in the first
> case. What I did not mention there, but which is most germane to Ian's
> query, is the latter question, whether they use spaces at all.  In fact, my
> corpus contains lots of messages that were written without spaces at all.
> Within a couple of years the orthography became more conventionalized, and
> everybody started using spaces, but to begin with, at least, it seemed like
> many speakers were not entertaining any (meta-)linguistic notion of 'word'
> whatsoever.
>
> (BTW, in Riau and many other dialects of Indonesian, the word for 'word',
> *kata*, also means 'say'.)
>
> David
>
> Gil, David (2020) "What Does It Mean to Be an Isolating Language? The Case
> of Riau Indonesian", in D. Gil and A. Schapper eds., *Austronesian
> Undressed: How and Why Languages Become Isolating*, John Benjamins,
> Amsterdam, 9-96.
>
>
> On 26/11/2021 12:11, Nikolaus P Himmelmann wrote:
>
> Hi
> On 26/11/2021 10:17, JOO, Ian [Student] wrote:
>
>
> The question would be, when one asks a speaker of a given language to
> divide a sentence into words, would the number of words be consistent
> throughout different speakers?
> It would be an interesting experiment. I’d be happy to be informed of any
> previous study who conducted such an experiment.
>
> Yes, indeed. And it is an experiment, though largely uncontrolled, that is
> carried out whenever someone carries out fieldwork on an undocumented lect.
> In this context, speakers provide evidence for word units in two ways: a)
> in elicitation when prompted by pointing or with a word from a contact
> language; b) when chunking a recording into chunks that can be written down
> by the researcher.
>
> In my experience, speakers across a given community are pretty consistent
> in both activities though one may distinguish two basic types speakers. One
> group provides word-like units, so when you ask for "stone" you get a
> minimal form for stone. The other primarily provides utterance-like units.
> So you do not get "stone" but rather "look at this stone", "how big the
> stone is", "stones for building ovens" or the like.
>
> Depending on the language, there is some variation in the units provided
> in both activities but this is typically restricted to the kind of
> phenomena that later on cause the main problems in the analytical
> reconstruction of a word unit, i.e. mostly phenomena that come under the
> broad term of "clitics". In my view, one should clearly distinguish between
> these analytical reconstructions, which are basic building blocks of
> grammatial descriptions, and the "natural" units provided by speakers,
> which are primary data providing the basis for the description.
>
> Best
>
> Nikolaus
>
>
>
> --
> David Gil
>
> Senior Scientist (Associate)
> Department of Linguistic and Cultural Evolution
> Max Planck Institute for Evolutionary Anthropology
> Deutscher Platz 6, Leipzig, 04103, Germany
>
> Email: gil at shh.mpg.de
> Mobile Phone (Israel): +972-526117713
> Mobile Phone (Indonesia): +62-81344082091
>
>
> *Disclaimer:*
>
> *This message (including any attachments) contains confidential
> information intended for a specific individual and purpose. If you are not
> the intended recipient, you should delete this message and notify the
> sender and The Hong Kong Polytechnic University (the University)
> immediately. Any disclosure, copying, or distribution of this message, or
> the taking of any action based on it, is strictly prohibited and may be
> unlawful.*
>
> *The University specifically denies any responsibility for the accuracy or
> quality of information obtained through University E-mail Facilities. Any
> views and opinions expressed are only those of the author(s) and do not
> necessarily represent those of the University and the University accepts no
> liability whatsoever for any losses or damages incurred or caused to any
> party as a result of the use of such information.*
>
> _______________________________________________
> Lingtyp mailing listLingtyp at listserv.linguistlist.orghttp://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
>
>
> --
> Martin Haspelmath
> Max Planck Institute for Evolutionary Anthropology
> Deutscher Platz 6
> D-04103 Leipzighttps://www.eva.mpg.de/linguistic-and-cultural-evolution/staff/martin-haspelmath/
>
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> http://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
> ,
>
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> http://listserv.linguistlist.org/mailman/listinfo/lingtyp
>
>
>
> --
> Peter Arkadiev, PhD Habil.
> Institute of Slavic Studies
> Russian Academy of Sciences
> Leninsky prospekt 32-A 119334 Moscow
> peterarkadiev at yandex.ru
> http://inslav.ru/people/arkadev-petr-mihaylovich-peter-arkadiev
>
>
>
>
> --
> Peter Arkadiev, PhD Habil.
> Institute of Slavic Studies
> Russian Academy of Sciences
> Leninsky prospekt 32-A 119334 Moscow
> peterarkadiev at yandex.ru
> http://inslav.ru/people/arkadev-petr-mihaylovich-peter-arkadiev
>
>
>
>
> --
> Peter Arkadiev, PhD Habil.
> Institute of Slavic Studies
> Russian Academy of Sciences
> Leninsky prospekt 32-A 119334 Moscow
> peterarkadiev at yandex.ru
> http://inslav.ru/people/arkadev-petr-mihaylovich-peter-arkadiev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20211129/60068c66/attachment.htm>


More information about the Lingtyp mailing list