[Lingtyp] Utterance boundaries as a universal concept?

John DuBois dubois at ucsb.edu
Thu Dec 15 17:34:11 UTC 2022

I agree with David that "utterance" is far from a trivial unit to reliably
identify from continuous discourse.

Far more reliable is the intonation unit. Intonation unit boundaries are
universally and reliably recognizable based on prosodic cues alone, even in
a language you don't know. See Himmelmann, and Troiani.) Intonation units
thus have a well-defined beginning, middle, and end (unlike the other

New work by Giorgia Toiani and myself on Kazakh presents a solid
methodology for confirming inter-transcriber reliability for intonation

Further, new work by Ryan Ka Yau Lai and myself on English shows in detail
what it means for a word to be initial, medial or final in the intonation
unit. (We'll present this next month at LSA.) This has consequences for
typological concepts like so-called "sentence-final particle", some of
which are probably actually intonation-unit-final.

We are also developing a prosodic operationalization of the utterance,
based on a sequence of 1 or more intonation units.

In addition to the very interesting work by Kibrik and colleagues on basic
discourse units, there is equally interesting work along the same lines by
Liesbeth Degand and her students. These 2 initiatives are closer to
operationalizing something like "utterance" in a meaningful way. Still,
it's not clear if the reliability of these units can reach the level of the
intonation unit, nor of the prosodically-defined utterance.

John W. Du Bois
Professor of Linguistics
University of California, Santa Barbara
Santa Barbara, California 93106
dubois at ucsb.edu

On Thu, Dec 15, 2022, 1:36 AM David Gil <gil at shh.mpg.de> wrote:

> Ian, and everybody,
> My impression is that the notion of "utterance" is every bit as
> problematical as that of "word" — though it seems like there as not been
> as much discussion about utterances as there has been about words.
> I was particularly struck by the lack of clarity of the notion of
> utterance when developing our Max Planck Institute naturalistic corpora
> in Jakarta.  When transcribing our naturalistic data, our goal was to
> enter each utterance into a separate field in our database; however, we
> had no clear set of principles how to parse a continuous say hour-long
> text into such utterances.  While for many purposes it didn't really
> matter, for some it most clearly did.  Ian's proposed generalizations
> might be a case in point, but the case that struck me as most cogent was
> in the field of 1st language acquisition, for which we compiled a large
> corpus.  In child language studies, a central role is played by the
> notion of MLU, or Mean Length of Utterance, so obviously we wanted to
> examine our data in terms of MLU.  But it was patently clear that our
> parsings into utterances were arbitrary and problematical in many ways,
> which got me to wondering whether this was due to our own ignorance, or
> alternatively a more general problem that should perhaps be addressed.
> I must confess I haven't thought much about this recently, but I'm now
> wondering:  Are there any go-to references on how to parse a text into
> utterances, or is this indeed a lacuna that still needs to be filled?
> David
> On 15/12/2022 07:31, Ian Joo wrote:
> > Dear typologists,
> >
> > many grammars employ the terms “word-initial”, “word-final”, and
> “word-medial”, without specifying what a “word” is.
> > And, as we have discussed earlier, there is no consensus on what a
> “word” is, or whether it is a cross-linguistically valid concept.
> > But can we at least agree that the following concepts are universal:
> “utterance-initial”, “utterance-final”, and “utterance-medial”?
> > As all human utterances are finite (signed or spoken), the corollary is
> that there is a beginning, the ending, and phases in between.
> > For example, instead of saying that “a lect does not allow /r/
> word-initially”, can we say that it does not allow /r/ utterance-initially?
> > Would it save us from the conceptual ambiguity of woordhood?
> >
> >  From Hong Kong,
> > Ian
> > _______________________________________________
> > Lingtyp mailing list
> > Lingtyp at listserv.linguistlist.org
> > https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp
> --
> David Gil
> Senior Scientist (Associate)
> Department of Linguistic and Cultural Evolution
> Max Planck Institute for Evolutionary Anthropology
> Deutscher Platz 6, Leipzig, 04103, Germany
> Email: gil at shh.mpg.de
> Mobile Phone (Israel): +972-526117713
> Mobile Phone (Indonesia): +62-082113720302
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20221215/4f54de5d/attachment.htm>

More information about the Lingtyp mailing list