[Lingtyp] Utterance boundaries as a universal concept?

Thu Dec 15 09:36:18 UTC 2022

Ian, and everybody,

My impression is that the notion of "utterance" is every bit as 
problematical as that of "word" — though it seems like there as not been 
as much discussion about utterances as there has been about words.

I was particularly struck by the lack of clarity of the notion of 
utterance when developing our Max Planck Institute naturalistic corpora 
in Jakarta.  When transcribing our naturalistic data, our goal was to 
enter each utterance into a separate field in our database; however, we 
had no clear set of principles how to parse a continuous say hour-long 
text into such utterances.  While for many purposes it didn't really 
matter, for some it most clearly did.  Ian's proposed generalizations 
might be a case in point, but the case that struck me as most cogent was 
in the field of 1st language acquisition, for which we compiled a large 
corpus.  In child language studies, a central role is played by the 
notion of MLU, or Mean Length of Utterance, so obviously we wanted to 
examine our data in terms of MLU.  But it was patently clear that our 
parsings into utterances were arbitrary and problematical in many ways, 
which got me to wondering whether this was due to our own ignorance, or 
alternatively a more general problem that should perhaps be addressed.  
I must confess I haven't thought much about this recently, but I'm now 
wondering:  Are there any go-to references on how to parse a text into 
utterances, or is this indeed a lacuna that still needs to be filled?

David

On 15/12/2022 07:31, Ian Joo wrote:
> Dear typologists,
>
> many grammars employ the terms “word-initial”, “word-final”, and “word-medial”, without specifying what a “word” is.
> And, as we have discussed earlier, there is no consensus on what a “word” is, or whether it is a cross-linguistically valid concept.
> But can we at least agree that the following concepts are universal: “utterance-initial”, “utterance-final”, and “utterance-medial”?
> As all human utterances are finite (signed or spoken), the corollary is that there is a beginning, the ending, and phases in between.
> For example, instead of saying that “a lect does not allow /r/ word-initially”, can we say that it does not allow /r/ utterance-initially?
> Would it save us from the conceptual ambiguity of woordhood?
>
>  From Hong Kong,
> Ian
> _______________________________________________
> Lingtyp mailing list
> Lingtyp at listserv.linguistlist.org
> https://listserv.linguistlist.org/cgi-bin/mailman/listinfo/lingtyp

-- 
David Gil

Senior Scientist (Associate)
Department of Linguistic and Cultural Evolution
Max Planck Institute for Evolutionary Anthropology
Deutscher Platz 6, Leipzig, 04103, Germany

Email: gil at shh.mpg.de
Mobile Phone (Israel): +972-526117713
Mobile Phone (Indonesia): +62-082113720302