number of utterances

Brian MacWhinney macw at cmu.edu
Tue Jul 15 21:30:11 UTC 2014


Dear Rui,

      Back in the 1970s, people thought that it was wrong to separate out sentences.  Instead, they focused on units such as turns and utterances.  Roger Brown did not take this approach, but Bloom and some others did.  It makes little sense to relive the debates of the 1970s on this.  Instead we just need to fix the corpora

The corpora with the most extreme problems are not Peter, but Kuczaj and Belfast.  Also, some of the corpora in the MPI collections that are not (yet) available through CHILDES.  

I try to fix the most obvious cases and this is certainly one.  However, I would only break this into two utterances and leave the okay as a final communicator on the second utterance.

The difference between comma and pause is that the comma only indicates an intonational contour, whereas the pause indicates a pause.


Best regards,

-- Brian MacWhinney

On Jul 15, 2014, at 11:14 PM, Rui Huang <huang3740 at gmail.com> wrote:

> Hi Leonid,
> 
> I am counting total utterances in Peter01 (from Bloom70), and find there are 3 utterances in one utterance:
> 
> *PAT:	you mustn't touch it (.) you just look at it (.) okay ?
> %mor:	pro|you mod|must~neg|not v|touch pro|it pro|you adv:int|just v|look prep|at pro|it co|okay ?
> 
> When I look at XML file, these three sentences just have one utterance ID: <u who="PAT" uID="u56">
> 
> "you mustn't touch it ", "you just look at it" are complete sentences to my knowledge. They may be two utterances. Why do you put them together?
> 
> Another thing is, what's the difference between "," and "(.)"?
> e.g:
> *CHI: no, Mommy no go. 
> *CHI: no (.) Mommy go. 
> (I got the example from CHAT manual, page 57.)
> 
> Thank you!
> Rui
> 
> 
> 
> 
> On Thursday, July 19, 2012 7:49:45 PM UTC-4, Spektor, Leonid: CMU wrote:
> Misha, 
> 
>         The command 'freq +y +s"\**:" [filename]' give you a breakdown of how many of each particular speaker's utterances there are in a file and the total number of utterances in the file can be found next to label "Total number of items (tokens)". The +s"\**:" option instructs freq to look for speaker tier names only. This utterances count is dependent on three conditions: 
> 
> 1. there aren't any words inside any utterance that start with '*' character and end with ':' character. 
> 2. no utterance has been interrupted and then continued by the same speaker as indicated by "+," and "+." codes. 
> 3. every speaker tier has only one utterance, if your data file passes CLAN's CHECK, then this condition is true. 
> 
> 
> If you have some utterances that have been interrupted and then continued or if you have more than one utterance per speaker tier, then as Nan suggested in her reply, you should use MLT. Or for more strict count according to Brown use MLU. 
> 
> Leonid. 
> 
> 
> 
> On Jul 19, 2012, at 16:13 , Misha Becker wrote: 
> 
> > I'm wondering how to calculate the number of utterances a given speaker produces in each file (I will be searching for *MOT and *FAT). I have a note from many years ago that the way to do this is with the following command: 
> > 
> > freq +y +s\** [filename] 
> > 
> > But this doesn't actually do what I want. It seems to give the number of words produced by each speaker in a file. How do I find out the number of *utterances*? I've looked through the latest version of the Clan manual but haven't found the answer. 
> > 
> > Many thanks, 
> > Misha 
> > 
> > -- 
> > You received this message because you are subscribed to the Google Groups "chibolts" group. 
> > To view this discussion on the web visit https://groups.google.com/d/msg/chibolts/-/Fxz3IFqq9XoJ. 
> > To post to this group, send email to chib... at googlegroups.com. 
> > To unsubscribe from this group, send email to chibolts+u... at googlegroups.com. 
> > For more options, visit this group at http://groups.google.com/group/chibolts?hl=en. 
> > 
> > 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/5d800d4a-cee9-4e84-a0f3-c06f2afc3a02%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/63F7C6AE-5AEA-42B3-9197-F75ACE52B3BE%40cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20140715/50a12fb4/attachment.htm>


More information about the Chibolts mailing list