Filled pauses and parts of speech

Brian MacWhinney macw at cmu.edu
Mon Nov 4 19:46:13 UTC 2013


Dear Shelley,

    In the CHAT files that work with MOR in the database, all uses of filled pauses are marked as non-words using the ampersand, as in &um or &eh.  To change this behavior, you would remove the ampersand and then you would have to change the extension on the fil.txt file in the /lex folder to fil.cut.  Then you can rerun MOR and you will get things like fil|um on the %mor line and you can run COMBO to check the syntactic environment.  The downside of this is that the MOR grammar was not trained to pick up these various erratic insertions inside syntactic structures and so the accuracy level of the tagging will go down.  That was the reason for removing these in the first place.  
   There may be other ways of doing this that rely on the XML structure, but nothing all that easy.

—Brian MacWhinney

On Nov 4, 2013, at 1:55 PM, Shelley Brundage <shelley.brundage at gmail.com> wrote:

> Dear ChiBolts
> One of my students is studying the location of filled pauses in Spanish and English speech samples.  We are trying to find a command or series of commands that would allow us to discern the location of filled pauses (which have been coded with @fp) with what part of speech occurs directly after the filled pause. We have used MOR to get a MOR tier on these files.  
> 
> So far, we've come up with using
> KWAL +s"*@fp*"
> to get us a list of the utterances that contain @fp
> 
> After this, is there any other command that would 'link' the part of speech info in the MOR tier with the @fp codes, even though we understand that the MOR tier does not recognize @fp (by design), or do we just use the MOR tier to ascertain the part of speech directly after the @fp?  
> 
> Thanks for your help.  
> 
> Shelley 
> 
> -- 
> Shelley B. Brundage, Ph.D., CCC-S
> Associate Professor and Graduate Program Director
> Board Recognized Specialist and Mentor-Fluency Disorders
> Speech and Hearing Science department
> George Washington University
> 2115 G St NW Suite 201
> Washington, D.C. 20052
> (202) 994-5008 office
> (202 994-2205 lab
> (202) 994-2589 fax
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAH2afvJJ1xsmYvxCTeJWy-Pgf-0DxSGbZXNmSLuyit4F5DhhHg%40mail.gmail.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/E2B7F301-5074-43F7-AF65-A8BEC77413F9%40cmu.edu.
For more options, visit https://groups.google.com/groups/opt_out.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20131104/051d2120/attachment.htm>


More information about the Chibolts mailing list