check errors and adding own tags

Brian MacWhinney macw at cmu.edu
Thu Jun 23 17:12:19 UTC 2011


On Jun 23, 2011, at 6:58 PM, Shelley Brundage wrote:

> First, compound words (french_fry) do not check.  They end up in the output file.  Does this mean that they would not be included in subsequent MOR analyses?  If we want them to be part of the MOR analyses, how do we get it to check?  We checked the depfile, and *_* was there under the %syn line.  We are hesitant to change the depfile unless we need to; can someone tell us how to get compound words into the MOR analyses?
> 
> 

Dear Shelley,
       Words like french_fry and mmhm will not have any problem with CHECK.  Perhaps you mean to say that they are not recognized by English MOR?  That is true. For MOR, french fries must be french+fries and mmhm must be m:hm.  Words have to be entered in forms that MOR can recognize.  When you say that %mor tiers were not created, I am not sure exactly what you mean.  Usually, MOR enters something like ?|mhm if it can't recognize something.  
>  
> Second, non-words such as ‘mmhm’ also do not pass check.  We attempted to run MOR even with these in the main tier, and the %MOR tiers were not created. Any idea why not?
> 
>  
> Third, our collaborator needs to run MLT and TTR in both English, Spanish, and ‘mixed’ (utterances that include code switches) utterances.  She has been using pre-codes to identify the language used in the utterance, and post-codes to identify utterances that she wants omitted from the MLT analyses.  However, we cannot get transcripts with pre-codes to run in MOR.  Is there a way to calculate MLT that does not involve the use of precodes?
> 
> 

I ran these two commands on a bilingual file from the YipMatthews corpus and they worked fine:

mlt +s"[- eng]" ac020610.cha
mlt -s"[- eng]" ac020610.cha
>  
> Fourth, more generally, does CLAN treat pre-codes and post-codes similarly?  Or are they unique in how they influence subsequent analyses? 
> 

They are totally different in concept and often in effect.  Precodes are only for marking the language of the utterance.  Postcodes are for speech acts and other coding categories.  

>  
> Finally, we have been able, with Leonid and Brian’s wonderful assistance, to get MOR to run in both SPA and ENG when precodes are deleted from the transcripts.  I have a set of directions for this, and can share if folks want them. 
> 

I'm not sure why you would want to delete precodes from transcripts.

Good luck,

-- Brian MacWhinney

>  
>  Shelley Brundage, 
> 
> George Washington University
> 
> 
> On Thu, Jun 23, 2011 at 11:46 AM, Leonid Spektor <spektor at andrew.cmu.edu> wrote:
> Rasmus,
> 
> 
> 1. The form "bird+house-s" is incorrect. If the manual says otherwise, then it needs to be changed. The correct form is "bird+houses".
> 
> 2. We don't support this feature at this time. We are in a process of implementing a generic "@x:" tag which will allow users to specify anything they want to. Your example below, then will look like "Toronto at x:geo". This should be implemented in a day or so.
> 
> 3. We discourage people from editing the "depfile.cut". Adding anything to that file without additional changes to other supporting files and/or programs will cause problems with other programs like mor and post. We also no longer support the "depadd.cut" file. Check just simply ignores it.
> 
> We no longer distribute the "depadd.cut" file. You must have a very old installation of CLAN and new updates do not delete any old files present there.
> 
> Leonid.
> 
> 
> 
> 
> On Jun 23, 2011, at 09:20, RSteinkrauss wrote:
> 
> > Hi,
> >
> > I have some questions regarding checking files in CLAN and would be
> > very happy if someone could help me. I am using the Win version from
> > 30-Apr-2011.
> >
> > 1. If understood the manual correctly, English compounds should be
> > transcribed with a + sign. Also, plurals in compounds should be marked
> > with a dash-s. However, when i run CHECK on my transcript, a form such
> > as bird+house-s gives me an "undeclared suffix" error. Same with bird
> > +house-s at s if it is a codeswitch in an otherwise German text.
> >
> > 2. We would like to use some @-tags in our transcripts that are not
> > specified in the manual (such as @geo for place names, e.g.
> > Toronto at geo). I understand that this is not longer possible using a
> > depadd-file, so I wondered if the way to go is to change the
> > depfile.cut-file.
> >
> > 3. Assuming that this was indeed what should be done, I changed the
> > depfile.cut so that it would include the tags we need. After that,
> > CHECK worked fine (with the exception of the errors described in
> > question 1). However, if I now change the depadd.cut file again, CLAN
> > will not react to the changes. E.g., if I remove a previously allowed
> > @-tag such as @geo from my depadd.cut, CHECK will not complain when it
> > encounters that tag in a transcipt. Strangely, that behaviour persists
> > even after de- and reinstalling CLAN. I have set the lib path
> > correctly, to a folder containing my changed depadd.cut. Also after
> > changing the lib path back to the original depadd.cut that came /
> > installed with CLAN, CHECK still allows my own @-tags.
> >
> > Does anyone have an idea how to go about this? Any feedback would be
> > greatly appreciated!
> >
> > Thank you in advance,
> > Rasmus
> >
> > --
> > You received this message because you are subscribed to the Google Groups "chibolts" group.
> > To post to this group, send email to chibolts at googlegroups.com.
> > To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
> > For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.
> >
> >
> 
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To post to this group, send email to chibolts at googlegroups.com.
> To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.
> 
> 
> 
> 
> -- 
> Shelley B. Brundage, Ph.D., CCC-S
> Associate Professor and Graduate Program Director
> Board Recognized Specialist and Mentor-Fluency Disorders
> Speech and Hearing Science department
> George Washington University
> 2115 G St NW Suite 201
> Washington, D.C. 20052
> (202) 994-5008 office
> (202 994-2205 lab
> (202) 994-2589 fax
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To post to this group, send email to chibolts at googlegroups.com.
> To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com.
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20110623/c4555a94/attachment.htm>


More information about the Chibolts mailing list