Some problems with extracting error-free utterances and verbs from CHAT files

Leonid Spektor spektor at andrew.cmu.edu
Thu Jun 28 17:37:47 UTC 2018


Li,

	1. The codes like [*] or [* aux] refer to the word before them. If you want your codes refer to the whole utterance, then they need to start with "[+ ". You can change your codes to [+ *], [+ *aux], [+ *wh], then trim those utterances with command: trim -s"[+ \**]". 

2. If you used the latest MOR grammar on your data, then you can comprehensive command option for all verbs is: +sm|v,|cop,|aux,|mod,|mod:*,|part


Leonid.

> On Jun 28, 2018, at 12:48, Li Zeng <zlmailhouse at 163.com> wrote:
> 
> Hi there, 
> 
> I encounter some problems with extracting utterances/ verbs  in CHAT files. 
> 
> Firstly, I have tagged ungrammatical utterances of *CHI with either [*], [* aux] or [* wh]. Now I wanna calculate the number of utterances without those tags([*], [* aux], [* wh] as well as those containing www, yyy.  I tried using the following command:  trim -s"[*_ ]" +1 , only to find it turns out to be unsuccessful.
> 
> Secondly, I would like to extract all the verbs of CHI* (including copulers, modals, auxiliaries as well as regular verbs ) in the file. I find out that at%mor, "walking" is coded not as a verb but as "PART |" . In that case, I guess I need to  also include "PART|"  , right?  I was wondering what might be the comprehensive command to be used to extract all the verbs mentioned above?
> 
> Thank you. 
> 
> Li 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To post to this group, send email to chibolts at googlegroups.com <mailto:chibolts at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/addb310b-f4ed-497a-bd48-e1f91c045f53%40googlegroups.com <https://groups.google.com/d/msgid/chibolts/addb310b-f4ed-497a-bd48-e1f91c045f53%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/3F6AE6BD-8547-469C-8FC1-14E050813767%40andrew.cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20180628/f09298b1/attachment.htm>


More information about the Chibolts mailing list