Freq in a bilingual conversation
kevin at dotmon.com
Thu Jun 23 12:05:07 UTC 2011
I'm trying to run basic freq commands on a bilingual conversation marked up
with the current CLAN default (ie with precodes). What I'm trying to do is to
get figures for total number of words in each language. This would be:
eng: words marked @s:eng, and unmarked words where the precode is [- eng];
spa: unmarked words, and words marked @s:spa where the precode is [- eng];
indeterminate: words marked @s:eng&spa.
clan/unix/bin/freq -s"@s:eng" clan/chats/myfile.cha
gets the ones marked @s:eng, but also includes the ones marked @s:eng&spa.
clan/unix/bin/freq +s"@s:eng&spa" clan/chats/myfile.cha
produces no results. I assume & has to be escaped, but \& doesn't work.
clan/unix/bin/freq +s"@s:eng" +s"[- eng]" clan/chats/myfile.cha
(to try and get all the English words, including the ones with precodes) also
produces no results.
I'd be grateful if someone could tell me the magic switches here. I suppose
in more general terms the question is, how far can standard regular
expressions be used in the CLAN command line - is there a special syntax, or
are they not really expected to be used there?
Pob hwyl / Best wishes
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com.
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.
More information about the Chibolts