Freq in a bilingual conversation

Kevin Donnelly kevin at dotmon.com
Thu Jun 23 12:05:07 UTC 2011


Hi

I'm trying to run basic freq commands on a bilingual conversation marked up 
with the current CLAN default (ie with precodes).  What I'm trying to do is to 
get figures for total number of words in each language.  This would be:
eng: words marked @s:eng, and unmarked words where the precode is [- eng];
spa: unmarked words, and words marked @s:spa where the precode is [- eng];
indeterminate: words marked @s:eng&spa.

The command:
clan/unix/bin/freq -s"@s:eng" clan/chats/myfile.cha 
gets the ones marked @s:eng, but also includes the ones marked @s:eng&spa.
Using:
clan/unix/bin/freq +s"@s:eng&spa" clan/chats/myfile.cha 
produces no results.  I assume & has to be escaped, but \& doesn't work.
Using 
clan/unix/bin/freq +s"@s:eng" +s"[- eng]" clan/chats/myfile.cha 
(to try and get all the English words, including the ones with precodes) also 
produces no results.

I'd be grateful if someone could tell me the magic switches here.  I suppose 
in more general terms the question is, how far can standard regular 
expressions be used in the CLAN command line - is there a special syntax, or 
are they not really expected to be used there?

Thanks.

-- 
Pob hwyl / Best wishes

Kevin Donnelly
kevindonnelly.org.uk

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com.
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.



More information about the Chibolts mailing list