Freq in a bilingual conversation

Kevin Donnelly kevin at
Thu Jun 23 12:05:07 UTC 2011


I'm trying to run basic freq commands on a bilingual conversation marked up 
with the current CLAN default (ie with precodes).  What I'm trying to do is to 
get figures for total number of words in each language.  This would be:
eng: words marked @s:eng, and unmarked words where the precode is [- eng];
spa: unmarked words, and words marked @s:spa where the precode is [- eng];
indeterminate: words marked @s:eng&spa.

The command:
clan/unix/bin/freq -s"@s:eng" clan/chats/myfile.cha 
gets the ones marked @s:eng, but also includes the ones marked @s:eng&spa.
clan/unix/bin/freq +s"@s:eng&spa" clan/chats/myfile.cha 
produces no results.  I assume & has to be escaped, but \& doesn't work.
clan/unix/bin/freq +s"@s:eng" +s"[- eng]" clan/chats/myfile.cha 
(to try and get all the English words, including the ones with precodes) also 
produces no results.

I'd be grateful if someone could tell me the magic switches here.  I suppose 
in more general terms the question is, how far can standard regular 
expressions be used in the CLAN command line - is there a special syntax, or 
are they not really expected to be used there?


Pob hwyl / Best wishes

Kevin Donnelly

You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at
To unsubscribe from this group, send email to chibolts+unsubscribe at
For more options, visit this group at

More information about the Chibolts mailing list