combo hangs on negation sometimes

paul paul.feitzinger at gmail.com
Tue Aug 6 01:29:36 UTC 2013


I'm trying to rerun some combo searches that ran successfully a year ago but
haven't been used since then. I've observed identical behavior on Windows 
XP,
Ubuntu 12.04, and Arch Linux with CLAN 05-Aug-2013 and the version before 
it.

It seems that combo is hanging when encountering the negation operator "!" 
in
certain contexts. For example:

  combo +t'*CHI' +t%mor +t%xgra +s'!*:wh|*^!*?' +d1 
~/corpora/childes/Valian/01a.cha

is intended to filter out utterances containing wh questions, although it's
unclear to me exactly how to parse that search string (I didn't write it).

The same thing happens on a simpler combo line like

  combo @ +t'*CHI' +t%mor +t%xgra +s'!xxx' +d1 
~/corpora/childes/Valian/01a.cha

though I realize this could be rewritten with kwal.

In both cases combo never gets past

  combo +t*CHI +t%mor +t%xgra +s!xxx +d1 
/home/paul/corpora/childes/Valian/01a.cha
  Mon Aug  5 20:43:10 2013
  combo (05-Aug-2013) is conducting analyses on:
    ONLY speaker main tiers matching: *CHI;
      and those speakers' ONLY dependent tiers matching: %MOR; %XGRA;
  ****************************************
  From file <01a.cha>

After poking around a little with gdb and enabling the debug print 
statement in
combo.cpp:findmatch I get

  combo +t*CHI +t%mor +t%xgra +s!xxx +d1 
/home/paul/corpora/childes/Valian/01a.cha
  Mon Aug  5 20:54:54 2013
  combo (05-Aug-2013) is conducting analyses on:
    ONLY speaker main tiers matching: *CHI;
      and those speakers' ONLY dependent tiers matching: %MOR; %XGRA;
  ****************************************
  From file <01a.cha>
  1; pat=xxx;wild=0;origmac->neg=1;txt=tape it up and two tape players 
.       %mor: v|tape pro|it adv:loc|up coord|and det:num|two n|tape 
n|play&dv-agt-pl   .  %xgra: 1|4|coord 2|1|obj 3|1|jct 4|0|root 5|6|quant 
6|4|coord 7|6|jct  8|4|punct 
  1; pat=xxx;wild=0;origmac->neg=1;txt=tape it up and two tape players 
.       %mor: v|tape pro|it adv:loc|up coord|and det:num|two n|tape 
n|play&dv-agt-pl   .  %xgra: 1|4|coord 2|1|obj 3|1|jct 4|0|root 5|6|quant 
6|4|coord 7|6|jct  8|4|punct 
  1; pat=xxx;wild=0;origmac->neg=1;txt=tape it up and two tape players 
.       %mor: v|tape pro|it adv:loc|up coord|and det:num|two n|tape 
n|play&dv-agt-pl   .  %xgra: 1|4|coord 2|1|obj 3|1|jct 4|0|root 5|6|quant 
6|4|coord 7|6|jct  8|4|punct
  ... and so on until killing the process.

It appears that at some point in the file it stops moving across words
boundaries/consuming input tokens and gets stuck. Note that "tape it up and 
two tape players" i s not
the first utterance in the file.

searches like +s'!xxx^yyy'  and +s'xxx^!yyy' run to completion.

Anyway, I'm not sure if this is a bug or maybe an abuse of deprecated 
syntax or
something, but any advice would be appreciated.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/cbdacf62-dcd5-4286-982d-c7b8ee263bcd%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20130805/2b03a9d6/attachment.htm>


More information about the Chibolts mailing list