Outputting KWAL search by keywords

Mits Ota mits at ling.ed.ac.uk
Sat Jun 19 23:04:06 UTC 2010


Dear Leonid,

Thanks very much for the suggestion. I think this is indeed a solution
if the data are all like the hypothetical examples I used in my
description of the problem. However, I failed to mention in my
previous posting that in the corpora that I'm looking at (e.g.,
Providence), a large proportion of these multi-keyword cases actually
involve the same keyword repeated in the main tier, e.g.:

snake, snake	%CHI snake snake . %xpho kneɪk Sneɪk
sleepy, sleepy  %CHI sleepy sleepy . %xpho sLipi slipi

I don't think we can get KWAL to return separate outputs for these
hits even if we run the command separately for every keyword. Am I
right?

Mits

On Jun 19, 3:40 pm, Leonid Spektor <spek... at andrew.cmu.edu> wrote:
>  Mits,
>
>         Unfortunately, there is no way to do this in one pass. Kwal program is designed to output every matched tier only once, no matter how many keywords are found on this one tier. The only thing I can recommend is that you run separate kwal command for every keyword you are looking for and later merge all outputs into one file. For example, your command would look like this:
>
> kwal +sgrapes +d4 +o%xpho *.cha +f +u
> kwal +sspoon +d4 +o%xpho *.cha +f +u
> kwal +splease +d4 +o%xpho *.cha +f +u
> kwal +splate +d4 +o%xpho *.cha +f +u
>
> You can also store all those commands into batch file, like the one I have created and I am attaching to this email. And run this batch file with this command:
>
> batch batch.cut
>
> Leonid.
>
>  batch.cut
> < 1KViewDownload
>
>
>
> On Jun 18, 2010, at 19:46, Mits Ota wrote:
>
> > I'm trying to extract productions of target words with onset clusters
> > (e.g., /pl/) and organize the output in a wide dataframe with age,
> > keyword (target), main tier and the %pho tier as column variables.
> > I've created a list of all relevant words in the child's lexicon using
> > FREQ, and run the list through KWAL with +d4 and +o%xpho, which gives
> > me something like the following.
>
> > <filename>   1475    grapes  *CHI:   Mommy eat grapes . %xpho: m6mI it grepsˈ
> > <filename>   1485    spoon   *CHI:   spoon . %xpho: ˈpun
> > <filename>   1520    please, plate   *CHI:   please pass me the plate . %xpho:
> > piz paes mI d6 plet
> > ...
>
> > This is more or less what I'm looking for and I can convert the
> > filenames to age info. The only problem is that, by default, KWAL
> > organizes the output by main tiers, so if there are two or more hits
> > in a single utterance (like the 'please' and 'plate' example above),
> > the multiple hits are listed in one output. What I want is to have
> > these treated separately. In other words, I want each keyword hit
> > forming a row, like this example below (with the main tier and %pho
> > tier duplicated, as both hits come from the same utterance):
>
> > <filename>   1520    please  *CHI:   please pass me the plate . %xpho: piz
> > paes mI d6 plet
> > <filename>   1520    plate     *CHI: please pass me the plate . %xpho: piz
> > paes mI d6 plet
>
> > Is there a simple way to do this?
>
> > Thanks,
>
> > Mits
>
> > --
> > You received this message because you are subscribed to the Google Groups "chibolts" group.
> > To post to this group, send email to chibolts at googlegroups.com.
> > To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/chibolts?hl=en.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com.
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.



More information about the Chibolts mailing list