verb lemmas and their frequencies

Naomi Shin naomilshin at gmail.com
Mon Mar 12 22:15:52 UTC 2018


Hi,
Sorry for being opaque. I DID get frequency for each lemma, but the
frequencies are based on *each file*, so I got output like what I've pasted
below. But what I'm asking is how to get the frequency ACROSS all the
speakers/files. So, for example, there are 3 tokens of verb stem *abridged * in
the first file and then 1 token from the child, but from a later file.
Imagine that for all the files I'm looking at there's a total of 50 tokens
of *abri *verb stem. Is there a way to just automatically extract that
number (for each verb stem) without having to manually go through and count
how many *abridged *stems there are for each file (i.e. 3+1+ ...).  In
other words, what I want is the TOTAL number of tokens of verb stem *abri* --
including all speakers and including all the Spanish files that have %mor
tiers - a few hundred files since there are often more than one file per
child (I've put them all in one folder).
I hope this clarifies the question.
Thanks!
-Naomi


small portion of current output:

>From file <diegoU030614a.cha>
Speaker: *MOT:
  3 v|abri
  5 v|cabe
  3 v|cerra
  1 v|coge
  1 v|da
  1 v|dormi
  1 v|empuja
  1 v|entra
  1 v|falta
  6 v|gusta
  4 v|habe
  2 v|hace
 12 v|i
  1 v|importa
  2 v|junta
  1 v|marcha
  3 v|mira
  1 v|monta
  2 v|move
  1 v|necesita
  2 v|parece
  8 v|pode
  4 v|pone
  1 v|prepara
  4 v|quere
  1 v|regala
  3 v|sabe
  2 v|saca
  1 v|sali
 16 v|tene
  1 v|tira
  1 v|toca
  4 v|trae
  3 v|ve
  1 v|veni
------------------------------
   35  Total number of different item types used
  104  Total number of items (tokens)
0.337  Type/Token ratio

Speaker: *CHI:
  1 v|abri
  1 v|aparca
  3 v|baja
  6 v|cabe
  5 v|cerra
  1 v|coge
  3 v|come
  1 v|deja
  1 v|desperta
  1 v|entra
  3 v|espera
  7 v|habe
  2 v|hace
 31 v|i
  1 v|mira
  2 v|oí
  1 v|parece
  3 v|pode
  2 v|pone
  2 v|queda
  1 v|queja
  1 v|sabe
  1 v|saca
  2 v|senta
  4 v|tene
  1 v|tira
  1 v|toca
  1 v|trae
  2 v|vale
  1 v|ve
------------------------------
   30  Total number of different item types used
   92  Total number of items (tokens)
0.326  Type/Token ratio

>From file <diegoU030614b.cha>
Speaker: *CHI:
  2 v|dispara
  1 v|escapa
  2 v|espera
  2 v|habe
 16 v|i
  1 v|lanza
  1 v|mete
  4 v|mira
  2 v|oí
  4 v|parece
 14 v|pode
  4 v|pone
  1 v|quere
  1 v|saca
  1 v|senta
  1 v|tira
------------------------------
   16  Total number of different item types used
   57  Total number of items (tokens)
0.281  Type/Token ratio

Speaker: *GUI:
  4 v|apreta
  5 v|da
  1 v|deja
  1 v|echa
  3 v|espera
  1 v|explica
  2 v|habe
  2 v|hace
  7 v|i
  4 v|mira
  1 v|oí
  1 v|pode
  6 v|pone
  1 v|sali
  1 v|sujeta
  2 v|tene
  2 v|tira
  2 v|veni



On Mon, Mar 12, 2018 at 3:38 PM, Leonid Spektor <spektor at andrew.cmu.edu>
wrote:

> Naomi,
>
> I don't understand exactly what you want, because the command you are
> using does calculate frequency for each verb lemma. Perhaps you want the
> following command:
>
> freq +sm|v,;* *.cha
>
>
> Leonid.
>
> On Mar 12, 2018, at 17:32, Naomi Shin <naomilshin at gmail.com> wrote:
>
> Thank you thank you thank you, Leonid!!!! This is terrific!
> Is there any way to calculate frequencies for each verb form (or even
> better, verb lexeme) based on the output file? I was able to get
> frequencies of each verb stem for EACH file using freq +sm|v,;*,o-%
> *.cha.
> Thank you again!
> -Naomi
>
> On Mon, Mar 12, 2018 at 2:56 PM, Leonid Spektor <spektor at andrew.cmu.edu>
> wrote:
>
>> Naomi,
>>
>> The "r-v" tells FREQ to find all stems that are "v".
>>
>> To find all verbs you need this command:
>> freq +sm|v,o-% *.cha
>>
>> To find all verbs and stems you need this command:
>> freq +sm|v,;*,o-% *.cha
>>
>> For more explanation of "+sm" option and more examples please type
>> command "freq +sm"
>>
>>
>> Leonid.
>>
>> On Mar 12, 2018, at 16:11, Naomi Shin <naomilshin at gmail.com> wrote:
>>
>> Hi all,
>>
>> I'm quite new to working with CHILDES.
>>
>> I am trying to extract all verb lexemes and their associated frequencies
>> from all Spanish files on CHILDES.  So far, I created a folder with only
>> the Spanish files that have the %mor tier. I have been able to run > freq
>> +sm"r-v,o-%" *.cha
>>
>> and the program runs, but the output is 0 for each file even though when
>> I open files randomly, I do see examples of verbs coded as verbs in the
>> %mor tier.
>>
>> I'd be so grateful for any suggestions you have. I'd also be very happy
>> to hire a tutor if you know of anyone who might be interested and who has
>> the relevant expertise.
>>
>> Thank you so much,
>> Naomi Shin
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "chibolts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to chibolts+unsubscribe at googlegroups.com.
>> To post to this group, send email to chibolts at googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/chibolts/ddd49f13-0526-423c-b219-3f754a4f1eb6%40googlegroups.com
>> <https://groups.google.com/d/msgid/chibolts/ddd49f13-0526-423c-b219-3f754a4f1eb6%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "chibolts" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/chibolts/QdMDz_pU_ws/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> chibolts+unsubscribe at googlegroups.com.
>> To post to this group, send email to chibolts at googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/chibolts/04F847D0-91FC-4B70-B3F4-ABA8976F01DB%40andrew.cmu.edu
>> <https://groups.google.com/d/msgid/chibolts/04F847D0-91FC-4B70-B3F4-ABA8976F01DB%40andrew.cmu.edu?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/chibolts/CAPKhozoFhNEmTcqreN0oPAspFhd7RohoY-hPP_F0p0i4st6sQQ%40mail.
> gmail.com
> <https://groups.google.com/d/msgid/chibolts/CAPKhozoFhNEmTcqreN0oPAspFhd7RohoY-hPP_F0p0i4st6sQQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "chibolts" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/chibolts/QdMDz_pU_ws/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/chibolts/41D718F6-F194-401E-9D43-A9CA542E6DFE%40andrew.cmu.edu
> <https://groups.google.com/d/msgid/chibolts/41D718F6-F194-401E-9D43-A9CA542E6DFE%40andrew.cmu.edu?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAPKhozps80Tku99yvq25nGW1i7uHGRV1SrMtYo0Z3thBraUwuw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20180312/c3bde6e2/attachment.htm>


More information about the Chibolts mailing list