verb lemmas and their frequencies
Naomi Shin
naomilshin at gmail.com
Mon Mar 12 22:15:52 UTC 2018
Hi,
Sorry for being opaque. I DID get frequency for each lemma, but the
frequencies are based on *each file*, so I got output like what I've pasted
below. But what I'm asking is how to get the frequency ACROSS all the
speakers/files. So, for example, there are 3 tokens of verb stem *abridged * in
the first file and then 1 token from the child, but from a later file.
Imagine that for all the files I'm looking at there's a total of 50 tokens
of *abri *verb stem. Is there a way to just automatically extract that
number (for each verb stem) without having to manually go through and count
how many *abridged *stems there are for each file (i.e. 3+1+ ...). In
other words, what I want is the TOTAL number of tokens of verb stem *abri* --
including all speakers and including all the Spanish files that have %mor
tiers - a few hundred files since there are often more than one file per
child (I've put them all in one folder).
I hope this clarifies the question.
Thanks!
-Naomi
small portion of current output:
>From file <diegoU030614a.cha>
Speaker: *MOT:
3 v|abri
5 v|cabe
3 v|cerra
1 v|coge
1 v|da
1 v|dormi
1 v|empuja
1 v|entra
1 v|falta
6 v|gusta
4 v|habe
2 v|hace
12 v|i
1 v|importa
2 v|junta
1 v|marcha
3 v|mira
1 v|monta
2 v|move
1 v|necesita
2 v|parece
8 v|pode
4 v|pone
1 v|prepara
4 v|quere
1 v|regala
3 v|sabe
2 v|saca
1 v|sali
16 v|tene
1 v|tira
1 v|toca
4 v|trae
3 v|ve
1 v|veni
------------------------------
35 Total number of different item types used
104 Total number of items (tokens)
0.337 Type/Token ratio
Speaker: *CHI:
1 v|abri
1 v|aparca
3 v|baja
6 v|cabe
5 v|cerra
1 v|coge
3 v|come
1 v|deja
1 v|desperta
1 v|entra
3 v|espera
7 v|habe
2 v|hace
31 v|i
1 v|mira
2 v|oí
1 v|parece
3 v|pode
2 v|pone
2 v|queda
1 v|queja
1 v|sabe
1 v|saca
2 v|senta
4 v|tene
1 v|tira
1 v|toca
1 v|trae
2 v|vale
1 v|ve
------------------------------
30 Total number of different item types used
92 Total number of items (tokens)
0.326 Type/Token ratio
>From file <diegoU030614b.cha>
Speaker: *CHI:
2 v|dispara
1 v|escapa
2 v|espera
2 v|habe
16 v|i
1 v|lanza
1 v|mete
4 v|mira
2 v|oí
4 v|parece
14 v|pode
4 v|pone
1 v|quere
1 v|saca
1 v|senta
1 v|tira
------------------------------
16 Total number of different item types used
57 Total number of items (tokens)
0.281 Type/Token ratio
Speaker: *GUI:
4 v|apreta
5 v|da
1 v|deja
1 v|echa
3 v|espera
1 v|explica
2 v|habe
2 v|hace
7 v|i
4 v|mira
1 v|oí
1 v|pode
6 v|pone
1 v|sali
1 v|sujeta
2 v|tene
2 v|tira
2 v|veni
On Mon, Mar 12, 2018 at 3:38 PM, Leonid Spektor <spektor at andrew.cmu.edu>
wrote:
> Naomi,
>
> I don't understand exactly what you want, because the command you are
> using does calculate frequency for each verb lemma. Perhaps you want the
> following command:
>
> freq +sm|v,;* *.cha
>
>
> Leonid.
>
> On Mar 12, 2018, at 17:32, Naomi Shin <naomilshin at gmail.com> wrote:
>
> Thank you thank you thank you, Leonid!!!! This is terrific!
> Is there any way to calculate frequencies for each verb form (or even
> better, verb lexeme) based on the output file? I was able to get
> frequencies of each verb stem for EACH file using freq +sm|v,;*,o-%
> *.cha.
> Thank you again!
> -Naomi
>
> On Mon, Mar 12, 2018 at 2:56 PM, Leonid Spektor <spektor at andrew.cmu.edu>
> wrote:
>
>> Naomi,
>>
>> The "r-v" tells FREQ to find all stems that are "v".
>>
>> To find all verbs you need this command:
>> freq +sm|v,o-% *.cha
>>
>> To find all verbs and stems you need this command:
>> freq +sm|v,;*,o-% *.cha
>>
>> For more explanation of "+sm" option and more examples please type
>> command "freq +sm"
>>
>>
>> Leonid.
>>
>> On Mar 12, 2018, at 16:11, Naomi Shin <naomilshin at gmail.com> wrote:
>>
>> Hi all,
>>
>> I'm quite new to working with CHILDES.
>>
>> I am trying to extract all verb lexemes and their associated frequencies
>> from all Spanish files on CHILDES. So far, I created a folder with only
>> the Spanish files that have the %mor tier. I have been able to run > freq
>> +sm"r-v,o-%" *.cha
>>
>> and the program runs, but the output is 0 for each file even though when
>> I open files randomly, I do see examples of verbs coded as verbs in the
>> %mor tier.
>>
>> I'd be so grateful for any suggestions you have. I'd also be very happy
>> to hire a tutor if you know of anyone who might be interested and who has
>> the relevant expertise.
>>
>> Thank you so much,
>> Naomi Shin
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "chibolts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to chibolts+unsubscribe at googlegroups.com.
>> To post to this group, send email to chibolts at googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/chibolts/ddd49f13-0526-423c-b219-3f754a4f1eb6%40googlegroups.com
>> <https://groups.google.com/d/msgid/chibolts/ddd49f13-0526-423c-b219-3f754a4f1eb6%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "chibolts" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/chibolts/QdMDz_pU_ws/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> chibolts+unsubscribe at googlegroups.com.
>> To post to this group, send email to chibolts at googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/chibolts/04F847D0-91FC-4B70-B3F4-ABA8976F01DB%40andrew.cmu.edu
>> <https://groups.google.com/d/msgid/chibolts/04F847D0-91FC-4B70-B3F4-ABA8976F01DB%40andrew.cmu.edu?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/chibolts/CAPKhozoFhNEmTcqreN0oPAspFhd7RohoY-hPP_F0p0i4st6sQQ%40mail.
> gmail.com
> <https://groups.google.com/d/msgid/chibolts/CAPKhozoFhNEmTcqreN0oPAspFhd7RohoY-hPP_F0p0i4st6sQQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "chibolts" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/chibolts/QdMDz_pU_ws/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/chibolts/41D718F6-F194-401E-9D43-A9CA542E6DFE%40andrew.cmu.edu
> <https://groups.google.com/d/msgid/chibolts/41D718F6-F194-401E-9D43-A9CA542E6DFE%40andrew.cmu.edu?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAPKhozps80Tku99yvq25nGW1i7uHGRV1SrMtYo0Z3thBraUwuw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20180312/c3bde6e2/attachment.htm>
More information about the Chibolts
mailing list