using CLAN to find the frequency of nouns and verbs

Leonid Spektor spektor at andrew.cmu.edu
Wed Oct 16 20:33:53 UTC 2013


Stephanie,

	If you want to analyze data from our server, then we have many data choices that already have been tagged with MOR grammar. Our data is located on one of two servers at URLs:

http://childes.talkbank.org/data/
http://talkbank.org/data/local.html

If you look at "http://childes.talkbank.org/data/" web page, you will see data names with "-MOR" string. This data has MOR tags. You just need to download it and run FREQ commands to compute frequency of nouns and verbs. I will give an example of the FREQ commands later. First you need to decide which words exactly do you consider to be nouns and verbs. To give you better explanation I would recommend that you download English or your choice language grammar from our server at URL:

http://childes.talkbank.org/morgrams/

I  will use English data as an example, because you did not specify which language you are interested in. After you download MOR grammar from web link above you will unzip it. In case of English grammar you will get "eng" folder and move it to hard disk to preferably "CLAN" folder. If you are using Mac, then it will go into "/Applications/CLAN" folder and if you are using Windows PC, then it will go into "c:\TalkBank\CLAN" folder. If you installed CLAN in custom location, then you know where CLAN is located on your computer. Now open folder "eng/lex". Here you will see files that combine words into groups of particular parts of speech. You can see that there a number of files with "n-" and "v-" string. This is where deciding which words are nouns and which are verbs comes in. For example absolutely all verbs and all nouns can be found with this FREQ command:

freq +s"@|-n,|n:*,|-v,|-cop,|-aux,|-mod,|-mod:*,|-part"  *.cha

This search includes nouns, pronouns, verbs, auxiliary and participle verbs and other variations of nouns and verbs. You can open each file in "eng/lex" folder to see a list of all words of each part of speech.

If you do not want to include pronouns or other variations of nouns in your count, then you would use command:

freq +s"@|-n,|-v,|-cop,|-aux,|-mod,|-mod:*,|-part"  *.cha

In the purest form nouns and verbs are counted with this command:

freq +s"@|-n,|-v"  *.cha

But, if you want to count auxiliary and participle verbs along with basic verbs, then  use this command:

freq +s"@|-n,|-v,|-aux,|-part"  *.cha

As you can see you can fine tune your search to your particular specifications. All above FREQ command will output the whole form of each words. If you want to know only the count of each part of speech, then replace above four commands with following four commands:

freq +s"@|-n,|n:*,|-v,|-cop,|-aux,|-mod,|-mod:*,|-part,o-%"  *.cha
freq +s"@|-n,|-v,|-cop,|-aux,|-mod,|-mod:*,|-part,o-%"  *.cha
freq +s"@|-n,|-v,o-%" *.cha
freq +s"@|-n,|-v,|-aux,|-part,o-%" 

There are simpler ways to look for verbs and nouns with FREQ command, but the more complex search patterns in above commands give you most precision. If you want to see the meaning of all those "|-" and "o-" symbol, then just type "freq +s@" command or for even more explanation look in CLAN manual.


	If you want to analyze your own data or data that doesn't have MOR tags, then after you download and unzip MOR grammar you need to set "mor lib" directory to the location on hard drive where you placed the grammar folder. In my example above it will be on Mac "/Applications/CLAN/eng" folder and on PC "c:\TalkBank\CLAN\eng" folder. In CLAN's "Commands" window click on the button "mor lib", navigate to location of language grammar on your computer's hard drive and select that folder. Now you need to run two following commands:

mor +1 *.cha
post +1 *.cha

This will add "%mor" tier to all your data files and you will be ready to run your analyzes. If you have any CLAN questions, then please post them to the chibolts at googlegroups.com address and some will be able to help you.


Leonid.



On Oct 16, 2013, at 11:27, stephanie.ciappara at gmail.com wrote:

> Hi,
> 
> I'm new to the CHILDES community. I've been reading the CLAN Manual but I would like to ask a question. I would like to analyse the frequency of nouns and verbs. I do not seem to be using the correct command and I cannot seem to find it. Could you please guide me when I should look please?
> 
> Could you also tell me where I should download MOR please as if I understood I need to download it separately and it is needed to analysis the nouns and verbs.
> 
> Thank you for your help
> 
> Stephanie
> 
> -- 
> You received this message because you are subscribed to the Google Groups "Info-CHILDES" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to info-childes+unsubscribe at googlegroups.com.
> To post to this group, send email to info-childes at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/info-childes/378b4439-6551-473c-bc8d-7ed81bb85a1a%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/ADFC99FA-23C9-4A42-BF03-E4326368FF07%40andrew.cmu.edu.
For more options, visit https://groups.google.com/groups/opt_out.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20131016/eb7830c7/attachment.htm>


More information about the Chibolts mailing list