list of tags used by MOR

Leonid Spektor spektor at andrew.cmu.edu
Tue May 6 17:08:32 UTC 2014


Rui,

	I forgot to mention one more exception for tags. Any word that ends with a "@..." symbol is converted to MOR tag using "sf.cut" file located in grammar's root folder. For English example, this file will be in eng folder.

Tags that are separated by '+' characters are compound words. Those are words that are made up of two or more different words to create a new word. For example, in English word "hopscotch" consist of two words "hop" and "scotch" and it is listed in lexicon file "n+v+n.cut". MOR command tags this word as " n|+v|hop+n|scotch" the first tag is a noun, "n|", and is an overall tag for the word, The second tag "v|" indicates that word "hop" is a verb and the third tag "n|" indicates that word "scotch" is a noun. Thus in FREQ output you would see "n|+v+n" tag. The other compound tags are more complicated. For example, word "iceskating" consists on two words "ice" and "skating". "ice" is a noun, "n|", and  word "skating" consists of parts "skate" and "ing", i.e. "n|" and "n:gerund|". Thus resulting tag for word "iceskating" is "n:gerund|+n+n". Compound words can either be literally listed in lex file like word "hopscotch" or can consist of tags representing its components like word "iceskating".

So, all the tags in MOR come from lex files, sf.cut file and $part-of-speech tag on main speaker tier. But, how those tags are arranged together in the end is a function of MOR command.

For more information I strongly encourage you to read chapter "11 MOR - Morphosyntactic Analysis" in CLAN's manual located at URL:
http://childes.talkbank.org/manuals/CLAN.pdf


Leonid.

On May 6, 2014, at 11:41, Rui Huang <huang3740 at gmail.com> wrote:

> Hi Leonid,
> 
> I am sorry to say that the command 'freq -y +s"[scat *]" +u *.cut' does not list all the tags. I ran 'freq +s@|*,o% filenames' on Eve and Valian corpus, and then collected all the tags that appear in the output, I found some tags are not in the  'freq -y +s"[scat *]" +u *.cut' output. (Attached is my output. Tags appear in orange color is not in the  'freq -y +s"[scat *]" +u *.cut' output.)
> 
>   In addition, I do not know the meaning of some tags that links by '+' symbol, like ''adv|+adj+n', 'adj+adj+adj', 'n+n+n', 'n+n+adj', and so on. Do you know them?
> Thank you. 
> 
> 
> 
> On Tuesday, April 29, 2014 5:46:37 PM UTC-4, Spektor, Leonid: CMU wrote:
> Hi Rui,
> 
> 	"n:prop" is the ONLY tag in all MOR grammars that is hardwired into CLAN itself. If MOR see a capitalized word, then it tags it with "n:prop". There is also a way in CHAT to specify any tag on speaker tier too. For example, tier:
> 
> *CHI:		word$foo .
> 
> will result in %mor tier:
> 
> %mor:	foo|word .
> 
> Beside above exceptions the command "freq -y +s"[scat *]" +u *.cut" will lists all the tags.
> 
> Leonid.
> 
> On Apr 29, 2014, at 12:01, Rui Huang <huan... at gmail.com> wrote:
> 
>> Hi Leonid,
>> 
>> Thank you for answering my question. The first command pulls out all tags in a file. It works very well. 
>> But the second command does not pull out all the tags that eng MOR grammar has. And this is what I need to find.
>> For example, in Valian corpus:
>> 
>> *MOT:	Child's !
>> %mor:	n:prop|Child~poss|s !
>> %gra:	1|2|MOD 2|0|ROOT 3|2|PUNCT
>> 
>> The tag 'n:prop' should appear in the output, but it did not. Hope you can take a look at it.
>> Thank you again!
>> 
>> Rui
>> 
>> 
>> On Thursday, April 24, 2014 7:49:02 PM UTC-4, Spektor, Leonid: CMU wrote:
>> Hi Rui,
>> 
>> 	You can try command:
>> 
>> freq +s@|*,o% filenames
>> 
>> If you are interested in all tags that a particular MOR grammar has, then you need to download a grammar you are interested in, set CLAN's "working" directory to "<grammar name>/lex" folder. For English grammar that would be "eng/lex" and run command:
>> 
>> freq -y +s"[scat *]" +u *.cut
>> 
>> 
>> Leonid.
>> 
>> On Apr 24, 2014, at 18:08, Rui Huang <huan... at gmail.com> wrote:
>> 
>>> Hello everyone, 
>>> 
>>>   I have a question that Erin has asked before, but did not get reply. (https://groups.google.com/forum/#!searchin/chibolts/erin/chibolts/5N8m43WrCZs/jSkHAm6aSY8J)
>>>   Is there a comprehensive list of tags used by MOR?  How can I tell CLAN to give me a list that all speech tags it used in a certain file?
>>> 
>>> Thank you.
>>> Rui
>>> 
>>> 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups "chibolts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u... at googlegroups.com.
>>> To post to this group, send email to chib... at googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/407a8ece-ff8d-4603-a934-63b32abc9e01%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups "chibolts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u... at googlegroups.com.
>> To post to this group, send email to chib... at googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/d58a6a70-470c-4eec-9406-56a6daba19a1%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/1b2381ae-3c32-4c01-9e49-47598bbb8191%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> <tagsinEve&Valian.rtf>

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/82F3DF2B-2E53-4E6F-9F1B-CD6B9ACB2D62%40andrew.cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20140506/ef81b016/attachment.htm>


More information about the Chibolts mailing list