[Corpora-List] Tags in Word Smith

Mark Davies mdavies at ilstu.edu
Mon Feb 17 18:22:13 UTC 2003


Randall Jones asked:

> I want to run WordList in  Word
> Smith Tools in a way that the tags will differentiate homographs,
> e.g. sein (verb and pronoun), da (adverb and conjunction), etc.  I would
think that
> because the words have different tags that they appear differently in the
> list.  However, thus far I have been successful in ignoring the tags or
> having them treated as separate words.

I just tried it, and you're right -- it groups together cases like da_categ1
and da_categ2 (whatever categ1 and categ2 are).  I think that it sees the
underscore as being a word separator, and I suspect that whatever character
you're using to separate the lemma and the category is being viewed as a
word separator as well.  I tried two or three other characters, and they
were seen as word separators as well.

I then replaced lemma + word separator + category with lemma1, lemma2 (just
a number right after the lemma) and it worked fine.  You do need to adjust
the settings for WordSmith to look at numbers in WordList, though.

Mark Davies

=======================================
Mark Davies, Associate Professor, Spanish Linguistics
http://mdavies.for.ilstu.edu/
4300 Foreign Languages / Illinois State University
Normal, IL 61790-4300
309-438-7975 (voice) / 309-438-8038 (fax)
=======================================



More information about the Corpora mailing list