[Corpora-List] Tags in Word Smith
Mark Davies
mdavies at ilstu.edu
Mon Feb 17 18:22:13 UTC 2003
Randall Jones asked:
> I want to run WordList in Word
> Smith Tools in a way that the tags will differentiate homographs,
> e.g. sein (verb and pronoun), da (adverb and conjunction), etc. I would
think that
> because the words have different tags that they appear differently in the
> list. However, thus far I have been successful in ignoring the tags or
> having them treated as separate words.
I just tried it, and you're right -- it groups together cases like da_categ1
and da_categ2 (whatever categ1 and categ2 are). I think that it sees the
underscore as being a word separator, and I suspect that whatever character
you're using to separate the lemma and the category is being viewed as a
word separator as well. I tried two or three other characters, and they
were seen as word separators as well.
I then replaced lemma + word separator + category with lemma1, lemma2 (just
a number right after the lemma) and it worked fine. You do need to adjust
the settings for WordSmith to look at numbers in WordList, though.
Mark Davies
=======================================
Mark Davies, Associate Professor, Spanish Linguistics
http://mdavies.for.ilstu.edu/
4300 Foreign Languages / Illinois State University
Normal, IL 61790-4300
309-438-7975 (voice) / 309-438-8038 (fax)
=======================================
More information about the Corpora
mailing list