[Corpora-List] Meaning of Semcor annotations
Jose Maria Gomez Hidalgo
jmgomez at dinar.esi.uem.es
Thu May 22 21:53:55 UTC 2003
Dear all
I am performing some experiments with the semantic concordance SemCor, and
I have found some difficulties in interpreting available documentation. Any
word in SemCor is labelled according to its meaning in WordNet:
<wf cmd=done pos=VB lemma=say wnsn=1 lexsn=2:32:00::>said</wf>
The word "said" has the part of speech VB (verb), its lemma is "say", and
the corresponding meaning in WordNet can be got by searching for "say" and
selecting the first sense (attribute wnsn). The attribute lexsn, according
to the documentation, and appended to the lemma, identifies the WordNet
synset for that meaning.
However, the lexsn attribute value is not unique for the synset. Many other
words in SemCor have the same value:
<wf cmd=done pos=VB lemma=consider wnsn=4 lexsn=2:32:00::>considering</wf>
<wf cmd=done pos=VB lemma=revise wnsn=1 lexsn=2:32:00::>revised</wf>
(all three extrated from brown1/tagfiles/br-a01)
Those words or lemmata do not belong to the same synset. It is important to
know when word senses belong to the same synset, because this way synonym
words __in the SemCor collection__ can be identified. The only way to know
this, apart of consulting WordNet itself, is having unique synset
identifiers in SemCor. Is the information in Semcor annotations enough to
get that unique identification? How can we do it?
Thank you
_______________________________________________________________________________
Jose Maria Gomez Hidalgo
Departamento de Inteligencia Artificial
Universidad Europea de Madrid
28670 - Villaviciosa de Odon - MADRID
(+34) 912115670
jmgomez at dinar.esi.uem.es
http://www.esi.uem.es/~jmgomez/
_______________________________________________________________________________
La legislación española ampara el secreto de las comunicaciones. Este
correo electrónico es estrictamente confidencial y va dirigido
exclusivamente a su destinatario/a. Si no es Ud., le rogamos que no difunda
ni copie la transmisión y nos lo notifique cuanto antes.
Spanish law guarantees privacy in electronic communications. This
electronic transmission is strictly confidential and intended solely for
the addressee. If you are not the intended addressee, you are kindly
requested not to disclose nor to copy this transmission and to notify us as
soon as possible.
More information about the Corpora
mailing list