[Corpora-List] Sense-tagged corpora

phil.edmonds at sharp.co.uk phil.edmonds at sharp.co.uk
Wed Aug 14 18:59:55 UTC 2002


Dear CORPORA List Members,

We are preparing the Introduction to a Special Issue of the
Journal of Natural Language Engineering on Evaluating WSD Systems
and would like to include details of as many word-sense-tagged corpora
as possible.  If you have any such resource, for any language, we
would be interested in hearing about it - including, ideally, details
of
   language
   size (total words, tagged words, and tagged word-types)
   text type
   date of collection
   purpose of collection
   source of the sense inventory
   availability

No need to report on the following, which we are already aware of:

   SEMCOR
   HECTOR
   'line' corpus
   DSO corpus, Singapore
   Dutch children's books corpus
   Italian PAROLE corpus
   all datasets prepared for SENSEVAL 1 or 2

We have also heard rumours of a picture library with sense-tagged captions
on a large scale.  More information most welcome.

All leads and details of further sense-tagged corpora most welcome,

   Thank you in anticipation,


	 Adam Kilgarriff and Phil Edmonds



--
Philip Edmonds                          (    phil at sharp.co.uk
Sharp Laboratories of Europe Ltd         )   www.sle.sharp.co.uk
Edmund Halley Road, Oxford Science Park (    +44 1865 747711 phone
Oxford OX4 4GB, United Kingdom           )   +44 1865 714170 fax



More information about the Corpora mailing list