[Corpora-List] Sense-tagged corpora
phil.edmonds at sharp.co.uk
phil.edmonds at sharp.co.uk
Wed Aug 14 18:59:55 UTC 2002
Dear CORPORA List Members,
We are preparing the Introduction to a Special Issue of the
Journal of Natural Language Engineering on Evaluating WSD Systems
and would like to include details of as many word-sense-tagged corpora
as possible. If you have any such resource, for any language, we
would be interested in hearing about it - including, ideally, details
of
language
size (total words, tagged words, and tagged word-types)
text type
date of collection
purpose of collection
source of the sense inventory
availability
No need to report on the following, which we are already aware of:
SEMCOR
HECTOR
'line' corpus
DSO corpus, Singapore
Dutch children's books corpus
Italian PAROLE corpus
all datasets prepared for SENSEVAL 1 or 2
We have also heard rumours of a picture library with sense-tagged captions
on a large scale. More information most welcome.
All leads and details of further sense-tagged corpora most welcome,
Thank you in anticipation,
Adam Kilgarriff and Phil Edmonds
--
Philip Edmonds ( phil at sharp.co.uk
Sharp Laboratories of Europe Ltd ) www.sle.sharp.co.uk
Edmund Halley Road, Oxford Science Park ( +44 1865 747711 phone
Oxford OX4 4GB, United Kingdom ) +44 1865 714170 fax
More information about the Corpora
mailing list