[Corpora-List] Resources for evaluating term extraction
Adam Kilgarriff
adam at lexmasterclass.com
Wed Feb 19 11:34:36 UTC 2014
Dear all,
The Sketch Engine now supports term extraction for many languages - and we
want to evaluate it.
For that, we need domain corpora in which somebody has gone through
identifying all the 'true' terms. Then we can compute our system's
precision and recall.
We are aware of GENIA, for English, and are using that already (key
citation here: A comparative evaluation of term recognition
algorithms<http://scholar.google.co.uk/citations?view_op=view_citation&hl=en&user=VsRwsN8AAAAJ&citation_for_view=VsRwsN8AAAAJ:u5HHmVD_uO8C>
2008: Z Zhang, J Iria, CA Brewster, F Ciravegna)
Any corpus with "the terms it contains", conscientiously produced, will
help us.
Pointers please!
Adam
--
========================================
Adam Kilgarriff <http://www.kilgarriff.co.uk/>
adam at lexmasterclass.com
Director Lexical Computing
Ltd<http://www.sketchengine.co.uk/>
Visiting Research Fellow University of
Leeds<http://leeds.ac.uk>
*Corpora for all* with the Sketch Engine <http://www.sketchengine.co.uk>
*DANTE: a lexical database for English
<http://www.webdante.com> *
========================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140219/fe968abf/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list