[Corpora-List] Resources for evaluating term extraction

Sophia Sophia.ananiadou at manchester.ac.uk
Wed Feb 19 16:29:52 UTC 2014


Termine http://www.nactem.ac.uk/software/termine/ would give you candidate terms but these also would have to be evaluated.
C-value upon which Termine is based, has been implemented for several languages, e.g. Spanish, Japanese, Chinese, etc.

Sophia 



On 19 Feb 2014, at 16:00, Kevin B. Cohen wrote:

> Hi, Adam,
> 
> I would recommend talking with Sophia Ananiadou, the creator of TerMine.
> 
> Kev
> 
> 
> 
> On Wed, Feb 19, 2014 at 4:34 AM, Adam Kilgarriff <adam at lexmasterclass.com> wrote:
> Dear all,
> 
> The Sketch Engine now supports term extraction for many languages - and we want to evaluate it.
> 
> For that, we need domain corpora in which somebody has gone through identifying all the 'true' terms.  Then we can compute our system's precision and recall.
> 
> We are aware of GENIA, for English, and are using that already (key citation here: A comparative evaluation of term recognition algorithms 2008: Z Zhang, J Iria, CA Brewster, F Ciravegna) 
> 
> Any corpus with "the terms it contains", conscientiously produced, will help us.
> 
> Pointers please!
> 
> Adam
> 
> -- 
> ========================================
> Adam Kilgarriff                  adam at lexmasterclass.com                                             
> Director                                    Lexical Computing Ltd                
> Visiting Research Fellow                 University of Leeds     
> Corpora for all with the Sketch Engine                 
>                         DANTE: a lexical database for English                  
> ========================================
> 
> 
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
> 
> 
> 
> 
> -- 
> Kevin Bretonnel Cohen, PhD
> Biomedical Text Mining Group Lead, Computational Bioscience Program, 
> U. Colorado School of Medicine
> 303-916-2417
> http://compbio.ucdenver.edu/Hunter_lab/Cohen
> 
> 
> 

----------
Professor Sophia Ananiadou, School of Computer Science,
Director, National Centre for Text Mining
Manchester Institute of Biotechnology
University of Manchester
131 Princess Street, M1 7DN
www.nactem.ac.uk
sophia.ananiadou at manchester.ac.uk
http://www.nactem.ac.uk/staff/sophia.ananiadou/
tel: +44 (0)161 306 3092

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140219/83103ea7/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list