[Corpora-List] named entity recognition

Jose Maria Gomez Hidalgo jmgomez at dinar.esi.uem.es
Wed Mar 26 14:48:12 UTC 2003

At 14:57 26/03/2003 +0100, Friederike Schmidt wrote:
>I'm working on a tool for named entity recognition for English broadcast
>Does anybody know of any freely available NE-tagged corpora for testing?

Try the datasets used in the last CONLL workshops:


For 2002/03, the shared task is language independet NER; in 2002, there are 
Spanish and Dutch datasets; for 2003, there is an English dataset of news 
wire articles from the Reuters Corpus.

>Thanks for your help,


Jose Maria Gomez Hidalgo
Departamento de Inteligencia Artificial
Universidad Europea de Madrid
28670 - Villaviciosa de Odon - MADRID
(+34) 912115670
jmgomez at dinar.esi.uem.es

La legislación española ampara el secreto de las comunicaciones. Este 
correo electrónico es estrictamente confidencial y va dirigido 
exclusivamente a su destinatario/a. Si no es Ud., le rogamos que no difunda 
ni copie la transmisión y nos lo notifique cuanto antes.

Spanish law guarantees privacy in electronic communications. This 
electronic transmission is strictly confidential and intended solely for 
the addressee. If you are not the intended addressee, you are kindly 
requested not to disclose nor to copy this transmission and to notify us as 
soon as possible.

More information about the Corpora mailing list