Corpora: Need for texts to evaluate named entity recognition software in En, Fr, De and Es

Ralf Steinberger ralf.steinberger at jrc.it
Mon Mar 18 17:14:46 UTC 2002


Hello,

we are looking for texts containing many named entities such as peoples'
names, company names, names of organisations/authorities and geographical
places in the languages English, French, German and Spanish.

The texts will be used for the evaluation of named entity recognition
software. Parallel texts (texts and their translations) would be preferred
as they would make the evaluation easier. It is not strictly necessary that
the named entities be marked up in the text.

The evaluation will be carried out by a student, who is writing her Master's
thesis on this subject, in collaboration with the EC's Joint Research
Centre. The thesis will be made publicly available.

Any hints are welcome. Thanks in advance.


Ralf Steinberger (ralf.steinberger at jrc.it)
European Commission, Joint Research Centre (http://www.jrc.it/langtech/)
Institute for the Protection and Security of the Citizen (IPSC)



More information about the Corpora mailing list