[Corpora-List] Tree-Structured Named Entities corpora ?

Agata Savary agata.savary at univ-tours.fr
Thu Dec 12 14:30:21 UTC 2013


Dear Yoann,

The National Corpus of Polish has this kind of annotation for named entities.
The corpus itself is distributed under an open license (GNU GPL v3) - see the first entry at: http://clip.ipipan.waw.pl/LRT

The principles of the named entity annotation, including its TEI P5 format, are described in this paper:
http://www.lrec-conf.org/proceedings/lrec2010/summaries/879.html

More data on that are available on demand.

Best regards,

Agata Savary

On 12/09/2013 11:29 AM, Yoann Dupont wrote:
> Greetings Corpora-List,
> I am currently looking for corpora with tree-structured named entities.
> A simple example of tree structuration would be a person which has a first and last name : "Barack Obama" is a person whose first name is "Barack" 
> and last name is "Obama". A parsing would then be : *(PER (NAME.FIRST* Barack*) (NAME.LAST* Obama*))*
> Another example would be geographical addresses.
> I know some corpora that could fit this definition : the SemEval'2007 task 9 corpora (tree-structured NE in Spanish and Catalan) and the GENIA 
> corpus (tree-structured NE for biomedical entities in English).
> Does any of you know other tree-structured NE corpora ?
> Thank you kindly in advance,
>
> -- 
> Yoann DUPONT
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20131212/2676452e/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list