[Corpora-List] Annotation of internal structure of named entities

Adam Przepiorkowski adamp at ipipan.waw.pl
Thu Jan 13 17:14:46 UTC 2011


Dear Grzegorz,

In the National Corpus of Polish, we are planning to make freely
available a 1-million word manually-annotated balanced subcorpus, where
one of the annotation levels is nested NEs, including the kinds of
nestings you're interested in.  See:

http://nlp.ipipan.waw.pl/~adamp/Papers/2010-lrec-as/

The project ends in June 2011, the corpus should be available around
that time.

Best regards,

Adam P.


Michał Marcińczuk <marcinczuk at gmail.com>:

> Hi Grzegorz,
>
> I can suggest two corpora for Polish (both are under development).
> The first one is being developed on Wrocław University of Technology
> and can be accessed here: http://nlp.pwr.wroc.pl/inforex/index.php?corpus=7&page=browse
> In this corpus all nested mentions are annotated. In case of person names
> first names, surnames and given names are annotated.
> The corpus will be published under the CC license.
>
> The other corpus is NKJP http://nkjp.pl/.
>
> Regards,
>   Michał Marcińczuk
>
> --
> www.czuk.eu
>
> 2011/1/13 Grzegorz Chrupała <pitekus at gmail.com>
>
>     Hi all,
>    
>     Does anyone know of a dataset where the internal structure of named
>     entities is annotated? What I have in mind is for example parts of
>     person names such as title, first name, initial, surname etc. I would
>     be interested mostly in English, German, Spanish or French resources
>     but other languages could also be OK.
>    
>     Best,
>     --
>     Grzegorz Chrupała
>     Saarland University
>     FR 7.4 Spoken Language Systems
>     Building C7 1, Room 0.04
>     66041 Saarbrücken
>     +49 681 302 58126
>     gchrupala at lsv.uni-saarland.de
>    
>     _______________________________________________
>     Corpora mailing list
>     Corpora at uib.no
>     http://mailman.uib.no/listinfo/corpora
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- 
Adam Przepiórkowski                             ˈadam ˌpʃɛpjurˈkɔfskʲi
http://nlp.ipipan.waw.pl/ ___________ Zespół Inżynierii Lingwistycznej
http://tnij.org/pling _____________________ Polska Lista Językoznawcza
http://korpus.pl/ _____________________________________ Korpus IPI PAN
http://nkjp.pl/ _________________________________________________ NKJP

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list