Hi Grzegorz,<br><br>I can suggest two corpora for Polish (both are under development). <br>The first one is being developed on Wrocław University of Technology <br>and can be accessed here: <a href="http://nlp.pwr.wroc.pl/inforex/index.php?corpus=7&page=browse">http://nlp.pwr.wroc.pl/inforex/index.php?corpus=7&page=browse</a><br>

In this corpus all nested mentions are annotated. In case of person names <br>first names, surnames and given names are annotated. <br>The corpus will be published under the CC license.<br><br>The other corpus is NKJP <a href="http://nkjp.pl/">http://nkjp.pl/</a>.<br>

<br>Regards,<br>  Michał Marcińczuk<br><br><br clear="all">--<br><a href="http://www.czuk.eu">www.czuk.eu</a><br>

<br><br><div class="gmail_quote">2011/1/13 Grzegorz Chrupała <span dir="ltr"><<a href="mailto:pitekus@gmail.com">pitekus@gmail.com</a>></span><br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

Hi all,<br>

<br>

Does anyone know of a dataset where the internal structure of named<br>

entities is annotated? What I have in mind is for example parts of<br>

person names such as title, first name, initial, surname etc. I would<br>

be interested mostly in English, German, Spanish or French resources<br>

but other languages could also be OK.<br>

<br>

Best,<br>

--<br>

Grzegorz Chrupała<br>

Saarland University<br>

FR 7.4 Spoken Language Systems<br>

Building C7 1, Room 0.04<br>

66041 Saarbrücken<br>

+49 681 302 58126<br>

<a href="mailto:gchrupala@lsv.uni-saarland.de">gchrupala@lsv.uni-saarland.de</a><br>

<br>

_______________________________________________<br>

Corpora mailing list<br>

<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>

<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>

</blockquote></div><br>