[Corpora-List] Tree-Structured Named Entities corpora ?
Kathrin Beck
kathrin.beck at uni-tuebingen.de
Thu Dec 12 15:39:57 UTC 2013
Dear Yoann,
The TüBa-D/Z (Tübingen Treebank of Written German; http://www.sfs.uni-tuebingen.de/en/ascl/resources/corpora/tueba-dz.html) is a manually annotated treebank of approximately 85,000 sentences. It contains five subclasses of Named Entities; nested Named Entities are annotated as well:
17,386 GPE (geo-political entities)
5,380 LOC (locations)
30,181 PER (persons)
18,262 ORG (organisations)
3,594 OTH (other, e.g. movie titles)
Examples for the annotation scheme are: [PER Bill Clinton]; [ORG [GPE New York] Times]
Kind regards,
Kathrin Beck
Am 09.12.2013 um 12:00 schrieb corpora-request at uib.no:
> Message: 7
> Date: Mon, 9 Dec 2013 11:29:54 +0100
> From: Yoann Dupont <yoa.dupont at gmail.com>
> Subject: [Corpora-List] Tree-Structured Named Entities corpora ?
> To: corpora at uib.no
>
> Greetings Corpora-List,
>
> I am currently looking for corpora with tree-structured named entities.
>
> A simple example of tree structuration would be a person which has a first
> and last name : "Barack Obama" is a person whose first name is "Barack" and
> last name is "Obama". A parsing would then be : *(PER (NAME.FIRST* Barack*)
> (NAME.LAST* Obama*))*
> Another example would be geographical addresses.
>
> I know some corpora that could fit this definition : the SemEval'2007 task
> 9 corpora (tree-structured NE in Spanish and Catalan) and the GENIA corpus
> (tree-structured NE for biomedical entities in English).
>
> Does any of you know other tree-structured NE corpora ?
>
> Thank you kindly in advance,
>
> --
> Yoann DUPONT
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 1523 bytes
> Desc: not available
> URL: <http://www.uib.no/mailman/public/corpora/attachments/20131209/66ca235f/attachment.txt>
-----------------
Kathrin Beck
Project Administrator CLARIN-D
Dept. of Computational Linguistics
University of Tübingen
Wilhelmstr. 19/ 2.22
72074 Tübingen
Germany
Tel.: +49-7071-29-73970
Fax: +49-7071-29-5214
E-Mail: kbeck at sfs.uni-tuebingen.de,
kathrin.beck at uni-tuebingen.de
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list