[Corpora-List] corpora annotated with pronominal anaphors

Orasan, Constantin C.Orasan at wlv.ac.uk
Fri Oct 7 11:04:21 UTC 2011


Hello,

The Research Group in Computational Linguistics has been developing
corpora annotated with anaphora and coreference information for many
years. We have the following corpora available:

1. You can download the NP4E corpus from
http://clg.wlv.ac.uk/projects/NP4E/  This corpus is annotated with full
coreferential chains and all the markables even if they are singletons.
You can find more in the annotation guidelines available on the site and
in our paper:

Laura Hasler, Constantin Orasan and Karin Naumann (2006) NPs for Events:
Experiments in Coreference Annotation. In Proceedings of the 5th edition
of the International Conference on Language Resources and Evaluation
(LREC2006), 24 -- 26 May, Genoa, Italy, pp. 1167 - 1172
http://clg.wlv.ac.uk/projects/NP4E/539_pdf.pdf 

2. A precursor of this corpus is a corpus of technical manuals that is
described in 

Mitkov, R., Evans, R., Orasan, C., Barbu, C., Jones L. and Sotirova, V.
2000. Coreference and anaphora: developing annotating tools, annotated
resources and annotation strategies. In Proceedings of the Discourse
Anaphora and Anaphora Resolution Colloquium (DAARC'2000)), 49-58.
Lancaster, UK
http://clg.wlv.ac.uk/papers/DAARC2000annotatedcorpora2.pdf   

This corpus also annotates full coreferential chains and uses MUC-like
SGML encoding (but it should not be too difficult to convert it to XML).
I should have this corpus somewhere on my harddisk, so please drop me an
email if you are interested in it. 

3. An English-French parallel corpus annotated with anaphora information
(i.e. only pronoun - antecedent links). You can find more information
in:

Tutin, A., Haddara, M., Mitkov, R. and Orasan, C. (2004) Annotation of
anaphoric expressions in an aligned bilingual corpus. In Proceedings of
the 4th Language Resources and Evaluation Conference (LREC2004), Lisbon,
Portugal
http://www.lrec-conf.org/proceedings/lrec2004/summaries/482.htm 

Again, I should have this corpus somewhere and I am happy to share it
with the community.

4. I have a corpus of scientific articles where I marked links between
pronouns and their antecedents in the current paragraph. The use of the
corpus and a few details are described in my PhD and in

Orasan, C. (2007) Pronominal anaphora resolution for text summarisation.
In Proceedings of RANLP2007, 27 - 29 September, Borovets, Bulgaria
http://clg.wlv.ac.uk/papers/show_paper.php?ID=138 

The corpus is a bit messy, but I am happy to share it.

Regards,

Constantin

------
Constantin Orasan, PhD
Senior Lecturer in Computational Linguistics
Research Group in Computational Linguistics
University of Wolverhampton
http://www.wlv.ac.uk/~in6093/

-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf
Of Muhammad Muhammad
Sent: 27 September 2011 06:40
To: corpora at uib.no
Subject: [Corpora-List] corpora annotated with pronominal anaphors

Hi

I am comparing various corpora available (English or other languages)
where pronouns are tagged with their antecedents. 
>>From my initial searches I know about (Ge, Hale and Charniak 1998)
which has 2477 instances of tagged pronouns, and (Modjeska 2003) had 500
instances of other-anaphor..

what else?

best regards,

Abdul-Baquee M. Sharaf
PhD Student
Language Technologies Group
School of Computing
University of Leeds
UK
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
-- 
Scanned by iCritical.

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list