Corpora: Cross Document Coreference

Einat Amitay einat at ics.mq.edu.au
Thu May 18 00:11:36 UTC 2000


Hi Daniel,

Try these URLs and then follow the references they make.

interesting:
http://www.cs.duke.edu/~amit/acl99-wkshp.html#program1
http://www.dcs.shef.ac.uk/~robertg/publications/papers_by_topic.html#Coreference
Resolution
http://www.dcs.shef.ac.uk/~kwh/auto_papers.html
http://citeseer.nj.nec.com/gaines92integrated.html
http://citeseer.nj.nec.com/gaines95concept.html
http://citeseer.nj.nec.com/kehler97probabilistic.html
http://citeseer.nj.nec.com/did/216478

system:
http://www.dcs.shef.ac.uk/research/groups/nlp/gate/

Daniel Winchester wrote:

> Dear All,
>
> I have recently undertaken a NLP PhD with the working title of
> 'Cross-Document Coreference' in the computer science department of the
> University of Birmingham.   To get to the point, I am using the term
> cross-document coreference to denote multiple, and often variant,
> references to the same entity from different texts.  This usage follows
> from the handful of papers from the NLP community that outline systems
> designed to disambiguate such references (e.g.. the work of Breck
> Baldwin and Amit Bagga).
>
> Thus; in different documents, 'Clinton', 'William Clinton', 'William
> Jefferson Clinton' etc. ,when referring to the president, could all be
> said to 'corefer' but 'Bill Clinton', the new york policeman or
> 'Clinton', the town in Arizona would not.
>
> I am aware that this 'coreference' is profoundly different from that
> found within documents, and that the terminology itself is
> problematic.   Coreference within a discourse/text relies on
> relationships that are intended to allow the reader to resolve any
> ambiguity, this is obviously not the case for references in unrelated
> texts to the same entity.  Nevertheless, for the time being I will use
> the term cross-document coreference.
>
> I am hoping for some help on the following:
>
> 1.  Are there any corpora available that are marked for cross-document
> coreference?
>
>        I know that this is unlikely but anything where all references in
> the corpus to the same entity are related in some way would be very
> useful.
>
> 2.  Does anyone know if this sort of work is being done or has been done
> elsewhere under a different name or in a different discipline?
>
>         It seems the sort of task that Information Retrieval (IR) would
> be interested in, but, to date, I have found no equivalent work.
>         I'm basically after any suggestions that people might have for
> where this is already being looked at, for other news groups that I
> should post a query on, or for alternative disciplines and terminology
> that might be relevant.
>
> Hope that you will be able to help.
>
> Kind Regards
>
> Daniel Winchester
>
> Research Student
> Computer Science Dept
> University of Birmingham

--
Einat Amitay
einat at ics.mq.edu.au
http://www.ics.mq.edu.au/~einat



More information about the Corpora mailing list