Corpora: Cross Document Coreference

Ruslan Mitkov R.Mitkov at wlv.ac.uk
Fri May 19 16:39:54 UTC 2000


On a more general note, visit
http://www.wlv.ac.uk/~le1825/download.htm
for a selection of papers on coreference/anaphora resolution.
This selection is updated on a regular basis

Ruslan Mitkov


At 10:11 18/05/00 +1000, Einat Amitay wrote:
>Hi Daniel,
>
>Try these URLs and then follow the references they make.
>
>interesting:
>http://www.cs.duke.edu/~amit/acl99-wkshp.html#program1
>http://www.dcs.shef.ac.uk/~robertg/publications/papers_by_topic.html#Corefer
>ence
>Resolution
>http://www.dcs.shef.ac.uk/~kwh/auto_papers.html
>http://citeseer.nj.nec.com/gaines92integrated.html
>http://citeseer.nj.nec.com/gaines95concept.html
>http://citeseer.nj.nec.com/kehler97probabilistic.html
>http://citeseer.nj.nec.com/did/216478
>
>system:
>http://www.dcs.shef.ac.uk/research/groups/nlp/gate/
>
>Daniel Winchester wrote:
>
>> Dear All,
>>
>> I have recently undertaken a NLP PhD with the working title of
>> 'Cross-Document Coreference' in the computer science department of the
>> University of Birmingham.   To get to the point, I am using the term
>> cross-document coreference to denote multiple, and often variant,
>> references to the same entity from different texts.  This usage follows
>> from the handful of papers from the NLP community that outline systems
>> designed to disambiguate such references (e.g.. the work of Breck
>> Baldwin and Amit Bagga).
>>
>> Thus; in different documents, 'Clinton', 'William Clinton', 'William
>> Jefferson Clinton' etc. ,when referring to the president, could all be
>> said to 'corefer' but 'Bill Clinton', the new york policeman or
>> 'Clinton', the town in Arizona would not.
>>
>> I am aware that this 'coreference' is profoundly different from that
>> found within documents, and that the terminology itself is
>> problematic.   Coreference within a discourse/text relies on
>> relationships that are intended to allow the reader to resolve any
>> ambiguity, this is obviously not the case for references in unrelated
>> texts to the same entity.  Nevertheless, for the time being I will use
>> the term cross-document coreference.
>>
>> I am hoping for some help on the following:
>>
>> 1.  Are there any corpora available that are marked for cross-document
>> coreference?
>>
>>        I know that this is unlikely but anything where all references in
>> the corpus to the same entity are related in some way would be very
>> useful.
>>
>> 2.  Does anyone know if this sort of work is being done or has been done
>> elsewhere under a different name or in a different discipline?
>>
>>         It seems the sort of task that Information Retrieval (IR) would
>> be interested in, but, to date, I have found no equivalent work.
>>         I'm basically after any suggestions that people might have for
>> where this is already being looked at, for other news groups that I
>> should post a query on, or for alternative disciplines and terminology
>> that might be relevant.
>>
>> Hope that you will be able to help.
>>
>> Kind Regards
>>
>> Daniel Winchester
>>
>> Research Student
>> Computer Science Dept
>> University of Birmingham
>
>--
>Einat Amitay
>einat at ics.mq.edu.au
>http://www.ics.mq.edu.au/~einat
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20000519/a135d946/attachment.htm>


More information about the Corpora mailing list