[Corpora-List] a new member

Christopher Brewster C.Brewster at dcs.shef.ac.uk
Sat Feb 26 09:25:14 UTC 2005


I suggest you look at the METER project which concerned the re-use of News
Feeds in journalism:
http://nlp.shef.ac.uk/research/areas/reuse.html

Although not directly connected with repetition in your sense , it may help
you understand the techniques needed.

Christopher Brewster

*****************************************************
Natural Language Processing Group,
Department of Computer Science, University of Sheffield
Tel: +44(0)114-22.21967  Fax: +44 (0)114-22.21810
Regent Court, 211 Portobello Street
Sheffield   S1 4DP   UNITED KINGDOM
Web: http://www.dcs.shef.ac.uk/~kiffer/
*****************************************************
A definition is the enclosing a wilderness of an
idea within a wall of words.---  Samuel Butler






> -----Original Message-----
> From: owner-corpora at lists.uib.no
> [mailto:owner-corpora at lists.uib.no] On Behalf Of Mai Zaki
> Sent: 26 February 2005 04:24
> To: corpora at uib.no
> Subject: [Corpora-List] a new member
>
> Hello everyone,
>
> It is a pleasure to join your group.
>
> I am a PhD student at Middlesex University and I am just
> starting my research to put together a formal proposal. My
> aim is to do a corpus-based study of repetition, comparing
> the various fiction and non-fiction, written and spoken text
> categories all within the framework of Relevance Theory. I am
> kind of a beginner in this field of corpus linguistics. I
> just did a small scale corpus-based study of the modals in my
> MA thesis using a corpus I compiled myself and a concordance
> software. Now I am hoping I can use one of the big English
> corpora like the ICE-GB or the BNC. But I am basically
> worried about the range of examples a one-million word corpus
> or a 2000-word text collections corpus would generate. I was
> also wondering if it would be feasible for such a study just
> to go through the whole corpus looking for repeated words or
> phrases since no search tool would be particularly useful,
> and whether the layout of the data in either corpora would
> allow me to detect cases of repetition both on senence and
> discourse levels easily. I would really appreciate it if
> anyone could provide me with useful information in this
> regard, especially from those who actually worked with these
> corpora before. And if anyone can recommend  other corpora
> for such a study would be most welcomed.
>
> Thank you all.
>
> Mai Zaki
>
>



More information about the Corpora mailing list