[Corpora-List] Wordsmith concordance
Lou Burnard
lou.burnard at computing-services.oxford.ac.uk
Thu Dec 19 09:59:00 UTC 2002
If you are indeed working on texts derived from the BNC, then a fairly
obvious thing to check would be whether the lines are in fact duplicated in
the BNC itself. Go to http://sara.natcorp.ox.ac.uk/lookup.html and type one
of your repeated phrases into the box.
There are (still) a few erroneous text duplications. More interestingly
there are several cases of genuine repetition-with-variants caused by
different newspapers (or the same newspaper at different times) re-using the
same agency material.
If you're not using the BNC of course this is irrelevant, except insofaras
it illustrates the general principle that one should *always* suspect the
data!
Lou
-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no]On
Behalf Of Anne Harrap
Sent: 17 December 2002 10:52
To: corpora list - messages to list
Subject: [Corpora-List] Wordsmith concordance
Does anyone else get a lot of duplicated entries when doing a
concordance in Wordsmith?
Not sure if this is a bug or we are doing something wrong...
Anne Harrap
Languages Centre Documentalist
School of Languages
Oxford Brookes University
Oxford (UK)
Tel: +44 865 483723
Fax: +44 865 483791
Email: anneh at sol.brookes.ac.uk
More information about the Corpora
mailing list