[Corpora-List] Is corpora of texts an object?

Angus B. Grieve-Smith grvsmth at panix.com
Mon Oct 8 18:03:00 UTC 2012


On 10/08/2012 09:15 AM, Trevor Jenkins wrote:
> At the moment we can't even measure the completeness of corpora for 
> Dickens and Hemingway. This past year has been the 200th anniversary 
> of his birth and it is only now that much of his ephemera has become 
> available through the Dickens Journals Online project 
> http://www.djo.org.uk/ (to which I have no real connection other than 
> being one of the team of volunteer proof-readers/copy-editors that 
> worked on correcting the OCR errors in the online texts). Until that 
> project we pretty much had only his fiction to analyse now we have his 
> social observations too.
>
> Do we have a */complete/* corpus for Hemingway?

     We certainly have complete corpora of the widely published 
fictional works of Dickens and Hemingway.  Do we need to take Dickens' 
social observations into account?  Maybe, maybe not. Completeness and 
representativeness all depend on your purpose.

     Yuri asked about homogeneity.  What are the implications for "more 
homogeneous" versus "less homogeneous"?  Could it just mean that Dickens 
had more careful (or scrupulous, rigid, or anal-retentive) editors than 
Hemingway?  I think "homogeneous" is too vague a term to be useful 
without further context.

-- 
Angus B. Grieve-Smith
grvsmth at panix.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121008/3cfcf264/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list