[Corpora-List] Is corpora of texts an object?
Angus B. Grieve-Smith
grvsmth at panix.com
Mon Oct 8 18:03:00 UTC 2012
On 10/08/2012 09:15 AM, Trevor Jenkins wrote:
> At the moment we can't even measure the completeness of corpora for
> Dickens and Hemingway. This past year has been the 200th anniversary
> of his birth and it is only now that much of his ephemera has become
> available through the Dickens Journals Online project
> http://www.djo.org.uk/ (to which I have no real connection other than
> being one of the team of volunteer proof-readers/copy-editors that
> worked on correcting the OCR errors in the online texts). Until that
> project we pretty much had only his fiction to analyse now we have his
> social observations too.
>
> Do we have a */complete/* corpus for Hemingway?
We certainly have complete corpora of the widely published
fictional works of Dickens and Hemingway. Do we need to take Dickens'
social observations into account? Maybe, maybe not. Completeness and
representativeness all depend on your purpose.
Yuri asked about homogeneity. What are the implications for "more
homogeneous" versus "less homogeneous"? Could it just mean that Dickens
had more careful (or scrupulous, rigid, or anal-retentive) editors than
Hemingway? I think "homogeneous" is too vague a term to be useful
without further context.
--
Angus B. Grieve-Smith
grvsmth at panix.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121008/3cfcf264/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list