[Corpora-List] What is corpora and what is not?
Graham White
graham at eecs.qmul.ac.uk
Wed Oct 3 17:56:59 UTC 2012
So the Corpus Iuris Civilis is not a corpus? This seems an unusual way
to define things, firstly because it unduly privileges the medium of
representation (and, as a computer scientist, that seems to me to be a
mistake), and, secondly, because it rather orphans corpora which happen
to be machine-readable: it ignores the considerable continuities between
what scholars do with machine-readable texts and what scholars do with
non-machine-readable texts. Why, after all, do we have machine-readable
corpora other than that we are interested in human linguistic practices?
Machine-readable corpora don't drop from space, after all.
Graham
On 03/10/12 18:43, WILLIAMS Geoffrey wrote:
> Are we not slightly reinventing the wheel?
>
> The nature of corpora has been discussed for years, EAGLES was about
> defining it. In 2005, John Sinclair enlarged upon the 1996 definition
> when he wrote :
>
>> A corpus is a collection of pieces of language text in electronic
>> format, selected according to external criteria to represent, as far
>> as possible, a language or language variety as a source of data for
>> linguistic research.
> Sinclair J. McH. . 2005. ‘Corpus and Text: Basic Principles’. In Wynne,
> M (ed). 2005. pp. 1-16. Wynne, M (ed). 2005. Developing Linguistic
> Corpora: A Guide to Good Practice. Oxford: AHDS 6 -
>
> It is also on the web!
>
> Surely anyone involved in corpora has read the seminal works and does
> not need reminding that corpora are machine-readable, maybe samples or
> whole works etc. What has changed is the rise of internet corpora, but
> here too Kilgarriff and others have commented the situation in a way
> that both NLP and corpus linguistic users can feel at home with.
>
> Best
>
> Geoffrey
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list