[Corpora-List] What I came away with from the "What is a Corpus" discussion
John F Sowa
sowa at bestweb.net
Sat Oct 6 16:33:53 UTC 2012
On 10/6/2012 12:08 PM, amsler at cs.utexas.edu wrote:
> The simplest summary I came away with is that a corpus is a set of
> texts that has a proposed purpose of study. At least one person must
> have an intention for the collection to serve a purpose.
I agree. This summary is very close to Adam's definition:
AK
> a corpus is a collection of texts/speech. We call it a corpus when
> we view it as an object of linguistics or literary research.
And the following point is true of many (most?) words in NLs:
RA
> This definition of a corpus means that it may not be recognized as
> a corpus by anyone else other than its collector/creator.
Yes. That's why nearly every reference to a corpus on this email list
puts some name or other qualifier in front of the word 'corpus'.
RA
> How to make a corpus that adheres to "best practices" would be more
> useful than deciding on whether someone's purposeful collection of text
> qualified to be called a corpus by everyone.
I agree. Then the name given to the rules for those practices
could be placed in front of the word 'corpus'.
John Sowa
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list