Corpora: What is a corpus

Mike Maxwell mike_maxwell at sil.org
Wed Feb 2 20:44:45 UTC 2000


I just got back from a trip, and perhaps this issue has blown over, in which
case just press <delete>.

If you haven't deleted this yet--it strikes me that this discussion of whether
there can be a "corpus" of proverbs has been rather English-specific.  (Of
course, the message which prompted the discussion was looking for English
proverbs, but...)  In many cultures--modern English not being one of
them--proverbs are very much a central part of language use.  The ancient Hebrew
culture, as shown by the Biblical book of Proverbs (already mentioned in this
thread) is presumably one example.  Apparently some west African cultures are
also; I talked to a Catholic priest who, during his years in Ghana (if I recall
correctly) collected a book of proverbs from one of the local cultures.  These
proverbs were for the most part not stand-alone sentences, but rather short
stories that ended in a pithy saying (the proverb proper, I suppose).  (If
anyone is interested, I can probably track down the author.)

At any rate, if you take a corpus to be (some kind of) a collection of texts,
then a collection of proverbs would be a corpus of short, semi-standardized
texts, perhaps with its own unique linguistic characteristics.  The texts would
often be sentence-length (as in the Biblical book), but sometimes longer (as in
the Ghanaian book).

Of course the whole concept of linguists arguing over the proper definition of
"corpus" strikes me as odd...

                        Mike Maxwell
                        Summer Institute of Linguistics
                        Mike_Maxwell at sil.org



More information about the Corpora mailing list