[Corpora-List] What is corpora and what is not?
evalacroix at free.fr
evalacroix at free.fr
Fri Oct 5 05:32:37 UTC 2012
Readers of French can find a chapter about the history of corpora and the definition of "corpus" in my Ph.D. thesis .
Answer to Michael:
in my Ph.D. thesis, page 18: " Grouping texts with similar characteristics is a longstanding practice, which can be demonstrated by the Codex Manesse, containing songs of German poets collected in the 14th century" ( " Le regroupement de textes ayant des caractéristiques communes, est une pratique ancienne, comme en témoigne par exemple l'existence du Codex Manesse comportant des chansons de poètes allemands rassemblées au 14 ème siècle."). By the way, concordancing was not invented by corpus linguists either. Bible researchers have done it at least for 500 years. I am refering once again to my Ph.D. thesis (pages 36/37):
"The TLFi (Trésor de la Langue Française, an online dictionary of French] tells us that in 1564, 'concordance' is defined as an alphabetical list of words used in the bible, with indication of the texts which contain them (...). I n 1680, its application for language learning is mentioned: Concordance. Small essentials with a syntax [...] for children beginning to study Latin."
"Le TLFi ( Trésor de la Langue Française ) nous apprend qu'en 1564, on définit la concordance comme
table alphabétique des mots employés dans la Bible, avec indication des textes qui les contiennent.
En 1680, une application pour l'enseignement des langues est évoquée:
Concordance. Petit rûdiment avec un sintaxe [...] pour instruire les enfans qui commençent à aprendre le latin. "
Best regards, Eva.
----- Mail original -----
De: "Michal Ptaszynski" <ptaszynski at media.eng.hokudai.ac.jp>
À: corpora at uib.no, corpora-request at uib.no
Cc: ptaszynski at hgu.jp
Envoyé: Vendredi 5 Octobre 2012 02:58:04
Objet: Re: [Corpora-List] What is corpora and what is not?
I was wandering if someone knows/remembers when the word "corpus" was used
first in the context of linguistics and by whom.
Best,
Michal
----------------
Od: Piotr Pezik <pezik at uni.lodz.pl>
Kopia dla: "CORPORA at hd.uib.no" <CORPORA at hd.uib.no>
Do: Trevor Jenkins <trevor.jenkins at suneidesis.com>
Data: Thu, 4 Oct 2012 11:38:43 +0200
Temat: Re: [Corpora-List] What is corpora and what is not?
Having been involved in the process of acquiring both conversational and
"on-air" spoken language data for the National Corpus of Polish (NKJP),
I'd have to strongly agree with Trevor's remarks.
I think the American Soap Operas Corpus, although a very valuable resource
in its own right, represents written-to-be-spoken rather than spoken
language. Soap opera scripts are essentially their authors's impressions
of casual spoken language, not that much different from linguistically
realistic dialogues you might fine in a novel or a play. They often are an
accurate reflection of (a particular breed of) spoken language and
sometimes they are even an exaggerated impression, which is why you might
find them to be more spoken than the conversational part of the BNC (the
plus-catholique-que-le-pape effect), but they're not the real thing simply
because they are written and edited and not produced with the real time
constraints of casual spoken discourse.
Live TV shows are closer to casual spoken discourse, although still very
different, if you consider their pragmatic discourse structure among other
dimensions of comparison. For example, it is fairly obvious that while
speaking to anyone in the studio, politicians and celebrities generally
tend to "communicate” to their viewers/voters. On-air spoken language is
different from what you get when the cameras and microphones are switched
off.
Regards,
Piotr
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121005/c8a0b46d/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list