<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: times new roman,new york,times,serif; font-size: 12pt; color: #000000'><font face="times new roman, new york, times, serif">Readers of French can find a chapter about the history of corpora and the definition of "corpus" in</font><a href="http://tel.archives-ouvertes.fr/tel-00439095/" style="font-family: 'times new roman', 'new york', times, serif; "> my Ph.D. thesis</a><font face="times new roman, new york, times, serif">. </font><div style="font-family: 'times new roman', 'new york', times, serif; "><br></div><div style="font-family: 'times new roman', 'new york', times, serif; ">Answer to Michael: </div><div style="font-family: 'times new roman', 'new york', times, serif; ">in my Ph.D. thesis, page 18: "<span style="font-family: 'Times New Roman', serif; font-size: 12pt; line-height: 150%; ">Grouping texts with similar characteristics is a longstanding practice, which can be demonstrated by the Codex Manesse, containing songs of German poets collected in the 14th century" (</span><span style="font-size: 12pt; ">"</span><span style="font-size: 12pt; line-height: 24.545454025268555px; font-family: 'Times New Roman', serif; ">Le regroupement de textes ayant des caractéristiques communes, est une pratique ancienne, comme en témoigne par exemple l'existence du <i>Codex Manesse</i> comportant des chansons de poètes allemands rassemblées au 14<sup>ème</sup> siècle."). By the way, concordancing was not invented by corpus linguists either. Bible researchers have done it at least for 500 years. I am refering once again to my Ph.D. thesis (pages 36/37): </span></div><div style="font-family: 'times new roman', 'new york', times, serif; "><span style="font-size: 12pt; line-height: 24.545454025268555px; font-family: 'Times New Roman', serif; ">"The TLFi (Trésor de la Langue Française, an online dictionary of French] tells us that in 1564, 'concordance' is defined as an alphabetical list of words used in the bible, with indication of the texts which contain them (...). I</span><span style="line-height: 24.545454025268555px; font-family: 'Times New Roman', serif; font-size: 12pt; ">n 1680, its application for language learning is mentioned: <i>Concordance. Small essentials with a syntax [...] for children beginning to study Latin."</i></span></div><div style="font-family: 'times new roman', 'new york', times, serif; "><span style="font-size: 12pt; line-height: 24.545454025268555px; font-family: 'Times New Roman', serif; "><br></span></div><div style="font-family: 'times new roman', 'new york', times, serif; "><p class="MsoNormal">"Le <i>TLFi</i> (<i>Trésor de la Langue Française</i>) nous
apprend qu'en 1564, on définit la concordance<!--[if supportFields]><span
style='mso-element:field-begin'></span> XE "<span style='font-size:10.0pt;
line-height:150%'>Concordance"</span> <![endif]--><!--[if supportFields]><span
style='mso-element:field-end'></span><![endif]--> comme <o:p></o:p></p>
<p class="MsoQuote">table alphabétique des mots employés dans la Bible, avec
indication des textes qui les contiennent. <o:p></o:p></p>
<p class="MsoNormal">En 1680, une application pour l'enseignement des langues est
évoquée:<o:p></o:p></p>
<p class="MsoQuote"><i>Concordance. Petit rûdiment avec un sintaxe [...] pour instruire
les enfans qui commençent à aprendre le latin.</i>"<o:p></o:p></p></div><div><font face="Times New Roman, serif"><span style="line-height: 24.545454025268555px;"><br></span></font></div><div><font face="Times New Roman, serif"><span style="line-height: 24.545454025268555px;">Best regards, Eva.<br></span></font><div style="font-family: 'times new roman', 'new york', times, serif; "><br><br><hr id="zwchr"><div style="color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><b>De: </b>"Michal Ptaszynski" <ptaszynski@media.eng.hokudai.ac.jp><br><b>À: </b>corpora@uib.no, corpora-request@uib.no<br><b>Cc: </b>ptaszynski@hgu.jp<br><b>Envoyé: </b>Vendredi 5 Octobre 2012 02:58:04<br><b>Objet: </b>Re: [Corpora-List] What is corpora and what is not?<br><br>I was wandering if someone knows/remembers when the word "corpus" was used <br>first in the context of linguistics and by whom.<br><br>Best,<br>Michal<br><br>----------------<br>Od: Piotr Pezik <pezik@uni.lodz.pl><br>Kopia dla: "CORPORA@hd.uib.no" <CORPORA@hd.uib.no><br>Do: Trevor Jenkins <trevor.jenkins@suneidesis.com><br>Data: Thu, 4 Oct 2012 11:38:43 +0200<br>Temat: Re: [Corpora-List] What is corpora and what is not?<br><br>Having been involved in the process of acquiring both conversational and <br>"on-air" spoken language data for the National Corpus of Polish (NKJP), <br>I'd have to strongly agree with Trevor's remarks.<br>I think the American Soap Operas Corpus, although a very valuable resource <br>in its own right, represents written-to-be-spoken rather than spoken <br>language. Soap opera scripts are essentially their authors's impressions <br>of casual spoken language, not that much different from linguistically <br>realistic dialogues you might fine in a novel or a play. They often are an <br>accurate reflection of (a particular breed of) spoken language and <br>sometimes they are even an exaggerated impression, which is why you might <br>find them to be more spoken than the conversational part of the BNC (the <br>plus-catholique-que-le-pape effect), but they're not the real thing simply <br>because they are written and edited and not produced with the real time <br>constraints of casual spoken discourse.<br>Live TV shows are closer to casual spoken discourse, although still very <br>different, if you consider their pragmatic discourse structure among other <br>dimensions of comparison. For example, it is fairly obvious that while <br>speaking to anyone in the studio, politicians and celebrities generally <br>tend to "communicate” to their viewers/voters. On-air spoken language is <br>different from what you get when the cameras and microphones are switched <br>off.<br>Regards,<br>Piotr<br><br>_______________________________________________<br>UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora<br>Corpora mailing list<br>Corpora@uib.no<br>http://mailman.uib.no/listinfo/corpora<br></div><br></div></div></div></body></html>