<div dir="ltr"><div>Dear Michaël,<br></div><div><br></div><div>You may want to look at and try to emulate the approach of Mike Thelwall's "Fk yea I swear", summarized with links at </div><a href="http://academiclogbook.blogspot.com/2011/09/mikethelwallfk-yea-i-swear2008.html">http://academiclogbook.blogspot.com/2011/09/mikethelwallfk-yea-i-swear2008.html</a><br>
<div><br></div><div>He looked at UK MySpace (raise your hand if you still remember MySpace) profiles and made use of the demographic data there. </div><div><br></div><div>Regards,</div><div>Bill Fletcher</div></div><div class="gmail_extra">
<br><br><div class="gmail_quote">On Tue, Feb 25, 2014 at 9:21 AM, Michaël GAUTHIER <span dir="ltr"><<a href="mailto:mic.gauthier@hotmail.fr" target="_blank">mic.gauthier@hotmail.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div dir="ltr">
<div dir="ltr">Dear all,<br><br>I am contacting the whole CORPORA list to try to get information on a corpus which could suit my needs, because up to now, all my efforts to find corresponding ones have been in vain.<br><br>
I am a PhD student investigating the use and perception of profanity among British speakers. Immediately, one difficulty which comes up is that instances of profanity are not easy to record, but there are other factors I need to take into consideration, thus my requirements imply that the corpus would have to:<br>
<br>- Be very recent (after 2000), since the phenomenon on which I focus is a relatively new one<br>- Focus on the U.K.<br>- Be composed of naturally occurring conversations to be able to grasp instances of profanity<br>- Provide at least basic information on the informants (such as age, gender, location, socio-economic situation, ethnic origin...)<br>
- Provide contextual information regarding the conversation and the link(s) between speakers<br><br>I know this is a lot to ask, but these requirements are the ones I have in the most ideal situation. As I said, all the corpora I have been reviewing up to now do not correspond. A short list of the main corpora I have reviewed would be: the BNC, Bank of English, Collins Corpus (this one seems great, with 5 billion words, but it is apparently only available to the lexicographers from Collins, I contacted them but got no answer...), COLT, CANCODE, Longman British Spoken Corpus, Limerick Corpus, Scottish Corpus of texts and speech, IViE, London-Lund Corpus of Spoken English, Cambridge English Corpus (same thing as the Collins Corpus...), International Corpus of English, Diachronic Corpus of Present-day Spoken English, British English Speech Dat. <br>
<br>This is it for the main ones, but as I said, no one corresponded perfectly. Thus, I would be more than happy if someone could point at a corpus I would have missed, even if it does not perfectly correspond. At this point, any new hint would be very welcome. If nothing comes up, I think I will have to “sacrifice” some of my requirements to be able to carry out this study, which by the way is a pilot study, so it would not be that tragic a situation, but if I have the opportunity to find something which perfectly corresponds this is even better!<br>
<br>Sorry for the length of this email, I just tried to be as clear as possible... I hope I was...<br><br>Thank you in advance for any idea/hint/plan/solution/revelation any one of you may have!<br><br>Best regards<span class="HOEnZb"><font color="#888888"><br>
<br>Michaël GAUTHIER<br>Université Lumière Lyon 2<br>France<br></font></span></div>
</div></div>
<br>_______________________________________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></blockquote></div><br></div>