<html><body><div style="color:#000; background-color:#fff; font-family:tahoma, new york, times, serif;font-size:12pt"><div style="font-family: tahoma, 'new york', times, serif; font-size: 12pt;"><span>Dear Stephen and Angus,</span></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span><br></span></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span>Please forgive me replying to you in particular, but I'd already deleted the original request when I remembered that a friend at Warwick U. is an expert in Conversation Analysis. I contacted him and he provided me with the links stating that the site is the best one he knows of. So I'm piggybacking off you and hoping whoever made the original request will see
it. </span></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span><br></span></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span>The link below is to a web site dedicated to conversation analysis with numerous links to research. It could be within those or their references pages that information can be found where the problem has been addressed. </span></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span style="background-color: transparent;"><br></span></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span
style="background-color: transparent;">http://www.paultenhave.nl/resource.htm</span></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span style="background-color: transparent;"><br></span></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span style="background-color: transparent;">The home page is: </span><span class="yshortcuts" id="lw_1391011377_0" style="color: purple; outline: 0px; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 15px;"><a rel="nofollow" target="_blank" href="http://www.paultenhave.nl/EMCA.htm" style="color: purple; outline: 0px; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 15px;">http://www.paultenhave.nl/EMCA.htm</a></span></div><div style="font-family: tahoma, 'new
york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><br></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;">Links to a list serve for them and other such links: http://www.paultenhave.nl/lists.htm</div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><br></div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;">Kindest regards,</div><div style="font-family: tahoma, 'new york', times, serif; font-size: 16px; color: rgb(0, 0, 0); background-color: transparent; font-style: normal;">Linda Bawcom</div><div style="font-family: tahoma, 'new york', times, serif; font-size: 12pt;"><blockquote
style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; margin-top: 5px; padding-left: 5px;"> <div style="font-family: tahoma, 'new york', times, serif; font-size: 12pt;"> <div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;"> <div dir="ltr"> <hr size="1"> <font size="2" face="Arial"> <b><span style="font-weight:bold;">From:</span></b> Stephen Wattam <stephenwattam@gmail.com><br> <b><span style="font-weight: bold;">To:</span></b> Angus Grieve-Smith <grvsmth@panix.com> <br><b><span style="font-weight: bold;">Cc:</span></b> corpora@uib.no <br> <b><span style="font-weight: bold;">Sent:</span></b> Wednesday, January 29, 2014 5:41 AM<br> <b><span style="font-weight: bold;">Subject:</span></b> Re: [Corpora-List] Testing how representative a particular corpus is<br> </font> </div> <div class="y_msg_container"><br>
> Right. Here's what I don't get: Why hasn't anyone followed even a<br>> single speaker around, let alone a representative sample, to see what<br>> proportion of registers and genres they're exposed to on a daily basis? Or<br>> has this been done?<br><br>I did exactly this (to myself) for two weeks---slides from CL'13 are up at:<br>http://stephenwattam.com/misc/?p=/pc<br><br>The approach we used was to gather a census, so (aside from<br>methodological errors), there should be no scope for errors relating<br>to representativeness.<br><br>The data reinforces others' points from this thread. The concept of<br>representativeness is only useful with respect to a given research<br>question.<br><br>This style of sample constitutes a single (very rich) data point in a<br>conventional corpus, and thus cannot tell us much about the<br>representativeness of something such as the BNC. At most it is
a<br>heuristic.<br><br>It would be possible to extract data from the BNC matching my<br>demographic details, and compare my corpus to that. If that is<br>similar, then the larger corpus is (somewhat) representative for at<br>least that portion of society, with representativeness becoming less<br>assured the less similar one is to myself. There are so many external<br>variables covered by larger corpora that doing detailed 'verification<br>samples' like this would only be statistically valuable with a<br>colossal number of participants, at which point one may as well just<br>use their data for the main corpus.<br><br>Further, it's not even possible to take that sample as representative<br>of myself for many uses, because the two-week recording period fails<br>to cover many events (even obvious periodic ones like Christmas).<br>Technology is helping to defeat this limitation to some degree though<br>by making sampling less
intrusive.<br><br>Regards,<br>-- <br>Steve Wattam<br><br>Contact details and availability:<br>http://ɯɐʇʇɐʍuǝɥdǝʇs.com<br><br>_______________________________________________<br>UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora<br>Corpora mailing list<br><a ymailto="mailto:Corpora@uib.no" href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>http://mailman.uib.no/listinfo/corpora<br><br><br></div> </div> </div> </blockquote></div> </div></body></html>