<div dir="ltr">This is a bit of a digression but it also underlines why building a start-up (which is similar to doing academic Social Media research) using Twitter data is a very risky business. As a community we should try to identify other Social Media streams and so not be so dependent upon one company.<div>
<br></div><div>Adam: Privacy is key, I agree and is something that I am working on now. Mechanisms for distributing data --whilst making guarantees about which information can be inferred from it-- should be the next step. Whether society as a whole allows for research using this data is a different question however and out of my control.</div>
<div><br></div><div>Miles</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 18 July 2013 09:33, Miguel Almeida <span dir="ltr"><<a href="mailto:miguelbalmeida@gmail.com" target="_blank">miguelbalmeida@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Adam, Miles,<div><br></div><div>I think another reason is so that Twitter can "black out" everyone else at any time in the future. It's a great (and very selfish and narrow-minded) idea: let the research community publish papers with your data, showing you how to find interesting stuff in your data (using taxpayer money!), and then if at some point you want to black them out, use the kill switch.</div>
<div><br></div><div>I don't think Twitter's owners care that much about reproducible research. ;)</div><span class="HOEnZb"><font color="#888888"><div><br></div><div>Miguel</div></font></span></div><div class="HOEnZb">
<div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jul 18, 2013 at 9:26 AM, Adam Kilgarriff <span dir="ltr"><<a href="mailto:adam@lexmasterclass.com" target="_blank">adam@lexmasterclass.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Miles,<div><br></div><div><div>> acts as a barrier to research. Additionally one could argue that preventing people from having access to static Tweet corpora <div>
> undermines doing reproducible research. </div><div><br>
</div></div><div>You can argue all you like but it's a bit irrelevant - the data privacy battleground is the whole wide world, with hi-tech companies, politicians and the media playing for big prizes, and they really won't care one jot what us worker ants think (or if they trample us)<br>
<br>adam</div><div><br><div class="gmail_quote"><div><div>On 18 July 2013 08:55, Miles Osborne <span dir="ltr"><<a href="mailto:miles@inf.ed.ac.uk" target="_blank">miles@inf.ed.ac.uk</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
<div dir="ltr">Basically Twitter's insistence on distributing IDs and not raw Tweets stems from the fact that third parties need to honour deletion requests.<div><br></div><div>If you pass around raw Tweets then there is no way for Twitter to argue that a deleted Tweet is deleted. If instead you force people to recrawl them each time then Tweets can be deleted at source and all subsequent access requests will not return that deleted Tweet.</div>
<div><br></div><div>Personally I think this way of distributing Tweets in bulk is not scalable and acts as a barrier to research. Additionally one could argue that preventing people from having access to static Tweet corpora undermines doing reproducible research. </div>
<span><font color="#888888">
<div><br></div><div>Miles<br clear="all"><div><br></div>-- <br>The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
</div></font></span></div>
<br></div></div><div>_______________________________________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>
Corpora mailing list<br>
</div><a href="mailto:Corpora@uib.no" target="_blank">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></blockquote></div><span><font color="#888888"><br><br clear="all"><div><br></div>-- <br>========================================<br><a href="http://www.kilgarriff.co.uk/" target="_blank">Adam Kilgarriff</a> <a href="mailto:adam@lexmasterclass.com" target="_blank">adam@lexmasterclass.com</a> <br>
Director <a href="http://www.sketchengine.co.uk/" target="_blank">Lexical Computing Ltd</a> <br>Visiting Research Fellow <a href="http://leeds.ac.uk" target="_blank">University of Leeds</a> <div>
<i><font color="#006600">Corpora for all</font></i> with <a href="http://www.sketchengine.co.uk" target="_blank">the Sketch Engine</a> </div><div> <i><a href="http://www.webdante.com" target="_blank">DANTE: <font color="#009900">a lexical database for English</font></a><font color="#009900"> </font> </i><div>
========================================</div></div>
</font></span></div></div>
<br>_______________________________________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no" target="_blank">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
</div>