<div dir="ltr">- but some countries are bullies.  As Darren Cook noted,<div><br></div><div><span style="font-size:12.7272720336914px">> Data on servers has to follow the rules of the country where the server</span><br style="font-size:12.7272720336914px"><span style="font-size:12.7272720336914px">> is. But, from certain points of view, the data is owned by the</span><br style="font-size:12.7272720336914px"><span style="font-size:12.7272720336914px">>  corporation owning the server:</span><br></div><div><span style="font-size:12.7272720336914px"><br></span></div><div><span style="font-size:12.7272720336914px">Guess which country takes the view that, if the company is its company, then its law controls access to the server?  Yes, the U S of A - which just happens to be the country where all the companies owning those servers, live.  We'll trample on your laws if they don't suit us.  (cf Mr Assange, Snowden - when the stakes get high enough it's only geopolitical enemies of the USA (Russia, Venezuela) who refuse to be trampled)</span></div><div><span style="font-size:12.7272720336914px"><br></span></div><div><span style="font-size:12.7272720336914px">Coming from a small company based partly in a mid-Atlantic island and partly in a small central-European state, </span><span style="font-size:12.7272720336914px">collecting data from all over the world and </span><span style="font-size:12.7272720336914px">with customers from around sixty countries, it is scarcely worth asking 'what the law says;' as it could be any of sixty legal systems (another critical consideration being, as Khalid points out, it's way outside our territory to pay one, let alone sixty, sets of lawyers)</span></div><div><span style="font-size:12.7272720336914px"><br></span></div><div><span style="font-size:12.7272720336914px">Adam</span></div><div><span style="font-size:12.7272720336914px"><br></span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 6 January 2015 at 18:46, Janne Bondi Johannessen <span dir="ltr"><<a href="mailto:jannebj@iln.uio.no" target="_blank">jannebj@iln.uio.no</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Every country has its own laws. <br></div>Janne<br></div><div class="gmail_extra"><div><div class="h5"><br><div class="gmail_quote">2015-01-06 16:56 GMT+01:00 Djamé Seddah <span dir="ltr"><<a href="mailto:djame.seddah@free.fr" target="_blank">djame.seddah@free.fr</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Dear everyone,<div>I’ve heard that shuffling a corpus, so that its original sentence order cannot be retrieved, is enough and counts as a transformation, thus alleviating the risk of potential copyright infringement.  </div><div>Can anyone confirm this?</div><div><br></div><div>Best and happy new year,</div><div><br></div><div>Djamé </div><div><br></div><div><br><div><blockquote type="cite"><div>Le 6 janv. 2015 à 16:04, Mcenery, Tony <<a href="mailto:a.mcenery@lancaster.ac.uk" target="_blank">a.mcenery@lancaster.ac.uk</a>> a écrit :</div><br><div><div><div><div style="font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);direction:ltr;font-family:Tahoma;font-size:10pt"><font face="Tahoma, Geneva, sans-serif">Thanks to all who have contributed to this thread - I have really enjoyed it. Khalid made a passing reference to the UK position - this has recently become quite permissive for non-commercial text mining research, but we have been debating back and forth in Lancaster exactly what this means for corpus linguists. Due to the case-law nature of English Law we won't really know until some cases have been brought forward and we are able to see how the laws/regulations are to be interpreted, hence Khalid's comment about the situation being unclear, I assume. Anyway, for those of you interested in the new exceptions to copyright in the UK, you can read all about it here:</font><div style="font-family:Tahoma,Geneva,sans-serif;font-size:10pt"><br></div><div><font face="Tahoma, Geneva, sans-serif"><a href="https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/375951/Education_and_Teaching.pdf" style="color:purple;text-decoration:underline" target="_blank">https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/375951/Education_and_Teaching.pdf</a></font><br><div style="font-family:Tahoma,Geneva,sans-serif;font-size:10pt"><br><div style="font-family:Tahoma;font-size:13px"> </div></div><div style="font-family:'Times New Roman';font-size:16px"><hr><div style="direction:ltr"><font face="Tahoma"><b>From:</b><span> </span><a href="mailto:corpora-bounces@uib.no" style="color:purple;text-decoration:underline" target="_blank">corpora-bounces@uib.no</a><span> </span>[<a href="mailto:corpora-bounces@uib.no" style="color:purple;text-decoration:underline" target="_blank">corpora-bounces@uib.no</a>] on behalf of Mark Davies [<a href="mailto:Mark_Davies@byu.edu" style="color:purple;text-decoration:underline" target="_blank">Mark_Davies@byu.edu</a>]<br><b>Sent:</b><span> </span>06 January 2015 13:36<br><b>To:</b><span> </span><a href="mailto:corpora@uib.no" style="color:purple;text-decoration:underline" target="_blank">corpora@uib.no</a><br><b>Subject:</b><span> </span>Re: [Corpora-List] Copyright question again<br></font><br></div><div></div><div><div style="margin-top:0px;margin-bottom:0px">Marc Brysbaert wrote:<br></div><div style="margin-top:0px;margin-bottom:0px"><br></div><div style="margin-top:0px;margin-bottom:0px">>> <span style="color:rgb(31,73,125);font-family:Calibri,sans-serif;font-size:11pt">For what it is worth, in my experience word frequency lists and N-gram lists are not a problem. </span></div><div><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)"><br></span></div><div><span style="font-size:11pt;font-family:Calibri,sans-serif;color:rgb(31,73,125)">I agree. I've distributed COCA/COHA word frequency (<a href="http://www.wordfrequency.info/" style="color:purple;text-decoration:underline" target="_blank">http://www.wordfrequency.info</a>) and n-grams (<a href="http://www.ngrams.info/" style="color:purple;text-decoration:underline" target="_blank">http://www.ngrams.info</a>) data for several years now, and I've never had any issues.</span></div><div style="margin-top:0px;margin-bottom:0px"><br></div><div style="margin-top:0px;margin-bottom:0px">>> <span style="color:rgb(31,73,125);font-family:Calibri,sans-serif;font-size:15px;background-color:rgb(255,255,255)">The big problem we are encountering is that currently there is no guidance about whether corpora can be shared. As a result, nearly all corpora assembled remain next to inaccessible, meaning that everyone has to collect their own corpus. This is a lot of needless work and also means that little cumulative work can be done.</span><br></div><div style="margin-top:0px;margin-bottom:0px"><br></div><div style="margin-top:0px;margin-bottom:0px">I've also been distributing "full-text" data from 450 million word COCA and the 1.9 billion word GloWbE (<a href="http://corpus.byu.edu/glowbe" style="color:purple;text-decoration:underline" target="_blank">http://corpus.byu.edu/glowbe</a>) for a while now, and again no problems to this point. There is a "twist", though, in terms of how the full-text data has been slightly altered to avoid copyright problems:<br></div><div style="margin-top:0px;margin-bottom:0px"><br></div><div style="margin-top:0px;margin-bottom:0px"><a href="http://corpus.byu.edu/full-text/limitations.asp" style="color:purple;text-decoration:underline" target="_blank">http://corpus.byu.edu/full-text/limitations.asp</a><br></div><div style="margin-top:0px;margin-bottom:0px"><br></div><div style="margin-top:0px;margin-bottom:0px">​Best,<br></div><div style="margin-top:0px;margin-bottom:0px"><br></div><div style="margin-top:0px;margin-bottom:0px">Mark D.<br></div><div style="margin-top:0px;margin-bottom:0px"><br></div><div><div style="font-family:Tahoma;font-size:13px"><div style="font-family:Tahoma;font-size:13px"><div style="margin-top:0px;margin-bottom:0px">============================================<br>Mark Davies<br>Professor of Linguistics / Brigham Young University<br><a href="http://davies-linguistics.byu.edu/" style="color:purple;text-decoration:underline" target="_blank">http://davies-linguistics.byu.edu/</a></div><div style="margin-top:0px;margin-bottom:0px">** Corpus design and use // Linguistic databases **<br>** Historical linguistics // Language variation **<br>** English, Spanish, and Portuguese **<br>============================================<br></div></div></div></div></div></div></div></div></div></div><span><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);float:none;display:inline!important">_______________________________________________</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)"><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);float:none;display:inline!important">UNSUBSCRIBE from this page:<span> </span></span><a href="http://mailman.uib.no/options/corpora" style="color:purple;text-decoration:underline;font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)" target="_blank">http://mailman.uib.no/options/corpora</a><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)"><span style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);float:none;display:inline!important">Corpora mailing list</span><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)"><a href="mailto:Corpora@uib.no" style="color:purple;text-decoration:underline;font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)" target="_blank">Corpora@uib.no</a><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)"><a href="http://mailman.uib.no/listinfo/corpora" style="color:purple;text-decoration:underline;font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255)"></span></div></blockquote></div><br></div></div><br>_______________________________________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no" target="_blank">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></blockquote></div><br><br clear="all"><br></div></div><span class="HOEnZb"><font color="#888888">-- <br><div><div dir="ltr">Janne Bondi Johannessen<br>Professor<div><a href="http://www.hf.uio.no/iln/english/about/organization/text-laboratory/" target="_blank">The Text Laboratory, ILN,  </a>&<br><a href="http://www.hf.uio.no/multiling/english/" target="_blank">Center for Multilingualism in Society across the Lifespan </a><br>University of Oslo<br>Tel: <a href="tel:%2B47%2022%2085%2068%2014" value="+4722856814" target="_blank">+47 22 85 68 14</a>, mob.: +47 928 966 34<br></div></div></div>
</font></span></div>
<br>_______________________________________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">=============================================<br><a href="http://www.kilgarriff.co.uk/" target="_blank">Adam Kilgarriff</a>                  <a href="mailto:adam@sketchengine.co.uk" target="_blank">adam@sketchengine.co.uk</a>                                            <br>Director                                    <a href="http://www.sketchengine.co.uk/" target="_blank">Lexical Computing Ltd</a>                <br>Visiting Research Fellow                 <a href="http://leeds.ac.uk/" target="_blank">University of Leeds</a>     <div><i><font color="#006600">Corpora for all</font></i> with <a href="http://www.sketchengine.co.uk/" target="_blank">the Sketch Engine</a>   and      <a href="http://skell.sketchengine.co.uk/" target="_blank">SKELL</a>       <i>               </i></div><div>=============================================</div></div></div>
</div>