[Corpora-List] Copyright question again

Janne Bondi Johannessen jannebj at iln.uio.no
Tue Jan 6 18:46:23 UTC 2015


Every country has its own laws.
Janne

2015-01-06 16:56 GMT+01:00 Djamé Seddah <djame.seddah at free.fr>:

> Dear everyone,
> I’ve heard that shuffling a corpus, so that its original sentence order
> cannot be retrieved, is enough and counts as a transformation, thus
> alleviating the risk of potential copyright infringement.
> Can anyone confirm this?
>
> Best and happy new year,
>
> Djamé
>
>
> Le 6 janv. 2015 à 16:04, Mcenery, Tony <a.mcenery at lancaster.ac.uk> a
> écrit :
>
> Thanks to all who have contributed to this thread - I have really enjoyed
> it. Khalid made a passing reference to the UK position - this has recently
> become quite permissive for non-commercial text mining research, but we
> have been debating back and forth in Lancaster exactly what this means for
> corpus linguists. Due to the case-law nature of English Law we won't really
> know until some cases have been brought forward and we are able to see how
> the laws/regulations are to be interpreted, hence Khalid's comment about
> the situation being unclear, I assume. Anyway, for those of you interested
> in the new exceptions to copyright in the UK, you can read all about it
> here:
>
>
> https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/375951/Education_and_Teaching.pdf
>
>
> ------------------------------
> *From:* corpora-bounces at uib.no [corpora-bounces at uib.no] on behalf of Mark
> Davies [Mark_Davies at byu.edu]
> *Sent:* 06 January 2015 13:36
> *To:* corpora at uib.no
> *Subject:* Re: [Corpora-List] Copyright question again
>
> Marc Brysbaert wrote:
>
> >> For what it is worth, in my experience word frequency lists and N-gram
> lists are not a problem.
>
> I agree. I've distributed COCA/COHA word frequency (
> http://www.wordfrequency.info) and n-grams (http://www.ngrams.info) data
> for several years now, and I've never had any issues.
>
> >> The big problem we are encountering is that currently there is no
> guidance about whether corpora can be shared. As a result, nearly all
> corpora assembled remain next to inaccessible, meaning that everyone has to
> collect their own corpus. This is a lot of needless work and also means
> that little cumulative work can be done.
>
> I've also been distributing "full-text" data from 450 million word COCA
> and the 1.9 billion word GloWbE (http://corpus.byu.edu/glowbe) for a
> while now, and again no problems to this point. There is a "twist", though,
> in terms of how the full-text data has been slightly altered to
> avoid copyright problems:
>
> http://corpus.byu.edu/full-text/limitations.asp
>
> ​Best,
>
> Mark D.
>
> ============================================
> Mark Davies
> Professor of Linguistics / Brigham Young University
> http://davies-linguistics.byu.edu/
> ** Corpus design and use // Linguistic databases **
> ** Historical linguistics // Language variation **
> ** English, Spanish, and Portuguese **
> ============================================
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
Janne Bondi Johannessen
Professor
The Text Laboratory, ILN,
<http://www.hf.uio.no/iln/english/about/organization/text-laboratory/>&
Center for Multilingualism in Society across the Lifespan
<http://www.hf.uio.no/multiling/english/>
University of Oslo
Tel: +47 22 85 68 14, mob.: +47 928 966 34
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20150106/68a48bf0/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list