[Corpora-List] Copyright question again

Michael Scott mike at lexically.net
Tue Jan 6 12:57:46 UTC 2015


Sad story. But it surely that illustrates Adam's point. He took  on some 
powerful people. Edward Snowden took on powerful agencies too. If you 
tweak a tiger's tail you might get pounced on.

Damir was on about frequency profiles only, not source texts. I think he 
is safe enough until frequency profiles become very valuable resources.

Mike


On 06/01/2015 08:03, Buabin, Emmanuel wrote:
> Hello Damir,
>
> Perhaps you may read this news article from New York Times and take 
> necessary precautions. But let me indicate that, Issues about 
> copyrights are very delicate and must be handled with care. Especially 
> when you are an individual. Please take a look.
>
> http://www.nytimes.com/2013/01/13/technology/aaron-swartz-internet-activist-dies-at-26.html?pagewanted=all&_r=0 
>
>
> Hope this helps
>
> Regards
> Emmanuel
>
>
> -- 
> Emmanuel Buabin
> Lecturer, Department of Information Technology
> Methodist University College Ghana
> Box DC 940
> Dansoman
> personal: www.ebuabin.net <http://www.ebuabin.net>
>
>
>
> On Tue, Jan 6, 2015 at 5:00 AM, Damir Cavar <dcavar at me.com 
> <mailto:dcavar at me.com>> wrote:
>
>     Hi everybody,
>
>     I know, this question has been addressed a lot, but, just to get an
>     update on this issue and your expert opinion:
>
>     If I am accessing the internet from the US, as I am right now, and I
>     decide to generate N-gram-based language models by exploiting the
>     web as
>     a corpus and publish the word-lists and frequency profiles openly
>     on my
>     homepage, sell them even, change or manipulate them, and reuse them in
>     various ways, would this be
>
>     a. ok as fair-use for research only, excluding commercial use
>     b. legal in general, independent of my research interests
>     c. legal only in some countries (so, my models would be illegal in
>     some
>     others)
>
>     What is the current status of the web as a corpus and extracted
>     language
>     models from the legal perspective in the US and globally?
>
>     If I do the same now with open-access journals and extract frequency
>     profiles of tokens for a certain research domain, would it be the
>     same?
>     It I use Google Books? Or even some news website?
>
>     Is the extraction of a language model, maybe a domain specific
>     frequency
>     profile a copyright infringement per se? The text cannot be
>     reconstructed, the content is not visible, the authors style
>     neither, in
>     particular not, if the corpus is larger etc.
>
>     Thanks!
>
>     Damir
>
>
>
>     --
>     Damir Cavar
>     Department of Linguistics
>     Indiana University
>
>
>
>     _______________________________________________
>     UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>     Corpora mailing list
>     Corpora at uib.no <mailto:Corpora at uib.no>
>     http://mailman.uib.no/listinfo/corpora
>
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- 
/--
Mike Scott
***
If you publish research which uses WordSmith, do let me know so I can 
include it at 
http://www.lexically.net/wordsmith/corpus_linguistics_links/papers_using_wordsmith.htm 

***
Aston University
and
Lexical Analysis Software Ltd
www.lexically.net
/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20150106/4b82e3f5/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list