[Corpora-List] Google Books, copyrights, and corpora

Thu Jun 15 15:48:04 UTC 2006

Doug Cooper wrote:
>
> The parallel to Napster is also hard to see.  Taking a work apart,
> then providing an automatic process to put it back together again,
> clearly tries to make an end run around the law.  But quite simple
> limitations on corpus sample-serving (e.g. not allowing samples to
> run over paragraph boundaries, and/or not identifiying samples with
> their specific sources) would make it impossible for any number of
> 14-year-old Python scripters to reconstitute the original texts.

Yes. One of my points was exactly that. If limitations are designed into
the online service, then there might not be any exposure. If such
limitations are lacking, then not only the provider but even individual
users of the service might find themselves in deep kimchee.

> Bottom line, establishing that research applications of text corpora
> is fair use is not a matter of 'snippet' defenses, and won't rise or fall
> with Google.  Rather, it's that our use and citation of text samples for
> analytical purposes has little or nothing to do with the protection they
> are given as creative literary works.

Another of my points was that this can change practically overnight with a
single appellate court decision. If you're Google, you can then go off and
pursue other business while working towards the establishment of
counter-precedents in other jurisdictions.

Finally, I guess I should mention that it would _not_ be considered fair
use to make copies of copyrighted works and give them away for free --
commercial use is not a necessary criterion. I was trying to make the
point that even this kind of philanthropic action can be considered
criminal infringement fairly easily.

IANAL. TINLA.

-- Mark

Mark P. Line
Polymathix
San Antonio, TX