[Corpora-List] Legal aspects of compiling corpora

Doug Cooper doug at th.net
Fri Jun 13 18:22:10 UTC 2003


At 14:40 13/6/03 +0100, Mark Sanderson wrote:
>  I think the honest answer is that it is a question with no clear answer.

Not so clear.  The original query was whether a 100-
character citation of a text would be a copyright violation.
Is there a copyright law anywhere that does not grant
"fair use" rights to this sort of minimal citation in all but
pathological cases (eg. extremely short texts like song
lyrics, or perhaps many consecutive citatations of a
single text)?

  In any case, this question comes up periodically, and the
response is almost invariably something along the lines of
'well, you'll probably get away with it.'

  I am rather surprised that the corpus-using community has
not come out with a position statement -- not everybody has
to sign on to it, of course --  that articulates the point of view
that:

   a) distributing minimal citations of copyrighted texts, and
   b) allowing public, indirect access to privately held collections
       of copyrighted texts for statistical purposes
are:
   a) a necessary part of corpus linguistics research, and
   b) believed by CL practitioners to be inherently protected
    as fair use, particularly in non-profit research contexts.

and perhaps also gives a few examples of what might _not_
be considered professional conduct; eg. making full texts
available or easily reconstructed.

  It seems to me that such a statement would be useful in:

   a) helping to clarify that CL applications promote the
      'Progress of Science;' ie. are a genuine research use;
   b) helping individual researchers show that they are
      acting in good faith. in accordance with others in the
      profession.

  Obviously, a bunch of us getting together and saying that
black is white won't make it so.  But to the extent that there
_is_ a possible gray area in the balance between copyright
and fair use, I think it is important to start to establish our side's
position as well.

  Doug Cooper



More information about the Corpora mailing list