[Corpora-List] Request People Name Corpus (English)

maxwell maxwell at umiacs.umd.edu
Mon Jun 14 17:51:00 UTC 2010


Waleed Oransa <woransa at gmail.com> wrote:
> I am looking for People Name Corpus in English, categorized by gender.
> do you know of such one exists? Some web sites have such data (e.g. 
> baby names, etc.) so I thought to check with you first since it 
> needs some effort to extract the names from the web beside possible 
> copyright issue. 

IANAL, but I doubt that the copyright issue is a real problem: proper
names aren't copyrighted or (AFAIK) copyrightable.  The exact *selection*
of names in some list mighty conceivably be copyrightable, but there are
work-arounds.  For instance, you could merge several lists, and you could
delete rare names.  Deletion might be done by taking only the names that
appear in all the lists, or by doing a web search and throwing out those
that show up rarely.

That doesn't solve the extraction problem, of course.

   Mike Maxwell

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list