[Corpora-List] Request People Name Corpus (English)

Adam Funk a.funk at dcs.shef.ac.uk
Mon Jun 14 19:59:46 UTC 2010


[14/06/10 20:35] Piotr Bański wrote:
> On 2010-06-14 19:18, Nathan Schneider wrote:
>> Mark Kantrowitz's Names Corpus distributed with NLTK sounds like what
>> you're looking for (at least for English):
>> http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml
> 
> Thanks, this is an interesting resource. I can't help feeling that if we
> start calling such lists corpora, the term will totally lose its
> meaning. Or has it already?
> 
> A while ago, I saw a request for a "corpus of acronyms", which I thought
> was a misuse of the term, perfectly highlighted by Rob Malouf's reply.
> Now this comes up.
> 
> Note that I'm not asking for a definition of a corpus (oh no...), but
> have its borders now become really so fuzzy as to allow for *lists* of
> single words to be counted as "non-prototypical" corpora?

I'd call a list like these a "gazetteer".


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list