[Corpora-List] Gender dataset

John D Burger john at mitre.org
Fri Apr 13 13:45:12 UTC 2012


kiran wrote:

> Is there any gender dataset available?
> It should ideally be a first name-gender mapping
> 
> Ex: Abraham-Male or Abraham_Lincoln-Male

There are the name lists from the US 1990 Census, which have been used in a lot of language research, I believe:

  http://www.census.gov/genealogy/names/

These comprise three files: male given names, female given names, and surnames, each with frequency information.  From the first two files, you could construct a gender distribution for each given name.

- John Burger
  MITRE

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list