[Corpora-List] For students: "CL in Action"

Rada Mihalcea rada at cs.unt.edu
Thu Nov 4 17:17:21 UTC 2010


There are several papers that looked at automatic gender identification,
see for instance:

M.Koppel, S. Argamon and A. Shimoni (2003), Automatically categorizing
written texts by author gender, Literary and Linguistic Computing 17(4),
November 2002, pp. 401-412

Hugo Liu and Rada Mihalcea, Of Men, Women, and Computers: Data-Driven
Gender Modeling for Improved User Interfaces, in Proceedings of the
International Conference on Weblogs and Social Media (ICWSM), Boulder,
Colorado, March 2007.

Arjun Mukherjee and Bing Liu. "Improving Gender Classification of Blog
Authors." Proceedings of Conference on Empirical Methods in Natural
Language Processing (EMNLP-10). Oct. 9-11, 2010, Boston, Massachusetts,
USA.

A search for "gender classification" or "gender identification" will
most likely reveal quite a few more papers.

Rada

On Thu, 4 Nov 2010, Diana Maynard wrote:

>I wondered that as well.
>
>On another note, I guess the success of it depends critically on at
>least two things:
>(1) how good the gender guesser is (I didn't see any statistics on that,
>but I didn't search extensively).
>
>(2) (which is related) - the proportion of American names in the twitter
>corpus (since I think the guesser used is based solely on American first
>names) - and this could have some impact. Even the differences between
>first name gender in the US and Britain are not insignificant.
>
>On a related note, has anyone done the reverse and used vocabulary
>selection to help identify the gender of the speaker, with any success?
>I'm sure people must have played with this idea.
>
>I'm interested in techniques to improve person gender recognition - in
>my experience, using pre-built lists of male and female names and simple
>frequency information is often not accurate enough. Again, I haven't
>searched extensively for this, but if anyone happens to know offhand
>about it I'd be interested.
>Diana
>
>On 04/11/2010 09:51, Adam Kilgarriff wrote:
>> Cool!
>>
>> So, what is it about 3?  (see
>> http://labs.buradayiz.webfactional.com/gender/query/query?words=1+2+3+4+5+6+7+8+9)
>>   You must have a theory
>>
>> adam
>
>
>_______________________________________________
>Corpora mailing list
>Corpora at uib.no
>http://mailman.uib.no/listinfo/corpora
>


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list