[Corpora-List] For students: "CL in Action"

Diana Maynard d.maynard at dcs.shef.ac.uk
Thu Nov 4 10:16:02 UTC 2010


I wondered that as well.

On another note, I guess the success of it depends critically on at 
least two things:
(1) how good the gender guesser is (I didn't see any statistics on that, 
but I didn't search extensively).

(2) (which is related) - the proportion of American names in the twitter 
corpus (since I think the guesser used is based solely on American first 
names) - and this could have some impact. Even the differences between 
first name gender in the US and Britain are not insignificant.

On a related note, has anyone done the reverse and used vocabulary 
selection to help identify the gender of the speaker, with any success?
I'm sure people must have played with this idea.

I'm interested in techniques to improve person gender recognition - in 
my experience, using pre-built lists of male and female names and simple 
frequency information is often not accurate enough. Again, I haven't 
searched extensively for this, but if anyone happens to know offhand 
about it I'd be interested.
Diana

On 04/11/2010 09:51, Adam Kilgarriff wrote:
> Cool!
>
> So, what is it about 3?  (see
> http://labs.buradayiz.webfactional.com/gender/query/query?words=1+2+3+4+5+6+7+8+9)
>   You must have a theory
>
> adam


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list