[Corpora-List] For students: "CL in Action"
Diana Maynard
d.maynard at dcs.shef.ac.uk
Thu Nov 4 10:16:02 UTC 2010
I wondered that as well.
On another note, I guess the success of it depends critically on at
least two things:
(1) how good the gender guesser is (I didn't see any statistics on that,
but I didn't search extensively).
(2) (which is related) - the proportion of American names in the twitter
corpus (since I think the guesser used is based solely on American first
names) - and this could have some impact. Even the differences between
first name gender in the US and Britain are not insignificant.
On a related note, has anyone done the reverse and used vocabulary
selection to help identify the gender of the speaker, with any success?
I'm sure people must have played with this idea.
I'm interested in techniques to improve person gender recognition - in
my experience, using pre-built lists of male and female names and simple
frequency information is often not accurate enough. Again, I haven't
searched extensively for this, but if anyone happens to know offhand
about it I'd be interested.
Diana
On 04/11/2010 09:51, Adam Kilgarriff wrote:
> Cool!
>
> So, what is it about 3? (see
> http://labs.buradayiz.webfactional.com/gender/query/query?words=1+2+3+4+5+6+7+8+9)
> You must have a theory
>
> adam
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list