[Corpora-List] For students: "CL in Action"

Vlado Keselj vlado at cs.dal.ca
Thu Nov 4 12:33:33 UTC 2010


Hi,

I have attached some of our findings in 2005.  We got ~80%
accuracy of gender-based classification of student essays.

Regards,
Vlado


On Thu, 4 Nov 2010, Koenraad De Smedt wrote:

> Which reminds me that the original Turing test was about guessing the sex of the writer, for which you may need a "discrete state machine" (http://www.abelard.org/turpap/turpap.php#index).

<3 (an ice cream cone with two scoops, fallen sideways)

K

Yorick Wilks wrote the following on 4/11/10 11:47:
> The security agencies have funded quite a bit of stuff on determining the sex of email writers automatically---I believe it works at a very high rate but I dont think they are publishing it (!).
> YW
>
> On 4 Nov 2010, at 10:04, Matthew Purver wrote:
>
>> perhaps the fact that people use it to form the 'love' symbol <3
>>
>> try searching for 3 on twitter's search facility and you see a lot of things like:
>>
>>  belieber_smiles @LittleGirlJBieb Thankyou <3
>>
>> http://twitter.com/#search?q=3
>>
>> On 04/11/2010 9:51, Adam Kilgarriff wrote:
>>> Cool!
>>>
>>> So, what is it about 3?  (see
>>> http://labs.buradayiz.webfactional.com/gender/query/query?words=1+2+3+4+5+6+7+8+9)
>>>  You must have a theory
>>>
>>> adam
>>>
>>> On 4 November 2010 09:23, Amaç Herdağdelen <amac at herdagdelen.com
>>> <mailto:amac at herdagdelen.com>> wrote:
>>>
>>>    Hello Erik,
>>>
>>>    This is a late reply but you might also be interested in a demo
>>>    application that we (Marco Baroni and I) put together to look at the
>>>    gender differences in Twitter messages: http://bit.ly/twittergender
>>>
>>>    We analyzed millions of tweets collected from the Twitter public
>>>    timeline [1] and separated them into male and female subsets by
>>>    using the first names of the Twitter users [2]. For example, if the
>>>    first name of a user is "John", all of this user's tweets are
>>>    categorized as male tweets. On the page, there are two simple tools
>>>    that allow us to compare the gendered frequencies of phrases or
>>>    compare the salient male and female collocates of a given phrase.
>>>
>>>    1. www.iccs.inf.ed.ac.uk/~osborne/papers/socmed10.pdf
>>>    <http://www.iccs.inf.ed.ac.uk/~osborne/papers/socmed10.pdf>
>>>    2. https://github.com/amacinho/Name-Gender-Guesser
>>>
>>>    Best,
>>>
>>>    Amaç Herdağdelen
>>>
>>>
>>>    On Tue, 02 Nov 2010 14:02:51 +0100, Erik Fäßler
>>>    <erik.faessler at uni-jena.de <mailto:erik.faessler at uni-jena.de>> wrote:
>>>
>>>          Hey all,
>>>
>>>        thank you very much for all your rich contributions! I have a lot of
>>>        stuff now, I hope the students don't get blown away ;)
>>>
>>>        Best regards,
>>>
>>>             Erik
>>>
>>>        _______________________________________________
>>>        Corpora mailing list
>>>        Corpora at uib.no <mailto:Corpora at uib.no>
>>>        http://mailman.uib.no/listinfo/corpora
>>>
>>>
>>>    _______________________________________________
>>>    Corpora mailing list
>>>    Corpora at uib.no <mailto:Corpora at uib.no>
>>>    http://mailman.uib.no/listinfo/corpora
>>>
>>>
>>>
>>>
>>> --
>>> ================================================
>>> Adam Kilgarriff http://www.kilgarriff.co.uk
>>> Lexical Computing Ltd http://www.sketchengine.co.uk
>>> Lexicography MasterClass Ltd http://www.lexmasterclass.com
>>> Universities of Leeds and Sussex adam at lexmasterclass.com
>>> <mailto:adam at lexmasterclass.com>
>>> ================================================
>>>
>>>
>>>
>>> _______________________________________________
>>> Corpora mailing list
>>> Corpora at uib.no
>>> http://mailman.uib.no/listinfo/corpora
>> -- 
>> Matthew Purver - http://www.dcs.qmul.ac.uk/~mpurver/
>>
>> Lecturer in Human Interaction
>> Department of Computer Science
>> Queen Mary University of London, London E1 4NS, UK
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2005-Automatic-Categorization-of-Author-Gender-via-N-Gram-Analysis--Doyle-Keselj.pdf
Type: application/pdf
Size: 110085 bytes
Desc: 
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20101104/ad82b7e0/attachment-0001.pdf>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list