[Corpora-List] Chiniese Name Gender Recognition
Xiaofei Lu
xflu at ling.ohio-state.edu
Thu Dec 22 18:45:29 UTC 2005
Are you planning to look at context at all? The pronoun resolution idea
should definitely help. Plus, looking at the context in which a personal
name appears may help a bit, too, e.g., in cases where one or more names
appears after things like "member(s) of the women's team", etc.
Xiaofei
On Thu, 22 Dec 2005, Heng Ji wrote:
>
> I believe your IR idea will boost the performance. Besides, you may want to
> try applying pronoun reference resolution before gender disambiguation.
> Since Chinese person pronouns are distinguished clearly based on genders. If
> you could link the pronoun in the context with the name candidate, that
> might help. In addition a few gender-specific title words in the context
> would be useful too.
>
> I would guess only using lexical information can accurately recognize name
> genders for people born before 1980; but might not be enough for those
> names appearing later - many names have been given intentionally
> gender-insensitive.:) So you may want to incorporate the time frame
> information in your system.
>
> Heng
>
> On Thu, 22 Dec 2005, Jun Lang wrote:
>
>> Hi Mark Lewellen,
>> Thanks for your concerning about this problem.
>> Yes. After doing some baseline research, I found there were many
>> related problems about the gender recognition based on Chinese Name. May be
>> using only Name could not achieve better result. I am considering
>> combining some other resource for disambiguation the gender. For example, I
>> could use some search engine for some gender designing word to enhance the
>> final accuracy.
>> How do you think about it?
>>
>> Thanks!
>>
>> May you nice Christmas Eve and Day!
>>
>> Best wishes,
>> Bill_Lang(Jun Lang): Ph.D Candidate
>> Information Retrieval Laboratory
>> Harbin Institute of Technology
>> Mail: bill_lang at gmail.com
>> Homepage: http://ir.hit.edu.cn/~bill_lang
>>
>>
>> -----Original Message-----
>> From: Mark Lewellen [mailto:lewellen at erols.com]
>> Sent: Wednesday, December 21, 2005 11:49 PM
>> To: 'Jun Lang'; 'Xiaofei Lu'
>> Cc: corpora at uib.no
>> Subject: RE: [Corpora-List] Chiniese Name Gender Recognition
>>
>> Since Chinese given names are not limited to a set of
>> lexical items that are prototypically 'names' (i.e. they
>> can be just about any lexical item), Chinese given names,
>> as you probably know, often have no clue about gender.
>> There has been some discussion on 'traits' that are
>> more feminine or masculine and would be reflected in names,
>> but there remains a lot of ambiguity. I doubt there is any
>> statistical method, algorithm, or even native speaker that
>> can make up for that problem!
>>
>> Mark Lewellen
>>
>>> -----Original Message-----
>>> From: owner-corpora at lists.uib.no
>>> [mailto:owner-corpora at lists.uib.no] On Behalf Of Jun Lang
>>> Sent: Tuesday, December 13, 2005 7:31 AM
>>> To: 'Xiaofei Lu'
>>> Cc: corpora at uib.no
>>> Subject: [Corpora-List] ´ð¸´: [Corpora-List] Chiniese Name
>>> Gender Recognition
>>>
>>>
>>> Yeah! There are many names which could be used for mail and
>>> female. It is a
>>> difficult problem. Now I have done some simple research on this topic.
>>> Recently, I am trying to get more and more data. Since the
>>> parameter space
>>> is very huge, decision trees can not get the final result
>>> quickly. I want to
>>> use Bayes Model again.
>>>
>>> Can you give me some ideas about it? Thanks a lot!
>>>
>>> Best wishes,
>>> Jun Lang
>>>
>>> -----ÓʼþÔ¼þ-----
>>> ·¢¼þÈË: Xiaofei Lu [mailto:xflu at ling.ohio-state.edu]
>>> ·¢ËÍʱ¼ä: 2005Äê12ÔÂ13ÈÕ 13:56
>>> ÊÕ¼þÈË: Jun Lang
>>> Ö÷Ìâ: Re: [Corpora-List] Chiniese Name Gender Recognition
>>>
>>> Interesting. What is and how do you establish the baseline?
>>> Many names can
>>> be either male or female, can't they?
>>>
>>> On Tue, 13 Dec 2005, Jun Lang wrote:
>>>
>>>> Hi all Corpora Members,
>>>>
>>>> Now I am studying on Chinese Name Gender Recognition.
>>> The input is a
>>>> Chinese name. The output is the corresponding gender. I
>>> used decision
>>> trees
>>>> method. But finally, the accuracy is only about 70%.
>>>>
>>>> Do you know any other method which can achieve higher
>>> accuracy? And is
>>>> there somebody has done any similar research?
>>>>
>>>> Thanks a lot!
>>>>
>>>>
>>>>
>>>> Best wishes,
>>>>
>>>> Bill_Lang(Jun Lang): Ph.D Candidate
>>>>
>>>> Information Retrieval Laboratory
>>>>
>>>> Harbin Institute of Technology
>>>>
>>>> Mail: bill_lang at gmail.com
>>>>
>>>> Homepage: http://ir.hit.edu.cn/~bill_lang
>>>>
>>>>
>>>
>>
>>
>>
>>
>
More information about the Corpora
mailing list