27.2611, Support: French; Computational Linguistics / France

The LINGUIST List via LINGUIST linguist at listserv.linguistlist.org
Wed Jun 15 15:17:34 UTC 2016


LINGUIST List: Vol-27-2611. Wed Jun 15 2016. ISSN: 1069 - 4875.

Subject: 27.2611, Support: French; Computational Linguistics / France

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Anthony Aristar, Helen Aristar-Dry, Robert Coté, Sara Couture)
Homepage: http://linguistlist.org

*****************    LINGUIST List Support    *****************
                       Fund Drive 2016
                   25 years of LINGUIST List!
Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Amanda Foster <amanda at linguistlist.org>
================================================================


Date: Wed, 15 Jun 2016 11:17:26
From: Isabelle Tellier [isabelle.tellier at univ-paris3.fr]
Subject: French; Computational Linguistics, PhD, Labex EFL, France

 Institution/Organization: Labex EFL 
Department:  
Web Address: http://www.labex-efl.org/?q=en 

Level: PhD 

Duties: Research
 
Specialty Areas: Computational Linguistics 
 
Required Language(s): French (fra)

Description:

Ph.D. Thesis on Automatic coreference chains detection for French by
combination of machine learning approaches and linguistic resources

Supervisors: Isabelle Tellier, Marco Dinarelli (Lattice), and Eric de la
Clergerie (Alpage)
Funding : Labex EFL (http://www.labex-efl.org/?q=en) in Paris (3 years)
Start : October 2016

Subject

A coreference chain is composed by the set of expressions in a text referring
to the same discourse entity (or event). Coreference chains ensure the
continuity and coherence of discourses. Their detection is very important for
several tasks, such as Information Extraction or Machine Translation.
Automatic Coreference Chains Detection (ACCD henceforth) is a well-known task
in NLP. It has given rise to several competitive international challenges
(Sem-Eval-2 in 2010, CoNNL in 2011 and 2012). There has not been however any
such challenge for French, as no French corpus was available up to 2014. The
French ANCOR corpus (for ''Anaphore et Coréférence dans les corpus Oraux'',
i.e. Anaphora and Coreference in Speech Corpora), developed at the University
of Tours (France), has partially solved this problem ([Lefèvre et al. 2014],
[Desoyer et al. 2015], in French). But the corpus presents some specificities
due to the nature of speech transcriptions, and up to now there exists no
complete (end-to-end) system for ACCD in French.
The goal of the Ph.D. thesis is to build an end-to-end system for coreference
chains detection in French, that is the system must be able to extract
coreference chains from French raw texts. Moreover the system must be able to
integrate the ALPAGE NLP framework. The main difficulty in this Ph.D. thesis
will be the lack of corpora annotated with coreference chains in French. In
order to overcome this problem, a wide range of different approaches must be
tested, combining machine learning approaches (text classification, CRFs,
Neural Networks) with linguistic information (surface features extracted from
multilingual data, features coming from lemmatization, distributional
analysis, morphosyntactic tagging, chunking, syntactic analysis, semantic
lexicons or discourse analysis…)

Références

(Haghighi and Klein 2009) Haghighi and Klein, Simple Coreference Resolution
with Rich syntactic and semantic features, EMNLP'09.
Desoyer et al. 2015) A. Desoyer, F. Landragin, I. Tellier, A. Lefeuvre, J-Y.
Antoine, Les coréférences à l'oral : une expérience d'apprentissage
automatique sur le corpus
ANCOR, revue TAL, numéro 55.2 sur le traitement automatique du langage parlé,
p.97-121, 2015.
(Lee et al. 2013) Lee H., Chang A., Peirsman Y., Chambers N., Surdeanu M.,
Jurafsky D., Deterministic Co-reference Resolution Based on Entity-Centric,
Precision Ranked Rules, Computational Linguistics, vol. 39, no 4, p. 885-916,
2013.
(Lefèvre et al. 2014) : A. Lefeuvre, J-Y Antoine, E. Schang, Le corpus
ANCOR_Centre et son outil de requêtage : application à l’étude de l’accord en
genre et en nombre dans les coréférences et anaphores en français parlé, Actes
du 4éme Congrés Mondial de Linguistique Française, 2014.

For application

The Ph.D. thesis will take place at Lattice (Montrouge) and Alpage (Paris),
and will be conducted in coordination with the French ANR project Democrat,
leaded by Frederic Landragin at Lattice, and focusing on the same subject.

The candidates aiming for application must hold a Master Degree in mathematics
or computer science (proving their knowledge and skills in NLP and machine
learning). Familiarity with French is highly desirable.

Candidates must apply by sending CV, motivation letter and possibly Master
exam scores to isabelle.tellier at univ-paris3.fr before 30th June 2016.
 

Application Deadline: 30-Jun-2016 

Mailing Address for Applications:
	Attn: Pr Isabelle Tellier 
	Lattice 
	1 rue Maurice Arnoux 
	Montrouge 92120 
	France 
	
Contact Information: 
	Pr Isabelle Tellier 
	isabelle.tellier at univ-paris3.fr  


------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
                       Fund Drive 2016
Please support the LL editors and operation with a donation at:
            http://funddrive.linguistlist.org/donate/

This year the LINGUIST List hopes to raise $79,000. This money 
will go to help keep the List running by supporting all of our 
Student Editors for the coming year.

Don't forget to check out Fund Drive 2016 site!

http://funddrive.linguistlist.org/

For all information on donating, including information on how to 
donate by check, money order, PayPal or wire transfer, please visit:
http://funddrive.linguistlist.org/donate/

The LINGUIST List is under the umbrella of Indiana University and
as such can receive donations through Indiana University Foundation. We
also collect donations via eLinguistics Foundation, a registered 501(c)
Non Profit organization with the federal tax number 45-4211155. Either
way, the donations can be offset against your federal and sometimes your
state tax return (U.S. tax payers only). For more information visit the
IRS Web-Site, or contact your financial advisor.

Many companies also offer a gift matching program, such that
they will match any gift you make to a non-profit organization.
Normally this entails your contacting your human resources department
and sending us a form that the Indiana University Foundation fills in
and returns to your employer. This is generally a simple administrative
procedure that doubles the value of your gift to LINGUIST, without
costing you an extra penny. Please take a moment to check if
your company operates such a program.


Thank you very much for your support of LINGUIST!
 


----------------------------------------------------------
LINGUIST List: Vol-27-2611	
----------------------------------------------------------







More information about the LINGUIST mailing list