35.321, Support: Computational Linguistics: PhD, Labex Empirical Foundations of Linguistics

The LINGUIST List linguist at listserv.linguistlist.org
Sat Jan 27 01:05:02 UTC 2024


LINGUIST List: Vol-35-321. Sat Jan 27 2024. ISSN: 1069 - 4875.

Subject: 35.321, Support: Computational Linguistics: PhD, Labex Empirical Foundations of Linguistics

Moderators: Malgorzata E. Cavar, Francis Tyers (linguist at linguistlist.org)
Managing Editor: Justin Fuller
Team: Helen Aristar-Dry, Steven Franks, Everett Green, Daniel Swanson, Maria Lucero Guillen Puon, Zackary Leech, Lynzie Coburn, Natasha Singh, Erin Steitz
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Erin Steitz <ensteitz at linguistlist.org>
================================================================


Date: 17-Jan-2024
From: Christel Préterre [christel.preterre at u-paris.fr]
Subject: Computational Linguistics: PhD, Labex Empirical Foundations of Linguistics


Institution/Organization: Labex Empirical Foundations of Linguistics
Web Address: https://www.labex-efl.fr/

Level: PhD

Duties: Research

Specialty Areas: Computational Linguistics

Description:

Open PhD position in Large Language Models for Information Extraction

The project focuses on information extraction (IE) and addresses the
challenges in low-resource scenarios: limited computing power or
scarce annotated data, such as in poorly endowed languages and
dialects in particular. The current trend is to use dedicated
encoder-decoder architectures for IE graphs, but this has limitations,
notably the difficulty of generating well-formed structures and the
need for specialized datasets for fine-tuning. The PhD thesis will
explore two approaches using large language models (LLMs), without
fine-tuning:

 - Constrained autoregressive generation. This involves several types
of LLM prompting and guiding the generation of an IE graph with
constraints expressed using a grammar. See Geng et al. (2023) for an
example.

 - Structured inference for better compositional generalization. This
involves using LLM to enumerate and score parts of the graph, followed
by a combinatorial algorithm for inference. This allows better use of
structural information and better anchoring in the input text.

General information
Open position : Three-year PhD fully-funded scholarship starting in
September 2024

Funding
By the Laboratory of Excellence Empirical Foundations in Linguistics
(www.labex-efl.fr).
Funds will be available for traveling expenses, equipment and
experiments
Access to the CNRS supercomputer Jean-Zay is provided
PhD salary: about 1700 euros net/month.

Affiliation
Students will be affiliated with two laboratories:
 - LIPN - CNRS UMR 7030 and Université Sorbonne Paris Nord
 - LLF - CNRS UMR 7110 and Université Paris Cité
Students will be attached to the Doctoral school 146 "Sciences,
Technologies, Santé – Galilée"

Supervision
Nadi Tomeh (LIPN) and Guillaume Wisniewski (LLF)

Place of work
LIPN, Université Sorbonne Paris Nord (Villetaneuse) and
LLF, Université Paris Cité (Paris)

Requirements
 - Candidates for the PhD scholarship must have their Masters degree
(or equivalent Bac+5 engineering degree) by September 2024
 - A specialization in machine learning, deep learning, natural
language processing or computational linguistics is required
 - Excellent programming skills in Python
 - Proficiency in software development practices, version control, and
collaborative coding environments
 - Fluency in deep learning frameworks, particularly PyTorch. The
candidate should have hands-on experience in designing, training, and
evaluating deep neural networks.

How to apply?
Send your CV, transcripts, short cover letter and contacts for at
least two references to
nadi.tomeh at lipn.univ-paris13.fr and
guillaume.wisniewski at u-paris.fr

Email Address for Applications: nadi.tomeh at lipn.univ-paris13.fr

Contact Information:
Nadi Tomeh
nadi.tomeh at lipn.univ-paris13.fr



------------------------------------------------------------------------------

Please consider donating to the Linguist List https://give.myiu.org/iu-bloomington/I320011968.html


LINGUIST List is supported by the following publishers:

Cambridge University Press http://www.cambridge.org/linguistics

John Benjamins http://www.benjamins.com/

Lincom GmbH https://lincom-shop.eu/

Linguistic Association of Finland http://www.ling.helsinki.fi/sky/

Multilingual Matters http://www.multilingual-matters.com/

Wiley http://www.wiley.com


----------------------------------------------------------
LINGUIST List: Vol-35-321
----------------------------------------------------------



More information about the LINGUIST mailing list