These: Hai-Son Le, Modeles neuronaux pour la modelisation statistique de la langue

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Sun Dec 16 21:34:39 UTC 2012


Date: Fri, 14 Dec 2012 11:40:08 +0100
From: Alexandre Allauzen <allauzen at limsi.fr>
Message-ID: <50CB0208.7050203 at limsi.fr>


Bonjour à toutes et à tous,

c'est avec grand plaisir que je vous convie à ma soutenance de thèse
jeudi 20 décembre 2012 à 14h en salle de conférences du Bâtiment S,
LIMSI.

Cette thèse s'intitule:

Modèles neuronaux pour la modélisation statistique de la langue
(Continuous Space Neural Network Models in Natural Language Processing)


Le jury est composé de :

Rapporteurs:
Laurent BESACIER, Professeur, Université Joseph Fourier
Holger SCHWENK, Professeur, Université du Maine
Examinateurs:
Yoshua BENGIO, Professeur, Université de Montréal
Hermann NEY, Professeur, RWTH Aachen
Michèle SEBAG, Directrice de recherche, LRI & Université Paris Sud
Encadrants de thèse:
François YVON, Professeur, LIMSI-CNRS & Université Paris Sud
Alexandre ALLAUZEN, Maitre de Conférences, LIMSI-CNRS & Université Paris 
Sud


Je serais également ravi de vous voir au pot qui suivra.

Cordialement

Hai-Son LE

Abstract

The purpose of language models is in general to capture and to model
regularities of language, thereby capturing morphological, syntactical
and distributional properties of word sequences in a given language.
They play an important role in many successful applications of Natural
Language Processing, such as Automatic Speech Recognition, Machine
Translation and Information Extraction. The most successful approaches
to date are based on n-gram assumption and the adjustment of statistics
from the training data by applying smoothing and back-off techniques,
notably Kneser-Ney technique, introduced twenty years ago. In this way,
language models predicts a word based on its n-1 previous words. In
spite of their prevalence, conventional n-gram based language models
still suffer from several limitations that could be intuitively overcome
by consulting human expert knowledge. One critical limitation is that,
ignoring all linguistic properties, they treat each word as one discrete
symbol with no relation with the others. Another point is that, even
with a huge amount of data, the data sparsity issue always has an
important impact, so the optimal value of n in the n-gram assumption is
often 4 or 5 which is insufficient in practice. This kind of model is
constructed based on the count of n-grams in training data. Therefore,
the pertinence of theses models is conditioned only on the
characteristics of the training text (its quantity, its representation
of the content in terms of theme, date).

Recently, one of the most successful attempt that tries to directly
learn word similarities is to use distributed word representations in
language modeling, where distributionally words, which have semantic and
syntactic similarities, are expected to be represented as neighbors in a
continuous space. These representations and the associated objective
function (the likelihood of the training data) are jointly learned using
a multi-layer neural network architecture. In this way, word
similarities are learned automatically. This approach has shown
significant and consistent improvements when applied to automatic speech
recognition and statistical machine translation tasks.

A major difficulty with the continuous space neural network based
approach remains the computational burden, which does not scale well to
the massive corpora that are nowadays available. For this reason, the
first contribution of this dissertation is the definition of a neural
architecture based on a tree representation of the output vocabulary,
namely Structured OUtput Layer (SOUL), which makes them well suited for
large scale frameworks. The SOUL model combines the neural network
approach with the class-based approach. It achieves significant
improvements on both state-of-the-art large scale automatic speech
recognition and statistical machine translations tasks. The second
contribution is to provide several insightful analyses on their
performances, their pros and cons, their induced word space
representation. Finally, the third contribution is the successful
adoption of the continuous space neural network into a machine
translation framework. New translation models are proposed and reported
to achieve significant improvements over state-of-the-art baseline
systems.

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list