Job: Post-doctoral position on machine learning / data clustering, Lyon
Thierry Hamon
hamon at LIMSI.FR
Fri Jun 27 20:32:10 UTC 2014
Date: Thu, 26 Jun 2014 12:25:30 +0200
From: Julien Velcin <julien.velcin at univ-lyon2.fr>
Message-ID: <53ABF51A.7080100 at univ-lyon2.fr>
Post-doctoral position at ERIC, University of Lyon (France): Modeling
and analysis dynamics of web reputation.
ERIC is a research unit specialized in business intelligence and data
mining. In the context of the ImagiWeb project, a 10-month postdoctoral
position is available on machine learning / data clustering, with
applications to social media analysis.
See below for more details about the offer.
Julien Velcin
julien.velcin at univ-lyon2.fr
Post-doctoral Position
Title: Modeling and analysis dynamics of web reputation
Supervision: Julien Velcin, Stephane Bonnevay
Place: ERIC Lab (University of Lyon)
Duration: 10 months
Funding: 2250 € per month (ANR Project ImagiWeb, gross salary)
Context
The ImagiWeb project, funded by the French National Research Agency
(2012-2015), aims at studying the image (a.k.a. web reputation) of
entities of various kinds (companies, politicians etc.) as this is
diffused and viewed on the Internet. The study of these representations
and their dynamics is considered today to be a real challenge, which
deals with several issues related to data mining: information/topic
extraction, opinion mining, web reputation, social network analysis
etc. The project involves six partners (3 academic labs, 3 companies)
and two real case studies are considered.
Description
Building and tracking images (web reputations), by taking into account
both temporal and spatial dimensions, can be addressed as a machine
learning problem. In the context of the ImagiWeb project, this
post-doctoral position aims mainly at building new unsupervised machine
learning models and algorithms. Graphical models [1, 8], probabilistic
models or various dynamic models that take into account conditional
dependences, will be considered for dealing with this problem. For data
clustering, several models have been designed for the attribute-value
data [5, 6] and relational data [9]. Recent models of evolutionary
clustering have been proposed for integrating the temporal evolution
into the process [2, 3, 4, 10].
Researchers of the ImagiWeb project have recently designed a new model
for representing and manipulating the “images” (paper under
review). This model is able to deal with entities (e.g., politicians,
companies, brands etc.), temporally described by opinionated labels. Up
to now, it has been tested on short descriptions extracted from a sample
of Twitter. The recruited researcher will address theoretical issues and
she/he will perform experiments on real datasets provided by the
ImagiWeb project. More precisely:
- She/he will update the graphical model in both addressing some of its
shortcomings and integrating additional information available in the
data (in particular, the author of the message).
- She/he will propose an accurate way to deal with the timeline by going
beyond evenly-distributed time windows, for instance by using the
notion of change points [7].
- She/he will participate in submitting the new model(s) to a high-level
international conference in machine learning and/or data mining.
- She/he will design the algorithm that implements the new model and
test it on the datasets of the ImagiWeb project. She/he will be
involved in the integration of the code into a full prototype.
Profile requirements
Applicants must have a PhD Thesis in Science with a clear research
orientation. Priority will be given to students who have already worked
in the domains of statistical machine learning, probabilistic graphical
models, data clustering.
Application procedure
Applications must be sent by email to Julien Velcin
(julien.velcin at univ-lyon2.fr) and Stéphane Bonnevay
(stephane.bonnevay at univ-lyon1.fr). Candidates should send the following
elements:
- Cover letter
- CV (including recent publications)
- Marks and awards obtained during their Master degree
- Recommendation letters
After a first selection step, interviews will be organized before taking
the final decision.
References
[1] C.M. Bishop. Pattern recognition and machine learning, volume
4. Springer, New York, 2006. chapter 8.
[2] Fuyuan Cao, Jiye Liang, Liang Bai, Xingwang Zhao, and Chuangyin
Dang. A framework for clustering categorical time-evolving data. Fuzzy
Systems, IEEE Transactions on, 18(5):872–882, October 2010.
[3] Deepayan Chakrabarti, Ravi Kumar, and Andrew Tomkins. Evolutionary
clustering. In Inter- national conference on Knowledge discovery and
data mining, KDD ’06, pages 554–560. ACM, 2006.
[4] Yun Chi, Xiaodan Song, Dengyong Zhou, Koji Hino, and Belle
L. Tseng. Evolutionary spectral clustering by incorporating temporal
smoothness. In International conference on Knowledge discovery and data
mining, KDD ’07, pages 153–162. ACM, 2007.
[5] A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from
incomplete data via the em algorithm. Journal of the Royal Statistical
Society. Series B (Methodological), pages 1–38, 1977.
[6] MAF Figueiredo and A.K. Jain. Unsupervised learning of finite
mixture models. Pattern Analysis and Machine Intelligence, IEEE
Transactions on, 24(3):381–396, 2002.
[7] Lajos Horváth and Marie Husková. Change-point detection in panel
data. Journal of Time Series Analysis, 33(4):631–648, 2012.
[8] D. Magatti. Graphical models for text mining : knowledge extraction
and performance estimation. PhD thesis, Universita degli Studi di
Milano-Bicocca, 2010.
[9] M. Shafiei and H. Chipman. Mixed-membership stochastic block-models
for transactional net- works. In Data Mining (ICDM), 2010 IEEE 10th
International Conference on, pages 1019–1024. IEEE, 2010.
[10] Tianbing Xu, Zhongfei (Mark) Zhang, Philip S. Yu, and Bo
Long. Dirichlet process based evolu- tionary clustering. In
International Conference on Data Mining, ICDM ’08, pages 648–657. IEEE
Computer Society, 2008.
-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version :
Archives : http://listserv.linguistlist.org/archives/ln.html
http://liste.cines.fr/info/ln
La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion : http://www.atala.org/
ATALA décline toute responsabilité concernant le contenu des
messages diffusés sur la liste LN
-------------------------------------------------------------------------
More information about the Ln
mailing list