Job: Post-doctoral position on machine learning / data clustering, Lyon

Thierry Hamon hamon at LIMSI.FR
Fri Jun 27 20:32:10 UTC 2014

Date: Thu, 26 Jun 2014 12:25:30 +0200
From: Julien Velcin <julien.velcin at>
Message-ID: <53ABF51A.7080100 at>

Post-doctoral position at ERIC, University of Lyon (France): Modeling 
and analysis dynamics of web reputation.

ERIC is a research unit specialized in business intelligence and data
mining. In the context of the ImagiWeb project, a 10-month postdoctoral
position is available on machine learning / data clustering, with
applications to social media analysis.

See below for more details about the offer.

Julien Velcin
julien.velcin at

Post-doctoral Position
Title: Modeling and analysis dynamics of web reputation
Supervision: Julien Velcin, Stephane Bonnevay

Place: ERIC Lab (University of Lyon)
Duration: 10 months
Funding: 2250 € per month (ANR Project ImagiWeb, gross salary)


The ImagiWeb project, funded by the French National Research Agency
(2012-2015), aims at studying the image (a.k.a. web reputation) of
entities of various kinds (companies, politicians etc.) as this is
diffused and viewed on the Internet. The study of these representations
and their dynamics is considered today to be a real challenge, which
deals with several issues related to data mining: information/topic
extraction, opinion mining, web reputation, social network analysis
etc. The project involves six partners (3 academic labs, 3 companies)
and two real case studies are considered.


Building and tracking images (web reputations), by taking into account
both temporal and spatial dimensions, can be addressed as a machine
learning problem. In the context of the ImagiWeb project, this
post-doctoral position aims mainly at building new unsupervised machine
learning models and algorithms. Graphical models [1, 8], probabilistic
models or various dynamic models that take into account conditional
dependences, will be considered for dealing with this problem. For data
clustering, several models have been designed for the attribute-value
data [5, 6] and relational data [9]. Recent models of evolutionary
clustering have been proposed for integrating the temporal evolution
into the process [2, 3, 4, 10].

Researchers of the ImagiWeb project have recently designed a new model
for representing and manipulating the “images” (paper under
review). This model is able to deal with entities (e.g., politicians,
companies, brands etc.), temporally described by opinionated labels. Up
to now, it has been tested on short descriptions extracted from a sample
of Twitter. The recruited researcher will address theoretical issues and
she/he will perform experiments on real datasets provided by the
ImagiWeb project. More precisely:

- She/he will update the graphical model in both addressing some of its
  shortcomings and integrating additional information available in the
  data (in particular, the author of the message).

- She/he will propose an accurate way to deal with the timeline by going
  beyond evenly-distributed time windows, for instance by using the
  notion of change points [7].

- She/he will participate in submitting the new model(s) to a high-level
  international conference in machine learning and/or data mining.

- She/he will design the algorithm that implements the new model and
  test it on the datasets of the ImagiWeb project. She/he will be
  involved in the integration of the code into a full prototype.

Profile requirements

Applicants must have a PhD Thesis in Science with a clear research
orientation. Priority will be given to students who have already worked
in the domains of statistical machine learning, probabilistic graphical
models, data clustering.

Application procedure

Applications must be sent by email to Julien Velcin
(julien.velcin at and Stéphane Bonnevay
(stephane.bonnevay at Candidates should send the following

- Cover letter
- CV (including recent publications)
- Marks and awards obtained during their Master degree
- Recommendation letters

After a first selection step, interviews will be organized before taking
the final decision.


[1] C.M. Bishop. Pattern recognition and machine learning, volume
4. Springer, New York, 2006.  chapter 8.

[2] Fuyuan Cao, Jiye Liang, Liang Bai, Xingwang Zhao, and Chuangyin
Dang. A framework for clustering categorical time-evolving data. Fuzzy
Systems, IEEE Transactions on, 18(5):872–882, October 2010.

[3] Deepayan Chakrabarti, Ravi Kumar, and Andrew Tomkins. Evolutionary
clustering. In Inter- national conference on Knowledge discovery and
data mining, KDD ’06, pages 554–560. ACM, 2006.

[4] Yun Chi, Xiaodan Song, Dengyong Zhou, Koji Hino, and Belle
L. Tseng. Evolutionary spectral clustering by incorporating temporal
smoothness. In International conference on Knowledge discovery and data
mining, KDD ’07, pages 153–162. ACM, 2007.

[5] A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from
incomplete data via the em algorithm. Journal of the Royal Statistical
Society. Series B (Methodological), pages 1–38, 1977.

[6] MAF Figueiredo and A.K. Jain. Unsupervised learning of finite
mixture models. Pattern Analysis and Machine Intelligence, IEEE
Transactions on, 24(3):381–396, 2002.

[7] Lajos Horváth and Marie Husková. Change-point detection in panel
data. Journal of Time Series Analysis, 33(4):631–648, 2012.

[8] D. Magatti. Graphical models for text mining : knowledge extraction
and performance estimation.  PhD thesis, Universita degli Studi di
Milano-Bicocca, 2010.

[9] M. Shafiei and H. Chipman. Mixed-membership stochastic block-models
for transactional net- works. In Data Mining (ICDM), 2010 IEEE 10th
International Conference on, pages 1019–1024.  IEEE, 2010.

[10] Tianbing Xu, Zhongfei (Mark) Zhang, Philip S. Yu, and Bo
Long. Dirichlet process based evolu- tionary clustering. In
International Conference on Data Mining, ICDM ’08, pages 648–657. IEEE
Computer Society, 2008.

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

ATALA décline toute responsabilité concernant le contenu des
messages diffusés sur la liste LN

More information about the Ln mailing list