[Corpora-List] RepLab 2013: Call for Participation

Edgar Meij edgar.meij at gmail.com
Tue Apr 9 21:51:45 UTC 2013


===============================================================
RepLab 2013: Call for Participation
An Evaluation Campaign for Online Reputation Management Systems
http://www.limosine-project.eu/events/replab2013
===============================================================

RepLab is a competitive evaluation exercise for Online Reputation Management systems. After a successful first year at CLEF 2012, the second RepLab campaign is an activity of CLEF 2013, and the results of the exercise will be discussed at the CLEF conference in Valencia (23–26 September) -- see http://clef2013.org for details.

RepLab 2013 focuses on the task of monitoring the reputation of entities (companies, organizations, celebrities…) on Twitter. For analysts, the monitoring task consists of searching the stream of tweets for potential mentions to the entity, filtering those that do refer to the entity, detecting topics (clustering tweets by subject) and prioritizing so as to identify reputation alerts (issues that may have a substantial impact on the reputation of the entity).

Accordingly, the RepLab 2013 task is defined as (multilingual) topic detection combined with priority ranking of clusters, as input for reputation monitoring experts. The detection of polarity for reputation (does the tweet have negative/positive implications for the reputation of the entity?) is an essential step to detect reputation alert, and will be evaluated as a standalone subtask.

Task and subtasks

Participants are welcome to present systems that attempt the full monitoring task (filtering + topic detection + topic ranking) or modules that contribute only partially to solve the problem. Possible modules are related to the following components of the whole reputation management task:

1) Filtering: Systems will be asked to determine which tweets are related to the entity and which are not, for instance, distinguishing between tweets that contain the word "Stanford" referring to the University of Stanford and filtering out tweets about Stanford as a place. Manual annotations will be provided with two possible values: related/unrelated.

2) Polarity for Reputation: The goal will be to decide if a (related) tweet content has positive or negative implications for the company's reputation. Manual annotations will be: positive/negative/neutral.

3) Topic Detection: Systems will be asked to cluster related tweets about the entity by topic with the objective of grouping together tweets referring to the same subject. 

4) Assigning priority. The full task involves detecting the relative priority of topics. So as to be able to evaluate priority independently from the clustering task, we will evaluate the subtask of predicting the priority of the cluster a tweet belongs to.  


It will be possible to present systems that address only filtering, only polarity identification, only topic detection or only priority assignment. The organization will provide baseline components for all of the four subtasks. This way any participant will be able to participate in the full task regardless of where his particular contribution lies. Evaluation results will be provided for the full task and for each of the four subtasks listed above. 

Data

RepLab 2013 uses Twitter data in English and Spanish. The balance between both languages depends on the availability of data for each of the entities included in the dataset.

The corpus consists of a collection of tweets referring to a selected set of entities from four domains: automotive, banking, universities and music/artists. The training and test data sets are manually labelled by annotators which are trained and guided by experts in online reputation management. Each tweet is annotated with the following labels:

- RELATED/UNRELATED: the tweet is/is not about the entity.

- POSITIVE/NEUTRAL/NEGATIVE: the information contained in the tweet has positive/neutral/negative implications for the entity's reputation (only for related tweets).

- The cluster identifier the tweet belongs to (only for related tweets).

- The priority (in three levels: alert/mildly important/unimportant) of the cluster the tweet belongs to (only for related tweets).

How to participate?

Register for RepLab at the CLEF 2013 web page (http://clef2013.org) and contact the lab organizers (at julio at lsi.uned.es) for further instructions (data download, etc.).

Important dates

April 15: Training & Test data released 
May 27: System results due
June 5: Official results released
June 15: Deadline for paper submission
September 23–26: CLEF 2013 Conference in Valencia

Organizers

RepLab is an activity sponsored by EU project LiMoSINe (http://limosine-project.eu). Lab organizers are:

Adolfo Corujo, Llorente & Cuenca (acorujo at llorenteycuenca.com)
Julio Gonzalo, UNED (julio at lsi.uned.es)
Edgar Meij, Yahoo! Research (emeij at yahoo-inc.com)
Maarten de Rijke, University of Amsterdam (mdr at science.uva.nl)

Steering Committee

Eugene Agichtein, Emory University, USA
Alexandra Balahur, JRC, Italy
Krisztian Balog, U. Stavanger, Norway
Donna Harman, NIST, USA
Eduard Hovy, ISI/USC, USA
Radu Jurca, Google, Switzerland
Jussi Karlgren, Gavagai/SICS, Sweden
Mounia Lalmas, Yahoo! Research, Spain
Jochen Leidner, Thomson Reuters, Switzerland
Bing Liu, U. Illinois at Chicago, USA
Alessandro Moschitti, U. Trento, Italy
Miles Osborne, U. Edinburgh, UK
Hans Uszkoreit, U. Saarbrucken, Germany
James Shanahan, Boston U., USA
Belle Tseng, Yahoo!, USA
Julio Villena, Daedalus/U. Carlos III, Spain
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list