Appel: Defi sur "Large Scale Hierarchical Text Classification"

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Fri Jul 3 20:29:32 UTC 2009


Date: Fri, 03 Jul 2009 12:24:58 +0200
From: Eric Gaussier <Eric.Gaussier at imag.fr>
Message-ID: <4A4DDC7A.8030708 at imag.fr>
X-url: http://lshtc.iit.demokritos.gr/
X-url: http://www-clips.imag.fr/mrim/User/eric.gaussier/


Bonjour,

Nous voudrions porter à votre connaissance le défi suivant.

                         Pascal Challenge on
             Large Scale Hierarchical Text classification

              Web site: http://lshtc.iit.demokritos.gr/
                 Email: lshtc_info at iit.demokritos.gr
                                   
We are pleased to announce the launch of the Large Scale Hierarchical
Text classification (LSHTC) Pascal Challenge. The LSHTC Challenge is a
hierarchical text classification competition using large datasets
based on the ODP Web directory data (www.dmoz.org).

Hierarchies are becoming ever more popular for the organization of
text documents, particularly on the Web. Web directories are an
example.  Along with their widespread use, comes the need for
automated classification of new documents to the categories in the
hierarchy. As the size of the hierarchy grows and the number of
documents to be classified increases, a number of interesting machine
learning problems arise. In particular, it is one of the rare
situations where data sparsity remains an issue despite the vastness
of available data. The reasons for this are the simultaneous increase
in the number of classes and their hierarchical organization. The
latter leads to a very high imbalance between the classes at different
levels of the hierarchy.  Additionally, the statistical dependence of
the classes poses challenges and opportunities for the learning
methods.

The challenge will consist of four tasks with partially overlapping
data. Information regarding the tasks and the challenge rules can be
found at challenge Web site, under the "Tasks, Rules and Guidelines"
link.

We plan a two-stage evaluation of the participating methods: one
measuring classification performance and one computational
performance.  It is important to measure both, as they are
dependent. The results will be included in a final report about the
challenge and we also aim at organizing a special NIPS'09 workshop.

In order to register for the challenge and gain access to the
datasets, please create a new account at challenge Web site.

Key dates:
Start of testing: July 10, 2009.
End of testing, submission of executables and short papers: September 
29, 2009.
End of scalability test and announcement of results: October 25, 2009.
NIPS'09 workshop (subject to approval): December 11-12, 2009

Organisers:
Eric Gaussier, LIG, Grenoble, France
George Paliouras, NCSR "Demokritos", Athens, Greece
Aris Kosmopoulos, NCSR "Demokritos", Athens, Greece
Sujeevan Aseervatham, LIG, Grenoble & Yakaz, Paris, France

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list