Seminaire: Seminaire Alpage, M. Strube, 6 mai 2011

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Wed May 4 17:13:34 UTC 2011

Date: Mon, 2 May 2011 13:24:07 +0200
From: Benoit Crabbé <bcrabbe at>
Message-Id: <5FF3F631-56CC-45B9-9E5C-14C7EB0410FE at>

*************** Séminaire Alpage *******************

  Séminaire de l'école doctorale de Paris 7

Il s'agit du séminaire de recherche en linguistique informatique
organisé par l'équipe Alpage, Alpage est une équipe mixte Inria --
Paris 7 qui centre ses intérêts scientifiques autour de l'analyse
syntaxique automatique et du traitement du discours pour la langue

Le prochain séminaire se tiendra vendredi 6 mai de 11.00 à 13.00 en
salle 3E91 à l'UFRL, 175, rue du Chevaleret, 75013 Paris (3e étage)

Toute personne intéressée est la bienvenue.


Michael Strube (Heidelberg HITS)

nous parlera de :

Transforming Wikipedia into a Very Large Conceptual Network


Wikipedia provides a repository for world knowledge with more
structure than the web and more coverage than manually created
knowledge bases. Although its system of categories can be used
straightforwardly as a semantic network, the Wikipedia categorization
cannot be considered a proper taxonomy, as the relations between
categories are not semantically typed.

In this presentation we will show how to induce an isa hierarchy on
top of the Wikipedia categorization. We start by taking the category
system in Wikipedia as a conceptual network. We then label the
semantic relations between categories using methods based on
connectivity in the network and lexico-syntactic matching. As a result
we are able to derive a large scale taxonomy with isa relations
between the concepts.

We evaluate the quality of the taxonomy by comparing it with
ResearchCyc, one of the largest manually created ontologies, and show
that the Wikipedia derived taxonomy compares favorably with it. We
also discuss experiments on using Wikipedia for computing the semantic
similarity of words. The Wikipedia derived taxonomy performs as well
as measures using WordNet, a commonly used lexical database in Natural
Language Processing. We conclude with a view on current work which
includes labeling additional relations such as part-of, location and
temporal ones and creating a multilingual conceptual network.

Publications relevant for this presentation:

Ponzetto, Simone Paolo; Strube, Michael (2011). Taxonomy induction
based on a collaboratively built knowledge repository In: Artificial
Intelligence, to appear.  (Short and slightly outdated version:
Ponzetto, Simone Paolo; Strube, Michael (2007). Deriving a large scale
taxonomy from Wikipedia. In:AAAI '07, pp.1440-1445.)

Ponzetto, Simone Paolo; Strube, Michael (2007). Knowledge derived from
Wikipedia for computing semantic relatedness. In: Journal of
Artificial Intelligence Research 30, pp.181-212.  (Short and outdated
version: Strube, Michael; Ponzetto, Simone Paolo (2006). WikiRelate!
Computing semantic relatedness using Wikipedia. In: AAAI '06,

Nastase, Vivi et al. (2010). WikiNet: A very large scale multi-lingual
concept network. In: LREC 2010

Nastase, Vivi; Strube, Michael (2008). Decoding Wikipedia Categories
for Knowledge Acquisition. In: AAAI '08, pp.1219-1224.

page web

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

More information about the Ln mailing list