Conf: Information Extraction for Balto-Slavonic languages, ACL-Workshop
Thierry Hamon
thierry.hamon at LIPN.UNIV-PARIS13.FR
Fri Jun 8 15:35:45 UTC 2007
Date: Wed, 06 Jun 2007 08:42:52 +0200
From: Ralf Steinberger <ralf.steinberger at jrc.it>
X-url: http://langtech.jrc.it/BSNLP2007/
X-url: http://langtech.jrc.it/BSNLP2007/m/program.html
X-url: http://langtech.jrc.it
X-url: http://langtech.jrc.it/JRC-Acquis.html
1st Call for Participation
We cordially invite you to participate in the forthcoming
ACL Workshop
Prague, 29 June 2007
Balto-Slavonic Natural Language Processing 2007
Special Theme: Information Extraction and Enabling Technologies
http://langtech.jrc.it/BSNLP2007/
There are over 400 million speakers of Balto-Slavonic (BS) languages
world-wide. As of 2007, almost a third of the 23 official European
Union languages belong to this group. For some BS-languages, there is
a rich linguistic heritage and Language Technology is rather advanced,
but many others lag behind. This is partly due to a lack of basic
linguistic resources, which unfortunately often leads to a linguistic
brain-drain: instead of working on their own BS languages, scientists
develop methods and tools for English or other widely spoken languages
because resources for these are freely available.
The objective of this ACL workshop, organised by the European
Commission\u2019s Joint Research Centre (JRC), is to promote the work
on Balto-Slavonic languages, and especially work on Information
Extraction, by helping scientists to describe and share their
resources and to describe their efforts, hoping that the experiences
of a few will be useful for many others.
The presentation subjects at the workshop (see the program at
http://langtech.jrc.it/BSNLP2007/m/program.html for details) will
include: Information Extraction (scenario template filling, Named
Entity Recognition, definition extraction), name lemmatisation,
word-sense discrimination, topical text segmentation, WordNet-related
developments, morphological corpus annotation, term extraction, and
hybrid POS-tagging. Most of the talks will address, at some point, the
specificities of analysing BS languages.
The invited speaker, Adam Przepiórkowski from the Polish Academy of
Sciences, will give an overview of specific linguistic phenomena of
Slavonic languages. He will show how these specific features can make
Information Extraction sometimes harder and sometimes easier than in
Germanic and Romance languages.
Organizing Committee:
European Commission, Joint Research Centre, Language Technology Group
Jakub Piskorski
Bruno Pouliquen
Ralf Steinberger
Hristo Tanev
--------------------------------------------------------------------------------
Ralf Steinberger (Ralf.Steinberger at jrc.it)
European Commission - Joint Research Centre (JRC)
IPSC - SeS - Language Technology (http://langtech.jrc.it)
JRC-Acquis Multilingual Parallel Corpus (Version 3)
· Freely available for research purposes.
· 22 languages: Bulgarian, Czech, Danish, German, Greek, English,
Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian,
Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak,
Slovene and Swedish.
· Altogether over 1 Billion words.
· Sentence alignment for 210 language pairs (currently available for
version 2.2 only).
· For more information and download, see
http://langtech.jrc.it/JRC-Acquis.html.
The JRC's Language Technology group specialises in the development of
highly multilingual text analysis tools and in cross-lingual
applications. Many applications are accessible online, e.g.:
· NewsExplorer: multilingual news aggregation and analysis (19
languages); allows to navigate the news over time and across
languages; trend analysis; collects information about people from
the news; social network detection.
· NewsBrief: breaking news detection and display of the very latest
thematic news from around the world; email alerting (22+ languages).
· MedISys Medical Information System: latest health-related news from
around the world according to themes and diseases (22+ languages).
-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version :
Archives : http://listserv.linguistlist.org/archives/ln.html
http://liste.cines.fr/info/ln
La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion : http://www.atala.org/
-------------------------------------------------------------------------
More information about the Ln
mailing list