From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:30:24 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:30:24 +0200 Subject: Ecole: FSFLA 2012, Tarragona, Spain, October 29 =?WINDOWS-1252?Q?=96_?=November 2, 2012 Message-ID: Date: Sun, 30 Sep 2012 18:10:51 +0200 From: "GRLMC" Message-ID: <7EC8E25B5FE543A09C44FEB1A87FA0D6 at Carlos1> X-url: http://grammars.grlmc.com/fsfla2012/ ********************************************************************* 2012 INTERNATIONAL FALL SCHOOL IN FORMAL LANGUAGES AND APPLICATIONS FSFLA 2012 (formerly International PhD School in Formal Languages and Applications) Tarragona, Spain October 29 – November 2, 2012 Organized by: Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University http://grammars.grlmc.com/fsfla2012/ ********************************************************************* AIM: FSFLA 2012 offers a broad and intensive series of lectures at different levels on selected topics in language and automata theory and their applications. The students choose their preferred courses according to their interests and background. Instructors are top names in their respective fields. The School intends to help students initiate and foster their research career. The previous event in this series was FSFLA 2011 ( http://grammars.grlmc.com/fsfla2011/). ADDRESSED TO: Graduate (and advanced undergraduate) students from around the world. Most appropriate degrees include: Computer Science and Mathematics. Other students (for instance, from Linguistics, Electrical Engineering, Molecular Biology or Logic) are welcome too provided they have a good background in discrete mathematics. The School is appropriate also for people more advanced in their career who want to keep themselves updated on developments in the field. There is no overlap in the class schedule. COURSES AND PROFESSORS: - Eric Allender (Rutgers), Circuit Complexity: Recent Progress in Lower Bounds [introductory/advanced, 8 hours] - Amihood Amir (Bar-Ilan), Periodicity and Approximate Periodicity in Pattern Matching [introductory, 6 hours] - Ahmed Bouajjani (Paris 7), Automated Verification of Concurrent Boolean Programs [introductory/advanced, 8 hours] - Bruno Courcelle (Bordeaux), Automata for Monadic Second-order Model Checking [intermediate, 8 hours] - Jörg Flum (Freiburg), The Halting Problem for Turing Machines [introductory/advanced, 6 hours] - Aart Middeldorp (Innsbruck), Termination of Rewrite Systems [introductory/intermediate, 8 hours] REGISTRATION: It has to be done on line at http://grammars.grlmc.com/fsfla2012/Registration.php FEES: They are variable, depending on the number of courses each student takes. The rule is: 1 hour = - 10 euros (for payments until June 2, 2012), - 12.50 euros (for payments between June 3 and August 15, 2012), - 15 euros (for payments after August 15, 2012). PAYMENT PROCEDURE: The fees must be paid to the School's bank account: Uno-e Bank bank’s address: Julian Camarillo 4 C, 28037 Madrid, Spain IBAN: ES3902270001820201823142 SWIFT/BIC code: UNOEESM1 account holder: Carlos Martin-Vide GRLMC account holder’s address: Av. Catalunya 35, 43002 Tarragona, Spain Please mention FSFLA 2012 and your name in the subject. A receipt will be provided on site. Remarks: - Bank transfers should not involve any expense for the School. - People claiming early registration will be requested to prove that the bank transfer order was carried out by the deadline. - Students may be refunded only in the case when a course gets cancelled due to the unavailability of the instructor. People registering on site at the beginning of the School must pay in cash. For the sake of local organization, however, it is much recommended to do it earlier. ACCOMMODATION: Information about accommodation is available on the website of the School. CERTIFICATE: Students will be delivered a certificate stating the courses attended, their contents, and their duration. IMPORTANT DATES: Announcement of the programme: March 24, 2012 Starting of the registration: March 24, 2012 Very early registration deadline: June 2, 2012 Early registration deadline: August 15, 2012 Starting of the School: October 29, 2012 End of the School: November 2, 2012 QUESTIONS AND FURTHER INFORMATION: Lilica Voicu: florentinalilica.voicu at urv.cat WEBSITE: http://grammars.grlmc.com/fsfla2012/ POSTAL ADDRESS: FSFLA 2012 Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University Av. Catalunya, 35 43002 Tarragona, Spain Phone: +34-977-559543 Fax: +34-977-558386 ACKNOWLEDGEMENTS: Diputació de Tarragona Universitat Rovira i Virgili ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:32:33 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:32:33 +0200 Subject: These: Beatrice Arnulphy, Designations nominales d'evenements - Etude et extraction automatique dans les textes Message-ID: Date: Sun, 30 Sep 2012 22:04:48 +0200 From: Béatrice Arnulphy Message-ID: <5068A5E0.1020108 at limsi.fr> Bonjour, J'ai le plaisir de vous annoncer ma soutenance de thèse intitulée "Désignations nominales d'événements - Étude et extraction automatique dans les textes". La soutenance aura lieu le *mardi 2 octobre 2012 à 10h30* en salle de conférences du LIMSI-CNRS (Bât 508 http://www.limsi.fr/Pratique/acces/, Université Paris Sud, Orsay ; http://www.limsi.fr). Vous êtes cordialement invités au pot qui suivra la soutenance. *Le jury de soutenance* est composé de : * Directeurs de thèse Anne Vilnat -- Professeur - LIMSI-CNRS, Université Paris-Sud Xavier Tannier -- MCF - LIMSI-CNRS, Université Paris-Sud * Rapporteurs Laurence Danlos -- Alpage - Université Paris 7 Patrice Bellot -- LSIS - Polytechnique, Université d'Aix-Marseille * Examinateurs Sophie Rosset -- LIMSI-CNRS, Orsay Laura Calabrese -- MCF - Université Libre de Bruxelles Philippe Muller -- MCF en informatique - Université Paul Sabatier, Toulouse *Résumé de thèse :* Ma thèse a pour but l'étude des désignations nominales des événements pour l'extraction automatique. Mes travaux s'inscrivent en traitement automatique des langues, soit dans une démarche pluridisciplinaire qui fait intervenir linguistique et informatique. L'extraction d'information a pour but d'analyser des documents en langage naturel et d'en extraire les informations utiles à une application particulière. Dans ce but général, de nombreuses campagnes d'extraction d'information ont été menées : pour chaque événement considéré, la tâche de la campagne est d'extraire certaines informations relatives (participants, dates, nombres, etc.). Dès le départ ces challenges touchent de près aux entités nommées (éléments "notables" des textes, comme les noms de personnes ou de lieu). Toutes ces informations forment un ensemble autour de l'événement et ces travaux ne s'intéressent pas aux mots utilisés pour décrire l'événement (particulièrement lorsqu'il s'agit d'un nom). L'événement est vu comme un tout englobant, comme la quantité et la qualité des informations qui le composent. Contrairement aux travaux en extraction d'informations générale, notre intérêt principal est porté uniquement sur la manière dont sont nommés les événements qui se produisent et particulièrement à la désignation nominale utilisée. Pour nous, l'événement est ce qui arrive, ce qui vaut la peine qu'on en parle. Les événements plus importants font l'objet d'articles de presse ou apparaissent dans les manuels d'Histoire. Un événement peut être évoqué par une description verbale ou nominale. Dans cette thèse, nous avons réfléchi à la notion d'événement. Nous avons observé et comparé les différents aspects présentés dans l'état de l'art jusqu'à construire une définition de l'événement et une typologie des événements en général qui conviennent dans le cadre de nos travaux et pour les désignations nominales des événements. Nous avons aussi dégagé de nos études sur corpus différents types de formation de ces noms d'événements, dont nous montrons que chacun peut être ambigu à des titres divers. Pour toutes ces études, la composition d'un corpus annoté est une étape indispensable, nous en avons donc profité pour élaborer un guide d'annotation dédié aux désignations nominales d'événements. Nous avons étudié l'importance et la qualité des lexiques existants pour une application dans notre tâche d'extraction automatique. Nous avons aussi, par des règles d'extraction, porté intérêt au contexte d'apparition des noms pour en déterminer l'événementialité. À la suite de ces études, nous avons extrait un lexique pondéré en événementialité (dont la particularité est d'être dédié à l'extraction des événements nominaux), qui rend compte du fait que certains noms sont plus susceptibles que d'autres de représenter des événements. Utilisée comme indice pour l'extraction des noms d'événements, cette pondération permet d'extraire des noms qui ne sont pas présents dans les lexiques standards existants. Enfin, au moyen de l'apprentissage automatique, nous avons travaillé sur des traits d'apprentissage contextuels en partie fondés sur la syntaxe pour extraire de noms d'événements. Béatrice Arnulphy ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:33:58 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:33:58 +0200 Subject: Conf: Colloque Traitement de corpus linguistiques, 4-5 octobre 2012, Sorbonne, Paris Message-ID: Date: Mon, 1 Oct 2012 15:25:54 +0200 From: marine damiani Message-ID: Chers collègues, Nous vous rappelons que le colloque des doctorants et jeunes chercheurs du laboratoire MoDyCo s'intéressant cette année aux outils et méthodes pour le traitement de corpus linguistiques aura lieu les *4 et 5 octobre 2012* à l’*amphithéâtre Durkheim* à la *Sorbonne*. La participation au colloque est libre et nous espérons que vous serez nombreux à être intéressés par le programme que vous trouverez en pièce jointe ou sur la page du colloque: https://sites.google.com/site/coldoc2012/programme Pour toute information complémentaire, vous pouvez nous contacter par mail: coldoc2012 at gmail.com Bien cordialement, Le comité d'organisation. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:39:02 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:39:02 +0200 Subject: Appel: Coria 2013 et RJCRI 2013 Message-ID: Date: Mon, 1 Oct 2012 17:36:42 +0200 From: Catherine Berrut Message-Id: <08FF8C1D-99DB-4C14-AD25-8596FA7D60AF at imag.fr> X-url: http://coria.unine.ch X-url: http://coria.unine.ch/rjcri.htm CORIA 2013, Neuchâtel (Suisse), du 3 au 5 Avril 2013 CORIA 2013 (http://coria.unine.ch) est la dixième édition de la COnférence en Recherche d'Information et Applications. Organisée avec le support de l'ARIA (Association francophone de Recherche d'Information et Applications, http://www.asso-aria.org), elle est la principale manifestation francophone dans ce domaine. CORIA a pour but de rassembler les équipes et les chercheurs menant des travaux scientifiques dans le domaine de la recherche d'informations : recherche d'information sur le web, extraction d'information au sein de documents multimédia, analyse d'opinion ou de réseaux sociaux, contextes monolingue ou multilingue, recherche de documents numériques et d'images, apprentissage et classification automatiques, interfaces homme-machine pour l'accès à l'information, etc. CORIA se veut largement ouverte à l'ensemble de la communauté scientifique concernée par la Recherche d'Information. Après s'être tenue à Toulouse, Grenoble, Lyon, Saint-Étienne, Lannion, Toulon, Sousse (Tunisie, en partenariat avec CIFED), Avignon, Bordeaux (en partenariat avec CIFED), CORIA aura lieu cette année du 3 avril au 5 avril 2013 à Neuchâtel (Suisse). L'activité scientifique en recherche d'information connaît une évolution forte depuis la généralisation du web et, plus récemment, le développement de l'informatique nomade. Les limites du domaine sont elles-mêmes en mutation et favorisent les synergies avec les travaux en apprentissage automatique, traitement automatique des langues, traitement de l'image, traitement de la parole, communication écrite et documents, systèmes d'information et bases de données, représentation et gestion des connaissances... Les domaines d'application sont vastes et peuvent être appliqués au web dans sa globalité ou restreints par exemple à des bibliothèques numériques ou des réseaux sociaux. Le public visé par CORIA 2013 est celui des universitaires et chercheurs - confirmés ou non -, des industriels et des spécialistes du domaine et des étudiants en Master se dirigeant vers les métiers de la Recherche. Les soumissions peuvent être faites en anglais ou en français. Les contributions peuvent concerner des travaux académiques ou des applications industrielles. Le programme prévoit deux conférences invitées, l'une de Jamie Callan (CMU), la seconde de Donna Harman (NIST). Ces deux conférenciers feront par ailleurs un cours dédiés aux doctorants lors d'un séminaire du CUSO, le mardi 2 avril 2013 (le jour précédant la conférence). Pendant la conférence CORIA 2013 seront également organisées les 8e Rencontres Jeunes Chercheurs en Recherche d'Information (RJCRI). Elles ont pour objectif de permettre à tous les doctorants de présenter leur problématique de recherche, d'établir des contacts avec des équipes travaillant sur des domaines similaires ou connexes, et d'offrir à l'ensemble de la communauté un aperçu des axes de recherche actuels. Les travaux sélectionnés pour les RJCRI donneront lieu à une présentation orale et sous forme de poster. Cette année, les soumissions conjointes RJCRI et CORIA sont autorisées (voir modalités dans RJCRI http://coria.unine.ch/rjcri.htm ) Thématiques (liste non exhaustive) - Théorie et modèles formels pour la RI : modèle logique, modèles de langages - Multilinguisme : Recherche d'information multilingue, traduction automatique - Multimédia (images, audio, vidéos, son, musique) : indexation, navigation, accès, interactions avec le texte - Passage à l'échelle : indexation, performances, architectures - Classification automatique, clustering, ranking, apprentissage automatique - Filtrage, routage, détection de nouveautés - Modélisation du contexte, personalisation - Traitement Automatique de la Langue Naturelle pour la recherche d'information - Systèmes de Questions Réponses - Extraction d'informations : ontologies, ressources et recherche d'informations, détection d'entités nommées - Web : grands graphes, utilisation de la topologie du web, lois de puissances, citations, analyse de liens - RI et documents structurés : RI et XML, RI précise et recherche de passages - Réseaux sociaux : analyse de blogs et de sites communautaires, suivi de conversations, analyse de rumeurs, analyse de sentiments, détection d'opinion - Recherche collaborative : filtrage, systèmes de recommandation - Interaction utilisateur : interrogation flexible, interfaces, visualisation, modélisation de l'utilisateur, accessibilité, indexation collaborative - Traitement et représentation des connaissances : logique floue, méta-données, ontologies, web sémantique, ingénierie des connaissances - Bibliothèques numériques : RI sur des livres numérisés, robustesse, OCR et indexabilité - Systèmes de recherche d'information dédiés : recherche d'information génomique, géographique - RI distribuée : recherche d'information mobile, située, P2P - Outils pour la recherche d'information : évaluation, bancs d'essais, métriques, expérimentations qualitatives des systèmes Dates importantes La soumission des articles se fera en deux étapes : d'abord la soumission d'un résumé et ensuite la soumission de l'article. Le calendrier de soumission est le même pour CORIA et les RJCRI : - Date limite de soumission des résumés : 01/12/2012 - Date limite de soumission des articles : 07/12/2012 - Réponse aux auteurs : 01/02/2013 - Date limite de soumission de la version finale : 22/02/2013 Site de dépôt des articles de Coria : https://www.easychair.org/conferences/?conf=coria2013 Site de dépôt des articles de RJCRI : https://www.easychair.org/conferences/?conf=rjcri2013 Format des articles - Les soumissions peuvent être faites en anglais ou en français. - Les contributions peuvent concerner des travaux académiques ou des applications industrielles. - Les textes de communications doivent comporter 16 pages maximum au format des revues Hermes. Ils doivent être précédés d'une page de garde comportant le titre, les noms et coordonnées précises des auteurs, une liste de mots clé en français et en anglais, un résumé d'une vingtaine de lignes au maximum. La mention « article soumis à CORIA et RJCRI » doit être portée sur la page de garde le cas échéant. - Les articles peuvent être écrits en Word ou en LaTeX. - Le format des articles Word et LaTeX peut être téléchargé sur le site Hermes. - Les articles déposés doivent être au format PDF exclusivement. Modalités RJCRI : voir http://coria.unine.ch/rjcri.htm ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:41:47 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:41:47 +0200 Subject: Appel: La composition neoclassique, revue VERBUM Message-ID: Date: Mon, 01 Oct 2012 17:57:11 +0200 From: Stéphanie Lignon Message-ID: <5069BD57.3020108 at univ-nancy2.fr> X-url: http://www.atilf.fr/spip.php?rubrique214&idfirst=922 /La composition néoclassique/, numéro de /Verbum/ coordonné par Stéphanie Lignon et Fiammetta Namer Parmi les procédés de création lexicale disponibles dans la langue, la composition néoclassique met en jeu des modèles particuliers. La composition est un procédé constructionnel qui fait intervenir deux lexèmes de base afin de construire un nouveau lexème (/timbre-poste/, /porte-bagage/). On distingue deux types de composition, la composition standard (ou populaire ou ordinaire), qui met en jeu des lexèmes du vocabulaire contemporain (/porte-bagage/), et la composition néoclassique, qui met en jeu des lexèmes empruntés au fonds patrimonial, comme /coléoptère/ ou /anthropophage/. La composition néoclassique (également dite « savante », « érudite », ou encore appelée « infixation », « confixation », etc.) était initialement réservée à la formation de termes des vocabulaires de spécialités en médecine, chimie, zoologie, botanique, etc. Or, aujourd'hui, elle sert de modèle à des formations appartenant non plus à des vocabulaires spécialisés, mais à la langue « générale » ; cf. par exemple /contraintophobe/, /ferrovipathe, capillo-tracté/, /chronophage, théâtrolâtre, publivore, bobophile/. Toutefois, son identification est plus délicate que celle des affixés : - le seul segment que partagent tous les composés néoclassiques est la voyelle de liaison --o- (/politic*o*médiatique/, /prim*o*accédant/) ou --i- (/libert*i*cide/) entre les composants ; - le segment qui suit la voyelle de liaison est soit un mot du français, soit un constituant grec ou latin présent dans d'autres mots du français ; - ce dernier, contrairement aux mots, ne figure pas dans les lexiques du français. Son succès dans la langue générale est la raison qui nous pousse à vouloir proposer un numéro thématique dédié à ce procédé. Une attention toute particulière sera portée aux soumissions proposant un lien avec les corpus et la formalisation (modèles), dans un contexte monolingue et multilingue, dans les domaines de spécialité ou dans la langue générale. L'appel s'adresse aux spécialistes de plusieurs domaines, quel que soit le courant théorique adopté : - linguistique : lexique, terminologie, morphologie ; - TAL, - psycholinguistique (aspect perception, apprentissage de la langue, troubles du langage, etc.). *Calendrier* : - *15janvier 2013* : Les auteurs souhaitant proposer un article sur ce thème sont priés d'envoyer une intention de soumission de deux pages (bibliographie non comprise) de leur projet pour le 15 janvier 2013. Ce résumé ne doit pas être programmatique. Il doit indiquer clairement la problématique abordée et faire état des principaux résultats qui seront exposés dans l'article. - *15 février 2013* : Sélection des communications par le comité de lecture et notification aux auteurs - *15 juin 2013* : réception des articles complets qui devront faire entre 15 et 20 pages. La feuille de style sera communiquée aux auteurs lors de la notification de leur acceptation. Comité de lecture : Dany AMIOT (STL, Université Lille 3), Frédérique BRIN-HENRY (ATILF, Université de Lorraine), Georgette DAL (STL, Université Lille 3), Natalia GRABAR (STL, Université Lille 3), Nabil HATHOUT (CLLE, Université Toulouse-Le Mirail), Stéphanie LIGNON (ATILF, Université de Lorraine), Fiammetta NAMER (ATILF, Université de Lorraine), Séverine CASALIS (URECA, Université Lille 3), Thierry HAMON (LIM&BIO, Université Paris 13), Thi Mai TRAN(STL, Université Lille 2). http://www.atilf.fr/spip.php?rubrique214&idfirst=922 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:51:03 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:51:03 +0200 Subject: Seminaire: BLRI, Jonathan Harrington, Aix-en-Provence, 29 octobre 2012 Message-ID: Date: Tue, 2 Oct 2012 15:47:49 +0200 From: Nadéra Bureau Message-ID: <006a01cda0a4$838fc310$8aaf4930$@bureau at lpl-aix.fr> Lundi 29 octobre 2012 10h Salle de conférences B011, bât. B 5 avenue Pasteur, Aix-en-Provence (Labex BLRI) Jonathan Harrington (Institute of Phonetics and Speech Processing, Ludwig-Maximilians University of Munich, Germany) Sound change and its relationship to variation in production and categorization in perception. Résumé In some models (Lindblom et al, 1995; Bybee, 2002), sound change is associated with the type of synchronic reduction that occurs in prosodically weak and semantically predictable contexts. In other models (Ohala, 1993), sound change can be brought about through listeners’ misperception of coarticulation in speech production. The talk will draw upon both models in order to explore whether coarticulatory misperception is more likely in prosodically weak contexts. In order to do so, the magnitude of trans-consonantal vowel coarticulation was investigated in /pV1pV2l/ non-words with the pitch-accent falling either on the first or second syllable and in which V1 = /ʊ, ʏ/ and V2 = /e, o/. The analysis of these words produced by 20 L1-German speakers showed that prosodic weakening caused vowel undershoot in /ʊ/ but had little effect on V2-on-V1 coarticulation. In a perception experiment, a V1 = /ʊ-ʏ/ continuum was synthesised and the same speakers made forced choice judgements to the same non-words with the prosody manipulated such that stress was perceived on V1 or on V2. Listeners compensated for V2-on-V1 coarticulation; however, the magnitude of compensation was less in the prosodically weak than in the strong context. The general conclusion is that segmental context influences both the dynamics of speech production and perceptual categorization, but not always in the same way: it is this divergence between the two which may be especially likely in prosodically weak contexts and which may, in turn, facilitate sound change. References Bybee, J. (2002). Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation Change, 14, 261–290. Lindblom, B., Guion, S., Hura, S., Moon, S. J., and Willerman, R. (1995). Is sound change adaptive? Rivista di Linguistica, 7, 5–36. Ohala, J. J. (1993). Sound change as nature’s speech perception experiment. Speech Communication, 13, 155–161. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 20:30:32 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 22:30:32 +0200 Subject: Appel: Tralogy II - Human and Machine Translation - 2013 - Submission deadline extended to October 15, 2012 Message-ID: Date: Tue, 02 Oct 2012 22:20:09 +0200 From: Joseph Mariani Message-ID: <506B4C79.6040903 at limsi.fr> X-url: http://www.tralogy.eu ************ Apologies for Multiple Posting ************ Tralogy II: Human and Machine Translation. The quest for meaning: where are our weak points and what do we need? Dates and venue of the Conference: January 17-18, 2013 - CNRS Headquarters Auditorium, Paris (France) ****** Submission Deadline extended to October 15, 2012 ****** http://www.tralogy.eu The conclusions of the first Tralogy Conference (3-4 March 2011 at the CNRS in Paris) were clear: none of the specialist branches of the language industry can individually hope to offer all the intellectual and professional tools needed to function effectively in the sector. They all need each other: translation has always been interdisciplinary and the translation profession even more so. Accordingly, on the occasion of the second Tralogy Conference, we would like to ask each of our prospective participants not only to present specific contributions from their specialist fields and research into the question of meaning, but also, and in particular, to highlight the limits they face in their specialist fields and research within the wider context of the potential applications of their work. What we would like to find out by the end of Tralogy II is what each of us does not know how to do. We are therefore hoping that, as we map out our respective weak points, these will coincide with the points of contact made at the Conference and with the areas in which there is room for improvement. We will therefore give priority to concise presentations (the published articles will of course be longer) in order to leave time for discussions. And the key question that emerged from Tralogy I will remain at the heart of this analysis: how to measure the quality of a translation with regard to its use. Canada was the country invited to participate in Tralogy I. This time we would like to honour languages that are very much alive but with lower numbers of users. We have therefore decided to organise this conference under the joint patronage of the Baltic States, Member States of the European Union: Estonia, Latvia and Lithuania. Call for papers: http://www.tralogy.eu/spip.php?article55&lang=en To submit a paper: http://www.tralogy.eu/spip.php?article10&lang=en --------- Tralogy revient : http://www.tralogy.eu Tralogy II : Trouver le sens : où sont nos manques et nos besoins respectifs ? Dates et lieu de la Conférence : 17 et 18 janvier 2013, Salle de conférence du siège du CNRS, Paris (France) La première édition du colloque Tralogy (les 3 et 4 mars 2011 dans le Grand amphithéâtre du CNRS, à Paris) s’était conclue sur une évidence : aucune des spécialités impliquées dans les professions langagières ne peut à elle seule donner les clefs intellectuelles et professionnelles qui permettraient d’y opérer efficacement. Chacune a besoin des autres : la traduction est interdisciplinaire depuis toujours, et les métiers de la traduction le sont bien davantage encore. C’est la raison pour laquelle nous souhaitons cette fois demander à chacun de nos intervenants potentiels, non seulement de présenter les apports spécifiques de sa spécialité et de sa recherche à la problématique du sens, mais aussi et surtout de mettre en lumière les limites auxquelles se heurtent cette spécialité et cette recherche dans le cadre plus général des applications envisagées. Ce que nous ambitionnons de savoir, à l’issue de Tralogy II, c’est ce que, les uns et les autres, nous ne savons pas faire. Nous faisons ainsi le pari que nos points de contact et nos marges de progression se superposent avec la cartographie de nos points faibles respectifs. Nous comptons, pour cela, privilégier les présentations concises (les publications seront bien sûr plus étendues) afin de laisser du temps au débat. Et nous conservons au coeur de cette analyse la question qui, lors de Tralogy I, est apparue essentielle : celle de la mesure de la qualité d’une traduction au regard de son usage. Le Canada était le pays invité pour Tralogy I. Nous souhaitons cette fois mettre à l’honneur les langues très vivantes mais à faible nombre d’utilisateurs. C’est la raison pour laquelle, nous avons décidé d’organiser ce colloque sous le patronage commun des pays baltes, membres de l’Union européenne : Estonie, Lettonie et Lituanie. Appel à contributions : http://www.tralogy.eu/spip.php?article56&lang=fr Pour proposer une contribution : http://www.tralogy.eu/spip.php?article10&lang=fr ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:01:22 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:01:22 +0200 Subject: Job: Post-doc at LIMSI-CNRS, Orsay, France Message-ID: Date: Wed, 03 Oct 2012 09:20:40 +0200 From: Xavier Tannier Message-ID: <506BE748.9060207 at limsi.fr> X-url: http://perso.limsi.fr/Individu/xtannier/fr/Stages/post_doc_2012_chronolines.html X-url: http://www.chronolines.fr Post-doctoral position: Event-based multi-document summarization for building timelines http://perso.limsi.fr/Individu/xtannier/fr/Stages/post_doc_2012_chronolines.html Keywords /information extraction, natural language processing, temporal analysis, events, timelines/ Location LIMSI-CNRS, Orsay (Paris), France. Duration 1 year Context Among other objectives, national funded project Chronolines http://www.chronolines.fr aims at creating semi-automatic timelines from a query, based on a collection of newswire papers. Given a user-defined topic and a set of texts, the task consists in *extracting the most important events* concerning the topic and to present them to the user for validation. The ideal output would then be a set of brief descriptions of events, together with the dates of these events. Work on this project already resulted in a few publications, among which a paper at ACL 2012 on /salient dates extraction/, that the candidate can refer to for more details [1] http://aclweb.org/anthology-new/P/P12/P12-1077.pdf. The candidate would be integrated into this project, working in the project team on some of the following issues: * *Aggregation/Summarization*: how to choose/generate a brief description of each event, from a set of relevant sentences. * *Evaluation*: what metrics, what methodology for objective evaluation. * *Granularity*: as the time unit for our salient date algorithm is the day, how to decide that several topic-related important events occurred on the same day or, inversely, that an important event lasted more than one day. * *Relationship*: how to use the big collection of articles to extract some relationship between events? Required skills The candidate should hold a PhD in Natural Language Processing and/or Information Retrieval, and be able to: * Work with texts (interest in linguistic issues and how to deal with them) * Work with /a lot/ of texts (good programming skills, big corpora management, information aggregation, ability to forget about linguistic issues when we need to) * Learn from (imperfect) references (ability to observe and generalize, machine learning skills) * Work with tools used and built by the team (in Linux, Java, perl...) Contacts: Xavier.Tannier[at]limsi.fr Veronique.Moriceau[at]limsi.fr Reference: [1] Rémy Kessler, Xavier Tannier, Caroline Hagège, Véronique Moriceau, André Bittar. *Finding Salient Dates for Building Thematic Timelines. http://aclweb.org/anthology-new/P/P12/P12-1077.pdf* In /Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012)/. Jeju Island, Republic of Korea, July 2012. © Association for Computational Linguistics. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:08:04 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:08:04 +0200 Subject: Seminaire: Alban Lemasson, 12 octobre 2012, BLRI, Marseille Message-ID: Date: Wed, 3 Oct 2012 16:33:56 +0200 From: Nadéra Bureau Message-ID: <006101cda174$1e8df330$5ba9d990$@bureau at lpl-aix.fr> Brain & Language Research Institute Vendredi 12 octobre 2012 11h Amphi Fabry Bât 5 3 place Victor Hugo, Marseille (Labex BLRI) Alban LEMASSON (Université de Rennes 1, Institut universitaire de France) Rudiments de langage chez les primates non-humains ? Résumé La communication vocale des primates non-humains a longtemps été considérée comme déterminée uniquement génétiquement et émotionnellement, encourageant les théoriciens de l'origine du langage humain à en rechercher les précurseurs ailleurs, notamment dans les gestes des grands singes. Pourtant, les études menées au cours des dix dernières années, particulièrement sur les cris des cercopithèques forestiers, démontrent un parallèle avec plusieurs caractéristiques fondamentales du langage (p.ex. sémantique, affixation, syntaxe, prosodie, conversation, accommodation et convergence vocale). Les différences entre le langage humain et la communication vocale des singes, qui sont des actes sociaux comparables, seraient donc plus d'ordre quantitatif que qualitatif. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:10:58 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:10:58 +0200 Subject: Seminaire: Evelina Fedorenko et Ted Gibson, 19 octobre 2012, BLRI, Marseille Message-ID: Date: Wed, 3 Oct 2012 17:01:09 +0200 From: Nadéra Bureau Message-ID: <007f01cda177$ec1e47c0$c45ad740$@bureau at lpl-aix.fr> Brain & Language Research Institute Vendredi 19 octobre 2012 11h Salle des Voûtes Fédération de Recherche 3 C (Comportement, Cerveau, Cognition) 3 place Victor Hugo, Marseille (Labex BLRI) Evelina FEDORENKO (MIT) Résumé What cognitive and neural mechanisms do we use to understand language? Since Broca's and Wernicke's seminal discoveries in the 19th century, a broad array of brain regions have been implicated in linguistic processing spanning frontal, temporal and parietal lobes, both hemispheres, and subcortical and cerebellar structures. However, characterizing the precise contribution of these different structures to linguistic processing has proven challenging. In this talk I will argue that high-level linguistic processing - including understanding individual word meanings and combining them into more complex structures/meanings - is accomplished by the joint engagement of two functionally and computationally distinct brain systems. The first is comprised of the classic “language regions” on the lateral surfaces of left frontal and temporal lobes that appear to be functionally specialized for linguistic processing (e.g., Fedorenko et al., 2011; Monti et al., 2009, 2012). And the second is the fronto-parietal "multiple demand" network, a set of regions that are engaged across a wide range of cognitive demands (e.g., Duncan, 2001, 2010). Most past neuroimaging work on language processing has not explicitly distinguished between these two systems, especially in the frontal lobes, where subsets of each system reside side by side within the region referred to as “Broca’s area” (Fedorenko et al., in press). Using methods which surpass traditional neuroimaging methods in sensitivity and functional resolution (Fedorenko et al., 2010; Nieto-Castañon & Fedorenko, in press; Saxe et al., 2006), we are beginning to characterize the important roles played by both domain-specific and domain-general brain regions in linguistic processing. ------------------------------------------------------------------------ Vendredi 19 octobre 2012 16h Salle des Voûtes Fédération de Recherche 3 C (Comportement, Cerveau, Cognition) 3 place Victor Hugo, Marseille (Labex BLRI) Ted GIBSON (MIT) The communicative basis of word order Résumé Some recent evidence suggests that subject-object-verb (SOV) may be the default word order for human language. For example, SOV is the preferred word order in a task where participants gesture event meanings (Goldin-Meadow et al. 2008). Critically, SOV gesture production occurs not only for speakers of SOV languages, but also for speakers of SVO languages, such as English, Chinese, Spanish (Goldin-Meadow et al. 2008) and Italian (Langus & Nespor, 2010). The gesture-production task therefore plausibly reflects default word order independent of native language. However, this leaves open the question of why there are so many SVO languages (41.2% of languages; Dryer, 2005). We propose that the high percentage of SVO languages cross-linguistically is due to communication pressures over a noisy channel (Jelinek, 1975; Brill & Moore, 2000; Levy et al. 2009). In particular, we propose that people understand that the subject will tend to be produced before the object (a near universal cross-linguistically; Greenberg, 1963). Given this bias, people will produce SOV word order – the word order that Goldin-Meadow et al. show is the default – when there are cues in the input that tell the comprehender who the subject and the object are. But when the roles of the event participants are not disambiguated by the verb, then the noisy channel model predicts either (i) a shift to the SVO word order, in order to minimize the confusion between SOV and OSV, which are minimally different; or (ii) the invention of case marking, which can also disambiguate the roles of the event participants. We test the predictions of this hypothesis and provide support for it using gesture experiments in English, Japanese and Korean. We also provide evidence for the noisy channel model in language understanding in English. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:13:41 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:13:41 +0200 Subject: Ecole: Stage intensif NooJ, 21 =?WINDOWS-1252?Q?=96_?=25 janvier 2013, INALCO Message-ID: Date: Thu, 4 Oct 2012 11:13:10 +0200 From: Max Silberztein Message-Id: <8B86BED8-CD68-4148-A6DC-7B7587EFEF6F at gmail.com> Stage intensif NooJ à l’INALCO 21 – 25 janvier 2013 65 rue des Grands Moulins, 75013 Paris NooJ est un environnement de développement utilisé pour formaliser huit niveaux de phénomènes linguistiques : orthographe et typographie, morphologie flexionnelle et dérivationnelle, syntaxe locale et structurelle, grammaire transformationnelle et analyse sémantique. NooJ s’appuie sur des formalismes adaptés à chaque type de phénomène (grammaires rationnelles, hors contexte, contextuelles et non restreintes), sur une structure d’annotation sophistiquée qui permet aux analyseurs des divers niveaux linguistiques de communiquer entre eux, et propose de nombreux outils d’aide au développement de ressources à large couverture dans une perspective de linguistique descriptive. Aujourd’hui, des modules de ressources linguistiques sont disponibles pour une vingtaine de langues. NooJ est utilisé par des linguistes pour décrire des langues, par des chercheurs en sciences sociales pour effectuer des analyses de corpus dans une perspective historique, littéraire, sociologique ou psychologique, et aussi par des entreprises pour extraire et annoter des informations dans des textes techniques. NooJ est gratuit, fonctionne sous Windows, Mac OS X, LINUX et Unix et sera bientôt disponible en open source, cf. www.nooj4nlp.net grâce à l'appui du projet européen Metanet-Cesar. Le stage s’adresse particulièrement aux étudiants de Master, doctorants et chercheurs intéressés par la linguistique descriptive, la linguistique de corpus et l’analyse automatique de textes dans une perspective des sciences humaines. Le stage dure une semaine : les matins sont dédiés au cours et aux travaux dirigés ; pendant les après-midis, des chercheurs viendront présenter diverses applications de NooJ. Les inscriptions sont gratuites mais obligatoires. Attention : les places sont limitées à 50 participants maximum. Les étudiants de Master qui peuvent et souhaitent valider le stage auprès de leur département devront impérativement rendre un devoir à l'issue du stage. Chaque participant devra venir avec son ordinateur portable sur lequel NooJ doit être déjà installé. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:16:57 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:16:57 +0200 Subject: Appel: ACL 2013 Message-ID: Date: Tue, 2 Oct 2012 22:20:25 +0100 From: Anna Korhonen Message-ID: X-url: http://acl2013.org/ ACL 2013 CALL FOR PAPERS The 51st Annual Meeting of the Association for Computational Linguistics Sofia, Bulgaria, August 4-9 http://acl2013.org/ The Association for Computational Linguistics is pleased to announce that its 2013 Annual Meeting will take place in Sofia, Bulgaria, on August 4th to 9th. The conference invites the submission of long and short papers on substantial, original, and unpublished research in all aspects of automated language processing, as discussed below. As already done last year, ACL 2013 will accept papers accompanied by the resource (software or data) described in the paper. In addition to the regular review of the research quality of the paper, these papers will also be reviewed for the quality of the resource that is being made available. Papers that are submitted with accompanying software/data will receive additional credit toward the overall evaluation score, and acceptance or rejection decision will be made based on the quality of both the research and the software/data component. In addition, this year there will be an important novelty: some of the presentations at the conference will be of papers accepted for the new Transactions of the ACL journal (http://www.transacl.org/). Topics Relevant topics for the conference include, but are not limited to, the following areas (in alphabetical order): Cognitive modelling of language processing and psycholinguistics Dialogue and interactive systems Discourse, coreference and pragmatics Evaluation methods Information retrieval Language resources Lexical semantics and ontologies Low resource language processing Machine translation: methods, applications and evaluation Multilinguality in NLP NLP applications NLP and creativity NLP for the languages of Central and Eastern Europe and the Balkans NLP for the Web and social media Question answering Semantics Sentiment analysis, opinion mining and text classification Spoken language processing Statistical and Machine Learning methods in NLP Summarization and generation Syntax and parsing Tagging and chunking Text mining and information extraction Word segmentation Submissions Long papers: ACL 2013 submissions must describe substantial, original, completed and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Submissions will be judged on appropriateness, clarity, originality/innovativeness, correctness/ soundness, meaningful comparison, thoroughness, significance, contributions to research resources, and replicability. Each submission will be reviewed by at least three program committee members. Long papers may consist of up to eight (8) pages of content, plus two extra pages for references; final versions should take into account reviewers' comments. Papers will be presented orally or as posters as determined by the program committee. Decisions on presentation format will be based on the nature rather than the quality of the work. There will be no distinction in the proceedings between long papers presented orally and as posters. The long paper deadline is: Wednesday February 20th, 2013 Short papers: ACL 2013 also solicits short papers. Short paper submissions must describe original and unpublished work. Characteristics of short papers include: - A small, focused contribution - Work in progress - A negative result - An opinion piece - An interesting application nugget Short papers will be presented in one or more oral or poster sessions, and will be given four (4) pages including references in the proceedings. While short papers will be distinguished from long papers in the proceedings, there will be no distinction in the proceedings between short papers presented orally and posters. Each short paper submission will be reviewed by at least two program committee members. The deadline for short papers is Sunday April 14th, 2013 Electronic Submission: Submission is electronic, using the Softconf submission software (URL to be announced in subsequent versions of this call) Format: Long paper submissions should follow the two-column format of ACL 2013 proceedings without exceeding eight (8) pages of content plus two extra pages for references. Short paper submissions should also follow the two- column format of ACL 2013 proceedings, and should not exceed four (4) pages including references. We strongly recommend the use of ACL LaTeX style files or Microsoft Word style files tailored for this year's conference. Submissions must conform to the official style guidelines, which are contained in the style files, and they must be in PDF. As the reviewing will be blind, papers must not include authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ..." must be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ..." Papers that do not conform to these requirements will be rejected without review. In addition, please do not post your submissions on the web until after the review process is complete. Multiple-submission policy: Papers that have been or will be submitted to other meetings or publications must indicate this at submission time. Authors of papers accepted for presentation at ACL 2013 must notify the program chairs by April 21st as to whether the paper will be presented. All accepted papers must be presented at the conference to appear in the proceedings. We will not accept for publication or presentation papers that overlap significantly in content or results with papers that will be (or have been) published elsewhere. Authors submitting more than one paper to ACL must ensure that submissions do not overlap significantly (> 50%) with each other in content or results. Important Dates Long paper submission deadline: Wednesday, February 20th Long paper author responses: Friday March 29th Long paper acceptance notification: Sunday April 7th Short paper submission deadline: Sunday, April 14th Long paper camera ready: Monday May 6th Short paper acceptance notification: Sunday May 12th Short paper camera ready: Wednesday May 22nd Conference: August 4th-9th Program Co-Chairs Pascale Fung, The Hong Kong University of Science and Technology Massimo Poesio, University of Essex ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:19:45 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:19:45 +0200 Subject: Appel: Workshop proposals, NAACL-HLT 2013, ACL 2013, EMNLP 2013 Message-ID: Date: Thu, 4 Oct 2012 20:45:43 +0100 From: Anna Korhonen Message-ID: X-url: http://naacl.org/ X-url: http://www.acl2013.org/ CALL FOR WORKSHOP PROPOSALS NAACL-HLT 2013 & ACL 2013 & EMNLP 2013 The North American Chapter of the Association for Computational Linguistics (NAACL), The Association for Computational Linguistics (ACL), and ACL SIGDAT invite proposals for workshops to be held in conjunction with the NAACL-HLT, ACL, or EMNLP conferences in 2013. We solicit proposals on any topic of interest to the ACL communities. Workshops will be held at one of the following conference venues: NAACL-HLT 2013 is the 14th Annual Meeting of the North American Chapter of the Association for Computational Linguistics. It will be held in Atlanta, GA, USA, June 9 - 14, 2013. The dates for the NAACL-HLT workshops will be June 13 - 14. The webpage for NAACL-HLT 2013 is: http://naacl.org/. ACL 2013 is the 51st Annual Meeting of the Association for Computational Linguistics (ACL). It will be held in Sofia, Bulgaria, August 4 - 9, 2013. The ACL workshops will be held August 8 - 9. The webpage for ACL 2013 is: http://www.acl2013.org/. EMNLP 2013 is SIGDAT's annual Conference on Empirical Methods in Natural Language Processing. It will be held in Seattle, WA, USA, in October 2013. The exact dates and venue are to be determined. One day of workshops are planned after a 3-day main conference, although proposals for a longer associated event will be considered. Proposals will be reviewed jointly by the workshop organizers for the conferences. ------------------------------------------------------------------------ SUBMISSION INFORMATION Proposals for workshops should contain: 1. A title and brief (2-page max) description of the workshop topic and content. 2. The desired workshop length (one or two days) and an estimate of the number of attendees. 3. The names, postal addresses, phone numbers, and email addresses of the organizers, with one-paragraph statements of their research interests and areas of expertise. 4. A list of potential members of the program committee, with an indication of which members have already agreed. 5. A description of any shared tasks associated with the workshop. 6. A description of special requirements for technical needs. 7. A note specifying which venue(s) (NAACL-HLT/ACL/EMNLP) would be acceptable and/or preferable. There will be a single workshop committee, coordinated by the workshop chairs. This single committee will review the quality of the workshop proposals. Once the reviews are complete, the workshop chairs will work together to assign workshops to all three of the conferences, taking into account the location preferences given by the proposers. The ACL has a set of policies on workshops. You can find the ACL's general policies on workshops at http://www.cis.udel.edu/~carberry/ACL/Workshops/workshop-support-general-policy.html, the financial policy for workshops at http://www.cis.udel.edu/~carberry/ACL/Workshops/workshop-conf-financial-policy.html, and the financial policy for SIG workshops at http://www.cis.udel.edu/~carberry/ACL/Workshops/workshops-Sig-financial-policy.html. Please submit proposals in plain text in the body of an email to the workshop organizers (naacl-acl-workshops-2013 at googlegroups.com) no later than November 30, 2012, 23:59:59 UTC/GMT. Notification of acceptance of workshop proposals will occur no later than December 14, 2012. Organizers of accepted proposals will be responsible for publicizing and running the workshop, including reviewing submissions, producing the camera ready workshop proceedings, and organizing the meeting days. It is crucial that organizers commit to all deadlines. In particular, failure to produce the camera ready proceedings on time will lead to the exclusion of the workshop from the CD-ROM/USB & unified author indexes. Workshop organizers cannot accept for publication papers that will be (or have been) published elsewhere, although they are free to set their own policies on simultanous submission and review. Since the conferences will occur at different times, the timescales for the submission and reviewing of workshop papers, and the preparation of camera-ready copies, will be different for each conference. Suggested timescales for each of the conferences are given below. Workshop organizers should not deviate from this schedule unless absolutely necessary. ------------------------------------------------------------------------ TIMELINES FOR 2013 WORKSHOPS SHARED DATES Nov 30, 2012 Workshop proposal deadline Dec 14, 2012 Notification of acceptance NAACL-HLT 2013 Dec 21, 2012 Proposed 1st workshop CFP Mar 01, 2013 Proposed paper due date Mar 29, 2013 Proposed notification of acceptance Apr 12, 2013 Camera-ready deadline Jun 13-14, 2013 Workshops ACL 2013 Jan 24, 2013 Proposed 1st workshop CFP Apr 26, 2013 Proposed paper due date May 24, 2013 Proposed notification of acceptance Jun 7, 2013 Camera-ready deadline Aug 8-9, 2013 Workshops EMNLP 2013 Mar 1, 2013 Proposed 1st workshop CFP Jul 1, 2013 Proposed paper due date Aug 1, 2013 Proposed notification of acceptance Sep 1, 2013 Camera-ready deadline Oct TBD, 2013 Workshops ------------------------------------------------------------------------ WORKSHOP CO-CHAIRS Sujith Ravi, NAACL; Google Inc. Luke Zettlemoyer, NAACL; University of Washington Aoife Cahill, ACL; Educational Testing Service Qun Liu, ACL; Dublin City University & Chinese Academy of Sciences For inquiries, send email to the workshop organizers at naacl-acl-workshops-2013 at googlegroups.com ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:20:37 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:20:37 +0200 Subject: Job: Information Extraction, IRISA/INRIA, Rennes, France Message-ID: Date: Fri, 05 Oct 2012 10:45:57 +0200 From: Vincent Claveau Message-ID: <506E9E45.1040704 at irisa.fr> POSTDOCTORAL/ENGINEERING POSITION at IRISA / INRIA Rennes, France Topic: Text-mining and information extraction in multimedia documents Information extraction and text-mining are well known domains of Natural Language Processing. Yet, dealing with low-quality texts, like automatic speech transcription or OCRized overlays, raises new challenges in terms of portability and robustness. In that context, the proposed project aims at developing new text-mining and information extraction approaches to overcome these difficulties. The goal is to rely on simple but robust description of the text and new machine learning techniques and paradigms (CRF, boosting, unsupervised and semi-supervised approaches...). The typical tasks concerned are term and named entity recognition and discovery, (ontological or semantic) relation recognition and discovery... The candidate is expected to implement these new approaches, participate to evaluation and challenges in this field, both for well-formed texts and degraded texts (such as speech transcripts), and may also help in developing new evaluation datasets. This work takes place in the context of the Quaero project, funded by the French National Innovation Agency (www.quaero.org). The work will be performed at IRISA/INRIA Rennes, France (http://www.irisa.fr , http://www.inria.fr/centre/rennes ). The candidate will integrate the TexMex team, whose main research topics include large-scale multimedia indexing, speech processing, information retrieval. QUALIFICATIONS AND POSITION The successful candidate will have an engineering degree or PhD with a track record of Information Extraction, Text-Mining or Machine Learning for Natural Language Processing research. Fluency in English is mandatory. This position is for 12 months and may begin as early as Nov 1st, 2012, and no later than mid-December. Salary follows INRIA scales and depends on the candidate's experience (the minimum monthly net salary is about 2000 €). To apply, please send a cover letter, describing how the applicant's knowledge and research background will contribute to the project, a CV, and the names and contact information of two referees to: Vincent Claveau (vincent.claveau at irisa.fr) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:23:43 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:23:43 +0200 Subject: Appel: JeTou 2013, colloque international jeunes chercheurs en Sciences du Langage Message-ID: Date: Fri, 5 Oct 2012 11:46:09 +0200 From: JéTou 2013 Message-ID: X-url: http://jetou2013.free.fr/ Bonjour, Nous vous contactons à propos de la quatrième édition des Journées d'études Toulousaines (JéTou) qui se tiendront les 16 et 17 mai 2013 à l’Université Toulouse II – Le Mirail (Toulouse, France). Ce colloque international s’adresse aux jeunes chercheurs en Sciences du Langage et aura pour thématique : « Variation et variabilité dans les Sciences du Langage : analyser, mesurer, contextualiser ». Vous trouverez l’appel à communications ainsi que l'affiche de la manifestation (versions francophone et anglophone) en pièces jointes. La date limite de soumission, initialement fixée au vendredi 5 octobre 2012, a été repoussée au *vendredi 12 octobre 2012* (dernier délai). Toutes les informations nécessaires sont disponibles sur le site internet du colloque à l'adresse suivante : http://jetou2013.free.fr/. Nous vous remercions par avance de bien vouloir diffuser ces informations à vos contacts. Bien à vous, Le comité d'organisation des JéTou 2013 : Caroline Atallah (CLLE-ERSS) Guillaume Carbou (LARA-CPST) Marie-Mandarine Colle-Quesada (Octogone-Lordat) Claire Del Olmo (Octogone-Lordat) Marie Lacabanne (Octogone-Lordat) Marine Lasserre (CLLE-ERSS) Simon Leva (CLLE-ERSS) Emilie Massa (Octogone-Lordat) Cécile Viollain (CLLE-ERSS) --------------------------------------------------------------------- Dear Sir, Dear Madam, we are contacting you about the 4th edition of the JéTou (Journées d'Etudes Toulousaines) which will take place on May 16th and May 17th 2013 at Université Toulouse II - Le Mirail (Toulouse, France). This international conference aims at gathering doctoral students and young researchers in the Language Sciences together in order to discuss a specific theme. This year, the theme of the conference is the following: "Variation and Variability in the Language Sciences: analyzing, measuring, contextualizing". Attached to this email you will find the call for papers as well as the poster for the conference (in both the French and English versions). The submission deadline, originally set to Friday October 5th 2012, has been extended to *Friday October 12th 2012*. All necessary information regarding the event can be found on the conference website: http://jetou2013.free.fr/index-en. We thank you in advance for spreading the information to your contacts. Respectfully yours, Le comité d'organisation des JéTou 2013 : Caroline Atallah (CLLE-ERSS) Guillaume Carbou (LARA-CPST) Marie-Mandarine Colle-Quesada (Octogone-Lordat) Claire Del Olmo (Octogone-Lordat) Marie Lacabanne (Octogone-Lordat) Marine Lasserre (CLLE-ERSS) Simon Leva (CLLE-ERSS) Emilie Massa (Octogone-Lordat) Cécile Viollain (CLLE-ERSS) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:24:12 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:24:12 +0200 Subject: Appel: Revue TAL, Gestion des erreurs en traitement automatique des langues (extension de la date limite) Message-ID: Date: Fri, 05 Oct 2012 22:58:37 +0200 From: Francois Yvon Message-ID: <506F49FD.3060703 at limsi.fr> X-url: http://tal-53-3.sciencesconf.org/ DERNIER APPEL À CONTRIBUTIONS DU BRUIT DANS LE SIGNAL : GESTION DES ERREURS EN TRAITEMENT AUTOMATIQUE DES LANGUES UN NUMÉRO SPÉCIAL DE LA REVUE < TRAITEMENT AUTOMATIQUE DES LANGUES> (TAL) ** Date limite pour soumettre un résumé: 15/10/2012 ** Date limite pour l'article complet: 29/10/2012 Voir [http://tal-53-3.sciencesconf.org/] La langue que les applications de traitement automatique des langues ont à traiter ressemble assez peu aux exemples parfaitement grammaticaux que l'on rencontre dans les livres de grammaire. Dans l'usage quotidien, les énoncés à traiter se présentent sous une forme imparfaite : les textes dactylographiés contiennent des erreurs de saisie, ainsi que de fautes d'orthographe et de grammaire ; les énoncés oraux correspondent souvent à des phrases incomplètes et contiennent des disfluences; les sorties des systèmes d'OCR contiennent de multiples confusion entre caractères, et celles des systèmes de reconnaissance vocale contiennent des transcriptions inexactes de ce qui a réellement été prononcé. Le bruit est donc inhérent au données langagières et ignorer cette réalité ne peut que nuire à la qualité de nos systèmes de traitement. Pour certaines applications, l'enjeu est de développer des mécanismes robustes vis-à-vis de ces erreurs. Par exemple, un système de dialogue pourra utiliser des mesures de confiance portant sur les hypothèses de reconnaissance vocale pour décider s'il doit demander à l'utilisateur de répéter. Pour d'autres applications, il sera nécessaire de faire appel à des techniques de correction automatique des erreurs; ainsi, par exemple, un système d'OCR pourra post-traiter les textes avec des modèles de correction contextuels pour valider l'orthographe des mots. Ce numéro spécial vise à rassembler des contributions portant sur la gestion des erreurs en traitement des langues. De nombreux sous-domaines du TAL ont besoin de prendre en compte le bruit et les erreurs dans les signaux linguistiques qu'ils considèrent, mais il est rare que des chercheurs issus de ces diverses communautés aient l'occasion de comparer leurs méthodes et leurs résultats. Notre ambition est de mettre en perspective des travaux issus de ces différents domaines de manière à encourager la fertilisation croisée des idées. Pour ce numéro spécial, nous considérons donc comme pertinent tout travail touchant au traitement automatique de données bruitées. Les sous-domaines les plus développés sont probablement la correction orthographique, et, dans une moindre mesure, la correction grammaticale; aucun de ces problèmes n'est pourtant complètement résolu, et la situation est encore moins satisfaisante quand on considère des erreurs plus profondes, touchant par exemple au style ou à l'organisation du discours. Les traitements robustes, qui visent à extraire le maximum d'informations utiles d'entrées potentiellement erronées, seront aussi favorablement considérés, que ces entrées se présentent sous forme écrite ou orale ; plus généralement, les études portant sur les stratégies de réparation d'erreur, par exemple dans les systèmes de dialogue ou d'autres systèmes analogues, sont également pertinentes pour ce numéro. Nous invitons donc les contributions portant sur tout aspect relatif au traitement des erreurs en TAL, et en particulier (liste non exclusive): * correction automatique de l'orthographe et de la grammaire * erreurs sémantiques et logiques * correction d'erreurs dans le style ou l'organisation du discours * correction d'erreurs "artificielles" (OCR, reconnaissance vocale, etc.) * correction automatique de requêtes à des moteurs de recherche * acquisition, annotation et analyse d'erreurs dans les textes réels * corpus d'erreurs * traitement des erreurs dans les langages contrôlés * erreurs en apprentissage des langues * erreurs de performance * normalisation d'écrits non standards * TAL robuste * traitement de parole disfluente * traitement des erreurs en reconnaissance vocale * apprendre avec des données bruitées * mesures de la gravité des erreurs * mesures de confiance * fouille et analyse d'erreurs * auto-évaluation et diagnostic d'erreurs ÉDITEURS INVITÉS - Robert Dale (Macquarie University, Australia) - François Yvon (LIMSI/CNRS and Univ. Paris Sud, France) COMITÉ SCIENTIFIQUE Martine Adda (LPL/CNRS, Paris) Delphine Bernhard (LiLPa, Université de Strasbourg) Simon Charest (Druide informatique, Montréal) Anne Dister (Facultés Universitaires Saint-Louis, Bruxelles) Yannick Estève (LIUM, Université du Maine, Le Mans) Thierry Fontenelle (Centre de Traduction des organes de l'Union Européenne, Luxembourg) Alegria Inaki (University of the Basque Country) Diana Inkpen (Université d'Ottawa) Marie-José Hamel (Université d'Ottawa) David Langlois (LORIA, Université de Lorraine, Nancy) Alessandro Lenci (Università di Pisa) Ryo Nagata (Konan University, Kobe) Pierre Nugues (University of Lund) Joel Tetrault (Educational Testing Service, Princeton) Martin Raynaert (Tilburg University) Christoph Ringlstetter (CIS, University of Munich) Alla Rozovskaya (University of Illinois at Urbana-Champaign) Benoit Sagot (ALPAGE/INRIA, Paris) Michel Simard (NRC, Ottawa) Khaled Shaalan (The British University in Dubai) Serge Sharroff (University of Leeds) Eric Werlhi (LATL, Université de Genève) DATES IMPORTANTES - soumission des contributions (résumés) : 15 octobre 2012 - soumission des contributions (article complet) : 29 octobre 2012 - première notification aux auteurs : 20 décembre 2012 - date limite pour les versions révisées : 1er février 2013 - décisions finales : 15 avril 2013 - versions finales : 15 juin 2013 - publication : été 2013 LE JOURNAL Depuis 40 ans, TAL (Traitement Automatique des Langues) est un journal international publié par l'ATALA (Association pour le Traitement Automatique des Langues) avec le soutien du CNRS. Depuis quelques années, il s'agit d'un journal en ligne, des versions papier pouvant être obtenues sur commande. Ceci n'affecte en rien le processus de relecture et de sélection. INFORMATIONS PRATIQUES Les articles (25 pages environ, format PDF) doivent être déposés sur la plateforme http://tal-53-3.sciencesconf.org/. Les feuilles de style sont disponibles sur le site web du journal (http://www.atala.org/-Revue-TAL). Le journal ne publie que des contributions originales, en français ou en anglais. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:28:41 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:28:41 +0200 Subject: Appel: PAKDD 2013, Deadline: Oct. 08, 2012 Message-ID: Date: Sun, 7 Oct 2012 22:51:54 +0100 From: CFP PAKDD2013 Message-ID: X-url: http://pakdd2013.pakdd.org/ [Apologies for multiple copies] --------------------------------------------------- Call For Papers PAKDD 2013 The 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining Gold Coast, Australia Conference Website http://pakdd2013.pakdd.org/ Submission System https://cmt.research.microsoft.com/PAKDD2013/ Important Dates Paper submission due: Oct. 8 (Mon). 2012 Notification to author: Dec. 19 (Wed). 2012 Camera ready due: Jan. 6 (Sun). 2013 *[23:59:59 Pacific Time] ============================================================== Conference Scope The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference in the areas of data mining and knowledge discovery (KDD). It provides an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all KDD related areas, including data mining, data warehousing, machine learning, artificial intelligence, databases, statistics, knowledge engineering, visualization, and decision-making systems. The conference calls for research papers reporting original investigation results and industrial papers reporting real data mining applications and system development experience. ============================================================== Topics The topics of relevance for the conference papers include but not limited to the following: * Novel models and algorithms * Clustering * Classification * Ranking * Association analysis * Anomaly detection * Data pre-processing * Feature extraction and selection * Mining heterogeneous data * Mining multi-source data * Mining sequential data * Mining spatial and temporal data * Mining unstructured and semi-structured data * Mining graph and network data * Parallel, distributed, and high performance data mining on the cloud platform * Privacy preserving data mining * Mining high dimensional data * Mining uncertain data * Mining imbalanced data * Mining dynamic/streaming data * Statistical methods for data mining * Visual data mining * Interactive and online mining * Mining behavioral data * Mining multimedia data * Mining scientific databases * Ubiquitous knowledge discovery * Agent-based data mining * Mining social networks * Financial data mining * Fraud and risk analysis * Security and intrusion detection * Opinion mining and sentiment analysis * Post-processing including quality assessment and validation * Integration of data warehousing, OLAP and data mining * Human, domain, organizational and social factors in data mining * Applications to healthcare, bioinformatics, computational chemistry, * Eco-informatics, marketing, online gaming, etc All paper submissions will be handled electronically. Detailed instructions are provided on the conference home page. ============================================================== Paper Submission Each submitted paper should include an abstract up to 200 words. It should also adhere to the double-blind review policy and not longer than 12 single-spaced pages with 10pt font size. Authors are strongly encouraged to use Springer LNCS/LNAI manuscript submission guidelines (available at http://www.springer.de/comp/lncs/authors.html) for their initial submissions. All papers must be submitted electronically through Microsoft's Conference Management Service (CMT) in PDF format only. The submitted papers must not be previously published anywhere, and must not be under consideration by any other conferences or journal during the PAKDD review process. Submitting a paper to the conference means that if the paper were accepted, at least one author will attend the conference to present the paper. For no-show authors, their affiliations will receive a notification. The program committee chairs are not allowed to submit papers to the conference for a fair review process. All papers will be double-blind reviewed by the Program Committee on the basis of technical quality, relevance to data mining, originality, significance, and clarity. Papers that do not comply with the Submission Guidelines will be rejected without review. The best papers will be selected to be included in the special issues of Knowledge and Information Systems (KAIS) and International Journal of Data Mining and Bioinformatics (IJDMB). Before submitting your paper, please carefully read and agree with the PAKDD submission policy and no-show policy: http://pakdd.togaware.com/policy.html ============================================================== Conference Officers Honorary Co-chairs * Jiawei Han. University of Illinois at Urbana-Champaign,USA * Ramamohanarao Kotagiri, University of Melbourne, Australia * Graham Williams. Australia Taxation Office, Australia Conference Co-chairs * Hiroshi Motoda, AFOSR/AOARD and Osaka University, Japan * Longbing Cao. University of Technology, Sydney, Australia Program Committee Co-chairs * Jian Pei. Simon Fraser University, Canada * Vincent S. Tseng. National Cheng Kung University, Taiwan Local Arrangement Co-chairs * Vladimir Estivill-Castro. Griffith University (Gold Coast), Australia * Xue Li, University of Queensland, Australia * Richi Nayak, Queensland University of Technology, Australia * Xinhua Zhu, University of Technology, Sydney, Australia Workshop Co-chairs * Jiuyong Li. University of Sourth Australia, Australia * Kay Chen Tan. National University of Singapore, Singapore * Bo Liu. Guangdong University of Technology, China Tutorial Co-chairs * Tu Bao Ho. Japan Advanced Institute of Science and Technology, Japan * Mengjie Zhang. Victoria University of Wellington, New Zealand Award Chair * Chengqi Zhang, University of Technology, Sydney, Australia Sponsorship Co-chair * Yue Xu, Queensland University of Technology, Australia Publicity Co-chairs * P.Krishna Reddy, The International Institute of Information Technology, Hyderabad, India * Yifeng Zeng, Aalborg University, Denmark * Xin Wang, University of Calgary, Canada * Zhihong Deng, Peking University, China ============================================================== Further Information For further information, please contact the Program Committee Chairs by pakdd13-program at pakdd.org . General inquiries * Longbing Cao University of Technology Sydney, Australia Email: pakdd13 at pakdd.org Phone: (61)2-9514-4477 Fax: (61)2-9514-1807 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:30:12 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:30:12 +0200 Subject: Appel: RECITAL 2013 Message-ID: Date: Mon, 08 Oct 2012 09:41:03 +0200 From: Florian Boudin Message-ID: <5072838F.302 at univ-nantes.fr> RECITAL 2013 : Premier appel à communications --------------------------------------------- RECITAL 2013 15ième Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues. Centre des congrés Les Atlantes aux Sables D'Olonne (France) du 17 au 21 juin 2013. Dates importantes ----------------- - Date limite de soumission : vendredi 15 mars 2013 - Notification aux auteurs : vendredi 19 avril 2013 - Version définitive : vendredi 10 mai 2013 Présentation ------------ RECITAL 2013, la conférence annuelle des jeunes chercheurs associée à TALN, se déroulera aux Sables D'Olonne (France) du 17 au 21 juin 2013. RECITAL offre aux jeunes chercheurs en Traitement Automatique des Langues (TAL) l'occasion de présenter leurs travaux et de comparer leurs approches. Elle est réservée aux étudiants (master et doctorat) et aux jeunes chercheurs ayant obtenu leur doctorat depuis moins d'un an. Fort du succès de l'année précédente, nous encourageons la soumission de travaux même préliminaires, de projets de thèse, et de travaux des premiers mois de recherche (état de l'art, premières pistes, etc.). L'objectif premier de RECITAL est de soutenir les travaux des jeunes chercheurs en TAL et de faciliter leur intégration dans notre communauté. A ce titre, nous visons : - des relectures pédagogiques : les auteurs doivent pouvoir comprendre les erreurs qu'ils ont pu commettre afin de pouvoir les corriger et améliorer la qualité de leur travail; - des relectures positives : il n'est jamais nécessaire de décourager un jeune chercheur, les maîtres-mots devront être encourager/guider; - l'échange direct : les relectures seront communiquées signées, donc non-anonymes, aux auteurs. Libre à eux (ou mieux, aux relecteurs) d'aller informellement discuter ensemble lors de la conférence. Un prix du meilleur papier RECITAL d'une valeur de 500 Euros sera décerné lors de la cérémonie de clôture. Thèmes principaux ----------------- Les communications pourront porter sur les thèmes habituels du TAL : - Analyse et génération dans les domaines suivants : + Phonétique + Phonologie + Morphologie + Syntaxe + Sémantique - Analyse et génération dans les domaines suivants : + Phonétique + Phonologie + Morphologie + Syntaxe + Sémantique + Discours - Développement de ressources linguistiques pour le TAL : + Bases de données comportant des informations morphologiques, syntaxiques, sémantiques, et/ou phonologiques + Grammaires + Lexiques + Ontologies + Linguistique de corpus - Applications du TAL : + Analyse de sentiments ou d'opinions + Catégorisation ou classification automatique + Désambiguïsation lexicale + Dialogue homme-machine en langage naturel + Enseignement assisté par ordinateur + Indexation automatique + Recherche et extraction d'information + Résumé automatique + Résolution d'anaphores + Systèmes de question-réponse + Traduction automatique + Web sémantique - Approches : + Linguistiques formelles destinées à soutenir les traitements automatiques + Symboliques + Logiques + Statistiques + Basées sur l'apprentissage automatique Cette liste n'est pas exhaustive et l'adéquation d'une proposition de communication à la conférence sera jugée par le comité de programme. Critères de sélection --------------------- Les auteurs doivent être des étudiant(e)s ou bien des jeunes docteur(e)s ayant soutenu leur thèse depuis moins d'un an. Les publications avec des chercheurs confirmés (ce qui inclus les directeurs de thèses) doivent être soumises à TALN et non à RECITAL. Les auteurs sont invités à soumettre des travaux de recherche originaux, n'ayant pas fait l'objet de publications antérieures. Les soumissions seront examinées par au moins deux spécialistes du domaine. Seront considérées en particulier: - La correction du contenu scientifique et technique - La situation des travaux dans le contexte de la recherche internationale - L'organisation et la clarté de la présentation - L'adéquation aux thèmes de la conférence Les articles sélectionnés seront publiés dans les actes de la conférence. Suivant l'avis du comité de programme, les présentations se feront soit sous forme orale soit sous forme de poster. Modalités de Soumission ----------------------- Les articles seront rédigés en français pour les francophones, en anglais pour ceux qui ne maîtrisent pas le français. Les articles doivent faire de 8 à 14 pages. Une feuille de style LaTeX, un modèle Word et un modèle LibreOffice seront disponibles sur le site web (à venir) de la conférence. Contact : florian.boudin at univ-nantes.fr et loic.barrault at lium.univ-lemans.fr ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:34:49 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:34:49 +0200 Subject: Job: Developpeur d'applications Web, PROLIPSIA, Besancon, France Message-ID: Date: Mon, 08 Oct 2012 16:40:45 +0200 From: blandine.alecu at prolipsia.com Message-ID: <20121008164045.84869g1ir4ubzx7x at webmail.prolipsia.com> X-url: http://prolipsia.com/wp-content/uploads/2012/10/AnnonceDeveloppeurProlipsia2012.pdf DEVELOPPEUR D?APPLICATIONS WEB (H/F) http://prolipsia.com/wp-content/uploads/2012/10/AnnonceDeveloppeurProlipsia2012.pdf Contrat : CDI Statut : cadre Salaire : selon profil Prolipsia est une jeune entreprise d?édition de logiciels de Traitement Automatique du Langage (TAL) spécialisés dans les langues contrôlées. Créée en 2011 et basée à Besançon (Temis Innovation), Prolipsia conçoit, édite et commercialise des solutions logicielles d?aide à la conception et rédaction de textes techniques, à destination de secteurs tels que la santé, la sécurité privée, l?industrie. Dans le cadre de notre développement, nous cherchons un développeur (bac+5), passionné par le développement et motivé par l?innovation, afin d?intégrer notre équipe de R&D dès le mois de novembre 2012. MISSIONS : Au sein de l?équipe R&D et en étroite collaboration avec nos ingénieurs linguistes, vous travaillerez sur la conception et le développement d?applications de traitement du langage. Doté d'une forte autonomie, vous avez l'habitude/conscience des contraintes liées à la production afin de garantir la meilleure qualité de service. Réactif, organisé, dynamique, vos compétences et vos qualités relationnelles vous permettront de coopérer au sein d?une équipe de R&D pluridisciplinaire, et d?assurer les missions suivantes : - Gestion des phases techniques de projets : spécifications fonctionnelles et techniques, planification, rédaction de comptes rendus techniques, déploiement - Participation aux projets de R&D internes : conception et développement d?applications (notamment Web), évolution des applications déjà développées par Prolipsia - Interactions avec nos clients : suivi de produit, étude et analyse des retours d?expériences Client - Veille technologique Vous pourrez, pour mener à bien vos missions, être amené à suivre des formations professionnelles. POURQUOI NOUS REJOINDRE ? - Participation au développement d?un projet ambitieux et innovant - Possibilité d'évolution : votre expérience au sein de Prolipsia pourrait vous amener rapidement à animer l?équipe de développement. - Ambiance Start-up, professionnalisme, ouverture d?esprit et convivialité - Horaires et emploi du temps flexibles, télétravail pendulaire possible - Valorisation des compétences et de la motivation PROFIL RECHERCHE : - Expérience significative en entreprise, idéalement en gestion de projet - Aptitude et goût pour le travail en équipe - Excellentes qualités relationnelles et de communication - Créativité, esprit d?initiative - Goût pour le challenge et l?innovation COMPETENCES : Vous maitrisez : - les langages de programmation PHP 5, Javascript, XML, HTML, et technologies AJAX - la POO (Programmation Orientée Objet) - un ou plusieurs frameworks PHP - un ou plusieurs frameworks Javascript (comme jQuery ou ExtJS) - un ou plusieurs SGBDR (système de gestion de base de données relationnelle), dont MySQL - l'intégration de tests dans le processus de développement Vous connaissez : - les logiciels de gestion de version, tels que Subversion - les méthodes agiles Seront appréciés : - connaissance de Perl, NoSQL - connaissances et expérience en exploitation d'outils/librairies open source - connaissances en Traitement Automatique du Langage - goût pour le français CANDIDATURE : Merci d'envoyer lettre de motivation et CV, avant le 26 octobre 2012, à Julie RENAHY : julie.renahy at prolipsia.com ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:37:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:37:25 +0200 Subject: Appel: CogALEX-III Message-ID: Date: Mon, 08 Oct 2012 23:19:07 +0200 From: Michael Zock Message-ID: <5073434B.5090508 at lif.univ-mrs.fr> X-url: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html Apologies for multiple postings : ============================================================== Times flies : !!! only one MORE week !!! ============================================================== 3d and last Call for Papers : CogALex-3 (Cognitive Aspects of the Lexicon), a post-COLING workshop deadline for paper submission : October 15, 2012 more details: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ============================================================== 3rd Workshop on "Cognitive Aspects of the Lexicon" (CogALex) Post-conference workshop at COLING 2012 (December 15, Mumbai, India) Invited speaker: Alain Polguère (Université de Lorraine & ATILF CNRS, France) Submission deadline: October 15, 2012 AIMS and TARGET AUDIENCE The aim of this workshop is to bring together researchers involved in the construction and application of electronic dictionaries to discuss modifications of existing resources in line with the users' needs, thereby fully exploiting the advantages of the digital form. Given the breadth of the questions, we welcome reports on work from many perspectives, including but not limited to: computational lexicography, psycholinguistics, cognitive psychology, language learning and ergonomics. MOTIVATION The way we look at dictionaries, their creation and use, has changed dramatically over the past 30 years. (1°) While being considered as an appendix to grammar in the past, they have in the meantime moved to centre stage. Indeed, there is hardly any task in NLP which can be conducted without them. (2°) Also, many lexicographers work nowadays with huge digital corpora, using language technology to build and to maintain the lexicon. (3°) Last, but not least, rather than being static entities (data-base view), dictionaries are now viewed as graphs, whose nodes and links (connection strengths) may change over time. Interestingly, properties concerning topology, clustering and evolution known from other disciplines (society, economy, human brain) also apply to dictionaries: everything is linked, hence accessible, and everything is evolving. Given these similarities, one may wonder what we can learn from these disciplines. In this 3rd edition of the CogALex workshop we therefore intend to also invite scientists working in these fields, our goals being to broaden the picture, i.e. to gain a better understanding concerning the mental lexicon and to integrate these findings into our dictionaries in order to support navigation. Given recent advances in neurosciences, it appears timely to seek inspiration from neuroscientists studying the human brain. There is also a lot to be learned from other fields studying graphs and networks, even if their object of study is something else than language, for example biology, economy or society. TOPICS OF INTEREST This workshop is about possible enhancements of existing electronic dictionaries. To perform the groundwork for the next generation of electronic dictionaries we invite researchers involved in the building of such dictionaries. The idea is to discuss modifications of existing resources by taking the users' needs and knowledge states into account, and to capitalize on the advantages of the digital media. For this workshop we invite papers including but not limited to the following topics which can be considered from various points of view: linguistics, neuro- or psycholinguistics (associations, tip-of-the-tongue problem), network-related sciences (complex graphs, network topology, small-world problem), etc. 1) Analysis of the conceptual input of a dictionary user - What does a language producer start from (bag of words)? - What is in the authors' minds when they are generating a message and looking for a word? - What does it take to bridge the gap between this input and the desired output (target word)? 2) The meaning of words - Lexical representation (holistic, decomposed) - Meaning representation (concept based, primitives) - Revelation of hidden information (vector-based approaches: LSA/HAL) - Neural models, neurosemantics, neurocomputational theories of content representation. 3) Structure of the lexicon - Discovering structures in the lexicon: formal and semantic point of view (clustering, topical structure) - Creative ways of getting access to and using word associations - Evolution, i.e. dynamic aspects of the lexicon (changes of weights) - Neural models of the mental lexicon (distribution of information concerning words, organisation of the mental lexicon) 4) Methods for crafting dictionaries or indexes - Manual, automatic or collaborative building of dictionaries and indexes (distributional semantics, crowd-sourcing, serious games, etc.) - Impact and use of social networks (Facebook, Twitter) for building dictionaries, for organizing and indexing the data (clustering of words), and for allowing to track navigational strategies, etc. - (Semi-) automatic induction of the link type (e.g. synonym, hypernym, meronym, association, collocation, ...) - Use of corpora and patterns (data-mining) for getting access to words, their uses, and combinations (associations) 5) Dictionary access (navigation and search strategies), interface issues - Semantic-based search - Search (simple query vs multiple words) - Context-dependent search (modification of usersí goals during search) - Recovery - Navigation (frequent navigational patterns or search strategies used by people) - Interface problems, data-visualisation IMPORTANT DATES - Deadline for paper submissions: October 15, 2012 - Notification of acceptance: November 5, 2012 - Camera-ready papers due: November 15, 2012 - Workshop date: December 15, 2012 SUBMISSION INSTRUCTIONS see: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html INVITED SPEAKER: Alain Polguère (Université de Lorraine & ATILF CNRS, France) PROGRAM COMMITTEE * Barbu, Eduard (Universidad de Jaén, Spain) * Barrat, Alain (Centre de physique théorique, CNRS & Aix-Marseille University) * Bilac, Slaven (Google Tokyo, Japan) * Bel Enguix, Gemma (LIF, Aix-Marseille University, France) * Bouillon, Pierrette (TIM, Faculty of Translation and Interpretating, Geneva, Switzerland) * Cook, Paul (The University of Melbourne, Australia) * Cristea, Dan (University of Iasi, Romania) * Fairon, Cedrick (CENTAL, Université catholique de Louvain, Belgium) * Fazly, Afsaneh (University of Toronto, Canada) * Fellbaum, Christiane (University of Princeton, USA) * Ferret, Olivier (CEA LIST, Palaiseau, France) * Fontenelle, Thierry (Translation Centre for the Bodies of the European Union, Luxemburg) * Granger, Sylviane (Université Catholique de Louvain, Belgium) * Grefenstette, Gregory (3DS Exalead, Paris, France) * Hansen-Schirra, Silvia (University of Mainz, FTSK, Germany) * Heid, Ulrich (University of Hildesheim, Germany) * Hirst, Graeme (University of Toronto, Canada) * Hovy, Ed (ISI, Los Angeles, USA) * Joyce, Terry (Tama University, Kanagawa-ken, Japan) * Kwong, Olivia (City University of Hong Kong, China) * L'Homme, Marie Claude (OLST, University of Montreal, Canada) * Lapalme, Guy (RALI, University of Montreal, Canada) * Mititelu, Verginica (RACAI, Bucharest, Romania) * Pirrelli, Vito (ILC, Pisa, Italy) * Polguère, Alain (Université de Lorraine & ATILF CNRS, France) * Rapp, Reinhard (University of Leeds, UK) * Ruette, Tom (KU Leuven, Belgium) * Schwab, Didier (LIG, Grenoble, France) * Serasset, Gilles (IMAG, Grenoble, France) * Sharoff, Serge (University of Leeds, UK) * Sinopalnikova, Anna (FIT, BUT, Brno, Czech Republic) * Sowa, John (VivoMind Research, LLC, USA) * Tiberius, Carole (Institute for Dutch Lexicology, The Netherlands) * Tokunaga, Takenobu (TITECH, Tokyo, Japan) * Tufis, Dan (RACAI, Bucharest, Romania) * Valitutti, Alessandro (University of Helsinki and HIIT, Finland) * Vossen, Piek (Vrije Universiteit, Amsterdam, The Netherlands) * Wehrli, Eric (LATL, University of Geneva, Switzerland) * Zock, Michael (LIF, CNRS, Aix-Marseille University, France) * Zweigenbaum, Pierre (LIMSI - CNRS, Orsay & ERTIM - INALCO, Paris, France) WORKSHOP ORGANIZERS and CONTACT PERSONS Michael Zock (LIF-CNRS, Marseille, France), michael.zock AT lif.univ-mrs.fr Reinhard Rapp (University of Leeds, UK), reinhardrapp AT gmx.de For more details see: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:39:24 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:39:24 +0200 Subject: Ecole: EARIA 2012, Ecole d'Automne en Recherche d'Information et Applications Message-ID: Date: Tue, 09 Oct 2012 13:15:54 +0200 From: Brigitte Grau Message-ID: <5074076A.8080005 at limsi.fr> X-url: http://www.asso-aria.org/earia2012 Il reste encore des places pour vous inscrire. APPEL A PARTICIPATION ======================================================================= École d'Automne en Recherche d'Information et Applications EARIA 2012 Organisation : ARIA (Association Francophone de Recherche d'Information et Applications), École des Mines de Saint-Etienne et Université Jean Monnet 24, 25 et 26 octobre 2012 au Couvent de La Tourette, (Éveux), à proximité de Lyon http://www.asso-aria.org/earia2012 ======================================================================= Objectifs EARIA (École d'Automne en Recherche d'Information et Application) a pour objectif principal la formation des doctorants dans le domaine de la Recherche d'Information (RI). Les cours sont organisés sur 4 demi-journées (du mercredi 24 octobre en fin de matinée au vendredi 26 octobre midi) et offrent un cadre d'échange convivial tant autour des fondements que des thèmes novateurs dans le domaine de la RI, abordés par des chercheurs européens faisant autorité dans le domaine. L'école EARIA est complémentaire de l'école ESSIR (European Summer School on Information Retrieval) organisée depuis 1990 tous les trois ans environ avant 2003 et tous les deux ans depuis 2003. EARIA a vocation à se tenir également tous les deux ans en alternance avec ESSIR et offre une occasion privilégiée de rencontres et discussions entre seniors du domaine et jeunes chercheurs, permettant ainsi à ces derniers de mieux situer leur projet de recherche. Les précédentes éditions de EARIA ont eu lieu en 2006 à Grenoble, en 2008 à Toulouse, en 2010 à Saint-Germain-au-Mont-d'Or en Rhône-Alpes et ont connu un franc succès. L'école est destinée au jeunes chercheurs issus de disciplines différentes, et comporte de ce fait deux volets : une revue des fondamentaux des disciplines liées à la Recherche d'Information tels que les modèles formels de la RI, leur mise en oeuvre, les méthodes d'évaluation, les méthodes du traitement automatique de la langue et les modèles d'apprentissage pour la RI. Outre ces aspects, des thèmes actuellement en plein essor tels que la RI sociale à travers les folksonomies et la RI communautaire ou encore la RI distribuée seront présentés. Par ailleurs les participants sont invités à présenter leur recherche au travers d'un poster lors de séances réservées à cet effet dans le but de favoriser les échanges entre participants et intervenants. ======================================================================= Programme des conférences 1. Introduction au domaine (Mohand Boughanem, IRIT, Université de Toulouse) 2. Modèles de RI (Eric Gaussier, LIG, Université de Grenoble) 3. Logiciels pour la RI (Michel Beigbeder, École des Mines de Saint-Étienne) 4. Méthodes d'évaluation (Jacques Savoy, Université de Neuchâtel) 5. RI et Apprentissage Automatique (Massih Amini, LIG, Université de Grenoble) 6. Techniques de base de TAL et leur utilisation en question-réponse et extraction d'information (Patrice Bellot, LSIS, Université Aix-Marseille) 7. Détection de sentiments (Vincent Guigue, LIP6, Univ. Pierre&Marie Curie) 8. RI contextuelle et mobile (Lynda Tamine-Lechani, IRIT, Université de Toulouse) 9. RI sociale (Maarten de Rijke, Univ. Amsterdam) ======================================================================= Le tarif des inscriptions est fixé à: - Doctorants : 250 euros, Enseignants-chercheurs 350 euros, pour une inscription avant le 21/09/2012, - Doctorants : 300 euros, Enseignants-chercheurs 400 euros, pour une inscription après le 21/09/2012, - Les frais d'adhésion à ARIA sont de : 50 euros pour une inscription individuelle, 100 euros pour l'inscription d'un organisme. Les frais d'inscription comprennent l'hébergement, cinq repas et les pauses. Pour s'inscrire, le formulaire est disponible sur le site. ======================================================================= Comité scientifique : Présidente: Brigitte Grau, LIMSI-CNRS et ENSIIE Michel Beigbeder, École des Mines de Saint-Étienne Mohand Boughanem, IRIT, Université de Toulouse Sylvie Calabretto, LIRIS, INSA de Lyon Éric Gaussier, LIG, Université de Grenoble Comité d'organisation : Président: Michel Beigbeder, École des Mines de Saint-Étienne Bissan Audeh, École des Mines de Saint-Étienne Mathias Géry, LaHC, Université de Saint-Étienne Philippe Jaillon, École des Mines de Saint-Étienne Christine Largeron, LaHC, Université de Saint-Étienne Mihaela Mathieu, École des Mines de Saint-Étienne Yulian YANG, INSA de Lyon ======================================================================= ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:42:56 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:42:56 +0200 Subject: Job: 2 post-doc positions in multilingual text mining and media monitoring at the JRC (Reminder) Message-ID: Date: Tue, 09 Oct 2012 17:20:12 +0200 From: Ralf Steinberger Message-id: <05cc01cda631$94044ad0$bc0ce070$@jrc.ec.europa.eu> X-url: http://recruitment.jrc.ec.europa.eu/ REMINDER: Deadline for application is 17 October 2012 Readers on this list may be interested in the following three-year post-doc positions to work at the European Commission’s Joint Research Centre (JRC) in Ispra, at the Lago Maggiore in Italy. Code: 2012-IPR-G-30-000-00741 - CAT 30 - ISPRA Multi-lingual and multi-functional information extraction methods and tools Code: 2012-IPR-G-30-000-00743 - CAT 30 - ISPRA Engineering Media Monitoring Software Solutions Applicants need to hold a Ph.D. or have at least five years of relevant post-graduate experience. URL with job details: http://recruitment.jrc.ec.europa.eu/ (select IPSC institute) URL with conditions: http://ec.europa.eu/dgs/jrc/index.cfm?id=4790 Application deadline: 17.10.2012 Duration: 36 months Type of contract: category 30 grant holder Action: Open Source Text Information Mining and Analysis (OPTIMA) Scientific website: http://langtech.jrc.ec.europa.eu EMM online applications: http://emm.newsbrief.eu/overview.html Information on the team and its work: The JRC’s Global Security and Crisis Management Unit (GlobeSec) supports the Union's policies to strengthen the EU's resilience to crises and disasters as well as the EU's aim to promote stability and peace through its research in crisis management technologies and in information mining and analysis. The Unit's OPTIMA (Open Source Text Information Mining and Analysis) Action develops innovative solutions for retrieving and extracting information from the Internet, and especially from online news and social media. It serves many Commission Services, EU agencies and some EU Member State authorities. The core of this action is the Europe Media Monitor (EMM). EMM gathers and analyses about 150,000 online news articles per day in 50 languages. The technologies that have been developed so far in the OPTIMA Action include multilingual tools for the following tasks: event extraction; automatic entity recognition, classification and disambiguation; name variant mapping; co-reference resolution; quotation recognition; opinion mining; multi-document summarisation; document clustering and classification; machine translation; information aggregation, including across languages; and more. Rule-based, as well as Machine Learning and hybrid methods are being used to achieve these goals. These techniques are already to some extent being deployed in several operational applications (see http://emm.newsbrief.eu/overview.html) and part of the work would be in support of these applications. The on-going research has a strong focus on applicability in a highly multi-lingual environment. The work is very practical and goal-oriented. Hands-on experience with developing tools is thus essential. Research results are expected to be used operationally. The candidate is expected to contribute to scientific publications of the research results. Ralf Steinberger European Commission - Joint Research Centre (JRC) IPSC - GlobeSec - OPTIMA (OPensource Text Information Mining and Analysis) 21027 Ispra (VA), Italy ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:48:12 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:48:12 +0200 Subject: Appel: ACL 2013 Student Research Workshop Message-ID: Date: Tue, 9 Oct 2012 19:53:41 +0200 From: "Vecchi, Eva Maria" Message-ID: <799A1AC9-5CFC-4269-A8B0-976BC123FA2D at unitn.it> X-url: http://sites.google.com/site/aclsrw2013/ Call for Papers ACL 2013 Student Research Workshop 5-7 August, 2013, Sofia, Bulgaria ** Submission deadline: Sunday, March 3, 2013 ** http://sites.google.com/site/aclsrw2013/ General Invitation for Submissions The ACL Student Session provides a venue for student researchers investigating topics in Computational Linguistics and Natural Language Processing to present their research, to meet potential advisors, and to receive feedback from the international research community. The Student Session's goal is to aid students at multiple stages of their education: from those in the final stages of undergraduate training to those who are preparing their graduate thesis proposal. Towards this goal, we invite papers in two separate categories. 1. Thesis/Research Proposals: This category is appropriate for experienced students who wish to get feedback on their proposal and broader ideas for the field in order to strengthen their final research. 2. Research Papers: Most appropriate for students who are new to academic conferences. Papers in this category can describe completed work or work in progress with preliminary results. Subject to the availability of established researcher volunteers, each accepted paper will be assigned a mentor, who will provide feedback on the work to the student at the conference. Separately, the committee will do its best to assign pre-submission mentors to students who wish to get feedback before the paper deadline. This service will be available on a first come, first served basis and does not guarantee acceptance into the workshop. Students who wish to take advantage of this opportunity should let the the co-chairs know via email no later than Saturday, December 29, 2012 and should submit a paper draft no later than Friday, January 18, 2013. Topics Relevant topics for the workshop include, but are not limited to, the following areas (in alphabetical order): - Cognitive modeling of language processing and psycholinguistics - Dialogue and interactive systems - Discourse, coreference and pragmatics - Evaluation methods - Information retrieval - Language resources - Lexical semantics and ontologies - Low resource language processing - Machine translation: methods, applications and evaluation - Multilinguality in NLP - NLP applications - NLP and creativity - NLP for the languages of Central and Eastern Europe and the Balkans - NLP for the Web and social media - Question answering - Semantics - Sentiment analysis, opinion mining and text classification - Spoken language processing - Statistical and Machine Learning methods in NLP - Summarization and generation - Syntax and parsing - Tagging and chunking - Text mining and information extraction - Word segmentation Submission Requirements Thesis/Research Proposals may contain previously published work and must include specific research directions. They may also be in the style of a position paper that surveys and critiques existing literature, but must suggest future research directions. Proposals may only have one author, who must be a student. Research Papers must describe original completed work or work in progress and should clearly indicate directions for future research wherever appropriate. The first author of multi-author papers MUST be a student, though it is not required that additional co-authors be students. Research Papers must not have been presented at any other meeting with publicly available published proceedings. Students who have already presented at a past ACL/EACL/NAACL Student Research Workshop may not be the first author on a Research Paper (though they may still be the first author of a Thesis/Research Proposal). They should instead submit their papers either to the main conference or to the Thesis/Research Proposal track. Students must indicate whether a paper has been submitted to another conference or workshop. Electronic Submission Submission is electronic, using the Softconf submission software (URL to be announced in subsequent versions of this call) Submission Format Both paper and proposal submissions to the Student Session should follow the standard two-column format of the ACL 2013 proceedings. Submissions should have no more than six (6) pages excluding references (LaTeX and Microsoft Word style files will be available on the main conference website http://acl2013.org/). Submissions must conform to the official ACL 2013 style guidelines and they must be submitted as a PDF file. The reviewing process will be double-blind; therefore, please ensure that the paper does not include the authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ...". Further guidelines are provided in the template style files. Multiple-submission policy Papers that have been or will be submitted to other meetings or publications must indicate this at submission time. Authors of papers accepted for presentation at ACL 2013 must notify the program chairs by April 21, 2013 as to whether the paper will be presented. All accepted papers must be presented at the conference in order for them to appear in the proceedings. We will not accept for publication or presentation papers that overlap significantly in content or results with papers that will be (or have been) published elsewhere. Authors submitting more than one paper to ACL must ensure that submissions do not overlap significantly (> 50%) with each other in content or results. Important Dates - Pre-submission mentoring service application: December 29, 2012 - Pre-submission mentoring paper deadline: January 18, 2013 - Submission deadline: March 3, 2013 - Notification of acceptance: April 24, 2013 - Camera-ready submission deadline: May 24, 2013 - Conference dates: August 5-7, 2013 (The session will be held during the main conference) Organising Committee Student Chairs - Anik Dey, The Hong Kong University of Science & Technology - Sebastian Krause, German Research Center for Artificial Intelligence - Ivelina Nikolova, Bulgarian Academy of Sciences - Eva Vecchi, Università di Trento Faculty Advisors - Steven Bethard, University of Colorado Boulder & KU Leuven - Preslav I. Nakov, Qatar Computing Research Institute - Feiyu Xu, German Research Center for Artificial Intelligence Program Committee (To be announced) Contact acl-srw-2013 at googlegroups.com ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:19:07 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:19:07 +0200 Subject: Job: CDD au TGE Adonis Message-ID: Date: Wed, 10 Oct 2012 08:53:06 +0200 From: Jean-Luc Minel Message-ID: <50751B52.9060108 at u-paris10.fr> X-url: http://dariah.eu/ X-url: http://www.tge-adonis. X-url: http://www.rechercheisidore.fr/ X-url: http://www.narcis.nl/ Dans le cadre de la mise en oeuvre de DARIAH (http://dariah.eu/), le TGE ADONIS (http://www.tge-adonis.fr/) recrute pour une mission de 3 ou 4 mois, un/e ingénieur/e de recherche CNRS contractuel, chef de projet ou expert en développement et déploiement d'applications. Mission : L'objectif principal de la mission consiste à étudier les possibilités d'interopérabilité documentaires et de requêtage entre les deux plateformes ISIDORE (http://www.rechercheisidore.fr/ ) et Narcis (http://www.narcis.nl/ ) dans le but de définir une ou plusieurs façon de requêter de façon simultané et croisé les deux plateformes (interconnexion). Dans un premier temps, il s'agira de produire une description fonctionnelle des deux systèmes, de leurs chaînes de traitement respectives et une description des API disponibles dans chacun des deux systèmes. Dans un deuxième temps, il s'agira de produire des propositions techniques et organisationnelles, nécessaires à une interconnexion (requêtes, navigation, interaction entre les métadonnées). Dans un troisième temps, il s'agira de développer différents prototypes pour illustrer les différentes solutions d'interconnexion et/ou les problèmes soulevés par ce projet. Compétences : - Bonnes connaissances sur les concepts du Linked Data et du Web sémantique (RDF, RDFa, SPARQL); - Bonnes connaissances en ingénierie logicielle et maitrise des concepts d'API ; - Capacités à prototyper (programmation java et/ou Python, et/ou Javascript, et/ou Perl, et/ou Php) pour réaliser des tests et démonstrateurs ; - Capacités à analyser des spécifications détaillées et à réaliser des analyses fonctionnelles (maitrise d'UML souhaitée) ; - Connaissances en Information scientifique et technique : Dublin Core, OAI-PMH, microformats, etc. - Maitrise de l'anglais oral et écrit. Le travail nécessite la participation à des réunions de travail en anglais et la rédaction de documents en anglais. Lieu de travail : TGE Adonis, Paris 5°. La mission nécessitera des déplacements et des brefs séjours (8 à 10 jours) à La Haye (NL) et à Lyon. Envoyer un CV à Sophie David (sophie.david at tge-adonis.fr) et Jean-Luc Minel (jean-luc.minel at u-paris10.fr) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:22:31 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:22:31 +0200 Subject: Appel: FGCS, Special Issue on Intelligent Big Data Processing Message-ID: Date: Wed, 10 Oct 2012 15:31:59 +0800 From: cfp at grid.chu.edu.tw Message-Id: <201210100731.q9A7Vxr4004958 at grid.chu.edu.tw> X-url: http://www.journals.elsevier.com/future-generation-computer-systems/calls-for-papers/special-issue-on-intelligent-big-data-processing/ Future Generation Computer Systems (http://ees.elsevier.com/fgcs/) Special Issue on Intelligent Big Data Processing http://www.journals.elsevier.com/future-generation-computer-systems/ calls-for-papers/special-issue-on-intelligent-big-data-processing/ == Overview == Nowadays, data comes from sensors, lab experiments, simulations, individual archives, enterprise and Internet in all scales and formats. This data flood has outpaced our capability to process, analyze, store and understand these datasets. Such rapid expansion is also accelerated by the dramatic increase in acceptance of social media and networking applications. Furthermore, It can be foreseen that Internet of things (IoT) applications will raise the scale of data to an unprecedented level. People and devices (from home coffee machines to cars, to buses, railway stations and airports) are all loosely connected. Trillions of such connected components will generate a huge data ocean, and valuable information must be discovered from the data to help improve quality of life and make our world a better place. This special issue intends to tackle such data deluge issues intelligently, efficiently and effectively. Areas of interest for this special issue include the following topics: - Intelligent data mining techniques - Dynamic data redistribution - Scalable and distributed algorithms - New programming models for large data - Locality aware data processing - NoSQL - Data filtering techniques for Internet of Things - DaaS, Data as a Service - MapReduce in hybrid clouds - Asynchronous data processing - Opportunistic data processing in hybrid clouds - Intelligent storage and load balancing - Data migration and synchronization (between private and public clouds) - Multi-tier MapReduce programming model - Dynamic Mapper/Reducer join/leave - Data decomposition base on GPU / CPU availability - Dynamic provisioning for big data processing - System Issues related to large datasets == Schedule == Manuscript due date: December 20, 2012 First round notification: March 1st, 2013 Submission due date of revised paper: April 15, 2013 Notification of acceptance: May 15, 2013 Submission of final revised paper: June 10, 2013 Publication: September 2013 (tentative) == Submission & Review Instruction == Submitted articles must not have been previously published or currently submitted for journal publication elsewhere. Submissions must be directly sent via the FGCS submission web site at http://ees.elsevier.com/fgcs/login.asp (please select the track - SS: Intelligent Big Data - Hsu). Paper submissions must conform to the layout and format guidelines in Future Generation Computer Systems. For a full and complete Guide for Authors, please refer to: http://www.elsevier.com/fgcs Each submitted paper will be reviewed by at least three Editorial reviewers Criteria and Evaluation for acceptance of paper: - Significance to the journal's audience - Relevance to this special issue - Overall recommendation on the paper - Optional confidential comments to the Editorial Committee Quality of the paper including originality, technical depth, significance of results, adequacy of priori works referenced, overall organization, clarity and readability, satisfactory English writing, sufficient support for assertions and conclusion, appropriate title, abstract adequately summarizes the paper, introduction provides proper orientation, clear tables and figures. Guest Editors Ching-Hsien (Robert) Hsu Department of Computer Science and Information Engineering, Chung Hua University, Taiwan Email: chh at chu.edu.tw http://www.chu.edu.tw/~chh ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:24:36 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:24:36 +0200 Subject: Appel: PAKDD 2013, Deadline Further Extended to 15 Oct. 2012 Message-ID: Date: Wed, 10 Oct 2012 09:38:05 +0100 From: CFP PAKDD2013 Message-ID: X-url: http://pakdd2013.pakdd.org/ ------------------------------------------------------------------------ Due to many requests, PAKDD2013 organizers have seriously considered and decided to extend the submission deadline to 23:59 pm, October 15, 2012 (PDT). ------------------------------------------------------------------------ Call For Papers PAKDD 2013 The 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining Gold Coast, Australia Conference Website http://pakdd2013.pakdd.org/ Submission System https://cmt.research.microsoft.com/PAKDD2013/ Important Dates Paper submission due: Oct. 15 (Mon). 2012 Notification to author: Dec. 19 (Wed). 2012 Camera ready due: Jan. 6 (Sun). 2013 *[23:59:59 Pacific Time] ============================================================== Conference Scope The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference in the areas of data mining and knowledge discovery (KDD). It provides an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all KDD related areas, including data mining, data warehousing, machine learning, artificial intelligence, databases, statistics, knowledge engineering, visualization, and decision-making systems. The conference calls for research papers reporting original investigation results and industrial papers reporting real data mining applications and system development experience. ============================================================== Topics The topics of relevance for the conference papers include but not limited to the following: * Novel models and algorithms * Clustering * Classification * Ranking * Association analysis * Anomaly detection * Data pre-processing * Feature extraction and selection * Mining heterogeneous data * Mining multi-source data * Mining sequential data * Mining spatial and temporal data * Mining unstructured and semi-structured data * Mining graph and network data * Parallel, distributed, and high performance data mining on the cloud platform * Privacy preserving data mining * Mining high dimensional data * Mining uncertain data * Mining imbalanced data * Mining dynamic/streaming data * Statistical methods for data mining * Visual data mining * Interactive and online mining * Mining behavioral data * Mining multimedia data * Mining scientific databases * Ubiquitous knowledge discovery * Agent-based data mining * Mining social networks * Financial data mining * Fraud and risk analysis * Security and intrusion detection * Opinion mining and sentiment analysis * Post-processing including quality assessment and validation * Integration of data warehousing, OLAP and data mining * Human, domain, organizational and social factors in data mining * Applications to healthcare, bioinformatics, computational chemistry, * Eco-informatics, marketing, online gaming, etc All paper submissions will be handled electronically. Detailed instructions are provided on the conference home page. ============================================================== Paper Submission Each submitted paper should include an abstract up to 200 words. It should also adhere to the double-blind review policy and not longer than 12 single-spaced pages with 10pt font size. Authors are strongly encouraged to use Springer LNCS/LNAI manuscript submission guidelines (available at http://www.springer.de/comp/lncs/authors.html) for their initial submissions. All papers must be submitted electronically through Microsoft's Conference Management Service (CMT) in PDF format only. The submitted papers must not be previously published anywhere, and must not be under consideration by any other conferences or journal during the PAKDD review process. Submitting a paper to the conference means that if the paper were accepted, at least one author will attend the conference to present the paper. For no-show authors, their affiliations will receive a notification. The program committee chairs are not allowed to submit papers to the conference for a fair review process. All papers will be double-blind reviewed by the Program Committee on the basis of technical quality, relevance to data mining, originality, significance, and clarity. Papers that do not comply with the Submission Guidelines will be rejected without review. The best papers will be selected to be included in the special issues of Knowledge and Information Systems (KAIS) and International Journal of Data Mining and Bioinformatics (IJDMB). Before submitting your paper, please carefully read and agree with the PAKDD submission policy and no-show policy: http://pakdd.togaware.com/policy.html ============================================================== Conference Officers Honorary Co-chairs * Jiawei Han. University of Illinois at Urbana-Champaign,USA * Ramamohanarao Kotagiri, University of Melbourne, Australia * Graham Williams. Australia Taxation Office, Australia Conference Co-chairs * Hiroshi Motoda, AFOSR/AOARD and Osaka University, Japan * Longbing Cao. University of Technology, Sydney, Australia Program Committee Co-chairs * Jian Pei. Simon Fraser University, Canada * Vincent S. Tseng. National Cheng Kung University, Taiwan Local Arrangement Co-chairs * Vladimir Estivill-Castro. Griffith University (Gold Coast), Australia * Xue Li, University of Queensland, Australia * Richi Nayak, Queensland University of Technology, Australia * Xinhua Zhu, University of Technology, Sydney, Australia Workshop Co-chairs * Jiuyong Li. University of Sourth Australia, Australia * Kay Chen Tan. National University of Singapore, Singapore * Bo Liu. Guangdong University of Technology, China Tutorial Co-chairs * Tu Bao Ho. Japan Advanced Institute of Science and Technology, Japan * Mengjie Zhang. Victoria University of Wellington, New Zealand Award Chair * Chengqi Zhang, University of Technology, Sydney, Australia Sponsorship Co-chair * Yue Xu, Queensland University of Technology, Australia Publicity Co-chairs * P.Krishna Reddy, The International Institute of Information * Technology, Hyderabad, India * Yifeng Zeng, Aalborg University, Denmark * Xin Wang, University of Calgary, Canada * Zhihong Deng, Peking University, China ============================================================== Further Information For further information, please contact the Program Committee Chairs by pakdd13-program at pakdd.org . General inquiries * Longbing Cao University of Technology Sydney, Australia Email: pakdd13 at pakdd.org Phone: (61)2-9514-4477 Fax: (61)2-9514-1807 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:29:03 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:29:03 +0200 Subject: Livre: Recherche d'information contextuelle, assistee et personnalisee Message-ID: Date: Wed, 10 Oct 2012 14:47:10 +0200 From: Patrice Bellot Message-Id: <3E109703-5A05-4DBB-93B2-E86D7B77647F at univ-amu.fr> X-url: http://www.eyrolles.com/Informatique/Livre/recherche-d-information-contextuelle-assistee-et-personnalisee-9782746225831 Bonjour, Vous trouverez ci-dessous le sommaire et une partie de l'introduction de chaque chapitre du livre : "Recherche d'information contextuelle, assistée et personnalisée" paru dans la collection "Recherche d'information et web" chez Hermès-Lavoisier (302 pages - ISBN13 : 978-2-7462-2583-1) http://www.eyrolles.com/Informatique/Livre/recherche-d-information-contextuelle-assistee-et-personnalisee-9782746225831 - Contexte et robustesse - Recherche d'information contextuelle : le cas des requêtes - Robustesse et analyse syntaxique - La recherche d'information face à des corpus et requêtes bruités - Questions-réponses sur des documents audio - Personnalisation et collaboration - Recherche d'information et modélisation usagers - Recherche d'information collaborative - Difficultés de lecture, dyslexies et recherche d'information - Assistance et aide à la navigation - Navigation dans les documents audio par le résumé automatique - Interaction - Prédiction de mots et saisie de requêtes sur interfaces limitées : dispositifs mobiles et aide au handicap bien cordialement, Patrice Bellot Aix-Marseille Université (AMU) - LSIS / CNRS ======================================================================== Chapitre 1 : Recherche d'information contextuelle : le cas des requêtes Josiane MOTHE (IRIT, Toulouse) ======================================================================== Les systèmes de recherche d'information (RI) actuels sont souvent généralistes : ils mettent en œuvre les mêmes mécanismes et les mêmes méthodes de traitement de l'information, quels que soient le contexte de recherche, l'utilisateur, son type de besoin d'information et l'usage qu'il souhaite faire de l'information retrouvée. La RI contextuelle vise à modéliser les différents aspects du contexte et leur variété pour les intégrer dans le processus de recherche. L'aspect contextuel fait référence à des connaissances implicites ou explicites concernant les intentions de l'utilisateur, l'environnement de l'utilisateur et le système lui-même. L'hypothèse est que rendre explicites certains éléments du contexte de la RI pourrait améliorer les performances des systèmes de RI. Dans ce chapitre, nous ne prétendons pas aborder l'ensemble des éléments associés au contexte ; nous nous focalisons plutôt sur un des aspects de celui-ci qui concerne les requêtes. Les requêtes correspondent au moyen par lequel l'utilisateur exprime explicitement son besoin en information au système. Cet aspect du contexte de la recherche à lui seul revêt de multiples facettes que nous aborderons par la suite. ======================================================================== Chapitre 2 : Robustesse et analyse syntaxique Philippe BLACHE et Stephane RAUZY (LPL, Aix-en-Provence) ======================================================================== Pour le traitement automatique des langues, la robustesse d'une application se mesure à sa capacité à résister aux erreurs. Celles-ci peuvent provenir soit d'une défaillance du système, soit d'une difficulté linguistique inhérente au texte ou à l'énoncé traité. Dans un cas comme dans l'autre, un système robuste devra être capable de poursuivre son traitement malgré l'erreur. La question de la robustesse se pose de façon particulière dans le domaine de la recherche d'information [LEW 96, STR 94]. En effet, bon nombre de techniques de RI n'exploitent finalement que peu d'informations linguistiques et ne nécessitent pas véritablement d'analyse linguistique détaillée. On remarque cependant que les progrès réalisés dans le domaine du traitement sémantique commencent à utiliser des analyses dépassant le niveau lexical, nécessitant des techniques plus sophistiquées permettant d'effectuer des traitements prenant en compte les unités syntaxiques ainsi que les relations les reliant. La RI est donc également concernée par cette évolution. De plus, ce domaine pose des problèmes spécifiques pouvant nécessiter des analyses plus fines (compréhension de questions, requêtes multimodales, comparaison de textes, etc.). Nous sommes finalement aujourd'hui confrontés, en RI comme dans les autres domaines du traitement des langues, à cette question de la robustesse, nécessitant le traitement de données disparates, non canoniques, partielles etc. Nous proposons dans ce chapitre d'aborder cette question en commençant par décrire plus précisément les situations conduisant les systèmes à des erreurs. L'étude des besoins spécifiques à la RI nous permettra d'identifier plus clairement les points à traiter pour proposer un traitement robuste permettant une analyse linguistique fine. Nous nous concentrerons sur la question de l'analyse syntaxique, qui constitue une étape essentielle dans les traitements en profondeur. Ce domaine a longtemps été laissé de côté dans les systèmes, en partie à cause de son coût, mais également de son manque de robustesse. Nous présenterons ici quelques techniques permettant de répondre à ces besoins. Nous décrirons en particulier une approche basée sur les contraintes offrant l'avantage d'être à la fois robuste, cohérente d'un point de vue formel, et capable de répondre aux évolutions futures notamment en termes de traitement de la multimodalité. ======================================================================== Chapitre 3 : La recherche d'information face à des corpus et requêtes bruités Laurianne SITBON (QUT - Brisbane, Australie) ======================================================================== Ce chapitre s'intéresse à la fois aux approches d'évaluation de systèmes de recherche d'information traitant de corpus ou de requêtes bruitées et aux techniques proposées dans la littérature pour intégrer le bruit au sein des modèles d'accès à l'information. En particulier, la transition entre les systèmes de transcription (de l'audio vers le texte, du manuscrit vers le texte, du texte erroné vers le texte) et le cœur des systèmes de recherche d'information doit s'appuyer sur une interprétation probabiliste des deux systèmes interconnectés. Des approches adaptées à l'évaluation et à la modélisation robuste des systèmes de recherche d'information complexes tels que les systèmes de questions réponses y sont présentées. Le volume et la variété de l'information accessible est en constante augmentation. La quantité d'information disponible encourage le développement d'approches de plus en plus complexes et ciblées pour la recherche d'information, tels que les systèmes de questions réponses (chapitre 4), les systèmes de recommandation (chapitre 6) ou encore les systèmes basés sur la classification. La variété des types d'information fait diminuer la certitude avec laquelle les données disponibles peuvent être interprétées par les systèmes, en s'éloignant des formats textuels standardisés. Cependant la plupart des systèmes se ramènent à du texte normalisé avant de procéder à l'analyse ou l'indexation des données ou des requêtes. Lorsque les performances des systèmes évalués en conditions standardisées chutent en conditions réelles, la part du bruit dans la baisse de la qualité des résultats n'est pas toujours clairement établie. En particulier, une question majeure est de connaître quelles sont les conséquences du bruit dans les corpus ou dans les requêtes sur les systèmes de recherche d'information. Dans ce cha- pitre, nous nous proposons d'examiner les évaluations menées ainsi que les solutions proposées pour des systèmes de recherche d'information ad hoc avant de proposer des méthodologies d'évaluation et de modélisation adaptées pour les systèmes d'information complexes. Les systèmes de questions réponses seront pris à titre d'exemple pour le traitement de requêtes non standards. Après une introduction présentant la nature du bruit rencontré par les systèmes de recherche d'information modernes, diverses analyses de l'impact du bruit sur l'efficacité des systèmes sont présentées dans la seconde section. Dans la troisième section, une approche modulaire pour l'analyse de l'impact de requêtes bruitées sur un système de questions réponses est proposée. La quatrième section présente les différentes approches proposées dans la littérature pour la prise en compte de corrections probabilistes au bruit. La dernière section introduit un système de correction pour des requêtes bruitées ainsi qu'une approche probabiliste pour des systèmes de recherche d'information complexes tels que les systèmes de questions réponses. Une nouvelle approche posant les conditions de l'évaluation des systèmes de transcription pour une interprétation incertaine est finalement proposée. ======================================================================== Chapitre 4 : Questions-réponses sur des documents audio Olivier GALIBERT, Sophie ROSSET et Lori LAMEL (LIMSI, Paris Orsay) ======================================================================== L'objectif de ce chapitre est de dresser un état des lieux concernant la problématique de la recherche d'information précise dans des documents audio. De plus en plus de documents et de données sont orales et disponibles. Qu'il s'agisse de journaux radio-télédiffusés, d'enregistrements de séminaires ou de réunions, de podcasts, ils sont une source d'information importante. Permettre la recherche d'information dans ce type de données parait de plus en plus nécessaire. Dans la famille des outils d'aide à l'accès à l'information, il y a les systèmes de questions-réponses. Dans ce cadre, depuis quelques années (2007), des travaux sont réalisés pour permettre une recherche efficace sur ce type de données. Les systèmes de questions-réponses peuvent être vus comme une extension des systèmes de recherche d'information qui permet à un utilisateur d'effectuer une recherche d'information à partir de mots clefs. En retour, il obtient une liste de documents, ou de pointeurs vers des documents, qu'il doit consulter pour trouver l'information précise qu'il recherche. Les systèmes de questions-réponses ont eux pour objectif de permettre à un utilisateur de poser sa question en langue, à l'écrit ou à l'oral, de manière précise et d'obtenir en retour une réponse précise, éventuellement accompagnée d'un document ou d'un extrait de document qui justifie ou accompagne la réponse. Cela suppose que les systèmes de questions-réponses analysent la question, en comprennent le sens, analysent les documents et en extraient la réponse appropriée. ======================================================================== Chapitre 5 : Recherche d'information et modélisation usagers Guillaume CABANAC, Max CHEVALIER, Christine JULIEN, Gilles HUBERT, Chantal SOULE-DUPUY (IRIT, Toulouse) & Céline CLAVEL (LIMSI, Paris Orsay) & Alexandra CIACCIA (PPCC, Paris Nanterre) & André TRICOT (CLLE, Toulouse) ======================================================================== La genèse de ce chapitre fait suite à une réflexion sur la place de l'usager dans le développement de systèmes d'information informatisés menée de façon conjointe par des membres de deux communautés pouvant apporter des éclairages spécifiques et complémentaires (informatique et ergonomie cognitive). À la base, pour tous, un usager est une personne qui, dans un contexte donné (métier, personnel...) a besoin (ou doit se servir) d'un système informatisé (un logiciel quelconque, ou un système de recherche d'information en l'occurrence ici) pour réaliser une tâche avec un objectif spécifique. Concevoir un tel système revient à répondre au moins aux questions de base suivantes : Qui est l'usager ? Où se trouve-t-il ? Que veut-il faire ou que cherche-t-il ? Comment et pourquoi ? Cependant, pour répondre à ces questions et pour caractériser l'usager, chacune de ces deux communautés appréhende l'usager différemment. Ce chapitre correspond à une synthèse de l'état de cette réflexion sur la modélisation usager, dans le cadre d'une démarche de recherche d'information (RI). Cette réflexion a été menée conjointement par des membres des deux communautés. Ce chapitre propose des recommandations générales relatives à la prise en compte des usagers de systèmes de recherche d'information (SRI). Dans le même temps, il vise à fournir des connaissances générales utiles à l'ergonomie, c'est-à-dire des connaissances utiles pour évaluer les SRI d'un point de vue cognitif pour les améliorer, voire pour améliorer le processus de conception de ces outils. Afin d'illustrer la prise en compte des usagers dans les SRI, la section 2 traite des approches classiques de modélisation de l'usager développées en informatique (et de la conception de SRI), mais également des applications de ces modèles. La section 3 présente les résultats des études menées en ergonomie cognitive sur l'influence des caractéristiques de la tâche, de l'outil et de l'usager sur l'utilisation d'un SRI. Comme synthèse des sections 2 et 3, la section 4 discute de la complémentarité des deux approches et des différences de point de vue. Elle dresse un bilan des limites et des enjeux de la prise en compte de l'usager dans les processus de RI en se basant sur les observations tirées des différents points de vue (de l'informatique et des sciences cognitives). ======================================================================== Chapitre 6 : Recherche d'information collaborative Nathalie DENOS (LIG, Grenoble) ======================================================================== La recherche d'information présente une dimension sociale très forte. On envoie à un collègue une référence intéressante ; on choisit de regarder d'abord la vidéo la plus souvent téléchargée ; devant un besoin d'information dans un domaine que l'on connait mal, on appelle à l'aide une personne compétente pour formuler la requête ; on se documente à plusieurs sur un thème afin de préparer un exposé ; on se réfère aux recommandations d'un site marchand pour trouver des idées de livres à acheter. Ce sont autant de manifestations de la nature sociale de la recherche d'information. Ce chapitre présente un tour d'horizon des avancées dans le domaine de la recherche d'information collaborative sous toutes ses formes. ======================================================================== Chapitre 7 : Difficultés de lecture, dyslexies et recherche d'information Patrice BELLOT (LSIS, Marseille) ======================================================================== S'il existe de nombreux travaux autour de la prise en compte du contexte en recherche d'information (voir chapitre 1) et de leur personnalisation (voir chapitre 5), de grandes lacunes concernent l'adaptation à des utilisateurs aux capacités de lecture limitées. Il peut s'agir de personnes atteintes de pathologies langagières (par exemple une dyslexie rendant la lecture lente et complexe) mais aussi de personnes ne maîtrisant pas suffisamment la langue d'un document en consultation ou face à un contenu dont l'expertise nécessaire à sa compréhension est trop élevée. La personnalisation de la recherche d'information en parallèle de la prise en compte des performances de lecture individuelles est l'une des problématiques majeures d'une société où l'accès à l'information passe de plus en plus par l'Internet, sans médiation humaine susceptible d'atténuer les différences entre les individus. Dans ce chapitre, nous allons tout d'abord nous intéresser aux modèles cognitifs de la lecture de manière à relever l'ensemble des critères qui pourraient permettre d'estimer au mieux la notion de lisibilité. Ensuite, nous ferons référence aux principaux travaux qui ont abordé le problème de l'estimation automatique de la lisibilité d'un texte et nous proposerons une manière d'exploiter concrètement la lisibilité au sein d'un système de recherche d'information. Puis nous définirons la, ou plutôt les dyslexies comme sujet d'étude. En effet, s'il existe un continuum évident depuis la personne analphabète ou illettrée jusqu'au lecteur expert qui peut être reflété par les nombreux tests de lecture disponibles, nous avons choisi dans ce chapitre de nous concentrer sur les dyslexies. Elles touchent significativement toutes les franges de la population et correspondent à un handicap pour lequel il n'est pas nécessaire de concevoir de dispositifs de remédiation trop important ni invasif. Les propositions des premières sections du chapitre serviront de base à la définition d'une mesure de lisibilité spécifique et qui ouvre des perspectives intéressantes pour une adaptation de la recherche d'information. ======================================================================== Chapitre 8 : Navigation dans les documents audio par le résumé automatique Benoit FAVRE (LIF, Marseille) ======================================================================== Avec la facilité d'enregistrer et de stocker des données audio, il devient urgent de pouvoir manipuler ces données avec la même facilité que pour des données textuelles. L'avènement des baladeurs numériques, par exemple, a fait émerger l'écoute d'émissions de radio-amateurs (podcasts), et de livres lus, disponibles à la demande sur Internet. Même si ces documents sont souvent consommés comme des émissions de radio, leur archivage est généralisé et il n'existe pas de solution pour les retrouver par leur contenu. Seules des métadonnées créées par leurs auteurs permettent d'y accéder. Dans de nombreux domaines, des conversations sont enregistrées et archivées. Les services client par téléphone, par exemple, étudient a posteriori le contenu des conversations entre agents et usagés pour améliorer leur services. Dans les domaines légaux et financiers, de nombreuses conversations sont enregistrées pour assurer une traçabilité des décisions. Toute réunion de travail peut être potentiellement enregistrée pour permettre aux participants de retrouver une information orale, ou à d'autres de se tenir au courant de l'avancement des sujets discutés. Bien que l'enregistrement et l'archivage de documents audio soient très développés, il n'existe que peu de moyens de structurer, indexer et retrouver l'information qu'ils contiennent. La navigation dans les documents audio est un problème omniprésent dû à la nature éphémère du son. En effet, la lecture du son est continue dans le temps et alors que l'on peut identifier un objet en y jetant un coup d'œil, il faut écouter un son dans son intégralité pour l'identifier. Il semble plus difficile de localiser des événements dans le temps que d'utiliser le retour continu de la vision pour localiser des objets dans l'espace. Il en résulte une difficulté à développer des interfaces efficaces pour accéder au contenu de documents audio. Dans ce chapitre, nous allons tout d'abord lister l'état de l'art de la navigation et du résumé dans les documents audio, puis nous détaillerons une expérience prouvant l'utilité du résumé de parole. Deux applications seront alors explicitées pour illustrer une meilleure capture du besoin utilisateur à l'aide de mots-clés et une navigation dans des documents s'étalant sur une grande durée temporelle. ======================================================================== Chapitre 9 : Interaction Mountaz HASCOET (LIRMM, Montpellier) ======================================================================== L'exploration rapide d'ensembles d'informations inconnues, avec la mise en évidence de relations, de structures, de similarités, de répétitions ou de différences au sein de ces informations peut-être abordée par différents modèles d'interaction. L'interaction rend possible l'exploitation réelle de vues d'ensembles préalablement calculées car l'être humain est particulièrement habile à extraire des informations d'un environnement s'il peut agir dessus, contrairement à un environnement qu'il ne pourrait qu'observer de manière passive. Selon l'approche écologique de la perception due au psychologue Gibson [GIB 79], la perception est indissociable de l'action : il faut agir pour percevoir et il faut percevoir pour agir. On parle de couplage (ou boucle) action-perception. De plus, la perception de notre environnement consiste à extraire des flux perçus (comme le flux visuel) des invariants. Par exemple, lorsque l'on se déplace, la direction du déplacement est donnée par le seul point immobile dans le flux visuel. Grâce à l'interaction sur les données, l'utilisateur peut agir sur ce qu'il perçoit et, par l'extraction d'invariants, mieux comprendre la nature des données ou de leur processus de représentation. Nous commencerons par un rapide survol de l'analyse de l'interaction dans le domaine lié à la recherche et à l'exploitation d'informations et nous poursuivrons par la présentation des styles d'interaction mis en œuvre en présentant les approches des plus classiques aux plus novatrices : interaction à facettes, filtrage dynamique, brossage, interfaces zoomables, interfaces déformables et enfin interaction distribuée. ======================================================================== Chapitre 10 : Prédiction de mots et saisie de requêtes sur interfaces limitées : dispositifs mobiles et aide au handicap Jean-Yves ANTOINE (LI, Tours) ======================================================================== La révolution Internet est juste derrière nous qu'une nouvelle ère se profile avec autant de fulgurance : celle de l'informatique mobile et ubiquitaire. A l'opposé de l'informatique de bureau ou à domicile, l'informatique ubiquitaire (ou ambiante) met en jeu de multiples systèmes à tout moment et dans n'importe quel lieu de votre vie quotidienne. La recherche d'information est directement concernée par cette évolution. Un des usages les plus répandus des téléphones mobiles intelligents (au premier titre desquels l'IPhone) est en effet la recherche d'une information ou d'un service sur la Toile. Si cette recherche est initiée par une requête à base de mots-clés ou d'un énoncé en langue naturelle, on se retrouve dans une problématique plus large : la saisie de texte sur interface limitée. On entend par là que l'utilisateur ne dispose pas d'un clavier standard du fait des dimensions réduites du dispositif utilisé : il peut s'agir par exemple d'un clavier de téléphone à nombre de touches réduites, ou d'un clavier virtuel affiché sur un écran tactile. Dans tous les cas, la vitesse de composition des messages est ralentie par le caractère limité du dispositif d'entrée. On observe également souvent une augmentation des erreurs de saisie. L'ingénierie des langues peut proposer des outils à même de compenser ces insuffisances. C'est en particulier le cas de la prédiction linguistique, qui fait l'objet de ce chapitre : si le système est capable de prédire correctement les prochaines lettres ou mots que l'utilisateur souhaite saisir, la sélection des hypothèses correspondantes va accélérer la composition des messages et éviter certaines erreurs. Dans un premier temps, nous allons situer la problématique de l'aide à la saisie de message en décrivant les différents dispositifs d'entrée qui peuvent être utilisés dans ces usages mobiles. Cette étude nous permettra de saisir l'importance de la prédiction linguistique pour l'aide à la composition de message. Nous présenterons ensuite en détail différentes modèles de prédiction, en insistant plus particulièrement sur les techniques les plus avancées en matière d'adaptation contextuelle de la prédiction. Notre propos s'appuiera sur des résultats d'évaluation expérimentale afin de situer l'intérêt de chaque technique étudiée. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:32:46 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:32:46 +0200 Subject: Journee: Journee d'etude S'caladis et TAL, Structures enumeratives dans le discours, 8 novembre 2012, Toulouse Message-ID: Date: Thu, 11 Oct 2012 16:50:59 +0200 From: Josette Rebeyrolle Message-ID: <5076DCD3.1090202 at univ-tlse2.fr> Le jeudi 8 novembre 2012 aura lieu à l’Université de Toulouse-Le Mirail une journée d’étude ouverte à tous : Structures énumératives dans le discours (organisée par les axes S’caladis et TAL de CLLE-ERSS, UMR 5263) Lieu : Université de Toulouse-Le Mirail, Maison de la Recherche, salle D155 Programme : 9h30-10h00 : M.-P. Péry-Woodley (CLLE-ERSS, CNRS & UTM) Pourquoi s'intéresser aux structures énumératives ? 10h00-10h30 : L. Tanguy et L.-M. Ho-Dac (CLLE-ERSS, CNRS & UTM) Identification des marqueurs complexes des structures multi-échelles 10h30-11h00 : J. Rebeyrolle (CLLE-ERSS, CNRS & UTM) Exploitation de la ressource ANNODIS : le cas des clôtures de structures énumératives 11h20-11h50 : M. Vergez-Couret et M. Bras (CLLE-ERSS, CNRS & UTM) Structures énumératives en SDRT 11h50-12h20 : L. A. Johnsen (Universités de Neuchâtel et de Fribourg) Le syntagme ‘tout ça’ à l’oral en fin de liste : entre marqueur référentiel et marqueur discursif 14h00-15h00 : C. Schnedecker (LiLPa, Université de Strasbourg) Les marqueurs à ordinaux comme indice fort de structures énumératives 15h30-15h50 : M. Bras (CLLE-ERSS, CNRS & UTM) et C. Schnedecker (LiLPa, Université de Strasbourg) Dans un premier temps / en premier lieu : des marqueurs de structures énumératives ? 15h50-17h30 : L.-M. Ho-Dac, M.-P. Péry-Woodley et J. Rebeyrolle (CLLE-ERSS, CNRS & UTM) Atelier : présentation de la ressource ANNODIS Myriam Bras, Marie-Paule Péry-Woodley, Josette Rebeyrolle ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:34:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:34:25 +0200 Subject: Appel: WIMS'13, Call for Papers & Proposals Message-ID: Date: Thu, 11 Oct 2012 17:04:02 +0200 From: Plantié Michel Message-ID: <5076DFE2.4040000 at mines-ales.fr> X-url: http://aida.ii.uam.es/wims13/ (Apologies for cross-posting!) CALL FOR PAPERS International Conference on Web Intelligence, Mining and Semantics (WIMS'13) June 12-14, 2013 Madrid, Spain http://aida.ii.uam.es/wims13/ About WIMS'13: The 3rd International Conference on Web Intelligence, Mining and Semantics (WIMS'13) will be organised under the auspices of Autonomous University of Madrid, Spain. The WIMS series of conferences concerned with intelligent approaches to transform the World Wide Web into a global reasoning and semantics-driven computing machine. The conference will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Web intelligence, Web mining and Web semantics. The purpose of the WIMS'13 is: - To provide a forum for established researchers and practitioners to present past and current research contributing to the state of the art of Web technology research and applications. - To give doctoral students an opportunity to present their research to a friendly and knowledgeable audience and receive valuable feedback. - To provide an informal social event where Web technology researchers and practitioners can meet. Conference Venue: The conference will be hosted by Autonomous University of Madrid. Call for Papers/Tutorials/Posters/Workshop: Authors are invited to submit full papers, tutorial proposals, posters on all related areas. Papers exploring new directions or areas will receive a thorough and encouraging review. Areas of interest include, but not limited to: Semantics-driven information retrieval Semantic agent systems Semantic data search Collective Intelligence Social Networking and Semantic Technologies Interaction paradigms for semantic search Evaluation of semantic search User interfaces Web mining Ubiquitous computing Bio-inspired Models & the Web Large Scale Data Mining Semantic deep Web and intelligent e-Technology Representation techniques for Web-based knowledge Quality of Life Technology for Web Document Access Rule markup languages and systems Semantic 3D media and content Scalability vs. expressivity of reasoning on the Web The detailed call for contributed papers, tutorial/workshop proposals, and posters can be found at:http://aida.ii.uam.es/wims13/cfp.php How to submit: The maximum length of - research papers is at most 12 pages in ACM format - tutorial/demonstration papers is 3 to 12 pages in ACM format - poster is at most 2 pages in ACM format Please note that the submission format is MS Word or PDF. The papers must be written in English and formatted according to the ACM guidelines. Author instructions and style files can be downloaded athttp://www.acm.org/sigs/publications/proceedings-templates Authors of accepted papers are expected to attend the conference and present their work. Tutorial/demonstration proposals, poster papers and full research paper submissions must be made electronically in MS Word or PDF format through the EasyChair submission system athttps://www.easychair.org/conferences/? conf=wims13 Publication: Accepted papers/tutorials/posters will be published by ACM and disseminated through the ACM Digital Library. Selected extended papers will be invited to appear in a special issues of reputed journals in the field and also in a book published by Elsevier. Important Dates Electronic submission of research papers: December 23, 2012 Electronic submission of poster papers: December 23, 2012 Tutorial and Workshop proposals due: December 23, 2012 Notification of workshop acceptance: January 10, 2013 Notifications of tutorial acceptance: January 10, 2013 Notification of paper/poster acceptance: February 17, 2013 Registration opens: February 18, 2013 Camera-ready of accepted papers/tutorials: March 4, 2013 Deadline for paper submissions for workshops: February 7, 2013 Acceptance of papers for workshops: March 10, 2013 Camera ready workshop papers: March 30, 2013 Author registration deadline: March 30, 2013 Conference: 12-14 June,2013 Contact: David Camacho Escuela Politécnica Superior Universidad Autónoma de Madrid Francisco Tomás y Valiente, 11 , 28049, Madrid , Spain Tel/Fax: +34 91 497 51 00 E-mail:wims13 at uam.es ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:35:29 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:35:29 +0200 Subject: Ressource: ANNODIS, un corpus enrichi d'annotations discursives Message-ID: Date: Fri, 12 Oct 2012 09:47:20 +0200 From: Marie-Paule PERY-WOODLEY Message-ID: <5077CB08.4000600 at univ-tlse2.fr> X-url: http://redac.univ-tlse2.fr/corpus/annodis [English version below] Nous avons le plaisir d'annoncer que la ressource ANNODIS, corpus enrichi d'annotations discursives, est désormais disponible sur http://redac.univ-tlse2.fr/corpus/annodis. Il s'agit d'un corpus de français écrit (687 000 mots) diversifié en termes de genre, longueur et organisation discursive. Les objets annotés, qui reflètent deux approches du discours, sont les relations rhétoriques et deux types de structures multi-échelles : chaînes topicales et structures énumératives. Le corpus peut être téléchargé librement et, en ce qui concerne les structures multi-échelles, exploré en ligne grâce à un browser. L'équipe Ressource ANNODIS (CLLE-ERSS et IRIT, Université de Toulouse): Lydia-Mai Ho-Dac (contact), Stergos Afantenos, Nicholas Asher, Farah Benamara, Myriam Bras, Cécile Fabre, Anne Le Draoulec, Philippe Muller, Marie-Paule Péry-Woodley, Laurent Prévot, Josette Rebeyrolle, Ludovic Tanguy, Marianne Vergez-Couret, Laure Vieu. ****************************************************** ANNODIS, a freely available discourse-level annotated corpus We are pleased to announce that the ANNODIS resource, a discourse-level annotated corpus for French, is now available on-line: http://redac.univ-tlse2.fr/corpus/annodis. The corpus (687,000 words) is diversified with respect to genre, length and type of discourse organisation. The annotated objects, which reflect two distinct approaches to discourse, are rhetorical relations and two types of multi-level structures:topical chains and enumerative structures. The corpus can be downloaded and, in the case of multi-level structures, explored on-line via a browser. The ANNODIS resource team (CLLE-ERSS and IRIT, Université de Toulouse): Lydia-Mai Ho-Dac (contact), Stergos Afantenos, Nicholas Asher, Farah Benamara, Myriam Bras, Cécile Fabre, Anne Le Draoulec, Philippe Muller, Marie-Paule Péry-Woodley, Laurent Prévot, Josette Rebeyrolle, Ludovic Tanguy, Marianne Vergez-Couret, Laure Vieu. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 11:12:53 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 13:12:53 +0200 Subject: Appel: LATA 2013 Message-ID: Date: Sun, 14 Oct 2012 18:13:39 +0200 From: "GRLMC" Message-ID: <17CA16231E834D33BB719B63722D95C4 at Carlos1> X-url: http://grammars.grlmc.com/LATA2013/ ------------------------------------------------------------------------ 7th INTERNATIONAL CONFERENCE ON LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS LATA 2013 Bilbao, Spain April 2-5, 2013 Organized by: Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University http://grammars.grlmc.com/LATA2013/ AIMS: LATA is a yearly conference in theoretical computer science and its applications. Following the tradition of the International Schools in Formal Languages and Applications developed at Rovira i Virgili University in Tarragona since 2002, LATA 2013 will reserve significant room for young scholars at the beginning of their career. It will aim at attracting contributions from both classical theory fields and application areas (bioinformatics, systems biology, language technology, artificial intelligence, etc.). VENUE: LATA 2013 will take place in Bilbao, at the Basque Country in Northern Spain. The venue will be the Basque Center for Applied Mathematics (BCAM). SCOPE: Topics of either theoretical or applied interest include, but are not limited to: ‐ algebraic language theory ‐ algorithms for semi‐structured data mining ‐ algorithms on automata and words ‐ automata and logic ‐ automata for system analysis and programme verification ‐ automata, concurrency and Petri nets ‐ automatic structures ‐ cellular automata ‐ combinatorics on words ‐ computability ‐ computational complexity ‐ computational linguistics ‐ data and image compression ‐ decidability questions on words and languages ‐ descriptional complexity ‐ DNA and other models of bio‐inspired computing ‐ document engineering ‐ foundations of finite state technology ‐ foundations of XML ‐ fuzzy and rough languages ‐ grammars (Chomsky hierarchy, contextual, multidimensional, unification, categorial, etc.) ‐ grammars and automata architectures ‐ grammatical inference and algorithmic learning ‐ graphs and graph transformation ‐ language varieties and semigroups ‐ language‐based cryptography ‐ language‐theoretic foundations of artificial intelligence and artificial life ‐ parallel and regulated rewriting ‐ parsing ‐ pattern recognition ‐ patterns and codes ‐ power series ‐ quantum, chemical and optical computing ‐ semantics ‐ string and combinatorial issues in computational biology and bioinformatics ‐ string processing algorithms ‐ symbolic dynamics ‐ symbolic neural networks ‐ term rewriting ‐ transducers ‐ trees, tree languages and tree automata ‐ weighted automata STRUCTURE: LATA 2013 will consist of: ‐ invited talks ‐ invited tutorials ‐ peer‐reviewed contributions INVITED SPEAKERS: Jin-Yi Cai (Madison), Complexity Dichotomy for Counting Problems Kousha Etessami (Edinburgh), Algorithms for Analyzing Infinite-state Recursive Probabilistic Systems Luke Ong (Oxford), tutorial Languages and Automata for Higher-order Model Checking Joël Ouaknine (Oxford), tutorial Discrete Linear Dynamical Systems Thomas Schwentick (Dortmund), Applications of Automata in Database Theory -- Challenges to Automata Theory from Databases Andrei Voronkov (Manchester), The Lazy Reviewer Assignment Problem in EasyChair PROGRAMME COMMITTEE: Parosh Aziz Abdulla (Uppsala) Franz Baader (Dresden) Jos Baeten (CWI, Amsterdam) Christel Baier (Dresden) Gerth Stølting Brodal (Aarhus) John Case (Delaware) Marek Chrobak (Riverside) Mariangiola Dezani (Torino) Rod Downey (Wellington) Ding-Zhu Du (Dallas) Ivo Düntsch (Brock) E. Allen Emerson (Austin) Javier Esparza (Technical University Munich) Michael R. Fellows (Darwin) Alain Finkel (ENS Cachan) Dov M. Gabbay (King’s, London) Jürgen Giesl (Aachen) Rob van Glabbeek (NICTA, Sydney) Georg Gottlob (Oxford) Annegret Habel (Oldenburg) Reiko Heckel (Leicester) Sanjay Jain (Singapore) Charanjit S. Jutla (IBM Thomas J. Watson) Ming-Yang Kao (Northwestern) Deepak Kapur (Albuquerque) Joost-Pieter Katoen (Aachen) S. Rao Kosaraju (Johns Hopkins) Evangelos Kranakis (Carleton) Hans-Jörg Kreowski (Bremen) Tak-Wah Lam (Hong Kong) Gad M. Landau (Haifa) Kim G. Larsen (Aalborg) Richard Lipton (Georgia Tech) Jack Lutz (Iowa State) Ian Mackie (École Polytechnique, Palaiseau) Rupak Majumdar (Max Planck, Kaiserslautern) Carlos Martín-Vide (Tarragona, chair) Paliath Narendran (Albany) Tobias Nipkow (Technical University Munich) David A. Plaisted (Chapel Hill) Jean-François Raskin (Brussels) Wolfgang Reisig (Humboldt Berlin) Michaël Rusinowitch (LORIA, Nancy) Davide Sangiorgi (Bologna) Bernhard Steffen (Dortmund) Colin Stirling (Edinburgh) Alfonso Valencia (CNIO, Madrid) Helmut Veith (Vienna Tech) Heribert Vollmer (Hannover) Osamu Watanabe (Tokyo Tech) Pierre Wolper (Liège) Louxin Zhang (Singapore) ORGANIZING COMMITTEE: Adrian Horia Dediu (Tarragona) Peter Leupold (Tarragona) Carlos Martín‐Vide (Tarragona, co-chair) Magaly Roldán (Bilbao) Bianca Truthe (Magdeburg) Florentina Lilica Voicu (Tarragona) Enrique Zuazua (Bilbao, co-chair) SUBMISSIONS: Authors are invited to submit papers presenting original and unpublished research. Papers should not exceed 12 single‐spaced pages (including eventual appendices) and should be formatted according to the standard format for Springer Verlag's LNCS series (see http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0). Submissions have to be uploaded to: https://www.easychair.org/conferences/?conf=lata2013 PUBLICATIONS: A volume of proceedings published by Springer in the LNCS series will be available by the time of the conference. A special issue of a major journal will be later published containing peer‐reviewed extended versions of some of the papers contributed to the conference. Submissions to it will be by invitation. REGISTRATION: The period for registration is open from August 6, 2012 to April 2, 2013. The registration form can be found at the website of the conference: http://grammars.grlmc.com/LATA2013/ FEES: Early registration fees: 500 Euro Early registration fees (PhD students): 400 Euro Late registration fees: 540 Euro Late registration fees (PhD students): 440 Euro On‐site registration fees: 580 Euro On‐site registration fees (PhD students): 480 Euro At least one author per paper should register. Papers that do not have a registered author who paid the fees by January 2, 2013 will be excluded from the proceedings. One registration gives the right to present only one paper. Fees comprise access to all sessions, one copy of the proceedings volume, coffee breaks and lunches. PAYMENT: Early (resp. late) registration fees must be paid by bank transfer before January 2, 2013 (resp. March 23, 2013) to the conference bank account: Uno-e Bank bank’s address: Julian Camarillo 4 C, 28037 Madrid, Spain IBAN: ES3902270001820201823142 BIC/SWIFT: UNOEESM1 account holder: C. Martin – GRLMC account holder’s address: Av. Catalunya 35, 43002 Tarragona, Spain Please mention LATA 2013 and your name in the subject. A receipt will be provided on site. Remarks: - Bank transfers should not involve any expense for the conference. - People claiming early registration will be requested to prove that the bank transfer order was carried out by the deadline. - PhD students will need to provide evidence of their status on site. People registering on site must pay in cash. For the sake of local organization, however, it is much recommended to do it earlier. DEADLINES: Paper submission: November 9, 2012 (23:59h, CET) Notification of paper acceptance or rejection: December 16, 2012 Final version of the paper for the LNCS proceedings: December 25, 2012 Early registration: January 2, 2013 Late registration: March 23, 2013 Starting of the conference: April 2, 2013 End of the conference: April 5, 2013 Submission to the post‐conference journal special issue: July 5, 2013 QUESTIONS AND FURTHER INFORMATION: florentinalilica.voicu at urv.cat POSTAL ADDRESS: LATA 2013 Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University Av. Catalunya, 35 43002 Tarragona, Spain Phone: +34‐977‐559543 Fax: +34‐977‐558386 ACKNOWLEDGEMENTS: Basque Center for Applied Mathematics Diputació de Tarragona Universitat Rovira i Virgili ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 11:30:21 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 13:30:21 +0200 Subject: Appel: New Extended Deadline CFP - ML4HMT-12 Workshop at COLING 2012 Message-ID: Date: Mon, 15 Oct 2012 16:20:39 +0200 From: Maite Melero Message-ID: <16B44A00C6287E46A454C5F9A5914F6FE16646A7CE at FBMEC01.corp.barcelonamedia.org> X-url: http://www.dfki.de/ml4hmt/ -----Apologies for duplicate postings----- ***CALL FOR PAPERS --- EXTENDED DEADLINE --- NEW DEADLINE: OCT 22nd*** “Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12 WS and Shared Task)” at COLING 2012 Mumbai (India), 9th December, 2012 URL: http://www.dfki.de/ml4hmt/ The workshop and associated shared task are an effort to trigger a systematic investigation on improving state-of-the-art hybrid machine translation, making use of advanced machine-learning (ML) methodologies. It follows the ML4HMT-11 workshop which took place last November in Barcelona. The first workshop also road-tested a shared task (and associated data set) and laid the basis for a broader reach in 2012. Regular Papers ML4HMT-12 ======================== We are soliciting original papers on hybrid MT, including (but not limited to): * use of machine learning methods in hybrid MT; * system combination: parallel in multi-engine MT (MEMT) or sequential in statistical post-editing (SPMT); * combining phrases and translation units from different types of MT; * syntactic pre-/re-ordering; * using richer linguistic information in phrase-based or in hierarchical SMT; * learning resources (e.g., transfer rules, transduction grammars) for probabilistic rule-based MT. Full papers should be anonymous and follow the COLING full paper format (http://www.coling2012-iitb.org/call_for_papers.php). To submit contributions, please follow the instructions at the Workshop management system submission website: https://www.softconf.com/coling2012/ML4HMT12/. The contributions will undergo a double-blind review by members of the programme committee. Shared Task ML4HMT-12 ===================== The main focus of the Shared Task is to address the question: "Can Hybrid MT and System Combination techniques benefit from extra information (linguistically motivated, decoding, runtime, confidence scores, or other meta-data) from the systems involved?" Participants are invited to build hybrid MT systems and/or system combinations by using the output of several MT systems of different types, as provided by the organisers. While participants are encouraged to use machine learning techniques to explore the additional meta-data information sources, other general improvements in hybrid and combination based MT are welcome to participate in the challenge. For systems that exploit additional meta-data information the challenge is that additional meta-data is highly heterogeneous and (individual) system specific. Data: The ML4HMT-12 Shared Task involves (ES-EN) and (ZH-EN) data sets, in each case translating into EN. * (ES-EN): Participants are given a development bilingual set aligned at a sentence level. Each "bilingual sentence" contains: 1) the source sentence, 2) the target (reference) sentence and 3) the corresponding multiple output translations from four systems, based on different MT approaches (Apertium, Ramirez-Sanchez, 2006; Lucy, Alonso and Thurmair, 2003; Moses, Koehn et. al., 2007). The output has been annotated with system-internal meta-data information derived from the translation process of each of the systems. * (ZH-EN) A corresponding data set for ZH-EN with output translations from three systems (Moses, ICT_Chiero, Mi et. al., 2009;and Huajian RBMT) will be provided. (Participants are required to fill out a shared task evaluation agreement form and obtain the ZH-EN data from LDC). Participants are challenged to build an MT mechanism where possible making effective use of the system-specific MT meta-data output. They can provide solutions based on opensource systems, or develop their own mechanisms. The development set can be used for tuning the systems during the development phase. Final submissions have to include translation output on a test set, which will be made available one week after training data release. Data will be provided to build language/reordering models, possibly re-using existing resources from MT research. Participants can also make use of additional (linguistic analysis, confidence estimation etc.) tools, if their systems require so, but they have to explicitly declare this upon submission, so that they are judged as "unconstrained" systems. This will allow for a better comparison between participating systems. Shared task results should be submitted via email attachment. Please compress your results as .zip or .gz archive and send them to cfedermann at dfki.de. Use "ML4HMT-12 Shared Task Submission" as mail subject. Shared task results are due by October 28th. System output will be judged via peer-based human evaluation as well as automatic evaluation. During the evaluation phase, participants will be requested to rank system outputs of other participants through a web-based interface (Appraise, Federmann 2010). Automatic metrics include BLEU (Papineni et. Al, 2002), TER (Snover et al., 2006) and METEOR (Lavie, 2005). Results from the automatic evaluation of submitted shared task results will be made available to participants on November 1st so that they could be referred to in system description papers. As the manual evaluation will take longer, its results will be presented and published at the workshop. Workshop Participation ================== If you are interested in our workshop and intend to participate, we'd much appreciate if you could inform us about your participation intent beforehand so that we can better plan the workshop; to do so, send an email to cfedermann at dfki.de. Important Dates 2012 =================== 15th August: Shared task Training data release (updated ML4HMT corpus) 23rd August: Shared task Test data release 22nd October: Workshop full paper submission deadline 28th October: Shared task Translation results submission deadline 31st October: Workshop paper accept/reject notification 1st November: Shared task Evaluation results release 4th November: Shared Task system description paper submision 11th November: Shared Task system description paper accept/reject notification 18th November: Workshop and Shared task Camera ready paper due 9th December: ML4HMT-12 Workshop Organizers ========== - Prof. Josef van Genabith, Dublin City University (DCU) and Centre for Next Generation Localisation (CNGL) - Prof. Toni Badia, Universitat Pompeu Fabra and Barcelona Media (BM) - Christian Federmann, German Research Center for Artificial Intelligence (DFKI), contact person: cfedermann at dfki.de - Dr. Maite Melero, Barcelona Media (BM) - Dr. Marta R. Costa-jussà, Barcelona Media (BM) - Dr. Tsuyoshi Okita, Dublin City University (DCU) Program committee ================ - Eleftherios Avramidis (German Research Center for Artificial Intelligence, Germany) - Prof. Sivaji Bandyopadhyay (Jadavpur University, India) - Dr. Rafael Banchs (Institute for Infocomm Research - I2R, Singapore) - Prof. Loïc Barrault (LIUM - University of Le Mans, France) - Prof. Antal van den Bosch (Centre for Language Studies, Radboud University Nijmegen, Netherlands) - Dr. Grzegorz Chrupala (Saarland University, Saarbrücken, Germany) - Prof. Jinhua Du (Xi'an University of Technology (XAUT), China) - Dr. Andreas Eisele (Directorate-General for Translation (DGT), Luxembourg) - Dr. Cristina España-Bonet (Technical University of Catalonia, TALP, Barcelona) - Dr. Declan Groves (Center for Next Generation Localisation, Dublin City University, Ireland) - Prof. Jan Hajic (Institute of Formal and Applied Linguistics, Charles University in Prague) - Prof. Timo Honkela (Aalto University, Finland) - Dr. Patrick Lambert (LIUM - University of Le Mans, France) - Prof. Qun Liu (Institute of Computing Technology, Chinese Academy of Sciences, China) - Dr. Maite Melero (Barcelona Media Innovation Center, Spain) - Dr. Tsuyoshi Okita (Dublin City University, Ireland) - Prof. Pavel Pecina (Institute of Formal and Applied Linguistics, Charles University in Prague) - Dr. Marta R. Costa-jussà (Barcelona Media Innovation Center, Spain) - Dr. Felipe Sanchez Martinez (Escuela Politecnica Superior, Universidad de Alicante, Spain) - Dr. Nicolas Stroppa (Google, Zurich, Switzerland) - Prof. Hans Uszkoreit (German Research Center for Artificial Intelligence, Germany) - Dr. David Vilar (German Research Center for Artificial Intelligence, Germany) The ML4HMT workshop is supported by the META-NET T4ME project (http://www.meta-net.eu/), funded by the DG INFSO of the European Commission through the Seventh Framework Programme, grant agreement no.: 249119. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 11:34:06 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 13:34:06 +0200 Subject: Appel: 4e Colloque Res per nomen, Reims Message-ID: Date: Mon, 15 Oct 2012 22:12:28 +0200 From: EMILIA HILGERT Message-ID: <16925_1350332248_507C6F57_16925_9217_1_20121015221228.83s7096nggc00cs4 at wmp.univ-reims.fr> X-url: http://www.res-per-nomen.org/respernomen/colloque-2013/Accueil-2013.html RAPPEL Chers collègues, Nous tenons à vous rappeler que la date limite d'envoi des propositions de communication pour le quatrième colloque Res per nomen « Les théories du sens et de la référence. Hommage à Georges Kleiber » est le 30 octobre 2012. Cf. le site Res per nomen : http://www.res-per-nomen.org/respernomen/colloque-2013/Accueil-2013.html Bien cordialement. Les organisateurs, Emilia Hilgert Silvia Palma Pierre Frath René Daval ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 11:39:01 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 13:39:01 +0200 Subject: Appel: Cogalex - Deadline extension: October 21, 2012 (no further extensions possible) Message-ID: Date: Tue, 16 Oct 2012 14:07:25 +0200 From: Michael Zock Message-ID: <507D4DFD.8010100 at lif.univ-mrs.fr> X-url: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ============================================================== Deadline extension: October 21, 2012 (no further extensions possible) All other dates (notification of acceptance' and 'camera-ready paper due') are maintained ============================================================== CogALex-3 (Cognitive Aspects of the Lexicon), a post-COLING workshop New deadline for paper submission : October 21, 2012 more details: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ============================================================== 3rd Workshop on "Cognitive Aspects of the Lexicon" (CogALex) Post-conference workshop at COLING 2012 (December 15, Mumbai, India) Invited speaker: Alain Polguère (Université de Lorraine & ATILF CNRS, France) Submission deadline: October 21, 2012 AIMS and TARGET AUDIENCE The aim of this workshop is to bring together researchers involved in the construction and application of electronic dictionaries to discuss modifications of existing resources in line with the users' needs, thereby fully exploiting the advantages of the digital form. Given the breadth of the questions, we welcome reports on work from many perspectives, including but not limited to: computational lexicography, psycholinguistics, cognitive psychology, language learning and ergonomics. MOTIVATION The way we look at dictionaries, their creation and use, has changed dramatically over the past 30 years. (1) While being considered as an appendix to grammar in the past, they have in the meantime moved to centre stage. Indeed, there is hardly any task in NLP which can be conducted without them. (2) Also, many lexicographers work nowadays with huge digital corpora, using language technology to build and to maintain the lexicon. (3) Last, but not least, rather than being static entities (data-base view), dictionaries are now viewed as graphs, whose nodes and links (connection strengths) may change over time. Interestingly, properties concerning topology, clustering and evolution known from other disciplines (society, economy, human brain) also apply to dictionaries: everything is linked, hence accessible, and everything is evolving. Given these similarities, one may wonder what we can learn from these disciplines. In this 3rd edition of the CogALex workshop we therefore intend to also invite scientists working in these fields, our goals being to broaden the picture, i.e. to gain a better understanding concerning the mental lexicon and to integrate these findings into our dictionaries in order to support navigation. Given recent advances in neurosciences, it appears timely to seek inspiration from neuroscientists studying the human brain. There is also a lot to be learned from other fields studying graphs and networks, even if their object of study is something else than language, for example biology, economy or society. TOPICS OF INTEREST This workshop is about possible enhancements of existing electronic dictionaries. To perform the groundwork for the next generation of electronic dictionaries we invite researchers involved in the building of such dictionaries. The idea is to discuss modifications of existing resources by taking the users' needs and knowledge states into account, and to capitalize on the advantages of the digital media. For this workshop we invite papers including but not limited to the following topics which can be considered from various points of view: linguistics, neuro- or psycholinguistics (associations, tip-of-the-tongue problem), network-related sciences (complex graphs, network topology, small-world problem), etc. 1) Analysis of the conceptual input of a dictionary user - What does a language producer start from (bag of words)? - What is in the authors' minds when they are generating a message and looking for a word? - What does it take to bridge the gap between this input and the desired output (target word)? 2) The meaning of words - Lexical representation (holistic, decomposed) - Meaning representation (concept based, primitives) - Revelation of hidden information (vector-based approaches: LSA/HAL) - Neural models, neurosemantics, neurocomputational theories of content representation. 3) Structure of the lexicon - Discovering structures in the lexicon: formal and semantic point of view (clustering, topical structure) - Creative ways of getting access to and using word associations - Evolution, i.e. dynamic aspects of the lexicon (changes of weights) - Neural models of the mental lexicon (distribution of information concerning words, organisation of the mental lexicon) 4) Methods for crafting dictionaries or indexes - Manual, automatic or collaborative building of dictionaries and indexes (distributional semantics, crowd-sourcing, serious games, etc.) - Impact and use of social networks (Facebook, Twitter) for building dictionaries, for organizing and indexing the data (clustering of words), and for allowing to track navigational strategies, etc. - (Semi-) automatic induction of the link type (e.g. synonym, hypernym, meronym, association, collocation, ...) - Use of corpora and patterns (data-mining) for getting access to words, their uses, and combinations (associations) 5) Dictionary access (navigation and search strategies), interface issues - Semantic-based search - Search (simple query vs multiple words) - Context-dependent search (modification of usersí goals during search) - Recovery - Navigation (frequent navigational patterns or search strategies used by people) - Interface problems, data-visualisation IMPORTANT DATES - Deadline for paper submissions: October 15, 2012 - Notification of acceptance: November 5, 2012 - Camera-ready papers due: November 15, 2012 - Workshop date: December 15, 2012 SUBMISSION INSTRUCTIONS see: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html INVITED SPEAKER: Alain Polguère (Université de Lorraine & ATILF CNRS, France) PROGRAMME COMMITTEE * Barbu, Eduard (Universidad de Jaén, Spain) * Barrat, Alain (Centre de physique théorique, CNRS & Aix-Marseille University) * Bilac, Slaven (Google Tokyo, Japan) * Bel Enguix, Gemma (LIF, Aix-Marseille University, France) * Bouillon, Pierrette (TIM, Faculty of Translation and Interpretating, Geneva, Switzerland) * Cook, Paul (The University of Melbourne, Australia) * Cristea, Dan (University of Iasi, Romania) * Fairon, Cedrick (CENTAL, Université catholique de Louvain, Belgium) * Fazly, Afsaneh (University of Toronto, Canada) * Fellbaum, Christiane (University of Princeton, USA) * Ferret, Olivier (CEA LIST, Palaiseau, France) * Fontenelle, Thierry (Translation Centre for the Bodies of the European Union, Luxemburg) * Granger, Sylviane (Université Catholique de Louvain, Belgium) * Grefenstette, Gregory (3DS Exalead, Paris, France) * Hansen-Schirra, Silvia (University of Mainz, FTSK, Germany) * Heid, Ulrich (University of Hildesheim, Germany) * Hirst, Graeme (University of Toronto, Canada) * Hovy, Ed (ISI, Los Angeles, USA) * Joyce, Terry (Tama University, Kanagawa-ken, Japan) * Kwong, Olivia (City University of Hong Kong, China) * L'Homme, Marie Claude (OLST, University of Montreal, Canada) * Lapalme, Guy (RALI, University of Montreal, Canada) * Mititelu, Verginica (RACAI, Bucharest, Romania) * Pirrelli, Vito (ILC, Pisa, Italy) * Polguère, Alain (Université de Lorraine & ATILF CNRS, France) * Rapp, Reinhard (University of Leeds, UK) * Ruette, Tom (KU Leuven, Belgium) * Schwab, Didier (LIG, Grenoble, France) * Serasset, Gilles (IMAG, Grenoble, France) * Sharoff, Serge (University of Leeds, UK) * Sinopalnikova, Anna (FIT, BUT, Brno, Czech Republic) * Sowa, John (VivoMind Research, LLC, USA) * Tiberius, Carole (Institute for Dutch Lexicology, The Netherlands) * Tokunaga, Takenobu (TITECH, Tokyo, Japan) * Tufis, Dan (RACAI, Bucharest, Romania) * Valitutti, Alessandro (University of Helsinki and HIIT, Finland) * Vossen, Piek (Vrije Universiteit, Amsterdam, The Netherlands) * Wehrli, Eric (LATL, University of Geneva, Switzerland) * Zock, Michael (LIF, CNRS, Aix-Marseille University, France) * Zweigenbaum, Pierre (LIMSI - CNRS, Orsay & ERTIM - INALCO, Paris, France) WORKSHOP ORGANIZERS and CONTACT PERSONS Michael Zock (LIF-CNRS, Marseille, France), michael.zock AT lif.univ-mrs.fr Reinhard Rapp (University of Leeds, UK), reinhardrapp AT gmx.de For more details see: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 19:00:03 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 21:00:03 +0200 Subject: Job: Stage de recherche, Technicolor (Rennes), Analysis of Web forums Message-ID: Date: Tue, 16 Oct 2012 16:56:58 +0200 From: Guegan Marie Message-ID: X-url: http://www.technicolor.com/ Internship position available at Technicolor R&D in Rennes. Title ------ “Which scene are you talking about?” Recognizing scenes discussed on cinema or TV forums. Context ------- For more info on Technicolor Research & Innovation, Rennes : https://research.technicolor.com/rennes/ The internship will be hosted at Technicolor R&I in Rennes, France (500 employees, of which 130 researchers), within the Media Computing Lab. Our lab aims at bringing modern trends in computing to the service of novel media engines in content creation (visual effects, animation) as well as content discovery and retrieval. More specifically, our team focuses on Web user comments posted on forums and social networks. We use various approaches such as data mining, social network analysis and natural language processing. Objective --------- This internship aims at designing, developing and evaluating an information extraction system for user comments. The domain is dedicated to cinema and television. Each comment is already attached to a particular audiovisual content (movie, TV series, TV program). One of our goals is to detect within the comments the text segments which refer to a particular moment in the video. For instance, users may talk about their favorite scene or quote a famous dialogue. Task description ----------------- We will not analyze the audiovisual signal (image or audio), but solely the text of comments. Comments have already been collected and saved into a database. Hence the internship will focus on the analysis of the dataset rather than its collection. The intern will be responsible for choosing best techniques, based on a survey he or she will conduct on state-of-the-art approaches. The developed system will be evaluated by the intern, both quantitatively and qualitatively. Depending on obtained results and innovative ideas, this work may lead to a research publication in a conference. Keywords --------- Natural language processing (NLP), machine learning, data mining, text mining Profile of the candidate ------------------------- * Student in final year of master or science engineering school * Computer science (Python, Java) * Skills in machine learning, data mining and natural language processing * Strong interest in research (will constitute a survey) * Interest in social networks * English mandatory * Appreciates working with a team spirit Internship period & duration ----------------------------------- 6 months, starting preferably around February or March 2013, depending on the candidate’s constraints. Please email your CV and cover letter to stage.rennes at technicolor.com with reference [TRDF-DM-029] in the subject. Marie Guégan Media Computing Lab Research & Innovation Technicolor R&D France 975 avenue des Champs Blancs, CS 17616 35576 Cesson-Sévigné Cedex, France www.technicolor.com Important : Technicolor R&D France déménage. Notre nouvelle adresse, à compter du 24 octobre 2012 devient : 975 avenue des champs blancs, CS 17616, 35576 Cesson Sévigné, France - tél (standard) : +33 (0)2 99 27 30 00 (inchangé). ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 19:11:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 21:11:25 +0200 Subject: Appel: ICICS 2013 Message-ID: Date: Tue, 16 Oct 2012 21:00:34 +0300 From: ICICS2013 Message-ID: <1168219207628 at CIT-SamerSuleiman-M2L-1.just.edu.jo> X-url: http://www.icics.info/icics2013 [Apologies if you receive multiple copies of this Call For Papers.] CALL FOR PAPERS ================= The 4th International Conference on Information and Communication Systems, ICICS 2013 Organized by Jordan University of Science and Technology April 23-25 2013, Irbid, Jordan http://www.icics.info/icics2013 IMPORTANT DATES ================= Full Paper Submission: Dec. 1st, 2012 Notification of Decision: Jan. 20th, 2013 Registration and Camera-Ready: Feb. 15th, 2013 Poster Presentation Submission: Feb. 15th, 2013 GENERAL INFORMATION ===================== The International Conference on Information and Communication Systems (ICICS 2013) is a forum for scientists, engineers, and practitioners to present their latest research results, ideas, developments, and applications in all areas of Computer and Information Sciences. The topics that will be covered in the ICICS 2013 include, but are not limited to: Artificial Intelligence, Mobile Computing, Networking, Information Security and Cryptography, Intrusion Detection and Computer Forensics, Web Content Mining, Bioinformatics and IT Applications, Database Technology, Systems Integration, Information Systems Analysis and Specification, Telecommunications, and Human-Computer Interaction. TOPICS ======= Researchers are encouraged to submit original research contributions in all major areas, which include, but not limited to: Databases and Information Systems Integration Artificial Intelligence Machine Learning Bioinformatics Data Mining Robotics and Autonomous Systems Knowledge Management & Natural Language Processing E-Business E-Learning Health Information Systems Applications of Fuzzy Logic Applications of Neural Network Data Warehouses & Human Computer Interaction Systems Engineering Methodologies Embedded Systems Software Engineering Software Measurement Algorithms and Applications Computer Architecture Computer Graphics VLSI and its applications Computer Networks Wireless and Mobile Computing & Computer Simulation Information Security Information Systems Semantic Web Technologies Multimedia and Image Processing Parallel and Distributed Systems Cloud Computing Pervasive and Adaptive Systems Reliability and Fault-Tolerance Internet and Collaborative Computing Nano Technology INSTRUCTIONS FOR AUTHORS =========================== Prospective authors are invited to submit full papers following the guidelines posted on the conference website http://www.icics.info/icics2013. Submitted papers will be peer-reviewed and prospective authors are expected to present their papers at the conference. The papers that are accepted and presented at the conference will appear in CD proceedings, in the ACM Digital Library (pending approval) and in DBLP. Best paper and best student paper will be selected by peer reviews and will be announced during the social event at the conference. Please submit your paper in PDF format via the electronic submission system available in EasyChair: https://www.easychair.org/conferences/?conf=icics13. Prospective authors are expected to present their papers at the conference. Extended version of selected papers will be published in Journals like: - Journal of emerging technologies in web intelligence (JETWI, ISSN 1798-0461) - Network Protocols and Algorithms (ISSN 1943-3581) Please send any enquiry on ICICS 2013 to icics at just.edu.jo ________________________________ Jordan University of Science and Technology accepts no liability for any damage caused by any virus transmitted by this email. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 19:14:02 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 21:14:02 +0200 Subject: Appel: Journee d'etude annotation corpus oraux, Paris Message-ID: Date: Wed, 17 Oct 2012 11:48:44 +0200 From: Christophe Benzitoun Message-ID: <507E7EFC.8030505 at univ-nancy2.fr> *Annotation syntaxique de corpus oraux* *Projets récents et perspectives* ** ** Appel à communication Journée d'étude Conscila (ENS Paris) *Vendredi 7 décembre 2012* À l'heure actuelle, de plus en plus de corpus de français parlé sont librement mis à la disposition de la communauté scientifique (corpus PFC, Corpus du Français Parlé Parisien, Valibel, CRDO, TCOF, etc.). Or, ces données présentent des particularités non prises en compte par la plupart des outils de traitements de corpus. Ainsi, il est difficile d'employer ces instruments directement sur le français parlé. De même, les données issues de l'oral posent des problèmes pour leur intégration dans les cadres traditionnels. Les logiciels et les approches linguistiques ont pour point commun d'avoir été principalement développés à partir de textes écrits (ou à partir d'exemples inventés) et en vue du traitement de l'écrit. Ainsi, afin d'adapter les systèmes actuels ou, tout simplement, d'approfondir notre connaissance du français, il est indispensable de produire des annotations sur les ressources orales. Cependant, les initiatives dans ce domaine en sont encore au stade embryonnaire pour le français, même s'il en existe tout de même un nombre conséquent. On peut citer les travaux de Eshkol et al. (2010), le projet PERCEO (http://cnrtl.fr/corpus/perceo/) -- tous deux sur l'annotation morphosyntaxique -- la récente journée ATALA /Annoter les corpus oraux/ (Paris, avril 2011), le projet CID à Aix-en-Provence (http://sldr.org/sldr000027), une partie du projet ANR /Colaje/ (chez les jeunes enfants ; http://colaje.risc.cnrs.fr/), le projet SYFRAP (http://talc.loria.fr/HOME,288.html) ou encore l'école thématique CNRS sur l'annotation de données langagières (sept. 2011). Pour la syntaxe plus spécifiquement, on peut, entre autres, signaler le projet FNRS de L. Degand et A.-C. Simon (2011-2013) portant sur la /Périphérie gauche des unités de discours /ainsi que le projet ANR Rhapsodie (2008-2012) sous la direction d'A. Lacheret. Un nouveau projet ANR ORFEO (Outils et Recherches sur le Français Ecrit et Oral) de constitution et d'annotation de corpus va également démarrer début 2013 sous la direction de J.-M. Debaisieux. Malgré ces travaux, à l'heure actuelle, aucun corpus de français parlé annoté en syntaxe n'est disponible, à notre connaissance. L'un des objectifs de cette journée thématique sera de faire le point sur les initiatives récentes, en cours et futures dans le domaine de l'annotation syntaxique de corpus de français parlé, en montrant notamment comment l'annotation systématique fait émerger des questions fondamentales pour la description du français en général. Il s'agira également de voir dans quelle mesure on peut/doit développer de nouveaux modèles et outils pour intégrer les phénomènes présents à l'oral. Les communications pourront aussi bien porter sur des protocoles d'annotation, des outils que des études ciblées, des problèmes rencontrés, etc., et soulèveront une série de questions : quel standard d'annotation pour l'oral ? De quels outils dispose-t-on pour exploiter les annotations ? Etc. Par ailleurs, les démonstrations de logiciels pour l'annotation/exploitation seront aussi les bienvenues. La journée se terminera par une table ronde, à laquelle tous les participants seront invités, et qui devrait permettre à la fois de faire une synthèse des présentations mais aussi de lister quelques-unes des bonnes pratiques et de lancer des pistes à explorer dans le cadre de projets futurs. */Organisation/* Christophe Benzitoun -- ATILF CNRS & Université de Lorraine Noalig Tanguy -- Lattice UMR 8094 ENS/Paris 3 & Valibel / Université Catholique de Louvain */Comité scientifique/* Frédéric Béchet (Aix-Marseille Université / LIF UMR 7279) Marie-José Béguelin (Université de Neuchâtel) Alain Berrendonner (Université de Fribourg) Mireille Bilger (Université de Perpignan) Sandrine Caddéo (Aix-Marseille Université / Laboratoire Parole et Langage UMR 7309) Paul Cappeau (Université de Poitiers) Christophe Cerisara (Loria UMR 7503) Jeanne-Marie Debaisieux (Université Paris 3 Sorbonne Nouvelle / Lattice UMR 8094) Liesbeth Degand (Université catholique de Louvain / Valibel) José Deulofeu (Aix-Marseille Université / LIF UMR 7279) Anne Dister (Facultés universitaires Saint-Louis, Bruxelles) Iris Eshkol (Université d'Orléans / Laboratoire Ligérien Linguistique UMR 7270) Françoise Gadet (Université Paris Ouest Nanterre La Défense / Modyco UMR 7114) Kim Gerdes (Université Paris 3 Sorbonne Nouvelle / LPP / Institut d'Automation / Académie de Sciences Chinoise) Eva Havu (Université de Helsinki) Sylvain Kahane (Université Paris Ouest Nanterre La Défense / Modyco UMR 7114) Anne Lacheret (Université Paris Ouest Nanterre La Défense / Modyco UMR 7114) Florence Lefeuvre (Université Paris 3 Sorbonne Nouvelle / Clesthia) Michel Pierrard (Université Libre de Bruxelles) Paola Pietrandrea (Université Roma Tre / Lattice UMR 8094) Thierry Poibeau (Lattice UMR 8094 ENS/Paris 3) Sophie Prévost (Lattice UMR 8094 ENS/Paris 3) Nathalie Rossi-Gensane (Université Toulouse 2 / CLLE ERSS UMR 5263) Frédéric Sabio (Aix-Marseille Université / Laboratoire Parole et Langage UMR 7309) Catherine Schnedecker (Université de Strasbourg / Lilpa) Anne-Catherine Simon (Université catholique de Louvain / Valibel) Sandra Teston-Bonnard (Université de Lyon 2 / ICAR UMR 5191) Véronique Traverso (ICAR UMR 5191) Dan Van Raemdonck (Université Libre de Bruxelles) Dominique Willems (Université de Gand) Les propositions de communication (de deux pages maximum, bibliographie comprise), en français ou en anglais, sont à adresser *avant le 20 octobre* aux adresses suivantes : Christophe.Benzitoun at univ-lorraine.fr/ noalig.tanguy at uclouvain.be ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 19:15:29 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 21:15:29 +0200 Subject: Appel: revue TAL - Note de lecture (MOOT) Message-ID: Date: Wed, 17 Oct 2012 13:20:07 +0200 (CEST) From: Denis Maurel Message-ID: <1033857014.5185527.1350472807249.JavaMail.root at mail10> Appel: revue TAL - Note de lecture (MOOT) La revue TAL publie régulièrement des notes de lecture. Nous recherchons un collègue souhaitant lire le livre: "Richard MOOT, Christian RETORÉ. The logic of categorial grammars: a deductive account of natural language syntax and semantics. LNCS 6850. Springer. 2012. 302 pages." et prêt à en faire un compte-rendu pour la revue TAL (cet ouvrage sera envoyé gracieusement en échange du service rendu). Cette note de lecture doit être rédigée en français (trois pages maximum, au format de la revue) et envoyée fin décembre 2012. D'autres compte-rendu sont possibles si vous avez lu récemment un ouvrage qui vous a intéressé et si vous êtes prêt à partager votre lecture avec la communauté... Denis Maurel ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Sat Oct 20 13:43:37 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Sat, 20 Oct 2012 15:43:37 +0200 Subject: Appel: CICLing 2013 Message-ID: Date: Thu, 18 Oct 2012 10:50:48 -0500 From: "Alexander Gelbukh $CICLing-2013$" Message-ID: CICLing 2013 14th International Conference on Intelligent Text Processing and Computational Linguistics Samos, Greece March 24-30, 2013 Springer LNCS www.CICLing.org/2013 TOPICS: All topics related with computational linguistics, natural language processing, human language technologies, information retrieval, etc. PUBLICATION: LNCS - Springer Lecture Notes in Computer Science; poster session: special issue of a journal KEYNOTE SPEAKERS: Sophia Ananiadou, Walter Daelemans, Roberto Navigli, Michael Thelwall CULTURAL PROGRAM: Three days of cultural activities: tours to Ephesus, Samos, and nearby islands AWARDS: Best paper, best student paper, best presentation, best poster, best software SUBMISSION DEADLINES: November 30: registration of tentative abstract, December 7: full text of registered papers See complete CFP and contact on www.CICLing.org/2013 This message is sent in good faith of its usefulness for you as an NLP researcher. If this is an error, kindly let me know. Alexander Gelbukh www.Gelbukh.com ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Sat Oct 20 13:46:28 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Sat, 20 Oct 2012 15:46:28 +0200 Subject: Soft: ECDC-TM - A freely available translation memory in 25 languages Message-ID: Date: Fri, 19 Oct 2012 12:03:01 +0200 From: Ralf Steinberger Message-id: <015501cdade0$eca55470$c5effd50$@jrc.ec.europa.eu> X-url: http://langtech.jrc.ec.europa.eu/ECDC-TM.html ECDC-TM is a translation memory (sentences and their manually produced translations) in 25 languages. It is a multilingual parallel corpus covering 300 language pairs. Size: Up to 2500 translation units per language; 32,000 in total. Languages: All 300 language pairs involving the following 25 languages: Bulgarian, Czech, Danish, Dutch, English, Estonian, German, Greek, Finnish, French, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Spanish, Swedish and Turkish. URL: http://langtech.jrc.ec.europa.eu/ECDC-TM.html Creator: European Centre for Disease Prevention and Control (ECDC http://www.ecdc.europa.eu/ ) and JRC WHAT IS ECDC-TM ECDC-TM was produced by professionally translating the English language web pages of the European Centre for Disease Prevention and Control (ECDC), an EU agency based in Stockholm. The results of the translation were stored in 24 bilingual translation memories. The JRC post-processed these by cleaning the data and by producing one alignment for all 25 languages, resulting in parallel data for 300 language pairs. The major part of the documents talks about health-related topics (anthrax, botulism, cholera, dengue fever, hepatitis, etc.), but some of the web pages also describe the organisation ECDC (e.g. its organisation, job opportunities) and its activities (e.g. epidemic intelligence, surveillance). The ECDC Translation Memory (http://langtech.jrc.ec.europa.eu/ECDC-TM.html) is much smaller than the other multilingual resources distributed in the past by the European Commission's Joint Research Centre (JRC). Its main advantages are that (a) it covers even more languages and (b) it is based on texts from a very different domain (Public Health). MOTIVATION FOR THIS RELEASE The public data release is in line with the general effort of the European Commission to support multilingualism, language diversity and the re-use of Commission information. It follows the release of the JRC-Acquis (http://langtech.jrc.ec.europa.eu/JRC-Acquis.html) parallel corpus in 2006 (over 1 billion words in 22 languages), of the DGT-TM Translation Memory (http://langtech.jrc.ec.europa.eu/DGT-TM.html) in 2007 and 2011, the multilingual named entity resource JRC-Names (http://langtech.jrc.ec.europa.eu/JRC-Names.html) in 2011, the multi-label classification software JRC EuroVoc Indexer JEX (http://langtech.jrc.ec.europa.eu/Eurovoc.html) in 22 languages and further smaller multilingual resources. See http://langtech.jrc.ec.europa.eu/JRC_Resources.html for more information on these resources. WHAT ECDC-TM CAN BE USED FOR ECDC-TM can be fed into translation memory software to support human translators in their work. As it is a large parallel corpus in electronic form, it can furthermore be used by specialists in computational linguistics to train statistical machine translation software, to generate multilingual dictionaries, to train and test multilingual information extraction software, and more. WHAT NEXT? The JRC and collaborating services of the European Commission plan to release further large-scale linguistic resources in the near future. These include another 25-language translation memory and a paragraph-aligned full-text parallel corpus in 23 languages. Ralf Steinberger & Mohamed Ebrahim European Commission - Joint Research Centre (JRC) 21027 Ispra (VA), Italy URL - Applications: http://emm.newsbrief.eu/overview.html URL - The science behind them: http://langtech.jrc.ec.europa.eu/ ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Sat Oct 20 13:48:24 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Sat, 20 Oct 2012 15:48:24 +0200 Subject: Appel: Journee d'etudes jeunes chercheurs en lexicologie, terminologie, traduction, Bruxelles, 31 janvier 2013 Message-ID: Date: Fri, 19 Oct 2012 17:01:30 +0200 From: Mathieu Mangeot Message-Id: <2709D583-74D1-48F8-87A8-6B209EF94099 at imag.fr> X-url: http://www.ltt.auf.org/article.php3?id_article=728 PREMIÈRE JOURNÉE D'ÉTUDE DES JEUNES CHERCHEURS DU RÉSEAU LEXICOLOGIE, TERMINOLOGIE, TRADUCTION BRUXELLES, 31 JANVIER 2013 APPEL À COMMUNICATIONS http://www.ltt.auf.org/article.php3?id_article=728 Le réseau Lexicologie, terminologie, traduction organisera le jeudi 31 janvier 2013, à l'Institut supérieur de traducteurs et interprètes (ISTI, Haute École de Bruxelles), sa Première Journée d'étude des jeunes chercheurs, intitulée : « Lexicologie, terminologie, traduction : nouvelles recherches au cœur d'un système ». Destinée à réunir en priorité des doctorants et des postdoctorants, cette rencontre a l'ambition de maintenir la dynamique d'échanges et de transmission qui a animé le réseau depuis sa fondation. Dans la foulée des 9es Journées scientifiques de Villetaneuse (2011) et des Journées d’animation scientifiques régionales de Tunis (2012), elles permettront aux conférenciers de présenter des recherches consacrées au système linguistique et situées au cœur des préoccupations du réseau, notamment : - l'équivalence et la synonymie ; - l'évolution de la langue spécialisée ; - la modélisation des dictionnaires et des corpus ; - les outils d'aide à la traduction ; - la description linguistique au service de l'intercompréhension. Pour cette rencontre, le comité scientifique du réseau LTT a fait le choix d'un appel à communications destiné en priorité aux jeunes chercheurs et chercheuses, doctorants ou postdoctorants, travaillant au sein d'équipes affiliées au réseau ou intéressées à le rejoindre. Les propositions seront sélectionnées sur la base d'un résumé comptant entre 600 et 1000 mots. Il devra être intégré au formulaire ad hoc et envoyé d'ici le 10 novembre 2012, à l'adresse ltt2013 at imag.fr. La version finale des textes des communications devra être remise au comité scientifique au plus tard le 28 février 2013 et se conformera aux normes de présentation. Seuls seront publiés dans les actes de la rencontre les textes qui auront été validés par ce comité. Calendrier 10 novembre 2012 date limite de dépôt des propositions de communication 1er décembre 2012 décision du comité scientifique 31 janvier 2013 colloque 28 février 2013 date limite d'envoi des versions définitives des textes (actes) Comité scientifique Coordonnateur : Mathieu Mangeot-Nagata (GETALP-LIG, Grenoble) Ibrahim Ben Mrad (VALD, Université de La Manouba, Tunis) Xavier Blanco Escoda (Laboratoire fLexSem, Université Autonome de Barcelone) Aïcha Bouhjar (Centre de l'aménagement linguistique de l'Institut royal de la culture amazighe, Rabat) Mame Thierno Cissé (Laboratoire SOLDILAF, Université Cheikh Anta Diop de Dakar) Anne Condamines (ERSS, CNRS-Université de Toulouse-le-Mirail) Marie-Claude L'Homme (OLST, Université de Montréal) Teresa Lino (Centro de Linguística, Universidade Nova de Lisboa) François Maniez (CRTT, Université Lumière Lyon 2) Salah Mejri (Laboratoire LDI, Université de Paris 13) Franck Neveu (Université de Paris IV et ILF) Patrice Pognan (Lalic, Institut national des langues et civilisations orientales) Philippe Thoiron (CRTT, Université Lumière Lyon 2) Amalia Todirascu (UR LILPA, Université de Strasbourg) Marc Van Campenhoudt (Termisti, Institut supérieur de traducteurs et interprètes, Bruxelles) Mathieu MANGEOT GETALP, LIG-campus, BP 53 - 41 rue des mathématiques F-38041 Grenoble Cedex 9 - France Tel : +33 4 76 63 56 54 / +33 4 79 75 81 89 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Sat Oct 20 13:49:30 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Sat, 20 Oct 2012 15:49:30 +0200 Subject: Job: Ingenieur au LIPN, Universite Paris-Nord Message-ID: Date: Fri, 19 Oct 2012 21:20:47 +0200 From: Sylvie Salotti Message-ID: <0d314954f8237eab0405bb87021239b5 at lipn.univ-paris13.fr> CDD Ingénieur de développement au LIPN, Université Paris 13 Sujet : Annotation sémantique de documents juridiques Contexte : Vous travaillerez au LIPN (Laboratoire d’Informatique de Paris-Nord), au sein de l'équipe RCLN (Représentation des Connaissances et Langage Naturel) dont les domaines de recherche concernent le traitement automatique des langues et l'ingénierie des connaissances. Plus précisément, vous interviendrez dans le cadre du projet Legilocal, dont l'équipe RCLN est partenaire et dont l'objectif est de faciliter l'accès des citoyens aux documents juridiques des collectivités locales. Une des tâches de ce projet concerne l'enrichissement sémantique des documents avec des annotations permettant de faciliter la recherche d'information et la navigation dans la collection documentaire. Les annotations envisagées sont de différentes natures, mais on s'intéressera particulièrement à deux types d'annotation : - les termes correspondant à des instances ou concepts de ressources sémantiques telles que des ontologies et des thésaurus, - les relations que l’on pourra identifier entre ces concepts et instances, ou entre les documents. Mission : Vous participerez, en collaboration avec les membres de l'équipe et du projet, à l'élaboration des spécifications concernant ces besoins d'annotation, puis vous prendrez en charge les différentes étapes du projet de développement des outils permettant d'annoter une collection documentaire (développement, tests, documentation). Vous pourrez vous appuyer sur des modules d'annotation déjà existants dans l'équipe qui pourront être modifiés et/ou complétés pour répondre aux spécifications. L'intégration dans une chaîne de traitement UIMA en cours de développement dans l’équipe pourra être envisagée. Profil : Ingénieur ou master 2 en informatique Compétence requise : Bon niveau en programmation JAVA Compétences appréciées : Environnement Eclipse/RCP, Web sémantique, TAL Lieu : Université Paris 13, Villetaneuse Contrat à durée déterminée de 9 à 12 mois à pourvoir début janvier 2013. CV et lettre de motivation à adresser à : sylvie.salotti at lipn.univ-paris13.fr ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 19:43:11 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 21:43:11 +0200 Subject: Soft: ECDC-TM - A freely available translation memory in 25 languages - CORRECTION Message-ID: Date: Mon, 22 Oct 2012 11:11:41 +0200 From: Ralf Steinberger Message-id: <04f301cdb035$403fca10$c0bf5e30$@jrc.ec.europa.eu> X-url: http://www.ecdc.europa.eu/ This is a correction to the announcement sent on Friday 19 October. The email and the related ECDC-TM webpage contained wrong information regarding the languages covered and regarding the statistics on the corpus. Thanks to Raivis Skadiņš for pointing this out. I had mixed up the information of two different corpora. 25 languages seems to be more than my little brain can handle. ;-) Please accept my apologies. You find the new summary below. The web page has now also been corrected. Ralf ========== ECDC-TM is a translation memory (sentences and their manually produced translations) in 25 languages. It is a multilingual parallel corpus covering 300 language pairs. Size: Up to 3900 translation units per language; 64,000 in total. Languages: All 300 language pairs involving the following 25 languages: Bulgarian, Czech, Danish, Dutch, English, Estonian, German, Greek, Finnish, French, Irish, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Spanish and Swedish. URL: http://langtech.jrc.ec.europa.eu/ECDC-TM.html Creator: European Centre for Disease Prevention and Control (ECDC http://www.ecdc.europa.eu/) and JRC WHAT IS ECDC-TM ECDC-TM was produced by professionally translating the English language web pages of the European Centre for Disease Prevention and Control (ECDC), an EU agency based in Stockholm. The results of the translation were stored in 24 bilingual translation memories. The JRC post-processed these by cleaning the data and by producing one alignment for all 25 languages, resulting in parallel data for 300 language pairs. The major part of the documents talks about health-related topics (anthrax, botulism, cholera, dengue fever, hepatitis, etc.), but some of the web pages also describe the organisation ECDC (e.g. its organisation, job opportunities) and its activities (e.g. epidemic intelligence, surveillance). The ECDC Translation Memory (http://langtech.jrc.ec.europa.eu/ECDC-TM.html) is much smaller than the other multilingual resources distributed in the past by the European Commission’s Joint Research Centre (JRC). Its main advantages are that (a) it covers even more languages and (b) it is based on texts from a very different domain (Public Health). MOTIVATION FOR THIS RELEASE The public data release is in line with the general effort of the European Commission to support multilingualism, language diversity and the re-use of Commission information. It follows the release of the JRC-Acquis (http://langtech.jrc.ec.europa.eu/JRC-Acquis.html) parallel corpus in 2006 (over 1 billion words in 22 languages), of the DGT-TM Translation Memory (http://langtech.jrc.ec.europa.eu/DGT-TM.html) in 2007 and 2011, the multilingual named entity resource JRC-Names (http://langtech.jrc.ec.europa.eu/JRC-Names.html) in 2011, the multi-label classification software JRC EuroVoc Indexer JEX (http://langtech.jrc.ec.europa.eu/Eurovoc.html) in 22 languages and further smaller multilingual resources. See http://langtech.jrc.ec.europa.eu/JRC_Resources.html for more information on these resources. WHAT ECDC-TM CAN BE USED FOR ECDC-TM can be fed into translation memory software to support human translators in their work. As it is a large parallel corpus in electronic form, it can furthermore be used by specialists in computational linguistics to train statistical machine translation software, to generate multilingual dictionaries, to train and test multilingual information extraction software, and more. WHAT NEXT? The JRC and collaborating services of the European Commission plan to release further large-scale linguistic resources in the near future. These include another 25-language translation memory and a paragraph-aligned full-text parallel corpus in 23 languages. Ralf Steinberger & Mohamed Ebrahim European Commission - Joint Research Centre (JRC) 21027 Ispra (VA), Italy URL – Applications: http://emm.newsbrief.eu/overview.html URL – The science behind them: http://langtech.jrc.ec.europa.eu/ ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 19:49:06 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 21:49:06 +0200 Subject: Appel: WSLST 2013 Message-ID: Date: Sat, 20 Oct 2012 21:47:55 +0200 From: "GRLMC" Message-ID: X-url: http://grammars.grlmc.com/wslst2013/ ********************************************************************* 2013 INTERNATIONAL WINTER SCHOOL IN LANGUAGE AND SPEECH TECHNOLOGIES WSLST 2013 (formerly International PhD School in Language and Speech Technologies) Tarragona, Spain January 7-11, 2013 Organized by: Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University http://grammars.grlmc.com/wslst2013/ ********************************************************************* AIM: WSLST 2013 offers a broad and intensive series of lectures at different levels on selected topics in language and speech technologies. The students choose their preferred courses according to their interests and background. Instructors are top names in their respective fields. The School intends to help students initiate and foster their research career. The previous event in this series was SSLST 2012: http://grammars.grlmc.com/sslst2012/ ADDRESSED TO: Graduate (and advanced undergraduate) students from around the world. Most appropriate degrees include: Computer Science and Linguistics. Other students (for instance, from Mathematics, Electrical Engineering, Logic, or Cognitive Science) are welcome too. The School is appropriate also for people more advanced in their career who want to keep themselves updated on developments in the field. There will be no overlap in the class schedule. COURSES AND PROFESSORS: - Simon King (U Edinburgh), Speech Synthesis [introductory/intermediate, 8 hours] - Constantine Kotropoulos (U Thessaloniki), Pattern Recognition Problems Related to Speech [intermediate, 6 hours] - Lori Levin (Carnegie Mellon U), The Theory behind the Resources [introductory/intermediate, 8 hours] - Rainer Martin (U Bochum), Signal Processing for Voice Communication Devices [intermediate, 8 hours] - German Rigau (U Basque Country, Donostia), Knowledge Resources for Semantic Processing [introductory/intermediate, 8 hours] - Marc Swerts (Tilburg U), Facial Expressions in Human-Human and Human-Machine Interactions [introductory/intermediate, 6 hours] - Tomoki Toda (Nara Institute of Science and Technology), Statistical Voice Conversion [introductory/advanced, 8 hours] REGISTRATION: It has to be done on line at http://grammars.grlmc.com/wslst2013/Registration.php FEES: They are variable, depending on the number of courses each student takes. The rule is: 1 hour = - 10 euros (for payments until November 15, 2012), - 12.50 euros (for payments between November 16 and December 11, 2012), - 15 euros (for payments after December 11, 2012). PAYMENT PROCEDURE: The fees must be paid to the School's bank account: Uno-e Bank bank’s address: Julian Camarillo 4 C, 28037 Madrid, Spain IBAN: ES3902270001820201823142 SWIFT/BIC: UNOEESM1 account holder: C. Martin – GRLMC account holder’s address: Av. Catalunya 35, 43002 Tarragona, Spain Please mention WSLST 2013 and your name in the subject. A receipt will be provided on site. Remarks: - Bank transfers should not involve any expense for the School. - People claiming early registration will be requested to prove that the bank transfer order was carried out by the deadline. - The organizers reserve the right to cancel a course if the number of students who signed up for it is less than 10. - Students will be refunded only in the case when a course gets cancelled due to the unavailability of the instructor or because of insufficient registration numbers. People registering on site at the beginning of the School must pay in cash. For the sake of local organization, however, it is much recommended to do it earlier. ACCOMMODATION: Information about accommodation will be available on the website of the School. CERTIFICATE: Students will be delivered a certificate stating the courses attended, their contents, and their duration. IMPORTANT DATES: Announcement of the programme: October 19, 2012 Very early registration deadline: November 15, 2012 Early registration deadline: December 11, 2012 Starting of the School: January 7, 2013 End of the School: January 11, 2013 QUESTIONS AND FURTHER INFORMATION: Lilica Voicu: florentinalilica.voicu at urv.cat WEBSITE: http://grammars.grlmc.com/wslst2013/ POSTAL ADDRESS: WSLST 2013 Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University Av. Catalunya, 35 43002 Tarragona, Spain Phone: +34-977-559543 Fax: +34-977-558386 ACKNOWLEDGEMENTS: Diputació de Tarragona Universitat Rovira i Virgili ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 19:54:11 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 21:54:11 +0200 Subject: Revue: Langages, numero 187 Message-ID: Date: Mon, 22 Oct 2012 10:05:33 +0200 From: Catherine SCHNEDECKER Message-ID: X-url: http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&ida * Langages, Nº187 (3/2012) * L’analyse de corpus face à l’hétérogénéité des données * Septembre 2012 * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9824 Sommaire du n°187 * GARRIC Nathalie, LONGHI Julien * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9823 L’analyse de corpus face à l’hétérogénéité des données : d’une difficulté méthodologique à une nécessité épistémologique * PINCEMIN Bénédicte * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9822 Hétérogénéité des corpus et textométrie * BLASCO-DULBECCO Mylène, CAPPEAU Paul * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9821 Identifier et caractériser un genre : l’exemple des interviews politiques * LONGHI Julien * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9820 Types de discours, formes textuelles et normes sémantiques : expression et doxa dans un corpus de données hétérogènes * CISLARU Georgeta, SITRI Frédérique * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9819 De l’émergence à l’impact social des discours : hétérogénéités d’un corpus * GARRIC Nathalie * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9818 Construire et maîtriser l’hétérogénéité par la variation des données, des corpus et des méthodes * * RATINAUD Pierre, MARCHAND Pascal * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9817 Recherche improbable d’une homogène diversité : le débat sur l’identité nationale * ANTOINE Jean-Yves, VILLANEAU Jeanne, GOULIAN Jérôme * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9816 Influence du genre applicatif sur la réalisation des extractions en dialogue oral : constantes et variations * LEFEUVRE Anaïs, VINOGRADOVA Natalia * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9815 Hétérogénéité et extraction d’information factuelle dans un corpus de récits de voyage ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 19:58:54 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 21:58:54 +0200 Subject: Seminaire: Karin Harbusch, Expose sur les ellipses, 27 octobre 2012, Paris Message-ID: Date: Tue, 23 Oct 2012 15:00:20 +0200 From: Anne Abeillé Message-Id: X-url: http://ellipse.linguist.univ-paris-diderot.fr/ Dans le cadre du projet Approches typologiques des constructions elliptiques (Fédération TUL du CNRS) http://ellipse.linguist.univ-paris-diderot.fr/ nous avons le plaisir d'accueillir le vendredi 27 octobre de 10h a 12h 175 rue du chevaleret, 75013 Paris 4e etage, aquarium l'exposé suivant: ELLEIPO: Generating Clausal Coordinative Ellipsis in Dutch, Estonian, German, and Hungarian Karin Harbusch Computer Science Dept., University of Koblenz-Landau, GERMANY harbusch at uni-koblenz.de Abstract In our talk, we present target-language independent syntactic rules to generate Clausal Coordinate Ellipsis (CCE), i.e. Gapping (including Long-Distance Gapping, Subgapping and Stripping), Forward and Backward Conjunction Reduction (FCR and BCR) and Subject Gap with Finite/ Fronted Verb (SGF). The CCE rules, which are inspired by the psycholinguistic theory by Kempen (2009), have been implemented in Java (cf. system ELLEIPO) so that tests for a new target language require the set up of syntactic trees to be read in by the system. All CCE paraphrases for any input sentence—provided as output by the ELLEIPO system—have to be inspected by native speaker with respect to overgeneration, i.e. does the list contain any ungrammatical sentence, and undergeneration, i.e. does the list lack any CCE paraphrase that is licensed in the currently investigated target language. We show the implementation for Dutch and German, two Indo-European languages, and for Estonian and Hungarian, two Finno-Ugric languages. With respect to incremental production of ellipsis, we present results from four different corpus studies. After an account of our data extraction method, we will present a detailed overview of the incidence of four types of clausal coordinate ellipsis in the spoken and written treebanks in Dutch (ALPINO and CGN 2.0) and German (TIGER and VERBMOBIL). Based on the deviating numbers for the individual CCE types, we propose a theoretical explanation of the data pattern based on the assumption that during spontaneous speaking the scope (“window”) of online grammatical planning is basically restricted to one (finite) clause. In producing clausal coordinations, checking the possibility of “forward” ellipsis (Gapping, Forward Conjunction Reduction) requires comparison of form and meaning of two adjacent clauses. As this overtaxes the online planning scope of the sentence production system, speakers prefer to plan the form of second or later conjoined clauses in isolation, that is, without taking the shape of preceding clauses into account and thereby eliminating elliptical options. RNR, the “backward” versions of coordinate ellipsis, is more severely affected in spoken language because it requires the simultaneous presence within the planning window of (nearly) two complete clauses. Indeed, whilst RNR is readily observable in written texts, in spoken language it is a rare phenomenon manifesting itself only in very short clauses. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 20:06:01 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 22:06:01 +0200 Subject: Conf: WACAI, 15 et 16 novembre 2012, Grenoble Message-ID: Date: Tue, 23 Oct 2012 18:28:25 +0200 From: Alexandre Pauchet Message-ID: <5086C5A9.606 at insa-rouen.fr> X-url: http://wacai2012.imag.fr/ ********************************************************************** Appel à participation Inscriptions à tarif préférentiel jusqu'au*2 Novembre* Workshop Affects, Compagnons Artificiels, Interaction (WACAI'12) Grenoble, 15 et 16 Novembre 2012 http://wacai2012.imag.fr/ ********************************************************************** L'objectif du workshop*WACAI 2012* est de réunir les recherches et développements en cours autour des thèmes des/Compagnons Artificiels (Agents Conversationnels animés -ACA-// et robots interactifs)/ et de l'/Informatique Affective/, afin que les chercheurs des communautés scientifiques concernées puissent présenter leurs modèles, outils, technologies et résultats de recherche. Ces journées seront composées : - d'exposés de synthèse : * Michel Dubois : "Robotique et acceptabilité sociale : limites des modèles et perspectives." * Anna Tcherkassof : "Quel modèle psychologique pour l'interaction affective ? Une théorie perceptive des émotions" * Rachid Alami : "Processus Décisionnels pour robots interactifs" - de plus de 20 communications sélectionnées par le comité de programme. - de présentations de posters et démonstrations ********************************************************************** ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:16:04 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:16:04 +0200 Subject: Job: TAL pour le japonais, vacations remunerees, LATTICE Message-ID: Date: Wed, 24 Oct 2012 08:32:50 +0200 From: Thierry Poibeau Message-Id: CDD (vacations rémunérées) Pour un projet ponctuel, le Lattice (http://www.lattice.cnrs.fr/) recherche un étudiant de niveau Master ayant une formation en traitement automatique des langues (TAL) et maîtrisant la langue japonaise. La tâche consiste à annoter automatiquement des textes en japonais au moyen de logiciels existants (par ex. Cabocha, http://code.google.com/p/cabocha/) pour en extraire (notamment) les entités nommées et les termes. L'étudiant devra être autonome pour manipuler les textes, nettoyer les données (langage de script), les passer dans l'analyseur visé puis nettoyer les sorties (pour ne garder que le texte avec des balises marquants les éléments intéressants). Il peut être nécessaire de concevoir quelques programmes complémentaires pour annoter en plus des éléments importants pour la tâche mais non reconnus dans les textes (qui seront fournis). Cette vacation peut démarrer très rapidement et le travail est de toute manière à effectuer courant novembre (la charge est évaluée entre 50 et 100 heures mais ceci est très indicatif). Il est possible de travailler chez soi après s'être mis d'accord sur la tâche et les objectifs. Le contrat implique malgré tout quelques réunions à Paris. Pour candidater : contacter par mail dès que possible (et en tout cas avant le 6 novembre) Thierry Poibeau (prénom.nom at ens.fr). Envoyer un CV et donner (éventuellement de manière informelle dans le mail) les éléments pertinents par rapport à la tâche (ex. utilisation dans le cadre d'un projet d'un outil d'annotation du japonais). ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:18:36 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:18:36 +0200 Subject: Soft: TextCoop pour l'analyse du discours Message-ID: date: Wed, 24 Oct 2012 14:07:23 +0200 from: "Patrick Saint-Dizier" message-id: <4174-5087da00-5-256ba140 at 228118700> analyse du discours en logique TextCoop est une plateforme pour l'analyse de diverses structures du discours (géneriques ou spécifiques à un genre ou à un domaine). TextCoop est issu des grammaires logiques et permet d'introduire des connaissances et du raisonnement dans l'analyse. Le langage de description des structures du discours, Dislog, peut coder aussi bien des règles établies manuellement que produites par un mécanisme d'aprpentissage automatique. Une archive ainsi qu'une doc utilisateur est à présent disponible gratuitement sur demande. Cette archive contient quelques ressources lexicales ainsi que des exemples pour débuter. contact: stdizier at irit.fr ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:19:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:19:25 +0200 Subject: Appel: BioNLP Shared Task 2013, 1st announcement, Sample data release Message-ID: Date: Wed, 24 Oct 2012 14:10:31 +0200 From: Claire Nedellec Message-ID: <5087DAB7.3020403 at jouy.inra.fr> X-url: http://2013.bionlp-st.org/ (apologies for duplicate posting) ================================== BioNLP Shared Task 2013 -- First announcement ==================================== Release of sample data sets ------------------------------- We are pleased to announce the upcoming BioNLP Shared Task, an information extraction task open to any interested participants. The task will be held in early 2013, and a workshop on the task is planned to be co-hosted with the ACL 2013 BioNLP workshop. Sample data are now available at BioNLP Shared Task 2013 website: http://2013.bionlp-st.org/ Participation to the task will be open to all interested parties. The BioNLP Shared Task series represents a community-wide trend in text-mining for biology toward fine-grained information extraction (IE). The two previous events, the BioNLP 2009 and 2011 shared task attracted wide attention, with numerous teams submitting final results. The task setup and data have since served as the basis of numerous studies and published event extraction systems and datasets. The BioNLP Shared Task 2013 (BioNLP-ST'13) follows the general outline and goals of the previous tasks. It identifies biologically relevant extraction targets and proposes a linguistically motivated approach to event representation. BioNLP-ST'13 tasks also covers many new hot topics in the biology domain that are close to biologist needs. As in previous editions, manually annotated data will be provided for training, development and evaluation of the participating extraction methods. The six BioNLP-ST 2013 event extraction tasks are - [GE] Genia Event Extraction for NFkB knowledge base construction - [CG] Cancer Genetics - [PC] Pathway Curation - [GRO] Corpus Annotation with Gene Regulation Ontology - [RNB] Gene Regulation Network in Bacteria - [BB] Microorganism biotope (semantic annotation by an ontology) Tentative schedule is as follows: * Sample Data Release 23 October 2013 * Training Data Release 8 January 2013 * Test Data Release March 2013 (Tentative) * Result Submission March 2013 (Tentative) * Results Notification March 2013 (Tentative) * Manuscript Submission April 2013 (Tentative) * BioNLP Shared task 2013 workshop Summer 2013 (subject to extension) Scientific Advisory Committee * Jun'ichi Tsujii (Microsoft) - Chair * Sophia Ananiadou (NaCTeM) * Kevin Cohen (Univ. Colorado) * Sung-Pil Choi (KISTI) * Tapio Salakoski (Univ. Turku) * Pierre Zweigenbaum (Univ. Paris-Sud, CNRS) Organizing Committee * Claire Nédellec (INRA) - Organizing Chair * Robert Bossy (INRA) - Task BB and RBB chair * Jin-Dong Kim (DBCLS) - Task GE chair * Jung-jae Kim (NTU, Singapore) - Task GRO chair * Tomoko Ohta (NaCTeM) - Task PC chair * Sampo Pyysalo (NaCTeM) - Task GC chair * Julien Jourde (INRA) * Pontus Stenetorp (Univ. Tokyo) * Yue Wang (DBCLS) Program Committee * TBD Contact * Web: http://2013.bionlp-st.org/ * Mailing List: bionlp-st at bionlp-st.org ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:24:07 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:24:07 +0200 Subject: Appel: revue TAL - Note de lecture (PIOTROWSKI) Message-ID: Date: Wed, 24 Oct 2012 16:09:49 +0200 (CEST) From: Denis Maurel Message-ID: <1275077246.6835049.1351087789143.JavaMail.root at mail10> Appel: revue TAL - Note de lecture (PIOTROWSKI) La revue TAL publie régulièrement des notes de lecture. Nous recherchons un collègue souhaitant lire le livre: "Michael PIOTROWSKI. Natural Language Processing for Historical Texts. Morgan & Claypool publishers. 2012. 145 pages" et prêt à en faire un compte-rendu pour la revue TAL (cet ouvrage sera envoyé gracieusement en échange du service rendu). Cette note de lecture doit être rédigée en français (trois pages maximum, au format de la revue) et envoyée fin janvier 2013. D'autres compte-rendu sont possibles si vous avez lu récemment un ouvrage qui vous a intéressé et si vous êtes prêt à partager votre lecture avec la communauté... Denis Maurel ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:32:43 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:32:43 +0200 Subject: Job: 2 ingenieur TALN, LELIE Message-ID: date: Wed, 24 Oct 2012 16:28:16 +0200 from: "Patrick Saint-Dizier" message-id: <4e1c-5087fb00-9-9cef710 at 196554100> X-url: http://www.irit.fr/recherches/ILPL/lelie/accueil.html LELIE - 2 postes ingé. TALN Dans le cadre du projet ANR LELIE: http://www.irit.fr/recherches/ILPL/lelie/accueil.html Une start-up est en création. Elle est dirigée par un ingénieur grandes écoles. Plusieurs clients grandes entreprises sont intéressés par le produit qui a été développé. Outre le travail sur l'analyse du risque, domaine en forte croissance, nos travaux portent aussi sur l'analyse de la qualité des cahiers des charges. Dans ce cadre, nous recherchons un à deux ingénieurs TALN: - de préférence ayant un doctorat en TAL, en linguistique appliquée, ou éventuellement en intelligence artificielle, - bonne connaissance de la syntaxe et du discours et des technologies d'analyse, - bonne connaissance de l'anglais et si possible d'une autre langue, - très bons contacts avec les clients - utilisateurs. - région: toulousaine ou parisienne de préférence, mais le télé-travail est aussi possible. Le travail sera assez diversifié: extension du système actuel et participation au développement d'une ligne de prouits, analyse des besoins industriels, promotion du produit, représentation de connaissances métier, analyse et mise en place de déploiements industriels. Des collaborations avec des sociétés françaises et étrangères sont en cours de mise en place. Embauches: début 2013. envoyer un CV + lettre motivation avant le 10/11 à: stdizier at irit.fr qui fera suivre. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:49:33 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:49:33 +0200 Subject: Appel: Journees de Rochebrune 2013, La preuve et ses moyens Message-ID: Date: Wed, 24 Oct 2012 16:45:58 +0200 From: Thomas Louail Message-Id: <8A60D591-6309-4BD3-B1E1-1B8AAD870C96 at irit.fr> X-url: http://s4.csregistry.org/rochebrune [En nous excusant pour les doublons éventuels.] *Dernier appel à communications* *Journées de Rochebrune* 2013 «La preuve et ses moyens» *13-19 janvier 2013* *Echéancier* - Date limite d'envoi du titre et du résumé de la proposition de communication : *04 novembre* 2012 - Date limite d'envoi de la proposition de communication (4 à 12 pages): *11 novembre* 2012 - Notification : *25 novembre* 2012 - Date limite d’inscription à Rochebrune : *15 décembre* 2012 - Journées de Rochebrune : *13 au 19 janvier* 2013 - URL de la page web des journées : http://s4.csregistry.org/tiki-index.php?page=rochebrune Chères et chers collègues, C'est avec plaisir que nous vous faisons parvenir l'appel à communications des prochaines journées de Rochebrune, qui auront lieu du 13 au 19 *janvier* 2013. "La notion de preuve a beaucoup évolué au cours de l’histoire des sciences et s’est toujours entendue différemment dans les disciplines formelles et axiomatiques, les sciences expérimentales et les sciences humaines et sociales. Dans les disciplines aujourd’hui regroupées sous l’étiquette « sciences de la complexité » et dialoguant sur la base d’une analyse systémique des phénomènes, elle renvoie à un vaste corpus de moyens de preuve. Ces moyens sont des formes argumentatives concurrentes, des méthodes d’investigation et de raisonnement hétérogènes, des expertises scientifiques qui partagent toutes l’ambition de produire du « dire solide » à la suite de graduations successives du discours (spéculation, plausibilité, vérité)..." L'appel complet, l'échéancier ainsi qu'un descriptif du principe et du fonctionnement de ces journées sont joints à ce courriel. Ils sont également disponibles en ligne à l'adresse : http://s4.csregistry.org/rochebrune Bien cordialement, Pour le comité d'organisation, ------------------------------------------------------------------------ Thomas Louail Postdoctoral associate, STAE Fundation UMR 5505 IRIT, S.M.A.C. team Manufacture des Tabacs, Batîment E 21, allée de Brienne - 31000 Toulouse Phone : +33 (0)682 291 952 Web : http://irit.academia.edu/ThomasLouail ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 19:03:04 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 21:03:04 +0200 Subject: Job: Poste de professeur en analyse semantique des media sociaux, Universite de Montreal Message-ID: Date: Wed, 24 Oct 2012 15:32:38 -0400 From: Philippe Langlais Message-Id: X-url: http://www.nserc-crsng.gc.ca/Professors-Professeurs/CFS-PCP/IRC-PCI_fra.asp Le Département d’informatique et de recherche opérationnelle de l'université de Montréal sollicite des candidatures pour occuper un poste à temps plein de professeure ou de professeur, au rang d’adjoint ou d’agrégé, en analyse sémantique des médias sociaux. L’engagement est conditionnel à l’obtention par le candidat d’une chaire dans le cadre du programme de professeurs-chercheurs industriels du CRSNG. Le dossier de candidature devra être accompagné des formulaires 100, 101 et 183A demandés par l’organisme subventionnaire à l’adresse suivante : http://www.nserc-crsng.gc.ca/Professors-Professeurs/CFS-PCP/IRC-PCI_fra.asp Fonctions Le candidat retenu sera appelé à enseigner aux trois cycles, à encadrer des étudiants aux études supérieures, à poursuivre des activités de recherche, de publication et de rayonnement ainsi qu’à contribuer aux activités de l’institution. Exigences - Doctorat en informatique ou dans un domaine connexe. - Obtention par le candidat (la candidate) d’une chaire de professeur-chercheur. - Expérience industrielle. - Le candidat sera appelé à oeuvrer dans le domaine du traitement des langues naturelles et, plus particulièrement, dans l’analyse sémantique des médias sociaux. - Expérience en enseignement souhaitable. - Dossier de publications. - Maîtrise de la langue française (http://secretariatgeneral.umontreal.ca/fileadmin/user_upload/secretariat/doc_officiels/reglements/administration/adm10-34_politique-linguistique.pdf) Traitement L’Université de Montréal offre un salaire concurrentiel jumelé à une gamme complète d’avantages sociaux. Entrée en fonction À compter du 1er juin 2013. Clôture du concours Le dossier de candidature, constitué d’une lettre de motivation, d’un curriculum vitæ, d’un exemplaire de publications ou de travaux de recherche récents, doit parvenir à l’adresse ci-dessous au plus tard le 31 décembre 2012. Les candidats doivent également demander à trois personnes de faire parvenir une lettre de recommandation au directeur du département à l’adresse suivante : Patrice Marcotte, directeur Département d’informatique et de recherche opérationnelle Université de Montréal C. P. 6128, succursale Centre-ville Montréal (Québec) H3C 3J7 CANADA Les personnes intéressées trouveront des renseignements sur le Département d’informatique et de recherche opérationnelle en consultant le site Web à l’adresse suivante : www.iro.umontreal.ca. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 19:05:50 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 21:05:50 +0200 Subject: Seminaire: Francois Rastier, INALCO, 15, 22, 29 novembre et 6 decembre 2012 Message-ID: Date: Thu, 25 Oct 2012 09:32:42 +0200 From: Mathieu Valette Message-Id: L'Equipe de Recherche Textes, Informatique, Multilinguisme (ERTIM) de l'INALCO a le plaisir de vous inviter aux séances de son séminaire de recherche animées par François Rastier : Description sémantique et contexte culturel. Le séminaire aura lieu les jeudis 15, 22, 29 novembre et 6 décembre 2012, 17h-19h à l'INALCO-Recherche, 2 rue de Lille 75007 Paris (Salle des marbres, porte 124, premier étage, escalier B dans la cour). Lectures possibles : - F. Rastier, La mesure et le grain. Sémantique de corpus, Champion, Paris, 2011. - Revue Texto ! Textes & Cultures (http://www.revue-texto.net) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 19:08:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 21:08:25 +0200 Subject: Job: Postdoctoral position in German Computational Linguistics at Voxygen (Rennes, France) Message-ID: Date: Thu, 25 Oct 2012 11:55:17 +0200 From: Chiara Mazza Message-ID: <50890C85.5090208 at voxygen.fr> X-url: http://www.voxygen.fr/ Postdoctoral position in German Computational Linguistics Context ------- Voxygen is a young and innovative company, created in September 2011, constituted by experts in the field of speech synthesis and linguistics, and located in the Lannion and Rennes areas, France. Voxygen proposes speech synthesis products and services essentially for European, Arabic, and African markets, and is particularly experienced in the creation of expressive voices for industrial and entertainment purposes. For more information on Voxygen: http://www.voxygen.fr/ The speech synthesis solution is widely deployed in voice servers and mobile applications and operates in a large range of environments: PCs and servers (Windows, Linux, MacOS) and mobile devices (Android, Windows Mobile, iPhone OS, Symbian). Voxygen is currently working on adding the German language to its catalog of offers. Task description ----------------- Speech synthesis is a cross-disciplinary area requiring computer science, linguistics and speech processing skills. The linguistic processing is a central component that analyzes text to determine its part-of-speech tags, pronunciation and intonation. This requires the candidate to have knowledge in phonetics, phonology, morphology, syntax and prosody. The candidate will be in charge of the implementation of the linguistic processing for the German language by using or adapting existing language processing tools, and by programming new ones using C/C++ or script languages such as Python and Perl. He or she should bring innovative ideas to the technical team and may interact with marketing and sales teams on needs concerning the German Language. Keywords --------- German Language, Computational Linguistics, Natural Language Processing (NLP), Speech Synthesis Profile of the candidate ------------------------- * Native German speaker * PhD in Computational linguistics or related fields * Skills in computer programming (C/C++, Perl, Python) * Experience in natural language processing * The working languages are English and French * Appreciates working in a team spirit Duration & location ----------------------------------- 12 months with a possible extension of 6 months, starting as soon as possible. The position will be located in the Rennes area (France) Please email your CV and cover letter to Paul BAGSHAW (jobs at voxygen.fr) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:05:53 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:05:53 +0100 Subject: Job: Annotateur d'entites nommees, ELDA Message-ID: Date: Mon, 29 Oct 2012 11:55:31 +0100 From: Leixa jérémy Message-ID: <508E60A3.5040704 at elda.org> X-url: http://www.quaero.org Intitulé du poste ----------------- Annotateur Contexte -------- Dans le cadre du projet de recherche QUAERO (http://www.quaero.org), ELDA (Evaluations and Language resources Distribution Agency) recrute plusieurs personnes pour participer à la création d'un référentiel destiné à évaluer les performances des logiciels de reconnaissance d'Entités Nommées dans des corpus de texte. Les Entités Nommées (EN) sont des objets textuels (c'est-à-dire un mot, ou un groupe de mot) catégorisables dans des classes prédéfinies (personnes, noms d'organisation, noms de lieux, quantités, distances, dates, etc.) La reconnaissance d'Entités Nommées est une sous-tâche essentielle des systèmes d'extraction d'information dans des corpus documentaires. ELDA (www.elda.org) ------------------- Notre activité principale est la distribution et la production de ressources linguistiques (bases de données terminologiques, enregistrements vocaux, dictionnaires électroniques, ...) et l’évaluation de technologies de la langue. Lieu ---- Cette mission aura lieu dans les locaux d'ELDA à Paris (13e). Mission ------- Annotation en Entités Nommées de documents textuels. La tâche consiste à détecter les Entités Nommées dans le texte et de les étiqueter en se référant à une liste de catégories prédéfinies. La formation à la tâche d’annotation et à l’utilisation du logiciel est assurée par ELDA. Profil recherché ---------------- Langue maternelle: français. Une licence ou une maîtrise en sciences du langage ou en informatique (spécialité traitement automatique de la langue ou en sciences de l'information et du document) est un plus. Un expérience de l’environnement LINUX est un plus. Durée ----- Mi-temps sur 2 mois. Début: dès que possible. Contact ------- Jérémy Leixa courriel: leixa at elda.org ELDA 5557, rue Brillat Savarin 75013 Paris http://www.elda.org ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:14:18 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:14:18 +0100 Subject: Journee: Consortium Corpus Ecrits, 23 et 24 novembre 2012, Paris Message-ID: [Formulaire d'inscription à demander à secretariat.ilf at ling.cnrs.fr ou secretariat-general at ling.cnrs.fr - TH] Date: Mon, 29 Oct 2012 17:19:04 +0100 From: Secretariat General Message-ID: <508EAC78.3060306 at ling.cnrs.fr> X-url: http://www.typologie.cnrs.fr X-url: http://www.ilf.cnrs.fr Chères Collègues, ChersCollègues, Le consortium « Corpus écrits » (Corpus-IR) organise sa réunion plénière annuelle _*le vendredi 24 novembre 2012*_, de 9h30 à 18h,au Campus des Cordeliers (15, rue de l'Ecole de Médecine, 75006, Paris). Cette réunion sera consacrée à la *présentation des activités des différents groupes de travail du consortium* : 1. Usage des corpus et droits d'auteurs ou d'éditeurs (aspects juridiques) 2. Corpus d'états anciens de la langue (numérisation, codage) 3. Numérisation (OCR, saisie), correction 4. Pluralité de systèmes d'écriture 5. Corpus multilingues (parallèles, comparables...) 6. Description de corpus collaborative - métadonnées 7. Corpus d'écrits modernes et prise en compte de nouveaux modes de communication (SMS, mail, blog, etc.) 8. Annotation de plus haut niveau : syntaxe, sémantique, référence (annotations collaboratives) 9. Annotation de surface : segmentation lexicale, description morphosyntaxique, chunking, lemmatisation, entités nommées, etc. 10. Exploration de corpus (méthodes, outils) 11. Qualité scientifique et accessibilité des corpus (place des corpus dans l'évaluation de la production scientifique des UR) Chaque présentation sera suivie d'une discussion d'une vingtaine de minutes. Cette journée, très importante pour l'avancée de la réflexion commune sur les corpus écrits, sera suivie, le lendemain, _*samedi 24 novembre*_, d'une *journée d'information et d'échanges sur les aspects juridiques de la propriété et de l'archivage des corpus*, dont le programme sera communiqué très prochainement. *La participation de toutes les personnes intéressées par ces journées est vivement encouragée par le comité de pilotage**, **qu'elles soient ou non inscrites à un groupe de travail. * ** Si la participation à ces journées est libre, *l'inscription est obligatoire*. Vous trouverez en pièce jointe le *formulaire d'inscription *à retourner *au plus tard le 12 novembre 2012 * au secrétariat de l'Institut de Linguistique Française,institution gestionnaire du Consortium, qui, le cas échéant, prendra contact avec vous pour organiser votre mission. Le consortium contribuera au financement des missions des participants actifs des groupes de travail.** Au plaisir devous accueillir nombreux les *23 et 24 novembre prochains*. *Pour le comité de pilotage du Consortium « Corpus écrits » * *Franck Neveu, Directeurde l'ILF* ** *Le comité de pilotage du Consortium «Corpus écrits » : * ** *Franck Neveu *pour l'ILF,FR 2393 --*Porteur du consortium* *Sylvie Archaimbault* (suppléant *BernardColombat*) pour HTL--UMR7597-Université Denis Diderot - Paris 7 *Benoit Sagot *pour ALPAGE -- INRIA-Université Denis Diderot - Paris 7 *SergeHeiden *pour ICAR- UMR 5191 - Université Lumière Lyon 2 *Damon Mayaffre*(Suppléante *Mahé Ben Hamed*) pour BCL-UMR6039-Université Nice Sophia Antipolis *Jean-Marie Pierrel* pour l'ATILF - UMR 7118 -- Nancy - Université *Clément Plancq* (suppléant *Olivier Bonami*) pour le LLF-UMR7110-Université Denis Diderot- Paris 7 *Céline Poudat *pour le LDI - UMR 7187-- Université de Paris 13 *Catherine Schnedecker* (suppléante *Amalia Todirascu*) pou rLILPA--EA1339--Université de Strasbourg *Agnès Tutin* (suppléante *Marie-PauleJacques*) pour le LIDILEM-- EA 609 -- Université Grenoble 3 Véronique BRISSET-FONTANA Secrétaire générale Fédérations de Linguistique FR 2559 - www.typologie.cnrs.fr FR 2393 - www.ilf.cnrs.fr Tél. 01 43 13 56 45 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:18:19 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:18:19 +0100 Subject: Job: Stage, Developper de performance d'un synthese de messages courts Message-ID: Date: Tue, 30 Oct 2012 09:29:38 +0100 From: Patrick LLORET Message-ID: <508F8FF2.30400 at succeed-together.eu> X-url: http://www.succeed-together.eu Recrutement stagiaire R&D Société : PME innovante spécialisée dans le développement de la performance des réunions par l'accélération des échanges entre les participants, recherche un stagiaire pour intégrer notre pôle R&D, constitué de 2 personnes, travaillant en lien avec des laboratoires d’informatique fondamentale affiliés au CNRS. Nous développons des outils numériques de traitement de l'information permettant le regroupement sémantique de messages courts en temps réel. Poste et mission : Nous développons un logiciel visant à synthétiser les idées exprimées dans des messages courts par des techniques de TALN. Le stagiaire aura pour mission principale de développer la performance du système existant (clustering, utilisation de dictionnaires, mise au point du machine learning ...). Le stagiaire sera aussi amené à piloter le logiciel en séminaire devant nos clients, parmi lesquels figurent de nombreux grands groupes (SNCF, Sodexo, Safran, BNP Paribas, BPCE, Auchan, Vinci, Accor, Foncia, Crédit Agricole...), afin de tester en situation le résultat de ce travail. Profil : - Niveau Master - Bon niveau en python - Connaissance des systèmes GNU/linux - Connaissances des technologies web appréciées (html/javascript/jquery/css/django) - Facilité relationnelle et bonne présentation - Bonne approche conceptuelle de la résolution de problèmes, autonomie décisionnelle - Bon niveau en anglais - Stage d’une durée de 6 mois minimum – dès que possible - Poste basé à Paris Merci d’envoyer votre candidature à plloret at succeed-together.eu Patrick LLORET Mob : +33 (0)6 89 74 80 69 Mél : plloret at succeed-together.eu http://www.succeed-together.eu STmain2011FR_signature ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:20:53 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:20:53 +0100 Subject: Seminaire: Information Theoretic Approaches to ad hoc Information Retrieval, VISEO R&D, Grenoble Message-ID: Date: Tue, 30 Oct 2012 15:03:45 +0000 From: Cédric Lopez Message-ID: X-url: http://www.viseo.net/en/news/information-theoretic-approaches-ad-hoc-information-retrieval X-url: http://www.viseo.net/en/viseo-research-and-development X-url: http://mrim.imag.fr/eric.gaussier/) X-url: http://www.liglab.fr/?lang=en X-url: http://www.liglab.fr/spip.php?article917&lang=en X-url: http://www.viseo.net/ Research Seminar Information theoretic approaches to ad hoc information retrieval by Eric Gaussier and Parantapa Goswami 28 November 2012, 12:00 - 13:00 VISEO R&D - Le Pulsar - 4 av du Doyen Louis Weil 38000 GRENOBLE (http://www.viseo.net/en/viseo-research-and-development ) Abstract Information retrieval (IR) has become highly popular through the daily use of web search engines. Nowadays, the best performing models in ad hoc IR are based on probabilistic approaches, following different principles (ranking principle, language modeling or divergence from randomness). We provide in our presentation an introduction to a recent family of probabilistic models solely based on information theoretic principles, and show how such models relate to well-known properties of text collections. We will also show how to use such models in different settings, as cross-language information retrieval and query expansion. Bio ERIC GAUSSIER (http://mrim.imag.fr/eric.gaussier/) is Professor of Computer Science at the Université Joseph Fourier (Grenoble I). He is Deputy Director of the Laboratory of Informatics of Grenoble (LIG - http://www.liglab.fr/?lang=en) and Head of the AMA (http://www.liglab.fr/spip.php?article917&lang=en) team, working on Machine Learning and Information Modeling. His research focuses on probabilistic modeling of large document collections for information access. He is particularly interested in multilingual, multimedia collections, and applications as categorization, clustering and information retrieval. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:22:46 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:22:46 +0100 Subject: Appel: Atelier SOS'2013 "Sources Ouvertes et Services", EGC'2013 Message-ID: Date: Tue, 30 Oct 2012 16:49:11 +0100 From: Laurie Serrano Message-ID: X-url: http://weblab-project.org/workshop/sos2013 *Merci de diffuser cet appel à communications le plus largement possible. Toutes nos excuses pour d'éventuelles réceptions multiples.* *Quatrième édition de l’atelier "Sources Ouvertes et Services"* *SOS’2013 (en association avec EGC’2013)* *29 janvier 2013 - Toulouse, France* http://weblab-project.org/workshop/sos2013 *Présentation de l'atelier :* Pour sa quatrième édition, l'atelier SOS'2013 propose de poursuivre une réflexion commune autour des diverses problématiques liées au traitement de données disponibles en sources ouvertes (SO). Les SO désignent l'ensemble des médias accessibles librement, de façon gratuite ou payante, tels qu'Internet, les bases de données publiques, les journaux, les CD-ROM, les chaînes de télévision et de radio, etc., par opposition aux sources fermées dont la consultation nécessite de disposer d'autorisations spécifiques. Ces SO fournissent d’importants volumes de données multimédia hétérogènes (image, texte, audio, vidéo, etc.) qui nécessitent des traitements adaptés en vue de leur exploitation. Cet atelier est dédié à toutes ces étapes, partant de la phase de découverte des sources d’information, en passant par la collecte et l'analyse des données collectées jusqu’à la phase de capitalisation et d’exploitation des connaissances. Un intérêt particulier sera porté aux choix architecturaux retenus pour la réalisation d’applications exploitant les SO. En effet, ces applications tentent généralement de faire cohabiter plusieurs briques logicielles (COTS, logiciels open source, développements ad-hoc, etc.) en vue de la réalisation d'une tâche particulière. Cela représente un défi scientifique et technique auquel les recherches sur l’enchaînement des traitements algorithmiques capables d'exploiter ces données peuvent contribuer. L’accent sera mis sur les architectures orientées services (SOA) et sur l’utilisation des technologies du Web Sémantique. Pour cette nouvelle édition, l'atelier SOS souhaite s'intéresser également au traitement des grandes masses de données ("Big data"). L'explosion récente des données disponibles sur le Web a fait émerger de nouvelles problématiques visant à adapter et optimiser toute la chaine de traitement de l'information face aux nouveaux volumes à traiter. Cet atelier souhaite rassembler des chercheurs issus à la fois des mondes académique et industriel afin d'obtenir un panel représentatif d'acteurs et de leurs travaux sur les thèmes présentés ci-après. *Thèmes de l'atelier :* Les auteurs sont invités à envoyer des propositions ayant une portée théorique, méthodologique ou pratique, sur l’un des thèmes suivants (liste non exhaustive) : * Identification et découverte automatique de sources d’information, * Accès et collecte d’information à partir de sources ouvertes (Web, réseaux sociaux, flux RSS et autres), * Classification, filtrage des informations d'intérêt, * Extraction d’information à partir de textes non structurés et/ou utilisant des vocabulaires spécifiques (blogs, langage sms, forums), * Extraction d’information à partir de gros volumes de données multimédia (texte, image, vidéo, audio), * Analyse des sentiments/opinions dans les médias sociaux (réseaux sociaux, blogs, forums), * Modélisation et capitalisation des connaissances extraites à partir de sources ouvertes (ontologies, annotations sémantiques), * Exploitation des connaissances extraites à partir de sources ouvertes : raisonnement, aide à la décision, visualisation, * Détection de signaux faibles, * Évaluation et qualification des sources d’information, * Évaluation et qualification des informations extraites à partir de sources ouvertes, * Inférence, fouille et validation de liens entre données, * Plate-formes d’intégration de services de traitement hétérogènes : interopérabilité des services, orchestration sémantique, * Applications de veille stratégique ou économique à partir de sources ouvertes, * Applications de renseignement d’origine sources ouvertes (ROSO), * Applications de traitement de l'information orientées "Big data". *Comité de programme :* Florence Amardeilh (Mondeca) Gaël de Chalendar (CEA LIST) Olivier Corby (INRIA, Sophia Antipolis) Valentina Dragos (ONERA) Adil El Ghali (IBM) Christian Fluhr (GEOL Semantics) Bruno Grilheres (Cassidian) Dafni Stampouli (Cassidian) Nicolas Hernandez (Université de Nantes) Alexandre Pauchet (LITIS, Rouen) Haïfa Zargayouna (LIPN, Université de Paris 13) Maroua Bouzid (GREYC, Université de Caen) Cassia Trojahn (IRIT, Toulouse) Klaus Atzenbeck (IISYS, Hof University, Germany) Sinan Yurtsever (Atos, Turkey) *Dates importantes :* * Date limite de soumission : *30 Novembre 2012* * Notification aux auteurs : 21 Décembre 2012 * Version finale : 2 Janvier 2013 * Date de l'atelier : 29 Janvier 2013 *Soumission :* Cet atelier invite les auteurs à soumettre leurs articles en français ou en anglais (résumé en anglais souhaité dans les deux cas). Les articles soumis ne devront pas dépasser 12 pages (annexes comprises). Nous vous demandons de respecter les instructions des organisateurs de EGC'2013 et d'utiliser la dernière version du format RNTI disponible à l'adresse suivante : http://www.antsearch.univ-tours.fr/rnti Chaque proposition sera évaluée par au moins deux membres du comité de programme. Les communications retenues feront l’objet d’une présentation orale par l'un des auteurs (en français ou en anglais à leur convenance). La journée commencera par une présentation d’un conférencier invité sur l'un des thèmes de l’atelier et s’achèvera par une table ronde autour des problématiques liées à l’exploitation des sources ouvertes. *Les articles doivent être envoyés à l'adresse suivante : sos2013 at weblab-project.org* ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:24:16 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:24:16 +0100 Subject: Appel: Atelier CIDN - EGC 2013 Message-ID: Date: Tue, 30 Oct 2012 17:18:04 +0100 From: LEMAIRE Vincent Message-ID: <508FFDBC.2080108 at orange.com> X-url: http://perso.rd.francetelecom.fr/lemaire/CIDN/ Veuillez nous excuser en cas de réceptions multiples -------------------------------------------------------- APPEL A COMMUNICATION : ATELIER CIDN (Apprentissage incrémental et méthodes de détection de nouveauté) 1ère Rencontre EGCIA (associations EGC et AFIA) http://perso.rd.francetelecom.fr/lemaire/CIDN/ Le développement de méthodes d'analyse dynamique de l'information, comme le clustering incrémental et les méthodes de détection de nouveauté, devient une préoccupation centrale dans un grand nombre d'applications dont le but principal est de traiter de larges volumes d’information variant au cours du temps. Ces applications se rapportent à des domaines très variés et hautement stratégiques, tels que l'exploration du Web et la recherche d'information, l'analyse du comportement des utilisateurs et les systèmes de recommandation, la veille technologique et scientifique, ou encore, l’analyse de l'information génomique en bioinformatique... Pour ne prendre en exemple qu’un type d’application sur des données textuelles, force est de constater que les publications sur des méthodes permettant de détecter les ruptures technologiques, les thématiques novatrices, sont très présentes dans les congrès et revues. Cet intérêt est souligné par la mise en place par la Commission Européenne du programme NEST (New and Emerging Science and Technology) dans le cadre du FP6 et du programme FET (Future and Emerging Technologies) dans le FP7. Lors des deux précédentes conférences EGC, se sont déroulés les premiers ateliers CIDN. Devant l’intérêt du public pour cette thématique nous proposons d’organiser la troisième édition de cet atelier lors de la conférence EGC’13 à Toulouse. L'objectif de cet atelier, commun EGC-AFIA, est de réunir des chercheurs confirmés, ainsi que des jeunes chercheurs, autour des problématiques et des applications de la « classification incrémentale », et de la « détection de nouveauté » sur des types de données variées, afin d’échanger nos réflexions sur les travaux en cours ainsi que sur les points bloquants. 1ère instance de rencontre entre les associations EGC et AFIA : cet atelier est conjointement supporté et organisé par EGC et l’AFIA. Au sein des deux communautés, les thèmes de l'atelier ont été identifiés comme des thèmes porteurs notamment pour créer des liens avec les industriels. Dans cette perspective, les entreprises et les industriels sont également invités à présenter leurs travaux et à venir échanger. Principaux thèmes (liste non limitative) : - Les algorithmes et techniques de détection de nouveauté, - Les méthodes de classification incrémentales, - Les méthodes adaptatives de clustering, - Les méthodes de classification hiérarchique adaptatives, - Les méthodes neuronales adaptatives, - Les approches basées sur l'intelligence en essaim et les algorithmes génétiques, - Les méthodes de classification supervisées prenant en charge la dérive de concepts, - Les méthodes de visualisation des résultats d’analyse de donnée évolutives, - Les méthodes de détection de dérive de concepts, - Analyse des changements de structures de grands graphes sémantiques. Domaines applicatifs (liste non limitative) : . La vision et la compréhension d'images, . La robotique, . Privacy, sécurité et biométrique, . L'interaction homme-machine, . L'intelligence ambiante, . Analyse de l'information textuelle évolutive, . Génomique et puces ADN, . Détection d'anomalies, . Contrôle de processus industriels, . Recommandation adaptive, systèmes de filtrage, . Supervision sur les réseaux de télécoms, . Gestion et prévision d'énergie . Détection des signaux faibles dans le Renseignement et l'Investigation Judiciaire. Comité d’organisation : Pascal Cuxac, INIST-CNRS, Vandoeuvre les Nancy, (pascal.cuxac at inist.fr) Jean-Charles Lamirel, TALARIS-LORIA, Vandoeuvre-lès-Nancy, (jean-charles.lamirel at loria.fr) Vincent Lemaire, ORANGE-LABS, Lannion, (vincent.lemaire at orange-ftgroup.com) Thomas Guyet, IRISA, Rennes, (Thomas.guyet at irisa.fr) Jean Rohmer, ESILV, Paris La Défense, (jean.rohmer at devinci.fr) Comité de programme (ordre alphabétique en cours d’élaboration) : . Alexis Bondu (EDF R&D), . Laurent Candillier (Wikio Group / Nomao), . Fabrice Clérot (Orange Labs), . Pascal Cuxac (INIST-CNRS, Vandoeuvre les Nancy), . Bernard Dousset (IRIT, Toulouse), . Claire François (INIST-CNRS, Vandoeuvre les Nancy), . Hatem Hamza (Orange, Sophia Antipolis), . Pascale Kuntz-Cosperec (Polytech'Nantes), . Jean-Charles Lamirel (Talaris Loria), . Vincent Lemaire (Orange Labs), . Gaelle Losli (Polytech Clermont-Ferrand), . Christophe Salperwyck (Orange Labs), . Fabien Torre (Université Lille 3) . Thomas Guyet (Agrocampus-Ouest Rennes) . Jean Rohmer (Ecole Supérieure d'Ingénieurs Léonard de Vinci) Dates importantes (dates prévisionnelles) : . Date limite de soumission des articles 03 décembre 2012 . Notification aux auteurs 21 décembre 2012 . Version finale 10 janvier 2013 . Atelier 29 janvier 2013 Format des soumissions Nous attendons des soumissions de six (6) à dix-huit (18 pages) au format PDF générés avec le style LaTeX RNTI, disponible depuis le site d'EGC 2013. Adresse de soumission Les articles doivent être soumis en ligne sur EasyChair à l'adresse suivante :http://www.easychair.org/conferences/?conf=cidn13 Actes Les articles acceptés seront mis en ligne sur le site web de l’atelier et de la conférence EGC. Les meilleurs des papiers reçus en 2011, 2012, et de cette nouvelle édition 2013, seront invités à être étendus pour un numéro spécial "Classifications incrémentales et méthodes de détection de nouveauté" de la Revue RIA. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:25:38 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:25:38 +0100 Subject: Appel: TIPA 29 Message-ID: Date: Tue, 30 Oct 2012 18:37:01 +0100 From: Joëlle Lavaud Message-ID: <5090103D.3040500 at lpl-aix.fr> X-url: http://www.revues.org X-url: http://www.lpl.univ-aix.fr/index.php?id=27 Appel à contributions -//TIPA n° 29 Travaux interdisciplinaires sur la parole et le langage *Numéro consacré à l'étude du /français parlé/* La revue /TIPA/ (Travaux Interdisciplinaires sur la parole et le langage), qui va être désormais exclusivement publiée en version électronique sur la plateforme d'édition électronique Revues.org (http://www.revues.org)., prépare aujourd'hui son 29e numéro. Le thème général qui a été retenu pour ce numéro est celui des études consacrées au */français parlé/*. L'orientation thématique des /TIPA/ entend rester largement interdisciplinaire, ce qui permettra d'accueillir des contributions s'inscrivant dans la problématique du français parlé à partir d'une pluralité de disciplines (syntaxe, phonétique, discours...) et de points de vue (linguistique descriptive orientée corpus, linguistique théorique, psycholinguistique, neurolinguistique, traitement automatique des langues, sociolinguistique, didactique, ...). La langue de publication sera le français ou l'anglais, avec la particularité de présenter un résumé long de deux pages dans l'autre langue. Cela permettra ou va permettre de rendre les articles francophones plus accessibles à la communauté anglophone et vice versa, donc d'atteindre un public plus large. ------------------------------------------------------- */Informations sur les propositions de contributions :/* */Dates importantes/** :* 7 janvier 2013 : date limite de réception des propositions de contribution. fin février 2013 : notification d'acceptation aux auteurs, après avis du comité scientifique. fin juin 2013 : réception des articles. parution : fin 2013. /*Format de la proposition* /: 1 fichier comprenant exclusivement le titre de l'article ainsi que le nom et l'affiliation du / des auteur(s). 1 fichier d'une 1 page (en times 12), comprenant : le titre mais pas l'auteur; ½ page environ consacrée à la présentation du sujet de la recherche et du cadre théorique et méthodologique ; ½ page environ présentant les résultats principaux. Cette page de présentation pourra être accompagnée d'une courte bibliographie de 5 ou 6 titres au maximum(dans laquelle les travaux de l'auteur ou des auteurs de l'article soumis n'apparaissent pas plus de 2 fois). /*Envoi des proposition*s : /tipa at lpl-aix.fr Vous pouvez également consulter les consignes aux auteurs : http://www.lpl.univ-aix.fr/index.php?id=27 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:29:15 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:29:15 +0100 Subject: Appel: Atelier "Fouille de Donnees Complexes : complexite liee aux donnees multiples et massives", FDC@EGC2013 Message-ID: Date: Wed, 31 Oct 2012 10:19:27 +0100 From: Guillaume Cleuziou Message-ID: <5090ED1F.6090805 at univ-orleans.fr> X-url: http://eric.univ-lyon2.fr/~gt-fdc/ *Appel à communication* *----------------------------------------------------* ** _10ème_ édition de l'atelier ** *----------------------------------------------------* *Fouille de données complexes* - *complexité liée aux données multiples et massives* - 29 janvier 2013 à Toulouse, France *Dans le cadre de la 13ème édition de la conférence internationale francophone EGC 2013 (http://www.irit.fr/EGC2013/).* https://sites.google.com/site/afdc2013/ http://eric.univ-lyon2.fr/~gt-fdc/ /////////////////////////////////// *Dates importantes * ////////////////////////////////// ***Date limite de soumission : 30 Novembre 2012* Notification aux auteurs : 17 Décembre 2012 Réception des versions finales : 11 janvier 2013 Programme préliminaire : 15 Janvier 2013 //////////////////////////////////////////////////////////////////////// Cet atelier est devenu au fil des années un lieu privilégié de rencontre où chercheurs/industriels viennent partager leurs expériences et expertises dans le domaine de la fouille de données complexes. L'atelier est ouvert en termes de propositions. Nous souhaitons stimuler particulièrement des discussions aussi bien du point de vue expérimental que théorique, académique et industriel. Les présentations pourront concerner aussi bien des travaux aboutis, des réflexions, que des études préliminaires (exposant davantage des problématiques originales que des solutions) en fouille de données complexes. Enfin, les discussions sur les problématiques inter ou pluridisciplinaires sont également bienvenues. //////////////////////////////////////////////////////////////////////// *Instructions aux auteurs * //////////////////////////////////////////////////////////////////////// Les auteurs sont invités à soumettre électroniquement leur proposition en utilisant easychair https://www.easychair.org/account/signin.cgi?conf=egcfdc10 *La taille des soumissions sera de 12 pages maximum*. Elle pourra être beaucoup plus courte, en particulier pour les articles présentant un travail qui débute ou présentation de projet de recherche.Le format latex à utiliser est celui de la revue Revue des Nouvelles Technologie de l'Information (RNTI) disponible pour les ateliers à l'adresse suivante: http://www.antsearch.univ-tours.fr/rnti/default.asp?FCT=DP&ID_PAGE=7 /////////////////////////////////// *Description* /////////////////////////////////// Dans tous les domaines tels que le multimédia, la télédétection, l'imagerie médicale, les bases de données, le web sémantique, la bioinformatique, la géomatique et bien d'autres encore, les données à traiter pour y extraire de la connaissance sont de plus en plus complexes, volumineuses et massives (Big Data) couvrant des téraoctets, des pétaoctets et des zettaoctets. Nous sommes ainsi conduits à manipuler des données souvent peu ou non structurées : * issues de diverses provenances : comme des capteurs ou sources physiquesd'informations variées ; * représentant la même information à des dates différentes ; * regroupant différentes vues ou types d'informations (images, textes) ou encore denatures différentes (logs, contenu de documents, ontologies, etc.). * ayant des distributions différentes et déséquilibrées (ce qui devient aujourd'hui unenorme et non une exception) * ... Aussi la fouille de données complexes ne doit plus être considérée comme un processus isolé mais davantage comme une des étapes du processus plus général d'extraction de connaissances dans les bases de données (ECDB). En effet, avant d'appliquer des techniques de fouille de données, les données complexes et massives ont besoin de mise en forme et de structuration.Pour le deuxième année consécutive nous choisissons d'orienter l'atelier sur deux problématiques plus spécifiques et complémentaires : * La complexité liée aux données multiples (multisources, multi-vues,tableaux multiples, séquentielles, etc.) * La complexité liée aux données massives au regard des solutions émergentes en matière de traitements décentralisés des données ou des traitements massivement parallèles(cloud computing, paradigme type MapReduce) Depuis bientôt quatre ans, le groupe de travail EGC « Fouille de Données Complexes » tente de fédérer et d'animer les recherches actuelles dans le domaine de la fouille sur données multiples. Les problématiques de stockage et d'indexation d'une part et de traitement par des approches probabilistes, possibilistes ou floues d'autre part ont été exposées. Il s'agira pour cette année de poursuivre l'animation de cette thématique et de la situer dans un nouvel environnement technique offert par les paradigmes types MapReduce (les traitements massivement parallèles) et le cloud computing (informatique en nuage). En effet, l'amélioration de la performance des réseaux permet aujourd'hui d'envisager des solutions de stockage et de traitement décentralisé des données, prenant tout leur sens lorsque ces données proviennent de sources multiples. Les contributions scientifiques attendues pour cet atelier pourront correspondre à l'une ou l'autre de ces orientations avec une forte incitation à proposer des contributions se situant à l'intersection de ces domaines. Une liste de thèmes est donnée ci dessous et reste ouverte et non limitative : * Pré traitement, structuration et organisation des données complexes et massives (Big Data) * Processus et méthodes de fouille de données complexes (clustering, biclustering, etc.) * Calculs intensifs pour la fouille de données massives (Big data) * Classification et fusion de données multisources et distribuées * Classification et traitement collaboratif et/ou coopératif * Retours d'expériences d'extraction de connaissances à partir de données complexes * Rôle des connaissances en fouille de données complexes * Fouille de données imprécises et/ou incertaines * Fouille de Données décentralisée et dématérialisée (Cloud mining) * Fouille de Données et les algorithmes massivement distribués *Responsables :* * Guillaume Cleuziou (LIFO, Université d'Orléans) * Cyril de Runz (CReSTIC, Université de Reims ChampagneArdenne) * Germain Forestier (MIPS, Université de Haute-Alsace) * Mustapha Lebbah (LIPN, Université Paris 13) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:30:24 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:30:24 +0200 Subject: Ecole: FSFLA 2012, Tarragona, Spain, October 29 =?WINDOWS-1252?Q?=96_?=November 2, 2012 Message-ID: Date: Sun, 30 Sep 2012 18:10:51 +0200 From: "GRLMC" Message-ID: <7EC8E25B5FE543A09C44FEB1A87FA0D6 at Carlos1> X-url: http://grammars.grlmc.com/fsfla2012/ ********************************************************************* 2012 INTERNATIONAL FALL SCHOOL IN FORMAL LANGUAGES AND APPLICATIONS FSFLA 2012 (formerly International PhD School in Formal Languages and Applications) Tarragona, Spain October 29 ? November 2, 2012 Organized by: Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University http://grammars.grlmc.com/fsfla2012/ ********************************************************************* AIM: FSFLA 2012 offers a broad and intensive series of lectures at different levels on selected topics in language and automata theory and their applications. The students choose their preferred courses according to their interests and background. Instructors are top names in their respective fields. The School intends to help students initiate and foster their research career. The previous event in this series was FSFLA 2011 ( http://grammars.grlmc.com/fsfla2011/). ADDRESSED TO: Graduate (and advanced undergraduate) students from around the world. Most appropriate degrees include: Computer Science and Mathematics. Other students (for instance, from Linguistics, Electrical Engineering, Molecular Biology or Logic) are welcome too provided they have a good background in discrete mathematics. The School is appropriate also for people more advanced in their career who want to keep themselves updated on developments in the field. There is no overlap in the class schedule. COURSES AND PROFESSORS: - Eric Allender (Rutgers), Circuit Complexity: Recent Progress in Lower Bounds [introductory/advanced, 8 hours] - Amihood Amir (Bar-Ilan), Periodicity and Approximate Periodicity in Pattern Matching [introductory, 6 hours] - Ahmed Bouajjani (Paris 7), Automated Verification of Concurrent Boolean Programs [introductory/advanced, 8 hours] - Bruno Courcelle (Bordeaux), Automata for Monadic Second-order Model Checking [intermediate, 8 hours] - J?rg Flum (Freiburg), The Halting Problem for Turing Machines [introductory/advanced, 6 hours] - Aart Middeldorp (Innsbruck), Termination of Rewrite Systems [introductory/intermediate, 8 hours] REGISTRATION: It has to be done on line at http://grammars.grlmc.com/fsfla2012/Registration.php FEES: They are variable, depending on the number of courses each student takes. The rule is: 1 hour = - 10 euros (for payments until June 2, 2012), - 12.50 euros (for payments between June 3 and August 15, 2012), - 15 euros (for payments after August 15, 2012). PAYMENT PROCEDURE: The fees must be paid to the School's bank account: Uno-e Bank bank?s address: Julian Camarillo 4 C, 28037 Madrid, Spain IBAN: ES3902270001820201823142 SWIFT/BIC code: UNOEESM1 account holder: Carlos Martin-Vide GRLMC account holder?s address: Av. Catalunya 35, 43002 Tarragona, Spain Please mention FSFLA 2012 and your name in the subject. A receipt will be provided on site. Remarks: - Bank transfers should not involve any expense for the School. - People claiming early registration will be requested to prove that the bank transfer order was carried out by the deadline. - Students may be refunded only in the case when a course gets cancelled due to the unavailability of the instructor. People registering on site at the beginning of the School must pay in cash. For the sake of local organization, however, it is much recommended to do it earlier. ACCOMMODATION: Information about accommodation is available on the website of the School. CERTIFICATE: Students will be delivered a certificate stating the courses attended, their contents, and their duration. IMPORTANT DATES: Announcement of the programme: March 24, 2012 Starting of the registration: March 24, 2012 Very early registration deadline: June 2, 2012 Early registration deadline: August 15, 2012 Starting of the School: October 29, 2012 End of the School: November 2, 2012 QUESTIONS AND FURTHER INFORMATION: Lilica Voicu: florentinalilica.voicu at urv.cat WEBSITE: http://grammars.grlmc.com/fsfla2012/ POSTAL ADDRESS: FSFLA 2012 Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University Av. Catalunya, 35 43002 Tarragona, Spain Phone: +34-977-559543 Fax: +34-977-558386 ACKNOWLEDGEMENTS: Diputaci? de Tarragona Universitat Rovira i Virgili ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:32:33 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:32:33 +0200 Subject: These: Beatrice Arnulphy, Designations nominales d'evenements - Etude et extraction automatique dans les textes Message-ID: Date: Sun, 30 Sep 2012 22:04:48 +0200 From: B?atrice Arnulphy Message-ID: <5068A5E0.1020108 at limsi.fr> Bonjour, J'ai le plaisir de vous annoncer ma soutenance de th?se intitul?e "D?signations nominales d'?v?nements - ?tude et extraction automatique dans les textes". La soutenance aura lieu le *mardi 2 octobre 2012 ? 10h30* en salle de conf?rences du LIMSI-CNRS (B?t 508 http://www.limsi.fr/Pratique/acces/, Universit? Paris Sud, Orsay ; http://www.limsi.fr). Vous ?tes cordialement invit?s au pot qui suivra la soutenance. *Le jury de soutenance* est compos? de : * Directeurs de th?se Anne Vilnat -- Professeur - LIMSI-CNRS, Universit? Paris-Sud Xavier Tannier -- MCF - LIMSI-CNRS, Universit? Paris-Sud * Rapporteurs Laurence Danlos -- Alpage - Universit? Paris 7 Patrice Bellot -- LSIS - Polytechnique, Universit? d'Aix-Marseille * Examinateurs Sophie Rosset -- LIMSI-CNRS, Orsay Laura Calabrese -- MCF - Universit? Libre de Bruxelles Philippe Muller -- MCF en informatique - Universit? Paul Sabatier, Toulouse *R?sum? de th?se :* Ma th?se a pour but l'?tude des d?signations nominales des ?v?nements pour l'extraction automatique. Mes travaux s'inscrivent en traitement automatique des langues, soit dans une d?marche pluridisciplinaire qui fait intervenir linguistique et informatique. L'extraction d'information a pour but d'analyser des documents en langage naturel et d'en extraire les informations utiles ? une application particuli?re. Dans ce but g?n?ral, de nombreuses campagnes d'extraction d'information ont ?t? men?es : pour chaque ?v?nement consid?r?, la t?che de la campagne est d'extraire certaines informations relatives (participants, dates, nombres, etc.). D?s le d?part ces challenges touchent de pr?s aux entit?s nomm?es (?l?ments "notables" des textes, comme les noms de personnes ou de lieu). Toutes ces informations forment un ensemble autour de l'?v?nement et ces travaux ne s'int?ressent pas aux mots utilis?s pour d?crire l'?v?nement (particuli?rement lorsqu'il s'agit d'un nom). L'?v?nement est vu comme un tout englobant, comme la quantit? et la qualit? des informations qui le composent. Contrairement aux travaux en extraction d'informations g?n?rale, notre int?r?t principal est port? uniquement sur la mani?re dont sont nomm?s les ?v?nements qui se produisent et particuli?rement ? la d?signation nominale utilis?e. Pour nous, l'?v?nement est ce qui arrive, ce qui vaut la peine qu'on en parle. Les ?v?nements plus importants font l'objet d'articles de presse ou apparaissent dans les manuels d'Histoire. Un ?v?nement peut ?tre ?voqu? par une description verbale ou nominale. Dans cette th?se, nous avons r?fl?chi ? la notion d'?v?nement. Nous avons observ? et compar? les diff?rents aspects pr?sent?s dans l'?tat de l'art jusqu'? construire une d?finition de l'?v?nement et une typologie des ?v?nements en g?n?ral qui conviennent dans le cadre de nos travaux et pour les d?signations nominales des ?v?nements. Nous avons aussi d?gag? de nos ?tudes sur corpus diff?rents types de formation de ces noms d'?v?nements, dont nous montrons que chacun peut ?tre ambigu ? des titres divers. Pour toutes ces ?tudes, la composition d'un corpus annot? est une ?tape indispensable, nous en avons donc profit? pour ?laborer un guide d'annotation d?di? aux d?signations nominales d'?v?nements. Nous avons ?tudi? l'importance et la qualit? des lexiques existants pour une application dans notre t?che d'extraction automatique. Nous avons aussi, par des r?gles d'extraction, port? int?r?t au contexte d'apparition des noms pour en d?terminer l'?v?nementialit?. ? la suite de ces ?tudes, nous avons extrait un lexique pond?r? en ?v?nementialit? (dont la particularit? est d'?tre d?di? ? l'extraction des ?v?nements nominaux), qui rend compte du fait que certains noms sont plus susceptibles que d'autres de repr?senter des ?v?nements. Utilis?e comme indice pour l'extraction des noms d'?v?nements, cette pond?ration permet d'extraire des noms qui ne sont pas pr?sents dans les lexiques standards existants. Enfin, au moyen de l'apprentissage automatique, nous avons travaill? sur des traits d'apprentissage contextuels en partie fond?s sur la syntaxe pour extraire de noms d'?v?nements. B?atrice Arnulphy ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:33:58 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:33:58 +0200 Subject: Conf: Colloque Traitement de corpus linguistiques, 4-5 octobre 2012, Sorbonne, Paris Message-ID: Date: Mon, 1 Oct 2012 15:25:54 +0200 From: marine damiani Message-ID: Chers coll?gues, Nous vous rappelons que le colloque des doctorants et jeunes chercheurs du laboratoire MoDyCo s'int?ressant cette ann?e aux outils et m?thodes pour le traitement de corpus linguistiques aura lieu les *4 et 5 octobre 2012* ? l?*amphith??tre Durkheim* ? la *Sorbonne*. La participation au colloque est libre et nous esp?rons que vous serez nombreux ? ?tre int?ress?s par le programme que vous trouverez en pi?ce jointe ou sur la page du colloque: https://sites.google.com/site/coldoc2012/programme Pour toute information compl?mentaire, vous pouvez nous contacter par mail: coldoc2012 at gmail.com Bien cordialement, Le comit? d'organisation. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:39:02 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:39:02 +0200 Subject: Appel: Coria 2013 et RJCRI 2013 Message-ID: Date: Mon, 1 Oct 2012 17:36:42 +0200 From: Catherine Berrut Message-Id: <08FF8C1D-99DB-4C14-AD25-8596FA7D60AF at imag.fr> X-url: http://coria.unine.ch X-url: http://coria.unine.ch/rjcri.htm CORIA 2013, Neuch?tel (Suisse), du 3 au 5 Avril 2013 CORIA 2013 (http://coria.unine.ch) est la dixi?me ?dition de la COnf?rence en Recherche d'Information et Applications. Organis?e avec le support de l'ARIA (Association francophone de Recherche d'Information et Applications, http://www.asso-aria.org), elle est la principale manifestation francophone dans ce domaine. CORIA a pour but de rassembler les ?quipes et les chercheurs menant des travaux scientifiques dans le domaine de la recherche d'informations : recherche d'information sur le web, extraction d'information au sein de documents multim?dia, analyse d'opinion ou de r?seaux sociaux, contextes monolingue ou multilingue, recherche de documents num?riques et d'images, apprentissage et classification automatiques, interfaces homme-machine pour l'acc?s ? l'information, etc. CORIA se veut largement ouverte ? l'ensemble de la communaut? scientifique concern?e par la Recherche d'Information. Apr?s s'?tre tenue ? Toulouse, Grenoble, Lyon, Saint-?tienne, Lannion, Toulon, Sousse (Tunisie, en partenariat avec CIFED), Avignon, Bordeaux (en partenariat avec CIFED), CORIA aura lieu cette ann?e du 3 avril au 5 avril 2013 ? Neuch?tel (Suisse). L'activit? scientifique en recherche d'information conna?t une ?volution forte depuis la g?n?ralisation du web et, plus r?cemment, le d?veloppement de l'informatique nomade. Les limites du domaine sont elles-m?mes en mutation et favorisent les synergies avec les travaux en apprentissage automatique, traitement automatique des langues, traitement de l'image, traitement de la parole, communication ?crite et documents, syst?mes d'information et bases de donn?es, repr?sentation et gestion des connaissances... Les domaines d'application sont vastes et peuvent ?tre appliqu?s au web dans sa globalit? ou restreints par exemple ? des biblioth?ques num?riques ou des r?seaux sociaux. Le public vis? par CORIA 2013 est celui des universitaires et chercheurs - confirm?s ou non -, des industriels et des sp?cialistes du domaine et des ?tudiants en Master se dirigeant vers les m?tiers de la Recherche. Les soumissions peuvent ?tre faites en anglais ou en fran?ais. Les contributions peuvent concerner des travaux acad?miques ou des applications industrielles. Le programme pr?voit deux conf?rences invit?es, l'une de Jamie Callan (CMU), la seconde de Donna Harman (NIST). Ces deux conf?renciers feront par ailleurs un cours d?di?s aux doctorants lors d'un s?minaire du CUSO, le mardi 2 avril 2013 (le jour pr?c?dant la conf?rence). Pendant la conf?rence CORIA 2013 seront ?galement organis?es les 8e Rencontres Jeunes Chercheurs en Recherche d'Information (RJCRI). Elles ont pour objectif de permettre ? tous les doctorants de pr?senter leur probl?matique de recherche, d'?tablir des contacts avec des ?quipes travaillant sur des domaines similaires ou connexes, et d'offrir ? l'ensemble de la communaut? un aper?u des axes de recherche actuels. Les travaux s?lectionn?s pour les RJCRI donneront lieu ? une pr?sentation orale et sous forme de poster. Cette ann?e, les soumissions conjointes RJCRI et CORIA sont autoris?es (voir modalit?s dans RJCRI http://coria.unine.ch/rjcri.htm ) Th?matiques (liste non exhaustive) - Th?orie et mod?les formels pour la RI : mod?le logique, mod?les de langages - Multilinguisme : Recherche d'information multilingue, traduction automatique - Multim?dia (images, audio, vid?os, son, musique) : indexation, navigation, acc?s, interactions avec le texte - Passage ? l'?chelle : indexation, performances, architectures - Classification automatique, clustering, ranking, apprentissage automatique - Filtrage, routage, d?tection de nouveaut?s - Mod?lisation du contexte, personalisation - Traitement Automatique de la Langue Naturelle pour la recherche d'information - Syst?mes de Questions R?ponses - Extraction d'informations : ontologies, ressources et recherche d'informations, d?tection d'entit?s nomm?es - Web : grands graphes, utilisation de la topologie du web, lois de puissances, citations, analyse de liens - RI et documents structur?s : RI et XML, RI pr?cise et recherche de passages - R?seaux sociaux : analyse de blogs et de sites communautaires, suivi de conversations, analyse de rumeurs, analyse de sentiments, d?tection d'opinion - Recherche collaborative : filtrage, syst?mes de recommandation - Interaction utilisateur : interrogation flexible, interfaces, visualisation, mod?lisation de l'utilisateur, accessibilit?, indexation collaborative - Traitement et repr?sentation des connaissances : logique floue, m?ta-donn?es, ontologies, web s?mantique, ing?nierie des connaissances - Biblioth?ques num?riques : RI sur des livres num?ris?s, robustesse, OCR et indexabilit? - Syst?mes de recherche d'information d?di?s : recherche d'information g?nomique, g?ographique - RI distribu?e : recherche d'information mobile, situ?e, P2P - Outils pour la recherche d'information : ?valuation, bancs d'essais, m?triques, exp?rimentations qualitatives des syst?mes Dates importantes La soumission des articles se fera en deux ?tapes : d'abord la soumission d'un r?sum? et ensuite la soumission de l'article. Le calendrier de soumission est le m?me pour CORIA et les RJCRI : - Date limite de soumission des r?sum?s : 01/12/2012 - Date limite de soumission des articles : 07/12/2012 - R?ponse aux auteurs : 01/02/2013 - Date limite de soumission de la version finale : 22/02/2013 Site de d?p?t des articles de Coria : https://www.easychair.org/conferences/?conf=coria2013 Site de d?p?t des articles de RJCRI : https://www.easychair.org/conferences/?conf=rjcri2013 Format des articles - Les soumissions peuvent ?tre faites en anglais ou en fran?ais. - Les contributions peuvent concerner des travaux acad?miques ou des applications industrielles. - Les textes de communications doivent comporter 16 pages maximum au format des revues Hermes. Ils doivent ?tre pr?c?d?s d'une page de garde comportant le titre, les noms et coordonn?es pr?cises des auteurs, une liste de mots cl? en fran?ais et en anglais, un r?sum? d'une vingtaine de lignes au maximum. La mention ? article soumis ? CORIA et RJCRI ? doit ?tre port?e sur la page de garde le cas ?ch?ant. - Les articles peuvent ?tre ?crits en Word ou en LaTeX. - Le format des articles Word et LaTeX peut ?tre t?l?charg? sur le site Hermes. - Les articles d?pos?s doivent ?tre au format PDF exclusivement. Modalit?s RJCRI : voir http://coria.unine.ch/rjcri.htm ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:41:47 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:41:47 +0200 Subject: Appel: La composition neoclassique, revue VERBUM Message-ID: Date: Mon, 01 Oct 2012 17:57:11 +0200 From: St?phanie Lignon Message-ID: <5069BD57.3020108 at univ-nancy2.fr> X-url: http://www.atilf.fr/spip.php?rubrique214&idfirst=922 /La composition n?oclassique/, num?ro de /Verbum/ coordonn? par St?phanie Lignon et Fiammetta Namer Parmi les proc?d?s de cr?ation lexicale disponibles dans la langue, la composition n?oclassique met en jeu des mod?les particuliers. La composition est un proc?d? constructionnel qui fait intervenir deux lex?mes de base afin de construire un nouveau lex?me (/timbre-poste/, /porte-bagage/). On distingue deux types de composition, la composition standard (ou populaire ou ordinaire), qui met en jeu des lex?mes du vocabulaire contemporain (/porte-bagage/), et la composition n?oclassique, qui met en jeu des lex?mes emprunt?s au fonds patrimonial, comme /col?opt?re/ ou /anthropophage/. La composition n?oclassique (?galement dite ? savante ?, ? ?rudite ?, ou encore appel?e ? infixation ?, ? confixation ?, etc.) ?tait initialement r?serv?e ? la formation de termes des vocabulaires de sp?cialit?s en m?decine, chimie, zoologie, botanique, etc. Or, aujourd'hui, elle sert de mod?le ? des formations appartenant non plus ? des vocabulaires sp?cialis?s, mais ? la langue ? g?n?rale ? ; cf. par exemple /contraintophobe/, /ferrovipathe, capillo-tract?/, /chronophage, th??trol?tre, publivore, bobophile/. Toutefois, son identification est plus d?licate que celle des affix?s : - le seul segment que partagent tous les compos?s n?oclassiques est la voyelle de liaison --o- (/politic*o*m?diatique/, /prim*o*acc?dant/) ou --i- (/libert*i*cide/) entre les composants ; - le segment qui suit la voyelle de liaison est soit un mot du fran?ais, soit un constituant grec ou latin pr?sent dans d'autres mots du fran?ais ; - ce dernier, contrairement aux mots, ne figure pas dans les lexiques du fran?ais. Son succ?s dans la langue g?n?rale est la raison qui nous pousse ? vouloir proposer un num?ro th?matique d?di? ? ce proc?d?. Une attention toute particuli?re sera port?e aux soumissions proposant un lien avec les corpus et la formalisation (mod?les), dans un contexte monolingue et multilingue, dans les domaines de sp?cialit? ou dans la langue g?n?rale. L'appel s'adresse aux sp?cialistes de plusieurs domaines, quel que soit le courant th?orique adopt? : - linguistique : lexique, terminologie, morphologie ; - TAL, - psycholinguistique (aspect perception, apprentissage de la langue, troubles du langage, etc.). *Calendrier* : - *15janvier 2013* : Les auteurs souhaitant proposer un article sur ce th?me sont pri?s d'envoyer une intention de soumission de deux pages (bibliographie non comprise) de leur projet pour le 15 janvier 2013. Ce r?sum? ne doit pas ?tre programmatique. Il doit indiquer clairement la probl?matique abord?e et faire ?tat des principaux r?sultats qui seront expos?s dans l'article. - *15 f?vrier 2013* : S?lection des communications par le comit? de lecture et notification aux auteurs - *15 juin 2013* : r?ception des articles complets qui devront faire entre 15 et 20 pages. La feuille de style sera communiqu?e aux auteurs lors de la notification de leur acceptation. Comit? de lecture : Dany AMIOT (STL, Universit? Lille 3), Fr?d?rique BRIN-HENRY (ATILF, Universit? de Lorraine), Georgette DAL (STL, Universit? Lille 3), Natalia GRABAR (STL, Universit? Lille 3), Nabil HATHOUT (CLLE, Universit? Toulouse-Le Mirail), St?phanie LIGNON (ATILF, Universit? de Lorraine), Fiammetta NAMER (ATILF, Universit? de Lorraine), S?verine CASALIS (URECA, Universit? Lille 3), Thierry HAMON (LIM&BIO, Universit? Paris 13), Thi Mai TRAN(STL, Universit? Lille 2). http://www.atilf.fr/spip.php?rubrique214&idfirst=922 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 18:51:03 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 20:51:03 +0200 Subject: Seminaire: BLRI, Jonathan Harrington, Aix-en-Provence, 29 octobre 2012 Message-ID: Date: Tue, 2 Oct 2012 15:47:49 +0200 From: Nad?ra Bureau Message-ID: <006a01cda0a4$838fc310$8aaf4930$@bureau at lpl-aix.fr> Lundi 29 octobre 2012 10h Salle de conf?rences B011, b?t. B 5 avenue Pasteur, Aix-en-Provence (Labex BLRI) Jonathan Harrington (Institute of Phonetics and Speech Processing, Ludwig-Maximilians University of Munich, Germany) Sound change and its relationship to variation in production and categorization in perception. R?sum? In some models (Lindblom et al, 1995; Bybee, 2002), sound change is associated with the type of synchronic reduction that occurs in prosodically weak and semantically predictable contexts. In other models (Ohala, 1993), sound change can be brought about through listeners? misperception of coarticulation in speech production. The talk will draw upon both models in order to explore whether coarticulatory misperception is more likely in prosodically weak contexts. In order to do so, the magnitude of trans-consonantal vowel coarticulation was investigated in /pV1pV2l/ non-words with the pitch-accent falling either on the first or second syllable and in which V1 = /?, ?/ and V2 = /e, o/. The analysis of these words produced by 20 L1-German speakers showed that prosodic weakening caused vowel undershoot in /?/ but had little effect on V2-on-V1 coarticulation. In a perception experiment, a V1 = /?-?/ continuum was synthesised and the same speakers made forced choice judgements to the same non-words with the prosody manipulated such that stress was perceived on V1 or on V2. Listeners compensated for V2-on-V1 coarticulation; however, the magnitude of compensation was less in the prosodically weak than in the strong context. The general conclusion is that segmental context influences both the dynamics of speech production and perceptual categorization, but not always in the same way: it is this divergence between the two which may be especially likely in prosodically weak contexts and which may, in turn, facilitate sound change. References Bybee, J. (2002). Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation Change, 14, 261?290. Lindblom, B., Guion, S., Hura, S., Moon, S. J., and Willerman, R. (1995). Is sound change adaptive? Rivista di Linguistica, 7, 5?36. Ohala, J. J. (1993). Sound change as nature?s speech perception experiment. Speech Communication, 13, 155?161. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 2 20:30:32 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 2 Oct 2012 22:30:32 +0200 Subject: Appel: Tralogy II - Human and Machine Translation - 2013 - Submission deadline extended to October 15, 2012 Message-ID: Date: Tue, 02 Oct 2012 22:20:09 +0200 From: Joseph Mariani Message-ID: <506B4C79.6040903 at limsi.fr> X-url: http://www.tralogy.eu ************ Apologies for Multiple Posting ************ Tralogy II: Human and Machine Translation. The quest for meaning: where are our weak points and what do we need? Dates and venue of the Conference: January 17-18, 2013 - CNRS Headquarters Auditorium, Paris (France) ****** Submission Deadline extended to October 15, 2012 ****** http://www.tralogy.eu The conclusions of the first Tralogy Conference (3-4 March 2011 at the CNRS in Paris) were clear: none of the specialist branches of the language industry can individually hope to offer all the intellectual and professional tools needed to function effectively in the sector. They all need each other: translation has always been interdisciplinary and the translation profession even more so. Accordingly, on the occasion of the second Tralogy Conference, we would like to ask each of our prospective participants not only to present specific contributions from their specialist fields and research into the question of meaning, but also, and in particular, to highlight the limits they face in their specialist fields and research within the wider context of the potential applications of their work. What we would like to find out by the end of Tralogy II is what each of us does not know how to do. We are therefore hoping that, as we map out our respective weak points, these will coincide with the points of contact made at the Conference and with the areas in which there is room for improvement. We will therefore give priority to concise presentations (the published articles will of course be longer) in order to leave time for discussions. And the key question that emerged from Tralogy I will remain at the heart of this analysis: how to measure the quality of a translation with regard to its use. Canada was the country invited to participate in Tralogy I. This time we would like to honour languages that are very much alive but with lower numbers of users. We have therefore decided to organise this conference under the joint patronage of the Baltic States, Member States of the European Union: Estonia, Latvia and Lithuania. Call for papers: http://www.tralogy.eu/spip.php?article55&lang=en To submit a paper: http://www.tralogy.eu/spip.php?article10&lang=en --------- Tralogy revient : http://www.tralogy.eu Tralogy II : Trouver le sens : o? sont nos manques et nos besoins respectifs ? Dates et lieu de la Conf?rence : 17 et 18 janvier 2013, Salle de conf?rence du si?ge du CNRS, Paris (France) La premi?re ?dition du colloque Tralogy (les 3 et 4 mars 2011 dans le Grand amphith??tre du CNRS, ? Paris) s??tait conclue sur une ?vidence : aucune des sp?cialit?s impliqu?es dans les professions langagi?res ne peut ? elle seule donner les clefs intellectuelles et professionnelles qui permettraient d?y op?rer efficacement. Chacune a besoin des autres : la traduction est interdisciplinaire depuis toujours, et les m?tiers de la traduction le sont bien davantage encore. C?est la raison pour laquelle nous souhaitons cette fois demander ? chacun de nos intervenants potentiels, non seulement de pr?senter les apports sp?cifiques de sa sp?cialit? et de sa recherche ? la probl?matique du sens, mais aussi et surtout de mettre en lumi?re les limites auxquelles se heurtent cette sp?cialit? et cette recherche dans le cadre plus g?n?ral des applications envisag?es. Ce que nous ambitionnons de savoir, ? l?issue de Tralogy II, c?est ce que, les uns et les autres, nous ne savons pas faire. Nous faisons ainsi le pari que nos points de contact et nos marges de progression se superposent avec la cartographie de nos points faibles respectifs. Nous comptons, pour cela, privil?gier les pr?sentations concises (les publications seront bien s?r plus ?tendues) afin de laisser du temps au d?bat. Et nous conservons au coeur de cette analyse la question qui, lors de Tralogy I, est apparue essentielle : celle de la mesure de la qualit? d?une traduction au regard de son usage. Le Canada ?tait le pays invit? pour Tralogy I. Nous souhaitons cette fois mettre ? l?honneur les langues tr?s vivantes mais ? faible nombre d?utilisateurs. C?est la raison pour laquelle, nous avons d?cid? d?organiser ce colloque sous le patronage commun des pays baltes, membres de l?Union europ?enne : Estonie, Lettonie et Lituanie. Appel ? contributions : http://www.tralogy.eu/spip.php?article56&lang=fr Pour proposer une contribution : http://www.tralogy.eu/spip.php?article10&lang=fr ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:01:22 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:01:22 +0200 Subject: Job: Post-doc at LIMSI-CNRS, Orsay, France Message-ID: Date: Wed, 03 Oct 2012 09:20:40 +0200 From: Xavier Tannier Message-ID: <506BE748.9060207 at limsi.fr> X-url: http://perso.limsi.fr/Individu/xtannier/fr/Stages/post_doc_2012_chronolines.html X-url: http://www.chronolines.fr Post-doctoral position: Event-based multi-document summarization for building timelines http://perso.limsi.fr/Individu/xtannier/fr/Stages/post_doc_2012_chronolines.html Keywords /information extraction, natural language processing, temporal analysis, events, timelines/ Location LIMSI-CNRS, Orsay (Paris), France. Duration 1 year Context Among other objectives, national funded project Chronolines http://www.chronolines.fr aims at creating semi-automatic timelines from a query, based on a collection of newswire papers. Given a user-defined topic and a set of texts, the task consists in *extracting the most important events* concerning the topic and to present them to the user for validation. The ideal output would then be a set of brief descriptions of events, together with the dates of these events. Work on this project already resulted in a few publications, among which a paper at ACL 2012 on /salient dates extraction/, that the candidate can refer to for more details [1] http://aclweb.org/anthology-new/P/P12/P12-1077.pdf. The candidate would be integrated into this project, working in the project team on some of the following issues: * *Aggregation/Summarization*: how to choose/generate a brief description of each event, from a set of relevant sentences. * *Evaluation*: what metrics, what methodology for objective evaluation. * *Granularity*: as the time unit for our salient date algorithm is the day, how to decide that several topic-related important events occurred on the same day or, inversely, that an important event lasted more than one day. * *Relationship*: how to use the big collection of articles to extract some relationship between events? Required skills The candidate should hold a PhD in Natural Language Processing and/or Information Retrieval, and be able to: * Work with texts (interest in linguistic issues and how to deal with them) * Work with /a lot/ of texts (good programming skills, big corpora management, information aggregation, ability to forget about linguistic issues when we need to) * Learn from (imperfect) references (ability to observe and generalize, machine learning skills) * Work with tools used and built by the team (in Linux, Java, perl...) Contacts: Xavier.Tannier[at]limsi.fr Veronique.Moriceau[at]limsi.fr Reference: [1] R?my Kessler, Xavier Tannier, Caroline Hag?ge, V?ronique Moriceau, Andr? Bittar. *Finding Salient Dates for Building Thematic Timelines. http://aclweb.org/anthology-new/P/P12/P12-1077.pdf* In /Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012)/. Jeju Island, Republic of Korea, July 2012. ? Association for Computational Linguistics. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:08:04 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:08:04 +0200 Subject: Seminaire: Alban Lemasson, 12 octobre 2012, BLRI, Marseille Message-ID: Date: Wed, 3 Oct 2012 16:33:56 +0200 From: Nad?ra Bureau Message-ID: <006101cda174$1e8df330$5ba9d990$@bureau at lpl-aix.fr> Brain & Language Research Institute Vendredi 12 octobre 2012 11h Amphi Fabry B?t 5 3 place Victor Hugo, Marseille (Labex BLRI) Alban LEMASSON (Universit? de Rennes 1, Institut universitaire de France) Rudiments de langage chez les primates non-humains ? R?sum? La communication vocale des primates non-humains a longtemps ?t? consid?r?e comme d?termin?e uniquement g?n?tiquement et ?motionnellement, encourageant les th?oriciens de l'origine du langage humain ? en rechercher les pr?curseurs ailleurs, notamment dans les gestes des grands singes. Pourtant, les ?tudes men?es au cours des dix derni?res ann?es, particuli?rement sur les cris des cercopith?ques forestiers, d?montrent un parall?le avec plusieurs caract?ristiques fondamentales du langage (p.ex. s?mantique, affixation, syntaxe, prosodie, conversation, accommodation et convergence vocale). Les diff?rences entre le langage humain et la communication vocale des singes, qui sont des actes sociaux comparables, seraient donc plus d'ordre quantitatif que qualitatif. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:10:58 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:10:58 +0200 Subject: Seminaire: Evelina Fedorenko et Ted Gibson, 19 octobre 2012, BLRI, Marseille Message-ID: Date: Wed, 3 Oct 2012 17:01:09 +0200 From: Nad?ra Bureau Message-ID: <007f01cda177$ec1e47c0$c45ad740$@bureau at lpl-aix.fr> Brain & Language Research Institute Vendredi 19 octobre 2012 11h Salle des Vo?tes F?d?ration de Recherche 3 C (Comportement, Cerveau, Cognition) 3 place Victor Hugo, Marseille (Labex BLRI) Evelina FEDORENKO (MIT) R?sum? What cognitive and neural mechanisms do we use to understand language? Since Broca's and Wernicke's seminal discoveries in the 19th century, a broad array of brain regions have been implicated in linguistic processing spanning frontal, temporal and parietal lobes, both hemispheres, and subcortical and cerebellar structures. However, characterizing the precise contribution of these different structures to linguistic processing has proven challenging. In this talk I will argue that high-level linguistic processing - including understanding individual word meanings and combining them into more complex structures/meanings - is accomplished by the joint engagement of two functionally and computationally distinct brain systems. The first is comprised of the classic ?language regions? on the lateral surfaces of left frontal and temporal lobes that appear to be functionally specialized for linguistic processing (e.g., Fedorenko et al., 2011; Monti et al., 2009, 2012). And the second is the fronto-parietal "multiple demand" network, a set of regions that are engaged across a wide range of cognitive demands (e.g., Duncan, 2001, 2010). Most past neuroimaging work on language processing has not explicitly distinguished between these two systems, especially in the frontal lobes, where subsets of each system reside side by side within the region referred to as ?Broca?s area? (Fedorenko et al., in press). Using methods which surpass traditional neuroimaging methods in sensitivity and functional resolution (Fedorenko et al., 2010; Nieto-Casta?on & Fedorenko, in press; Saxe et al., 2006), we are beginning to characterize the important roles played by both domain-specific and domain-general brain regions in linguistic processing. ------------------------------------------------------------------------ Vendredi 19 octobre 2012 16h Salle des Vo?tes F?d?ration de Recherche 3 C (Comportement, Cerveau, Cognition) 3 place Victor Hugo, Marseille (Labex BLRI) Ted GIBSON (MIT) The communicative basis of word order R?sum? Some recent evidence suggests that subject-object-verb (SOV) may be the default word order for human language. For example, SOV is the preferred word order in a task where participants gesture event meanings (Goldin-Meadow et al. 2008). Critically, SOV gesture production occurs not only for speakers of SOV languages, but also for speakers of SVO languages, such as English, Chinese, Spanish (Goldin-Meadow et al. 2008) and Italian (Langus & Nespor, 2010). The gesture-production task therefore plausibly reflects default word order independent of native language. However, this leaves open the question of why there are so many SVO languages (41.2% of languages; Dryer, 2005). We propose that the high percentage of SVO languages cross-linguistically is due to communication pressures over a noisy channel (Jelinek, 1975; Brill & Moore, 2000; Levy et al. 2009). In particular, we propose that people understand that the subject will tend to be produced before the object (a near universal cross-linguistically; Greenberg, 1963). Given this bias, people will produce SOV word order ? the word order that Goldin-Meadow et al. show is the default ? when there are cues in the input that tell the comprehender who the subject and the object are. But when the roles of the event participants are not disambiguated by the verb, then the noisy channel model predicts either (i) a shift to the SVO word order, in order to minimize the confusion between SOV and OSV, which are minimally different; or (ii) the invention of case marking, which can also disambiguate the roles of the event participants. We test the predictions of this hypothesis and provide support for it using gesture experiments in English, Japanese and Korean. We also provide evidence for the noisy channel model in language understanding in English. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:13:41 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:13:41 +0200 Subject: Ecole: Stage intensif NooJ, 21 =?WINDOWS-1252?Q?=96_?=25 janvier 2013, INALCO Message-ID: Date: Thu, 4 Oct 2012 11:13:10 +0200 From: Max Silberztein Message-Id: <8B86BED8-CD68-4148-A6DC-7B7587EFEF6F at gmail.com> Stage intensif NooJ ? l?INALCO 21 ? 25 janvier 2013 65 rue des Grands Moulins, 75013 Paris NooJ est un environnement de d?veloppement utilis? pour formaliser huit niveaux de ph?nom?nes linguistiques : orthographe et typographie, morphologie flexionnelle et d?rivationnelle, syntaxe locale et structurelle, grammaire transformationnelle et analyse s?mantique. NooJ s?appuie sur des formalismes adapt?s ? chaque type de ph?nom?ne (grammaires rationnelles, hors contexte, contextuelles et non restreintes), sur une structure d?annotation sophistiqu?e qui permet aux analyseurs des divers niveaux linguistiques de communiquer entre eux, et propose de nombreux outils d?aide au d?veloppement de ressources ? large couverture dans une perspective de linguistique descriptive. Aujourd?hui, des modules de ressources linguistiques sont disponibles pour une vingtaine de langues. NooJ est utilis? par des linguistes pour d?crire des langues, par des chercheurs en sciences sociales pour effectuer des analyses de corpus dans une perspective historique, litt?raire, sociologique ou psychologique, et aussi par des entreprises pour extraire et annoter des informations dans des textes techniques. NooJ est gratuit, fonctionne sous Windows, Mac OS X, LINUX et Unix et sera bient?t disponible en open source, cf. www.nooj4nlp.net gr?ce ? l'appui du projet europ?en Metanet-Cesar. Le stage s?adresse particuli?rement aux ?tudiants de Master, doctorants et chercheurs int?ress?s par la linguistique descriptive, la linguistique de corpus et l?analyse automatique de textes dans une perspective des sciences humaines. Le stage dure une semaine : les matins sont d?di?s au cours et aux travaux dirig?s ; pendant les apr?s-midis, des chercheurs viendront pr?senter diverses applications de NooJ. Les inscriptions sont gratuites mais obligatoires. Attention : les places sont limit?es ? 50 participants maximum. Les ?tudiants de Master qui peuvent et souhaitent valider le stage aupr?s de leur d?partement devront imp?rativement rendre un devoir ? l'issue du stage. Chaque participant devra venir avec son ordinateur portable sur lequel NooJ doit ?tre d?j? install?. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:16:57 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:16:57 +0200 Subject: Appel: ACL 2013 Message-ID: Date: Tue, 2 Oct 2012 22:20:25 +0100 From: Anna Korhonen Message-ID: X-url: http://acl2013.org/ ACL 2013 CALL FOR PAPERS The 51st Annual Meeting of the Association for Computational Linguistics Sofia, Bulgaria, August 4-9 http://acl2013.org/ The Association for Computational Linguistics is pleased to announce that its 2013 Annual Meeting will take place in Sofia, Bulgaria, on August 4th to 9th. The conference invites the submission of long and short papers on substantial, original, and unpublished research in all aspects of automated language processing, as discussed below. As already done last year, ACL 2013 will accept papers accompanied by the resource (software or data) described in the paper. In addition to the regular review of the research quality of the paper, these papers will also be reviewed for the quality of the resource that is being made available. Papers that are submitted with accompanying software/data will receive additional credit toward the overall evaluation score, and acceptance or rejection decision will be made based on the quality of both the research and the software/data component. In addition, this year there will be an important novelty: some of the presentations at the conference will be of papers accepted for the new Transactions of the ACL journal (http://www.transacl.org/). Topics Relevant topics for the conference include, but are not limited to, the following areas (in alphabetical order): Cognitive modelling of language processing and psycholinguistics Dialogue and interactive systems Discourse, coreference and pragmatics Evaluation methods Information retrieval Language resources Lexical semantics and ontologies Low resource language processing Machine translation: methods, applications and evaluation Multilinguality in NLP NLP applications NLP and creativity NLP for the languages of Central and Eastern Europe and the Balkans NLP for the Web and social media Question answering Semantics Sentiment analysis, opinion mining and text classification Spoken language processing Statistical and Machine Learning methods in NLP Summarization and generation Syntax and parsing Tagging and chunking Text mining and information extraction Word segmentation Submissions Long papers: ACL 2013 submissions must describe substantial, original, completed and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Submissions will be judged on appropriateness, clarity, originality/innovativeness, correctness/ soundness, meaningful comparison, thoroughness, significance, contributions to research resources, and replicability. Each submission will be reviewed by at least three program committee members. Long papers may consist of up to eight (8) pages of content, plus two extra pages for references; final versions should take into account reviewers' comments. Papers will be presented orally or as posters as determined by the program committee. Decisions on presentation format will be based on the nature rather than the quality of the work. There will be no distinction in the proceedings between long papers presented orally and as posters. The long paper deadline is: Wednesday February 20th, 2013 Short papers: ACL 2013 also solicits short papers. Short paper submissions must describe original and unpublished work. Characteristics of short papers include: - A small, focused contribution - Work in progress - A negative result - An opinion piece - An interesting application nugget Short papers will be presented in one or more oral or poster sessions, and will be given four (4) pages including references in the proceedings. While short papers will be distinguished from long papers in the proceedings, there will be no distinction in the proceedings between short papers presented orally and posters. Each short paper submission will be reviewed by at least two program committee members. The deadline for short papers is Sunday April 14th, 2013 Electronic Submission: Submission is electronic, using the Softconf submission software (URL to be announced in subsequent versions of this call) Format: Long paper submissions should follow the two-column format of ACL 2013 proceedings without exceeding eight (8) pages of content plus two extra pages for references. Short paper submissions should also follow the two- column format of ACL 2013 proceedings, and should not exceed four (4) pages including references. We strongly recommend the use of ACL LaTeX style files or Microsoft Word style files tailored for this year's conference. Submissions must conform to the official style guidelines, which are contained in the style files, and they must be in PDF. As the reviewing will be blind, papers must not include authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ..." must be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ..." Papers that do not conform to these requirements will be rejected without review. In addition, please do not post your submissions on the web until after the review process is complete. Multiple-submission policy: Papers that have been or will be submitted to other meetings or publications must indicate this at submission time. Authors of papers accepted for presentation at ACL 2013 must notify the program chairs by April 21st as to whether the paper will be presented. All accepted papers must be presented at the conference to appear in the proceedings. We will not accept for publication or presentation papers that overlap significantly in content or results with papers that will be (or have been) published elsewhere. Authors submitting more than one paper to ACL must ensure that submissions do not overlap significantly (> 50%) with each other in content or results. Important Dates Long paper submission deadline: Wednesday, February 20th Long paper author responses: Friday March 29th Long paper acceptance notification: Sunday April 7th Short paper submission deadline: Sunday, April 14th Long paper camera ready: Monday May 6th Short paper acceptance notification: Sunday May 12th Short paper camera ready: Wednesday May 22nd Conference: August 4th-9th Program Co-Chairs Pascale Fung, The Hong Kong University of Science and Technology Massimo Poesio, University of Essex ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:19:45 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:19:45 +0200 Subject: Appel: Workshop proposals, NAACL-HLT 2013, ACL 2013, EMNLP 2013 Message-ID: Date: Thu, 4 Oct 2012 20:45:43 +0100 From: Anna Korhonen Message-ID: X-url: http://naacl.org/ X-url: http://www.acl2013.org/ CALL FOR WORKSHOP PROPOSALS NAACL-HLT 2013 & ACL 2013 & EMNLP 2013 The North American Chapter of the Association for Computational Linguistics (NAACL), The Association for Computational Linguistics (ACL), and ACL SIGDAT invite proposals for workshops to be held in conjunction with the NAACL-HLT, ACL, or EMNLP conferences in 2013. We solicit proposals on any topic of interest to the ACL communities. Workshops will be held at one of the following conference venues: NAACL-HLT 2013 is the 14th Annual Meeting of the North American Chapter of the Association for Computational Linguistics. It will be held in Atlanta, GA, USA, June 9 - 14, 2013. The dates for the NAACL-HLT workshops will be June 13 - 14. The webpage for NAACL-HLT 2013 is: http://naacl.org/. ACL 2013 is the 51st Annual Meeting of the Association for Computational Linguistics (ACL). It will be held in Sofia, Bulgaria, August 4 - 9, 2013. The ACL workshops will be held August 8 - 9. The webpage for ACL 2013 is: http://www.acl2013.org/. EMNLP 2013 is SIGDAT's annual Conference on Empirical Methods in Natural Language Processing. It will be held in Seattle, WA, USA, in October 2013. The exact dates and venue are to be determined. One day of workshops are planned after a 3-day main conference, although proposals for a longer associated event will be considered. Proposals will be reviewed jointly by the workshop organizers for the conferences. ------------------------------------------------------------------------ SUBMISSION INFORMATION Proposals for workshops should contain: 1. A title and brief (2-page max) description of the workshop topic and content. 2. The desired workshop length (one or two days) and an estimate of the number of attendees. 3. The names, postal addresses, phone numbers, and email addresses of the organizers, with one-paragraph statements of their research interests and areas of expertise. 4. A list of potential members of the program committee, with an indication of which members have already agreed. 5. A description of any shared tasks associated with the workshop. 6. A description of special requirements for technical needs. 7. A note specifying which venue(s) (NAACL-HLT/ACL/EMNLP) would be acceptable and/or preferable. There will be a single workshop committee, coordinated by the workshop chairs. This single committee will review the quality of the workshop proposals. Once the reviews are complete, the workshop chairs will work together to assign workshops to all three of the conferences, taking into account the location preferences given by the proposers. The ACL has a set of policies on workshops. You can find the ACL's general policies on workshops at http://www.cis.udel.edu/~carberry/ACL/Workshops/workshop-support-general-policy.html, the financial policy for workshops at http://www.cis.udel.edu/~carberry/ACL/Workshops/workshop-conf-financial-policy.html, and the financial policy for SIG workshops at http://www.cis.udel.edu/~carberry/ACL/Workshops/workshops-Sig-financial-policy.html. Please submit proposals in plain text in the body of an email to the workshop organizers (naacl-acl-workshops-2013 at googlegroups.com) no later than November 30, 2012, 23:59:59 UTC/GMT. Notification of acceptance of workshop proposals will occur no later than December 14, 2012. Organizers of accepted proposals will be responsible for publicizing and running the workshop, including reviewing submissions, producing the camera ready workshop proceedings, and organizing the meeting days. It is crucial that organizers commit to all deadlines. In particular, failure to produce the camera ready proceedings on time will lead to the exclusion of the workshop from the CD-ROM/USB & unified author indexes. Workshop organizers cannot accept for publication papers that will be (or have been) published elsewhere, although they are free to set their own policies on simultanous submission and review. Since the conferences will occur at different times, the timescales for the submission and reviewing of workshop papers, and the preparation of camera-ready copies, will be different for each conference. Suggested timescales for each of the conferences are given below. Workshop organizers should not deviate from this schedule unless absolutely necessary. ------------------------------------------------------------------------ TIMELINES FOR 2013 WORKSHOPS SHARED DATES Nov 30, 2012 Workshop proposal deadline Dec 14, 2012 Notification of acceptance NAACL-HLT 2013 Dec 21, 2012 Proposed 1st workshop CFP Mar 01, 2013 Proposed paper due date Mar 29, 2013 Proposed notification of acceptance Apr 12, 2013 Camera-ready deadline Jun 13-14, 2013 Workshops ACL 2013 Jan 24, 2013 Proposed 1st workshop CFP Apr 26, 2013 Proposed paper due date May 24, 2013 Proposed notification of acceptance Jun 7, 2013 Camera-ready deadline Aug 8-9, 2013 Workshops EMNLP 2013 Mar 1, 2013 Proposed 1st workshop CFP Jul 1, 2013 Proposed paper due date Aug 1, 2013 Proposed notification of acceptance Sep 1, 2013 Camera-ready deadline Oct TBD, 2013 Workshops ------------------------------------------------------------------------ WORKSHOP CO-CHAIRS Sujith Ravi, NAACL; Google Inc. Luke Zettlemoyer, NAACL; University of Washington Aoife Cahill, ACL; Educational Testing Service Qun Liu, ACL; Dublin City University & Chinese Academy of Sciences For inquiries, send email to the workshop organizers at naacl-acl-workshops-2013 at googlegroups.com ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:20:37 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:20:37 +0200 Subject: Job: Information Extraction, IRISA/INRIA, Rennes, France Message-ID: Date: Fri, 05 Oct 2012 10:45:57 +0200 From: Vincent Claveau Message-ID: <506E9E45.1040704 at irisa.fr> POSTDOCTORAL/ENGINEERING POSITION at IRISA / INRIA Rennes, France Topic: Text-mining and information extraction in multimedia documents Information extraction and text-mining are well known domains of Natural Language Processing. Yet, dealing with low-quality texts, like automatic speech transcription or OCRized overlays, raises new challenges in terms of portability and robustness. In that context, the proposed project aims at developing new text-mining and information extraction approaches to overcome these difficulties. The goal is to rely on simple but robust description of the text and new machine learning techniques and paradigms (CRF, boosting, unsupervised and semi-supervised approaches...). The typical tasks concerned are term and named entity recognition and discovery, (ontological or semantic) relation recognition and discovery... The candidate is expected to implement these new approaches, participate to evaluation and challenges in this field, both for well-formed texts and degraded texts (such as speech transcripts), and may also help in developing new evaluation datasets. This work takes place in the context of the Quaero project, funded by the French National Innovation Agency (www.quaero.org). The work will be performed at IRISA/INRIA Rennes, France (http://www.irisa.fr , http://www.inria.fr/centre/rennes ). The candidate will integrate the TexMex team, whose main research topics include large-scale multimedia indexing, speech processing, information retrieval. QUALIFICATIONS AND POSITION The successful candidate will have an engineering degree or PhD with a track record of Information Extraction, Text-Mining or Machine Learning for Natural Language Processing research. Fluency in English is mandatory. This position is for 12 months and may begin as early as Nov 1st, 2012, and no later than mid-December. Salary follows INRIA scales and depends on the candidate's experience (the minimum monthly net salary is about 2000 ?). To apply, please send a cover letter, describing how the applicant's knowledge and research background will contribute to the project, a CV, and the names and contact information of two referees to: Vincent Claveau (vincent.claveau at irisa.fr) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 5 19:23:43 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 5 Oct 2012 21:23:43 +0200 Subject: Appel: JeTou 2013, colloque international jeunes chercheurs en Sciences du Langage Message-ID: Date: Fri, 5 Oct 2012 11:46:09 +0200 From: J?Tou 2013 Message-ID: X-url: http://jetou2013.free.fr/ Bonjour, Nous vous contactons ? propos de la quatri?me ?dition des Journ?es d'?tudes Toulousaines (J?Tou) qui se tiendront les 16 et 17 mai 2013 ? l?Universit? Toulouse II ? Le Mirail (Toulouse, France). Ce colloque international s?adresse aux jeunes chercheurs en Sciences du Langage et aura pour th?matique : ? Variation et variabilit? dans les Sciences du Langage : analyser, mesurer, contextualiser ?. Vous trouverez l?appel ? communications ainsi que l'affiche de la manifestation (versions francophone et anglophone) en pi?ces jointes. La date limite de soumission, initialement fix?e au vendredi 5 octobre 2012, a ?t? repouss?e au *vendredi 12 octobre 2012* (dernier d?lai). Toutes les informations n?cessaires sont disponibles sur le site internet du colloque ? l'adresse suivante : http://jetou2013.free.fr/. Nous vous remercions par avance de bien vouloir diffuser ces informations ? vos contacts. Bien ? vous, Le comit? d'organisation des J?Tou 2013 : Caroline Atallah (CLLE-ERSS) Guillaume Carbou (LARA-CPST) Marie-Mandarine Colle-Quesada (Octogone-Lordat) Claire Del Olmo (Octogone-Lordat) Marie Lacabanne (Octogone-Lordat) Marine Lasserre (CLLE-ERSS) Simon Leva (CLLE-ERSS) Emilie Massa (Octogone-Lordat) C?cile Viollain (CLLE-ERSS) --------------------------------------------------------------------- Dear Sir, Dear Madam, we are contacting you about the 4th edition of the J?Tou (Journ?es d'Etudes Toulousaines) which will take place on May 16th and May 17th 2013 at Universit? Toulouse II - Le Mirail (Toulouse, France). This international conference aims at gathering doctoral students and young researchers in the Language Sciences together in order to discuss a specific theme. This year, the theme of the conference is the following: "Variation and Variability in the Language Sciences: analyzing, measuring, contextualizing". Attached to this email you will find the call for papers as well as the poster for the conference (in both the French and English versions). The submission deadline, originally set to Friday October 5th 2012, has been extended to *Friday October 12th 2012*. All necessary information regarding the event can be found on the conference website: http://jetou2013.free.fr/index-en. We thank you in advance for spreading the information to your contacts. Respectfully yours, Le comit? d'organisation des J?Tou 2013 : Caroline Atallah (CLLE-ERSS) Guillaume Carbou (LARA-CPST) Marie-Mandarine Colle-Quesada (Octogone-Lordat) Claire Del Olmo (Octogone-Lordat) Marie Lacabanne (Octogone-Lordat) Marine Lasserre (CLLE-ERSS) Simon Leva (CLLE-ERSS) Emilie Massa (Octogone-Lordat) C?cile Viollain (CLLE-ERSS) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:24:12 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:24:12 +0200 Subject: Appel: Revue TAL, Gestion des erreurs en traitement automatique des langues (extension de la date limite) Message-ID: Date: Fri, 05 Oct 2012 22:58:37 +0200 From: Francois Yvon Message-ID: <506F49FD.3060703 at limsi.fr> X-url: http://tal-53-3.sciencesconf.org/ DERNIER APPEL ? CONTRIBUTIONS DU BRUIT DANS LE SIGNAL : GESTION DES ERREURS EN TRAITEMENT AUTOMATIQUE DES LANGUES UN NUM?RO SP?CIAL DE LA REVUE < TRAITEMENT AUTOMATIQUE DES LANGUES> (TAL) ** Date limite pour soumettre un r?sum?: 15/10/2012 ** Date limite pour l'article complet: 29/10/2012 Voir [http://tal-53-3.sciencesconf.org/] La langue que les applications de traitement automatique des langues ont ? traiter ressemble assez peu aux exemples parfaitement grammaticaux que l'on rencontre dans les livres de grammaire. Dans l'usage quotidien, les ?nonc?s ? traiter se pr?sentent sous une forme imparfaite : les textes dactylographi?s contiennent des erreurs de saisie, ainsi que de fautes d'orthographe et de grammaire ; les ?nonc?s oraux correspondent souvent ? des phrases incompl?tes et contiennent des disfluences; les sorties des syst?mes d'OCR contiennent de multiples confusion entre caract?res, et celles des syst?mes de reconnaissance vocale contiennent des transcriptions inexactes de ce qui a r?ellement ?t? prononc?. Le bruit est donc inh?rent au donn?es langagi?res et ignorer cette r?alit? ne peut que nuire ? la qualit? de nos syst?mes de traitement. Pour certaines applications, l'enjeu est de d?velopper des m?canismes robustes vis-?-vis de ces erreurs. Par exemple, un syst?me de dialogue pourra utiliser des mesures de confiance portant sur les hypoth?ses de reconnaissance vocale pour d?cider s'il doit demander ? l'utilisateur de r?p?ter. Pour d'autres applications, il sera n?cessaire de faire appel ? des techniques de correction automatique des erreurs; ainsi, par exemple, un syst?me d'OCR pourra post-traiter les textes avec des mod?les de correction contextuels pour valider l'orthographe des mots. Ce num?ro sp?cial vise ? rassembler des contributions portant sur la gestion des erreurs en traitement des langues. De nombreux sous-domaines du TAL ont besoin de prendre en compte le bruit et les erreurs dans les signaux linguistiques qu'ils consid?rent, mais il est rare que des chercheurs issus de ces diverses communaut?s aient l'occasion de comparer leurs m?thodes et leurs r?sultats. Notre ambition est de mettre en perspective des travaux issus de ces diff?rents domaines de mani?re ? encourager la fertilisation crois?e des id?es. Pour ce num?ro sp?cial, nous consid?rons donc comme pertinent tout travail touchant au traitement automatique de donn?es bruit?es. Les sous-domaines les plus d?velopp?s sont probablement la correction orthographique, et, dans une moindre mesure, la correction grammaticale; aucun de ces probl?mes n'est pourtant compl?tement r?solu, et la situation est encore moins satisfaisante quand on consid?re des erreurs plus profondes, touchant par exemple au style ou ? l'organisation du discours. Les traitements robustes, qui visent ? extraire le maximum d'informations utiles d'entr?es potentiellement erron?es, seront aussi favorablement consid?r?s, que ces entr?es se pr?sentent sous forme ?crite ou orale ; plus g?n?ralement, les ?tudes portant sur les strat?gies de r?paration d'erreur, par exemple dans les syst?mes de dialogue ou d'autres syst?mes analogues, sont ?galement pertinentes pour ce num?ro. Nous invitons donc les contributions portant sur tout aspect relatif au traitement des erreurs en TAL, et en particulier (liste non exclusive): * correction automatique de l'orthographe et de la grammaire * erreurs s?mantiques et logiques * correction d'erreurs dans le style ou l'organisation du discours * correction d'erreurs "artificielles" (OCR, reconnaissance vocale, etc.) * correction automatique de requ?tes ? des moteurs de recherche * acquisition, annotation et analyse d'erreurs dans les textes r?els * corpus d'erreurs * traitement des erreurs dans les langages contr?l?s * erreurs en apprentissage des langues * erreurs de performance * normalisation d'?crits non standards * TAL robuste * traitement de parole disfluente * traitement des erreurs en reconnaissance vocale * apprendre avec des donn?es bruit?es * mesures de la gravit? des erreurs * mesures de confiance * fouille et analyse d'erreurs * auto-?valuation et diagnostic d'erreurs ?DITEURS INVIT?S - Robert Dale (Macquarie University, Australia) - Fran?ois Yvon (LIMSI/CNRS and Univ. Paris Sud, France) COMIT? SCIENTIFIQUE Martine Adda (LPL/CNRS, Paris) Delphine Bernhard (LiLPa, Universit? de Strasbourg) Simon Charest (Druide informatique, Montr?al) Anne Dister (Facult?s Universitaires Saint-Louis, Bruxelles) Yannick Est?ve (LIUM, Universit? du Maine, Le Mans) Thierry Fontenelle (Centre de Traduction des organes de l'Union Europ?enne, Luxembourg) Alegria Inaki (University of the Basque Country) Diana Inkpen (Universit? d'Ottawa) Marie-Jos? Hamel (Universit? d'Ottawa) David Langlois (LORIA, Universit? de Lorraine, Nancy) Alessandro Lenci (Universit? di Pisa) Ryo Nagata (Konan University, Kobe) Pierre Nugues (University of Lund) Joel Tetrault (Educational Testing Service, Princeton) Martin Raynaert (Tilburg University) Christoph Ringlstetter (CIS, University of Munich) Alla Rozovskaya (University of Illinois at Urbana-Champaign) Benoit Sagot (ALPAGE/INRIA, Paris) Michel Simard (NRC, Ottawa) Khaled Shaalan (The British University in Dubai) Serge Sharroff (University of Leeds) Eric Werlhi (LATL, Universit? de Gen?ve) DATES IMPORTANTES - soumission des contributions (r?sum?s) : 15 octobre 2012 - soumission des contributions (article complet) : 29 octobre 2012 - premi?re notification aux auteurs : 20 d?cembre 2012 - date limite pour les versions r?vis?es : 1er f?vrier 2013 - d?cisions finales : 15 avril 2013 - versions finales : 15 juin 2013 - publication : ?t? 2013 LE JOURNAL Depuis 40 ans, TAL (Traitement Automatique des Langues) est un journal international publi? par l'ATALA (Association pour le Traitement Automatique des Langues) avec le soutien du CNRS. Depuis quelques ann?es, il s'agit d'un journal en ligne, des versions papier pouvant ?tre obtenues sur commande. Ceci n'affecte en rien le processus de relecture et de s?lection. INFORMATIONS PRATIQUES Les articles (25 pages environ, format PDF) doivent ?tre d?pos?s sur la plateforme http://tal-53-3.sciencesconf.org/. Les feuilles de style sont disponibles sur le site web du journal (http://www.atala.org/-Revue-TAL). Le journal ne publie que des contributions originales, en fran?ais ou en anglais. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:28:41 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:28:41 +0200 Subject: Appel: PAKDD 2013, Deadline: Oct. 08, 2012 Message-ID: Date: Sun, 7 Oct 2012 22:51:54 +0100 From: CFP PAKDD2013 Message-ID: X-url: http://pakdd2013.pakdd.org/ [Apologies for multiple copies] --------------------------------------------------- Call For Papers PAKDD 2013 The 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining Gold Coast, Australia Conference Website http://pakdd2013.pakdd.org/ Submission System https://cmt.research.microsoft.com/PAKDD2013/ Important Dates Paper submission due: Oct. 8 (Mon). 2012 Notification to author: Dec. 19 (Wed). 2012 Camera ready due: Jan. 6 (Sun). 2013 *[23:59:59 Pacific Time] ============================================================== Conference Scope The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference in the areas of data mining and knowledge discovery (KDD). It provides an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all KDD related areas, including data mining, data warehousing, machine learning, artificial intelligence, databases, statistics, knowledge engineering, visualization, and decision-making systems. The conference calls for research papers reporting original investigation results and industrial papers reporting real data mining applications and system development experience. ============================================================== Topics The topics of relevance for the conference papers include but not limited to the following: * Novel models and algorithms * Clustering * Classification * Ranking * Association analysis * Anomaly detection * Data pre-processing * Feature extraction and selection * Mining heterogeneous data * Mining multi-source data * Mining sequential data * Mining spatial and temporal data * Mining unstructured and semi-structured data * Mining graph and network data * Parallel, distributed, and high performance data mining on the cloud platform * Privacy preserving data mining * Mining high dimensional data * Mining uncertain data * Mining imbalanced data * Mining dynamic/streaming data * Statistical methods for data mining * Visual data mining * Interactive and online mining * Mining behavioral data * Mining multimedia data * Mining scientific databases * Ubiquitous knowledge discovery * Agent-based data mining * Mining social networks * Financial data mining * Fraud and risk analysis * Security and intrusion detection * Opinion mining and sentiment analysis * Post-processing including quality assessment and validation * Integration of data warehousing, OLAP and data mining * Human, domain, organizational and social factors in data mining * Applications to healthcare, bioinformatics, computational chemistry, * Eco-informatics, marketing, online gaming, etc All paper submissions will be handled electronically. Detailed instructions are provided on the conference home page. ============================================================== Paper Submission Each submitted paper should include an abstract up to 200 words. It should also adhere to the double-blind review policy and not longer than 12 single-spaced pages with 10pt font size. Authors are strongly encouraged to use Springer LNCS/LNAI manuscript submission guidelines (available at http://www.springer.de/comp/lncs/authors.html) for their initial submissions. All papers must be submitted electronically through Microsoft's Conference Management Service (CMT) in PDF format only. The submitted papers must not be previously published anywhere, and must not be under consideration by any other conferences or journal during the PAKDD review process. Submitting a paper to the conference means that if the paper were accepted, at least one author will attend the conference to present the paper. For no-show authors, their affiliations will receive a notification. The program committee chairs are not allowed to submit papers to the conference for a fair review process. All papers will be double-blind reviewed by the Program Committee on the basis of technical quality, relevance to data mining, originality, significance, and clarity. Papers that do not comply with the Submission Guidelines will be rejected without review. The best papers will be selected to be included in the special issues of Knowledge and Information Systems (KAIS) and International Journal of Data Mining and Bioinformatics (IJDMB). Before submitting your paper, please carefully read and agree with the PAKDD submission policy and no-show policy: http://pakdd.togaware.com/policy.html ============================================================== Conference Officers Honorary Co-chairs * Jiawei Han. University of Illinois at Urbana-Champaign,USA * Ramamohanarao Kotagiri, University of Melbourne, Australia * Graham Williams. Australia Taxation Office, Australia Conference Co-chairs * Hiroshi Motoda, AFOSR/AOARD and Osaka University, Japan * Longbing Cao. University of Technology, Sydney, Australia Program Committee Co-chairs * Jian Pei. Simon Fraser University, Canada * Vincent S. Tseng. National Cheng Kung University, Taiwan Local Arrangement Co-chairs * Vladimir Estivill-Castro. Griffith University (Gold Coast), Australia * Xue Li, University of Queensland, Australia * Richi Nayak, Queensland University of Technology, Australia * Xinhua Zhu, University of Technology, Sydney, Australia Workshop Co-chairs * Jiuyong Li. University of Sourth Australia, Australia * Kay Chen Tan. National University of Singapore, Singapore * Bo Liu. Guangdong University of Technology, China Tutorial Co-chairs * Tu Bao Ho. Japan Advanced Institute of Science and Technology, Japan * Mengjie Zhang. Victoria University of Wellington, New Zealand Award Chair * Chengqi Zhang, University of Technology, Sydney, Australia Sponsorship Co-chair * Yue Xu, Queensland University of Technology, Australia Publicity Co-chairs * P.Krishna Reddy, The International Institute of Information Technology, Hyderabad, India * Yifeng Zeng, Aalborg University, Denmark * Xin Wang, University of Calgary, Canada * Zhihong Deng, Peking University, China ============================================================== Further Information For further information, please contact the Program Committee Chairs by pakdd13-program at pakdd.org . General inquiries * Longbing Cao University of Technology Sydney, Australia Email: pakdd13 at pakdd.org Phone: (61)2-9514-4477 Fax: (61)2-9514-1807 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:30:12 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:30:12 +0200 Subject: Appel: RECITAL 2013 Message-ID: Date: Mon, 08 Oct 2012 09:41:03 +0200 From: Florian Boudin Message-ID: <5072838F.302 at univ-nantes.fr> RECITAL 2013 : Premier appel ? communications --------------------------------------------- RECITAL 2013 15i?me Rencontre des ?tudiants Chercheurs en Informatique pour le Traitement Automatique des Langues. Centre des congr?s Les Atlantes aux Sables D'Olonne (France) du 17 au 21 juin 2013. Dates importantes ----------------- - Date limite de soumission : vendredi 15 mars 2013 - Notification aux auteurs : vendredi 19 avril 2013 - Version d?finitive : vendredi 10 mai 2013 Pr?sentation ------------ RECITAL 2013, la conf?rence annuelle des jeunes chercheurs associ?e ? TALN, se d?roulera aux Sables D'Olonne (France) du 17 au 21 juin 2013. RECITAL offre aux jeunes chercheurs en Traitement Automatique des Langues (TAL) l'occasion de pr?senter leurs travaux et de comparer leurs approches. Elle est r?serv?e aux ?tudiants (master et doctorat) et aux jeunes chercheurs ayant obtenu leur doctorat depuis moins d'un an. Fort du succ?s de l'ann?e pr?c?dente, nous encourageons la soumission de travaux m?me pr?liminaires, de projets de th?se, et de travaux des premiers mois de recherche (?tat de l'art, premi?res pistes, etc.). L'objectif premier de RECITAL est de soutenir les travaux des jeunes chercheurs en TAL et de faciliter leur int?gration dans notre communaut?. A ce titre, nous visons : - des relectures p?dagogiques : les auteurs doivent pouvoir comprendre les erreurs qu'ils ont pu commettre afin de pouvoir les corriger et am?liorer la qualit? de leur travail; - des relectures positives : il n'est jamais n?cessaire de d?courager un jeune chercheur, les ma?tres-mots devront ?tre encourager/guider; - l'?change direct : les relectures seront communiqu?es sign?es, donc non-anonymes, aux auteurs. Libre ? eux (ou mieux, aux relecteurs) d'aller informellement discuter ensemble lors de la conf?rence. Un prix du meilleur papier RECITAL d'une valeur de 500 Euros sera d?cern? lors de la c?r?monie de cl?ture. Th?mes principaux ----------------- Les communications pourront porter sur les th?mes habituels du TAL : - Analyse et g?n?ration dans les domaines suivants : + Phon?tique + Phonologie + Morphologie + Syntaxe + S?mantique - Analyse et g?n?ration dans les domaines suivants : + Phon?tique + Phonologie + Morphologie + Syntaxe + S?mantique + Discours - D?veloppement de ressources linguistiques pour le TAL : + Bases de donn?es comportant des informations morphologiques, syntaxiques, s?mantiques, et/ou phonologiques + Grammaires + Lexiques + Ontologies + Linguistique de corpus - Applications du TAL : + Analyse de sentiments ou d'opinions + Cat?gorisation ou classification automatique + D?sambigu?sation lexicale + Dialogue homme-machine en langage naturel + Enseignement assist? par ordinateur + Indexation automatique + Recherche et extraction d'information + R?sum? automatique + R?solution d'anaphores + Syst?mes de question-r?ponse + Traduction automatique + Web s?mantique - Approches : + Linguistiques formelles destin?es ? soutenir les traitements automatiques + Symboliques + Logiques + Statistiques + Bas?es sur l'apprentissage automatique Cette liste n'est pas exhaustive et l'ad?quation d'une proposition de communication ? la conf?rence sera jug?e par le comit? de programme. Crit?res de s?lection --------------------- Les auteurs doivent ?tre des ?tudiant(e)s ou bien des jeunes docteur(e)s ayant soutenu leur th?se depuis moins d'un an. Les publications avec des chercheurs confirm?s (ce qui inclus les directeurs de th?ses) doivent ?tre soumises ? TALN et non ? RECITAL. Les auteurs sont invit?s ? soumettre des travaux de recherche originaux, n'ayant pas fait l'objet de publications ant?rieures. Les soumissions seront examin?es par au moins deux sp?cialistes du domaine. Seront consid?r?es en particulier: - La correction du contenu scientifique et technique - La situation des travaux dans le contexte de la recherche internationale - L'organisation et la clart? de la pr?sentation - L'ad?quation aux th?mes de la conf?rence Les articles s?lectionn?s seront publi?s dans les actes de la conf?rence. Suivant l'avis du comit? de programme, les pr?sentations se feront soit sous forme orale soit sous forme de poster. Modalit?s de Soumission ----------------------- Les articles seront r?dig?s en fran?ais pour les francophones, en anglais pour ceux qui ne ma?trisent pas le fran?ais. Les articles doivent faire de 8 ? 14 pages. Une feuille de style LaTeX, un mod?le Word et un mod?le LibreOffice seront disponibles sur le site web (? venir) de la conf?rence. Contact : florian.boudin at univ-nantes.fr et loic.barrault at lium.univ-lemans.fr ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:34:49 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:34:49 +0200 Subject: Job: Developpeur d'applications Web, PROLIPSIA, Besancon, France Message-ID: Date: Mon, 08 Oct 2012 16:40:45 +0200 From: blandine.alecu at prolipsia.com Message-ID: <20121008164045.84869g1ir4ubzx7x at webmail.prolipsia.com> X-url: http://prolipsia.com/wp-content/uploads/2012/10/AnnonceDeveloppeurProlipsia2012.pdf DEVELOPPEUR D?APPLICATIONS WEB (H/F) http://prolipsia.com/wp-content/uploads/2012/10/AnnonceDeveloppeurProlipsia2012.pdf Contrat : CDI Statut : cadre Salaire : selon profil Prolipsia est une jeune entreprise d??dition de logiciels de Traitement Automatique du Langage (TAL) sp?cialis?s dans les langues contr?l?es. Cr??e en 2011 et bas?e ? Besan?on (Temis Innovation), Prolipsia con?oit, ?dite et commercialise des solutions logicielles d?aide ? la conception et r?daction de textes techniques, ? destination de secteurs tels que la sant?, la s?curit? priv?e, l?industrie. Dans le cadre de notre d?veloppement, nous cherchons un d?veloppeur (bac+5), passionn? par le d?veloppement et motiv? par l?innovation, afin d?int?grer notre ?quipe de R&D d?s le mois de novembre 2012. MISSIONS : Au sein de l??quipe R&D et en ?troite collaboration avec nos ing?nieurs linguistes, vous travaillerez sur la conception et le d?veloppement d?applications de traitement du langage. Dot? d'une forte autonomie, vous avez l'habitude/conscience des contraintes li?es ? la production afin de garantir la meilleure qualit? de service. R?actif, organis?, dynamique, vos comp?tences et vos qualit?s relationnelles vous permettront de coop?rer au sein d?une ?quipe de R&D pluridisciplinaire, et d?assurer les missions suivantes : - Gestion des phases techniques de projets : sp?cifications fonctionnelles et techniques, planification, r?daction de comptes rendus techniques, d?ploiement - Participation aux projets de R&D internes : conception et d?veloppement d?applications (notamment Web), ?volution des applications d?j? d?velopp?es par Prolipsia - Interactions avec nos clients : suivi de produit, ?tude et analyse des retours d?exp?riences Client - Veille technologique Vous pourrez, pour mener ? bien vos missions, ?tre amen? ? suivre des formations professionnelles. POURQUOI NOUS REJOINDRE ? - Participation au d?veloppement d?un projet ambitieux et innovant - Possibilit? d'?volution : votre exp?rience au sein de Prolipsia pourrait vous amener rapidement ? animer l??quipe de d?veloppement. - Ambiance Start-up, professionnalisme, ouverture d?esprit et convivialit? - Horaires et emploi du temps flexibles, t?l?travail pendulaire possible - Valorisation des comp?tences et de la motivation PROFIL RECHERCHE : - Exp?rience significative en entreprise, id?alement en gestion de projet - Aptitude et go?t pour le travail en ?quipe - Excellentes qualit?s relationnelles et de communication - Cr?ativit?, esprit d?initiative - Go?t pour le challenge et l?innovation COMPETENCES : Vous maitrisez : - les langages de programmation PHP 5, Javascript, XML, HTML, et technologies AJAX - la POO (Programmation Orient?e Objet) - un ou plusieurs frameworks PHP - un ou plusieurs frameworks Javascript (comme jQuery ou ExtJS) - un ou plusieurs SGBDR (syst?me de gestion de base de donn?es relationnelle), dont MySQL - l'int?gration de tests dans le processus de d?veloppement Vous connaissez : - les logiciels de gestion de version, tels que Subversion - les m?thodes agiles Seront appr?ci?s : - connaissance de Perl, NoSQL - connaissances et exp?rience en exploitation d'outils/librairies open source - connaissances en Traitement Automatique du Langage - go?t pour le fran?ais CANDIDATURE : Merci d'envoyer lettre de motivation et CV, avant le 26 octobre 2012, ? Julie RENAHY : julie.renahy at prolipsia.com ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:37:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:37:25 +0200 Subject: Appel: CogALEX-III Message-ID: Date: Mon, 08 Oct 2012 23:19:07 +0200 From: Michael Zock Message-ID: <5073434B.5090508 at lif.univ-mrs.fr> X-url: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html Apologies for multiple postings : ============================================================== Times flies : !!! only one MORE week !!! ============================================================== 3d and last Call for Papers : CogALex-3 (Cognitive Aspects of the Lexicon), a post-COLING workshop deadline for paper submission : October 15, 2012 more details: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ============================================================== 3rd Workshop on "Cognitive Aspects of the Lexicon" (CogALex) Post-conference workshop at COLING 2012 (December 15, Mumbai, India) Invited speaker: Alain Polgu?re (Universit? de Lorraine & ATILF CNRS, France) Submission deadline: October 15, 2012 AIMS and TARGET AUDIENCE The aim of this workshop is to bring together researchers involved in the construction and application of electronic dictionaries to discuss modifications of existing resources in line with the users' needs, thereby fully exploiting the advantages of the digital form. Given the breadth of the questions, we welcome reports on work from many perspectives, including but not limited to: computational lexicography, psycholinguistics, cognitive psychology, language learning and ergonomics. MOTIVATION The way we look at dictionaries, their creation and use, has changed dramatically over the past 30 years. (1?) While being considered as an appendix to grammar in the past, they have in the meantime moved to centre stage. Indeed, there is hardly any task in NLP which can be conducted without them. (2?) Also, many lexicographers work nowadays with huge digital corpora, using language technology to build and to maintain the lexicon. (3?) Last, but not least, rather than being static entities (data-base view), dictionaries are now viewed as graphs, whose nodes and links (connection strengths) may change over time. Interestingly, properties concerning topology, clustering and evolution known from other disciplines (society, economy, human brain) also apply to dictionaries: everything is linked, hence accessible, and everything is evolving. Given these similarities, one may wonder what we can learn from these disciplines. In this 3rd edition of the CogALex workshop we therefore intend to also invite scientists working in these fields, our goals being to broaden the picture, i.e. to gain a better understanding concerning the mental lexicon and to integrate these findings into our dictionaries in order to support navigation. Given recent advances in neurosciences, it appears timely to seek inspiration from neuroscientists studying the human brain. There is also a lot to be learned from other fields studying graphs and networks, even if their object of study is something else than language, for example biology, economy or society. TOPICS OF INTEREST This workshop is about possible enhancements of existing electronic dictionaries. To perform the groundwork for the next generation of electronic dictionaries we invite researchers involved in the building of such dictionaries. The idea is to discuss modifications of existing resources by taking the users' needs and knowledge states into account, and to capitalize on the advantages of the digital media. For this workshop we invite papers including but not limited to the following topics which can be considered from various points of view: linguistics, neuro- or psycholinguistics (associations, tip-of-the-tongue problem), network-related sciences (complex graphs, network topology, small-world problem), etc. 1) Analysis of the conceptual input of a dictionary user - What does a language producer start from (bag of words)? - What is in the authors' minds when they are generating a message and looking for a word? - What does it take to bridge the gap between this input and the desired output (target word)? 2) The meaning of words - Lexical representation (holistic, decomposed) - Meaning representation (concept based, primitives) - Revelation of hidden information (vector-based approaches: LSA/HAL) - Neural models, neurosemantics, neurocomputational theories of content representation. 3) Structure of the lexicon - Discovering structures in the lexicon: formal and semantic point of view (clustering, topical structure) - Creative ways of getting access to and using word associations - Evolution, i.e. dynamic aspects of the lexicon (changes of weights) - Neural models of the mental lexicon (distribution of information concerning words, organisation of the mental lexicon) 4) Methods for crafting dictionaries or indexes - Manual, automatic or collaborative building of dictionaries and indexes (distributional semantics, crowd-sourcing, serious games, etc.) - Impact and use of social networks (Facebook, Twitter) for building dictionaries, for organizing and indexing the data (clustering of words), and for allowing to track navigational strategies, etc. - (Semi-) automatic induction of the link type (e.g. synonym, hypernym, meronym, association, collocation, ...) - Use of corpora and patterns (data-mining) for getting access to words, their uses, and combinations (associations) 5) Dictionary access (navigation and search strategies), interface issues - Semantic-based search - Search (simple query vs multiple words) - Context-dependent search (modification of users? goals during search) - Recovery - Navigation (frequent navigational patterns or search strategies used by people) - Interface problems, data-visualisation IMPORTANT DATES - Deadline for paper submissions: October 15, 2012 - Notification of acceptance: November 5, 2012 - Camera-ready papers due: November 15, 2012 - Workshop date: December 15, 2012 SUBMISSION INSTRUCTIONS see: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html INVITED SPEAKER: Alain Polgu?re (Universit? de Lorraine & ATILF CNRS, France) PROGRAM COMMITTEE * Barbu, Eduard (Universidad de Ja?n, Spain) * Barrat, Alain (Centre de physique th?orique, CNRS & Aix-Marseille University) * Bilac, Slaven (Google Tokyo, Japan) * Bel Enguix, Gemma (LIF, Aix-Marseille University, France) * Bouillon, Pierrette (TIM, Faculty of Translation and Interpretating, Geneva, Switzerland) * Cook, Paul (The University of Melbourne, Australia) * Cristea, Dan (University of Iasi, Romania) * Fairon, Cedrick (CENTAL, Universit? catholique de Louvain, Belgium) * Fazly, Afsaneh (University of Toronto, Canada) * Fellbaum, Christiane (University of Princeton, USA) * Ferret, Olivier (CEA LIST, Palaiseau, France) * Fontenelle, Thierry (Translation Centre for the Bodies of the European Union, Luxemburg) * Granger, Sylviane (Universit? Catholique de Louvain, Belgium) * Grefenstette, Gregory (3DS Exalead, Paris, France) * Hansen-Schirra, Silvia (University of Mainz, FTSK, Germany) * Heid, Ulrich (University of Hildesheim, Germany) * Hirst, Graeme (University of Toronto, Canada) * Hovy, Ed (ISI, Los Angeles, USA) * Joyce, Terry (Tama University, Kanagawa-ken, Japan) * Kwong, Olivia (City University of Hong Kong, China) * L'Homme, Marie Claude (OLST, University of Montreal, Canada) * Lapalme, Guy (RALI, University of Montreal, Canada) * Mititelu, Verginica (RACAI, Bucharest, Romania) * Pirrelli, Vito (ILC, Pisa, Italy) * Polgu?re, Alain (Universit? de Lorraine & ATILF CNRS, France) * Rapp, Reinhard (University of Leeds, UK) * Ruette, Tom (KU Leuven, Belgium) * Schwab, Didier (LIG, Grenoble, France) * Serasset, Gilles (IMAG, Grenoble, France) * Sharoff, Serge (University of Leeds, UK) * Sinopalnikova, Anna (FIT, BUT, Brno, Czech Republic) * Sowa, John (VivoMind Research, LLC, USA) * Tiberius, Carole (Institute for Dutch Lexicology, The Netherlands) * Tokunaga, Takenobu (TITECH, Tokyo, Japan) * Tufis, Dan (RACAI, Bucharest, Romania) * Valitutti, Alessandro (University of Helsinki and HIIT, Finland) * Vossen, Piek (Vrije Universiteit, Amsterdam, The Netherlands) * Wehrli, Eric (LATL, University of Geneva, Switzerland) * Zock, Michael (LIF, CNRS, Aix-Marseille University, France) * Zweigenbaum, Pierre (LIMSI - CNRS, Orsay & ERTIM - INALCO, Paris, France) WORKSHOP ORGANIZERS and CONTACT PERSONS Michael Zock (LIF-CNRS, Marseille, France), michael.zock AT lif.univ-mrs.fr Reinhard Rapp (University of Leeds, UK), reinhardrapp AT gmx.de For more details see: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:39:24 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:39:24 +0200 Subject: Ecole: EARIA 2012, Ecole d'Automne en Recherche d'Information et Applications Message-ID: Date: Tue, 09 Oct 2012 13:15:54 +0200 From: Brigitte Grau Message-ID: <5074076A.8080005 at limsi.fr> X-url: http://www.asso-aria.org/earia2012 Il reste encore des places pour vous inscrire. APPEL A PARTICIPATION ======================================================================= ?cole d'Automne en Recherche d'Information et Applications EARIA 2012 Organisation : ARIA (Association Francophone de Recherche d'Information et Applications), ?cole des Mines de Saint-Etienne et Universit? Jean Monnet 24, 25 et 26 octobre 2012 au Couvent de La Tourette, (?veux), ? proximit? de Lyon http://www.asso-aria.org/earia2012 ======================================================================= Objectifs EARIA (?cole d'Automne en Recherche d'Information et Application) a pour objectif principal la formation des doctorants dans le domaine de la Recherche d'Information (RI). Les cours sont organis?s sur 4 demi-journ?es (du mercredi 24 octobre en fin de matin?e au vendredi 26 octobre midi) et offrent un cadre d'?change convivial tant autour des fondements que des th?mes novateurs dans le domaine de la RI, abord?s par des chercheurs europ?ens faisant autorit? dans le domaine. L'?cole EARIA est compl?mentaire de l'?cole ESSIR (European Summer School on Information Retrieval) organis?e depuis 1990 tous les trois ans environ avant 2003 et tous les deux ans depuis 2003. EARIA a vocation ? se tenir ?galement tous les deux ans en alternance avec ESSIR et offre une occasion privil?gi?e de rencontres et discussions entre seniors du domaine et jeunes chercheurs, permettant ainsi ? ces derniers de mieux situer leur projet de recherche. Les pr?c?dentes ?ditions de EARIA ont eu lieu en 2006 ? Grenoble, en 2008 ? Toulouse, en 2010 ? Saint-Germain-au-Mont-d'Or en Rh?ne-Alpes et ont connu un franc succ?s. L'?cole est destin?e au jeunes chercheurs issus de disciplines diff?rentes, et comporte de ce fait deux volets : une revue des fondamentaux des disciplines li?es ? la Recherche d'Information tels que les mod?les formels de la RI, leur mise en oeuvre, les m?thodes d'?valuation, les m?thodes du traitement automatique de la langue et les mod?les d'apprentissage pour la RI. Outre ces aspects, des th?mes actuellement en plein essor tels que la RI sociale ? travers les folksonomies et la RI communautaire ou encore la RI distribu?e seront pr?sent?s. Par ailleurs les participants sont invit?s ? pr?senter leur recherche au travers d'un poster lors de s?ances r?serv?es ? cet effet dans le but de favoriser les ?changes entre participants et intervenants. ======================================================================= Programme des conf?rences 1. Introduction au domaine (Mohand Boughanem, IRIT, Universit? de Toulouse) 2. Mod?les de RI (Eric Gaussier, LIG, Universit? de Grenoble) 3. Logiciels pour la RI (Michel Beigbeder, ?cole des Mines de Saint-?tienne) 4. M?thodes d'?valuation (Jacques Savoy, Universit? de Neuch?tel) 5. RI et Apprentissage Automatique (Massih Amini, LIG, Universit? de Grenoble) 6. Techniques de base de TAL et leur utilisation en question-r?ponse et extraction d'information (Patrice Bellot, LSIS, Universit? Aix-Marseille) 7. D?tection de sentiments (Vincent Guigue, LIP6, Univ. Pierre&Marie Curie) 8. RI contextuelle et mobile (Lynda Tamine-Lechani, IRIT, Universit? de Toulouse) 9. RI sociale (Maarten de Rijke, Univ. Amsterdam) ======================================================================= Le tarif des inscriptions est fix? ?: - Doctorants : 250 euros, Enseignants-chercheurs 350 euros, pour une inscription avant le 21/09/2012, - Doctorants : 300 euros, Enseignants-chercheurs 400 euros, pour une inscription apr?s le 21/09/2012, - Les frais d'adh?sion ? ARIA sont de : 50 euros pour une inscription individuelle, 100 euros pour l'inscription d'un organisme. Les frais d'inscription comprennent l'h?bergement, cinq repas et les pauses. Pour s'inscrire, le formulaire est disponible sur le site. ======================================================================= Comit? scientifique : Pr?sidente: Brigitte Grau, LIMSI-CNRS et ENSIIE Michel Beigbeder, ?cole des Mines de Saint-?tienne Mohand Boughanem, IRIT, Universit? de Toulouse Sylvie Calabretto, LIRIS, INSA de Lyon ?ric Gaussier, LIG, Universit? de Grenoble Comit? d'organisation : Pr?sident: Michel Beigbeder, ?cole des Mines de Saint-?tienne Bissan Audeh, ?cole des Mines de Saint-?tienne Mathias G?ry, LaHC, Universit? de Saint-?tienne Philippe Jaillon, ?cole des Mines de Saint-?tienne Christine Largeron, LaHC, Universit? de Saint-?tienne Mihaela Mathieu, ?cole des Mines de Saint-?tienne Yulian YANG, INSA de Lyon ======================================================================= ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:42:56 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:42:56 +0200 Subject: Job: 2 post-doc positions in multilingual text mining and media monitoring at the JRC (Reminder) Message-ID: Date: Tue, 09 Oct 2012 17:20:12 +0200 From: Ralf Steinberger Message-id: <05cc01cda631$94044ad0$bc0ce070$@jrc.ec.europa.eu> X-url: http://recruitment.jrc.ec.europa.eu/ REMINDER: Deadline for application is 17 October 2012 Readers on this list may be interested in the following three-year post-doc positions to work at the European Commission?s Joint Research Centre (JRC) in Ispra, at the Lago Maggiore in Italy. Code: 2012-IPR-G-30-000-00741 - CAT 30 - ISPRA Multi-lingual and multi-functional information extraction methods and tools Code: 2012-IPR-G-30-000-00743 - CAT 30 - ISPRA Engineering Media Monitoring Software Solutions Applicants need to hold a Ph.D. or have at least five years of relevant post-graduate experience. URL with job details: http://recruitment.jrc.ec.europa.eu/ (select IPSC institute) URL with conditions: http://ec.europa.eu/dgs/jrc/index.cfm?id=4790 Application deadline: 17.10.2012 Duration: 36 months Type of contract: category 30 grant holder Action: Open Source Text Information Mining and Analysis (OPTIMA) Scientific website: http://langtech.jrc.ec.europa.eu EMM online applications: http://emm.newsbrief.eu/overview.html Information on the team and its work: The JRC?s Global Security and Crisis Management Unit (GlobeSec) supports the Union's policies to strengthen the EU's resilience to crises and disasters as well as the EU's aim to promote stability and peace through its research in crisis management technologies and in information mining and analysis. The Unit's OPTIMA (Open Source Text Information Mining and Analysis) Action develops innovative solutions for retrieving and extracting information from the Internet, and especially from online news and social media. It serves many Commission Services, EU agencies and some EU Member State authorities. The core of this action is the Europe Media Monitor (EMM). EMM gathers and analyses about 150,000 online news articles per day in 50 languages. The technologies that have been developed so far in the OPTIMA Action include multilingual tools for the following tasks: event extraction; automatic entity recognition, classification and disambiguation; name variant mapping; co-reference resolution; quotation recognition; opinion mining; multi-document summarisation; document clustering and classification; machine translation; information aggregation, including across languages; and more. Rule-based, as well as Machine Learning and hybrid methods are being used to achieve these goals. These techniques are already to some extent being deployed in several operational applications (see http://emm.newsbrief.eu/overview.html) and part of the work would be in support of these applications. The on-going research has a strong focus on applicability in a highly multi-lingual environment. The work is very practical and goal-oriented. Hands-on experience with developing tools is thus essential. Research results are expected to be used operationally. The candidate is expected to contribute to scientific publications of the research results. Ralf Steinberger European Commission - Joint Research Centre (JRC) IPSC - GlobeSec - OPTIMA (OPensource Text Information Mining and Analysis) 21027 Ispra (VA), Italy ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 9 20:48:12 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 9 Oct 2012 22:48:12 +0200 Subject: Appel: ACL 2013 Student Research Workshop Message-ID: Date: Tue, 9 Oct 2012 19:53:41 +0200 From: "Vecchi, Eva Maria" Message-ID: <799A1AC9-5CFC-4269-A8B0-976BC123FA2D at unitn.it> X-url: http://sites.google.com/site/aclsrw2013/ Call for Papers ACL 2013 Student Research Workshop 5-7 August, 2013, Sofia, Bulgaria ** Submission deadline: Sunday, March 3, 2013 ** http://sites.google.com/site/aclsrw2013/ General Invitation for Submissions The ACL Student Session provides a venue for student researchers investigating topics in Computational Linguistics and Natural Language Processing to present their research, to meet potential advisors, and to receive feedback from the international research community. The Student Session's goal is to aid students at multiple stages of their education: from those in the final stages of undergraduate training to those who are preparing their graduate thesis proposal. Towards this goal, we invite papers in two separate categories. 1. Thesis/Research Proposals: This category is appropriate for experienced students who wish to get feedback on their proposal and broader ideas for the field in order to strengthen their final research. 2. Research Papers: Most appropriate for students who are new to academic conferences. Papers in this category can describe completed work or work in progress with preliminary results. Subject to the availability of established researcher volunteers, each accepted paper will be assigned a mentor, who will provide feedback on the work to the student at the conference. Separately, the committee will do its best to assign pre-submission mentors to students who wish to get feedback before the paper deadline. This service will be available on a first come, first served basis and does not guarantee acceptance into the workshop. Students who wish to take advantage of this opportunity should let the the co-chairs know via email no later than Saturday, December 29, 2012 and should submit a paper draft no later than Friday, January 18, 2013. Topics Relevant topics for the workshop include, but are not limited to, the following areas (in alphabetical order): - Cognitive modeling of language processing and psycholinguistics - Dialogue and interactive systems - Discourse, coreference and pragmatics - Evaluation methods - Information retrieval - Language resources - Lexical semantics and ontologies - Low resource language processing - Machine translation: methods, applications and evaluation - Multilinguality in NLP - NLP applications - NLP and creativity - NLP for the languages of Central and Eastern Europe and the Balkans - NLP for the Web and social media - Question answering - Semantics - Sentiment analysis, opinion mining and text classification - Spoken language processing - Statistical and Machine Learning methods in NLP - Summarization and generation - Syntax and parsing - Tagging and chunking - Text mining and information extraction - Word segmentation Submission Requirements Thesis/Research Proposals may contain previously published work and must include specific research directions. They may also be in the style of a position paper that surveys and critiques existing literature, but must suggest future research directions. Proposals may only have one author, who must be a student. Research Papers must describe original completed work or work in progress and should clearly indicate directions for future research wherever appropriate. The first author of multi-author papers MUST be a student, though it is not required that additional co-authors be students. Research Papers must not have been presented at any other meeting with publicly available published proceedings. Students who have already presented at a past ACL/EACL/NAACL Student Research Workshop may not be the first author on a Research Paper (though they may still be the first author of a Thesis/Research Proposal). They should instead submit their papers either to the main conference or to the Thesis/Research Proposal track. Students must indicate whether a paper has been submitted to another conference or workshop. Electronic Submission Submission is electronic, using the Softconf submission software (URL to be announced in subsequent versions of this call) Submission Format Both paper and proposal submissions to the Student Session should follow the standard two-column format of the ACL 2013 proceedings. Submissions should have no more than six (6) pages excluding references (LaTeX and Microsoft Word style files will be available on the main conference website http://acl2013.org/). Submissions must conform to the official ACL 2013 style guidelines and they must be submitted as a PDF file. The reviewing process will be double-blind; therefore, please ensure that the paper does not include the authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ...". Further guidelines are provided in the template style files. Multiple-submission policy Papers that have been or will be submitted to other meetings or publications must indicate this at submission time. Authors of papers accepted for presentation at ACL 2013 must notify the program chairs by April 21, 2013 as to whether the paper will be presented. All accepted papers must be presented at the conference in order for them to appear in the proceedings. We will not accept for publication or presentation papers that overlap significantly in content or results with papers that will be (or have been) published elsewhere. Authors submitting more than one paper to ACL must ensure that submissions do not overlap significantly (> 50%) with each other in content or results. Important Dates - Pre-submission mentoring service application: December 29, 2012 - Pre-submission mentoring paper deadline: January 18, 2013 - Submission deadline: March 3, 2013 - Notification of acceptance: April 24, 2013 - Camera-ready submission deadline: May 24, 2013 - Conference dates: August 5-7, 2013 (The session will be held during the main conference) Organising Committee Student Chairs - Anik Dey, The Hong Kong University of Science & Technology - Sebastian Krause, German Research Center for Artificial Intelligence - Ivelina Nikolova, Bulgarian Academy of Sciences - Eva Vecchi, Universit? di Trento Faculty Advisors - Steven Bethard, University of Colorado Boulder & KU Leuven - Preslav I. Nakov, Qatar Computing Research Institute - Feiyu Xu, German Research Center for Artificial Intelligence Program Committee (To be announced) Contact acl-srw-2013 at googlegroups.com ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:19:07 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:19:07 +0200 Subject: Job: CDD au TGE Adonis Message-ID: Date: Wed, 10 Oct 2012 08:53:06 +0200 From: Jean-Luc Minel Message-ID: <50751B52.9060108 at u-paris10.fr> X-url: http://dariah.eu/ X-url: http://www.tge-adonis. X-url: http://www.rechercheisidore.fr/ X-url: http://www.narcis.nl/ Dans le cadre de la mise en oeuvre de DARIAH (http://dariah.eu/), le TGE ADONIS (http://www.tge-adonis.fr/) recrute pour une mission de 3 ou 4 mois, un/e ing?nieur/e de recherche CNRS contractuel, chef de projet ou expert en d?veloppement et d?ploiement d'applications. Mission : L'objectif principal de la mission consiste ? ?tudier les possibilit?s d'interop?rabilit? documentaires et de requ?tage entre les deux plateformes ISIDORE (http://www.rechercheisidore.fr/ ) et Narcis (http://www.narcis.nl/ ) dans le but de d?finir une ou plusieurs fa?on de requ?ter de fa?on simultan? et crois? les deux plateformes (interconnexion). Dans un premier temps, il s'agira de produire une description fonctionnelle des deux syst?mes, de leurs cha?nes de traitement respectives et une description des API disponibles dans chacun des deux syst?mes. Dans un deuxi?me temps, il s'agira de produire des propositions techniques et organisationnelles, n?cessaires ? une interconnexion (requ?tes, navigation, interaction entre les m?tadonn?es). Dans un troisi?me temps, il s'agira de d?velopper diff?rents prototypes pour illustrer les diff?rentes solutions d'interconnexion et/ou les probl?mes soulev?s par ce projet. Comp?tences : - Bonnes connaissances sur les concepts du Linked Data et du Web s?mantique (RDF, RDFa, SPARQL); - Bonnes connaissances en ing?nierie logicielle et maitrise des concepts d'API ; - Capacit?s ? prototyper (programmation java et/ou Python, et/ou Javascript, et/ou Perl, et/ou Php) pour r?aliser des tests et d?monstrateurs ; - Capacit?s ? analyser des sp?cifications d?taill?es et ? r?aliser des analyses fonctionnelles (maitrise d'UML souhait?e) ; - Connaissances en Information scientifique et technique : Dublin Core, OAI-PMH, microformats, etc. - Maitrise de l'anglais oral et ?crit. Le travail n?cessite la participation ? des r?unions de travail en anglais et la r?daction de documents en anglais. Lieu de travail : TGE Adonis, Paris 5?. La mission n?cessitera des d?placements et des brefs s?jours (8 ? 10 jours) ? La Haye (NL) et ? Lyon. Envoyer un CV ? Sophie David (sophie.david at tge-adonis.fr) et Jean-Luc Minel (jean-luc.minel at u-paris10.fr) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:22:31 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:22:31 +0200 Subject: Appel: FGCS, Special Issue on Intelligent Big Data Processing Message-ID: Date: Wed, 10 Oct 2012 15:31:59 +0800 From: cfp at grid.chu.edu.tw Message-Id: <201210100731.q9A7Vxr4004958 at grid.chu.edu.tw> X-url: http://www.journals.elsevier.com/future-generation-computer-systems/calls-for-papers/special-issue-on-intelligent-big-data-processing/ Future Generation Computer Systems (http://ees.elsevier.com/fgcs/) Special Issue on Intelligent Big Data Processing http://www.journals.elsevier.com/future-generation-computer-systems/ calls-for-papers/special-issue-on-intelligent-big-data-processing/ == Overview == Nowadays, data comes from sensors, lab experiments, simulations, individual archives, enterprise and Internet in all scales and formats. This data flood has outpaced our capability to process, analyze, store and understand these datasets. Such rapid expansion is also accelerated by the dramatic increase in acceptance of social media and networking applications. Furthermore, It can be foreseen that Internet of things (IoT) applications will raise the scale of data to an unprecedented level. People and devices (from home coffee machines to cars, to buses, railway stations and airports) are all loosely connected. Trillions of such connected components will generate a huge data ocean, and valuable information must be discovered from the data to help improve quality of life and make our world a better place. This special issue intends to tackle such data deluge issues intelligently, efficiently and effectively. Areas of interest for this special issue include the following topics: - Intelligent data mining techniques - Dynamic data redistribution - Scalable and distributed algorithms - New programming models for large data - Locality aware data processing - NoSQL - Data filtering techniques for Internet of Things - DaaS, Data as a Service - MapReduce in hybrid clouds - Asynchronous data processing - Opportunistic data processing in hybrid clouds - Intelligent storage and load balancing - Data migration and synchronization (between private and public clouds) - Multi-tier MapReduce programming model - Dynamic Mapper/Reducer join/leave - Data decomposition base on GPU / CPU availability - Dynamic provisioning for big data processing - System Issues related to large datasets == Schedule == Manuscript due date: December 20, 2012 First round notification: March 1st, 2013 Submission due date of revised paper: April 15, 2013 Notification of acceptance: May 15, 2013 Submission of final revised paper: June 10, 2013 Publication: September 2013 (tentative) == Submission & Review Instruction == Submitted articles must not have been previously published or currently submitted for journal publication elsewhere. Submissions must be directly sent via the FGCS submission web site at http://ees.elsevier.com/fgcs/login.asp (please select the track - SS: Intelligent Big Data - Hsu). Paper submissions must conform to the layout and format guidelines in Future Generation Computer Systems. For a full and complete Guide for Authors, please refer to: http://www.elsevier.com/fgcs Each submitted paper will be reviewed by at least three Editorial reviewers Criteria and Evaluation for acceptance of paper: - Significance to the journal's audience - Relevance to this special issue - Overall recommendation on the paper - Optional confidential comments to the Editorial Committee Quality of the paper including originality, technical depth, significance of results, adequacy of priori works referenced, overall organization, clarity and readability, satisfactory English writing, sufficient support for assertions and conclusion, appropriate title, abstract adequately summarizes the paper, introduction provides proper orientation, clear tables and figures. Guest Editors Ching-Hsien (Robert) Hsu Department of Computer Science and Information Engineering, Chung Hua University, Taiwan Email: chh at chu.edu.tw http://www.chu.edu.tw/~chh ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:24:36 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:24:36 +0200 Subject: Appel: PAKDD 2013, Deadline Further Extended to 15 Oct. 2012 Message-ID: Date: Wed, 10 Oct 2012 09:38:05 +0100 From: CFP PAKDD2013 Message-ID: X-url: http://pakdd2013.pakdd.org/ ------------------------------------------------------------------------ Due to many requests, PAKDD2013 organizers have seriously considered and decided to extend the submission deadline to 23:59 pm, October 15, 2012 (PDT). ------------------------------------------------------------------------ Call For Papers PAKDD 2013 The 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining Gold Coast, Australia Conference Website http://pakdd2013.pakdd.org/ Submission System https://cmt.research.microsoft.com/PAKDD2013/ Important Dates Paper submission due: Oct. 15 (Mon). 2012 Notification to author: Dec. 19 (Wed). 2012 Camera ready due: Jan. 6 (Sun). 2013 *[23:59:59 Pacific Time] ============================================================== Conference Scope The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference in the areas of data mining and knowledge discovery (KDD). It provides an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all KDD related areas, including data mining, data warehousing, machine learning, artificial intelligence, databases, statistics, knowledge engineering, visualization, and decision-making systems. The conference calls for research papers reporting original investigation results and industrial papers reporting real data mining applications and system development experience. ============================================================== Topics The topics of relevance for the conference papers include but not limited to the following: * Novel models and algorithms * Clustering * Classification * Ranking * Association analysis * Anomaly detection * Data pre-processing * Feature extraction and selection * Mining heterogeneous data * Mining multi-source data * Mining sequential data * Mining spatial and temporal data * Mining unstructured and semi-structured data * Mining graph and network data * Parallel, distributed, and high performance data mining on the cloud platform * Privacy preserving data mining * Mining high dimensional data * Mining uncertain data * Mining imbalanced data * Mining dynamic/streaming data * Statistical methods for data mining * Visual data mining * Interactive and online mining * Mining behavioral data * Mining multimedia data * Mining scientific databases * Ubiquitous knowledge discovery * Agent-based data mining * Mining social networks * Financial data mining * Fraud and risk analysis * Security and intrusion detection * Opinion mining and sentiment analysis * Post-processing including quality assessment and validation * Integration of data warehousing, OLAP and data mining * Human, domain, organizational and social factors in data mining * Applications to healthcare, bioinformatics, computational chemistry, * Eco-informatics, marketing, online gaming, etc All paper submissions will be handled electronically. Detailed instructions are provided on the conference home page. ============================================================== Paper Submission Each submitted paper should include an abstract up to 200 words. It should also adhere to the double-blind review policy and not longer than 12 single-spaced pages with 10pt font size. Authors are strongly encouraged to use Springer LNCS/LNAI manuscript submission guidelines (available at http://www.springer.de/comp/lncs/authors.html) for their initial submissions. All papers must be submitted electronically through Microsoft's Conference Management Service (CMT) in PDF format only. The submitted papers must not be previously published anywhere, and must not be under consideration by any other conferences or journal during the PAKDD review process. Submitting a paper to the conference means that if the paper were accepted, at least one author will attend the conference to present the paper. For no-show authors, their affiliations will receive a notification. The program committee chairs are not allowed to submit papers to the conference for a fair review process. All papers will be double-blind reviewed by the Program Committee on the basis of technical quality, relevance to data mining, originality, significance, and clarity. Papers that do not comply with the Submission Guidelines will be rejected without review. The best papers will be selected to be included in the special issues of Knowledge and Information Systems (KAIS) and International Journal of Data Mining and Bioinformatics (IJDMB). Before submitting your paper, please carefully read and agree with the PAKDD submission policy and no-show policy: http://pakdd.togaware.com/policy.html ============================================================== Conference Officers Honorary Co-chairs * Jiawei Han. University of Illinois at Urbana-Champaign,USA * Ramamohanarao Kotagiri, University of Melbourne, Australia * Graham Williams. Australia Taxation Office, Australia Conference Co-chairs * Hiroshi Motoda, AFOSR/AOARD and Osaka University, Japan * Longbing Cao. University of Technology, Sydney, Australia Program Committee Co-chairs * Jian Pei. Simon Fraser University, Canada * Vincent S. Tseng. National Cheng Kung University, Taiwan Local Arrangement Co-chairs * Vladimir Estivill-Castro. Griffith University (Gold Coast), Australia * Xue Li, University of Queensland, Australia * Richi Nayak, Queensland University of Technology, Australia * Xinhua Zhu, University of Technology, Sydney, Australia Workshop Co-chairs * Jiuyong Li. University of Sourth Australia, Australia * Kay Chen Tan. National University of Singapore, Singapore * Bo Liu. Guangdong University of Technology, China Tutorial Co-chairs * Tu Bao Ho. Japan Advanced Institute of Science and Technology, Japan * Mengjie Zhang. Victoria University of Wellington, New Zealand Award Chair * Chengqi Zhang, University of Technology, Sydney, Australia Sponsorship Co-chair * Yue Xu, Queensland University of Technology, Australia Publicity Co-chairs * P.Krishna Reddy, The International Institute of Information * Technology, Hyderabad, India * Yifeng Zeng, Aalborg University, Denmark * Xin Wang, University of Calgary, Canada * Zhihong Deng, Peking University, China ============================================================== Further Information For further information, please contact the Program Committee Chairs by pakdd13-program at pakdd.org . General inquiries * Longbing Cao University of Technology Sydney, Australia Email: pakdd13 at pakdd.org Phone: (61)2-9514-4477 Fax: (61)2-9514-1807 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:29:03 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:29:03 +0200 Subject: Livre: Recherche d'information contextuelle, assistee et personnalisee Message-ID: Date: Wed, 10 Oct 2012 14:47:10 +0200 From: Patrice Bellot Message-Id: <3E109703-5A05-4DBB-93B2-E86D7B77647F at univ-amu.fr> X-url: http://www.eyrolles.com/Informatique/Livre/recherche-d-information-contextuelle-assistee-et-personnalisee-9782746225831 Bonjour, Vous trouverez ci-dessous le sommaire et une partie de l'introduction de chaque chapitre du livre : "Recherche d'information contextuelle, assist?e et personnalis?e" paru dans la collection "Recherche d'information et web" chez Herm?s-Lavoisier (302 pages - ISBN13 : 978-2-7462-2583-1) http://www.eyrolles.com/Informatique/Livre/recherche-d-information-contextuelle-assistee-et-personnalisee-9782746225831 - Contexte et robustesse - Recherche d'information contextuelle : le cas des requ?tes - Robustesse et analyse syntaxique - La recherche d'information face ? des corpus et requ?tes bruit?s - Questions-r?ponses sur des documents audio - Personnalisation et collaboration - Recherche d'information et mod?lisation usagers - Recherche d'information collaborative - Difficult?s de lecture, dyslexies et recherche d'information - Assistance et aide ? la navigation - Navigation dans les documents audio par le r?sum? automatique - Interaction - Pr?diction de mots et saisie de requ?tes sur interfaces limit?es : dispositifs mobiles et aide au handicap bien cordialement, Patrice Bellot Aix-Marseille Universit? (AMU) - LSIS / CNRS ======================================================================== Chapitre 1 : Recherche d'information contextuelle : le cas des requ?tes Josiane MOTHE (IRIT, Toulouse) ======================================================================== Les syst?mes de recherche d'information (RI) actuels sont souvent g?n?ralistes : ils mettent en ?uvre les m?mes m?canismes et les m?mes m?thodes de traitement de l'information, quels que soient le contexte de recherche, l'utilisateur, son type de besoin d'information et l'usage qu'il souhaite faire de l'information retrouv?e. La RI contextuelle vise ? mod?liser les diff?rents aspects du contexte et leur vari?t? pour les int?grer dans le processus de recherche. L'aspect contextuel fait r?f?rence ? des connaissances implicites ou explicites concernant les intentions de l'utilisateur, l'environnement de l'utilisateur et le syst?me lui-m?me. L'hypoth?se est que rendre explicites certains ?l?ments du contexte de la RI pourrait am?liorer les performances des syst?mes de RI. Dans ce chapitre, nous ne pr?tendons pas aborder l'ensemble des ?l?ments associ?s au contexte ; nous nous focalisons plut?t sur un des aspects de celui-ci qui concerne les requ?tes. Les requ?tes correspondent au moyen par lequel l'utilisateur exprime explicitement son besoin en information au syst?me. Cet aspect du contexte de la recherche ? lui seul rev?t de multiples facettes que nous aborderons par la suite. ======================================================================== Chapitre 2 : Robustesse et analyse syntaxique Philippe BLACHE et Stephane RAUZY (LPL, Aix-en-Provence) ======================================================================== Pour le traitement automatique des langues, la robustesse d'une application se mesure ? sa capacit? ? r?sister aux erreurs. Celles-ci peuvent provenir soit d'une d?faillance du syst?me, soit d'une difficult? linguistique inh?rente au texte ou ? l'?nonc? trait?. Dans un cas comme dans l'autre, un syst?me robuste devra ?tre capable de poursuivre son traitement malgr? l'erreur. La question de la robustesse se pose de fa?on particuli?re dans le domaine de la recherche d'information [LEW 96, STR 94]. En effet, bon nombre de techniques de RI n'exploitent finalement que peu d'informations linguistiques et ne n?cessitent pas v?ritablement d'analyse linguistique d?taill?e. On remarque cependant que les progr?s r?alis?s dans le domaine du traitement s?mantique commencent ? utiliser des analyses d?passant le niveau lexical, n?cessitant des techniques plus sophistiqu?es permettant d'effectuer des traitements prenant en compte les unit?s syntaxiques ainsi que les relations les reliant. La RI est donc ?galement concern?e par cette ?volution. De plus, ce domaine pose des probl?mes sp?cifiques pouvant n?cessiter des analyses plus fines (compr?hension de questions, requ?tes multimodales, comparaison de textes, etc.). Nous sommes finalement aujourd'hui confront?s, en RI comme dans les autres domaines du traitement des langues, ? cette question de la robustesse, n?cessitant le traitement de donn?es disparates, non canoniques, partielles etc. Nous proposons dans ce chapitre d'aborder cette question en commen?ant par d?crire plus pr?cis?ment les situations conduisant les syst?mes ? des erreurs. L'?tude des besoins sp?cifiques ? la RI nous permettra d'identifier plus clairement les points ? traiter pour proposer un traitement robuste permettant une analyse linguistique fine. Nous nous concentrerons sur la question de l'analyse syntaxique, qui constitue une ?tape essentielle dans les traitements en profondeur. Ce domaine a longtemps ?t? laiss? de c?t? dans les syst?mes, en partie ? cause de son co?t, mais ?galement de son manque de robustesse. Nous pr?senterons ici quelques techniques permettant de r?pondre ? ces besoins. Nous d?crirons en particulier une approche bas?e sur les contraintes offrant l'avantage d'?tre ? la fois robuste, coh?rente d'un point de vue formel, et capable de r?pondre aux ?volutions futures notamment en termes de traitement de la multimodalit?. ======================================================================== Chapitre 3 : La recherche d'information face ? des corpus et requ?tes bruit?s Laurianne SITBON (QUT - Brisbane, Australie) ======================================================================== Ce chapitre s'int?resse ? la fois aux approches d'?valuation de syst?mes de recherche d'information traitant de corpus ou de requ?tes bruit?es et aux techniques propos?es dans la litt?rature pour int?grer le bruit au sein des mod?les d'acc?s ? l'information. En particulier, la transition entre les syst?mes de transcription (de l'audio vers le texte, du manuscrit vers le texte, du texte erron? vers le texte) et le c?ur des syst?mes de recherche d'information doit s'appuyer sur une interpr?tation probabiliste des deux syst?mes interconnect?s. Des approches adapt?es ? l'?valuation et ? la mod?lisation robuste des syst?mes de recherche d'information complexes tels que les syst?mes de questions r?ponses y sont pr?sent?es. Le volume et la vari?t? de l'information accessible est en constante augmentation. La quantit? d'information disponible encourage le d?veloppement d'approches de plus en plus complexes et cibl?es pour la recherche d'information, tels que les syst?mes de questions r?ponses (chapitre 4), les syst?mes de recommandation (chapitre 6) ou encore les syst?mes bas?s sur la classification. La vari?t? des types d'information fait diminuer la certitude avec laquelle les donn?es disponibles peuvent ?tre interpr?t?es par les syst?mes, en s'?loignant des formats textuels standardis?s. Cependant la plupart des syst?mes se ram?nent ? du texte normalis? avant de proc?der ? l'analyse ou l'indexation des donn?es ou des requ?tes. Lorsque les performances des syst?mes ?valu?s en conditions standardis?es chutent en conditions r?elles, la part du bruit dans la baisse de la qualit? des r?sultats n'est pas toujours clairement ?tablie. En particulier, une question majeure est de conna?tre quelles sont les cons?quences du bruit dans les corpus ou dans les requ?tes sur les syst?mes de recherche d'information. Dans ce cha- pitre, nous nous proposons d'examiner les ?valuations men?es ainsi que les solutions propos?es pour des syst?mes de recherche d'information ad hoc avant de proposer des m?thodologies d'?valuation et de mod?lisation adapt?es pour les syst?mes d'information complexes. Les syst?mes de questions r?ponses seront pris ? titre d'exemple pour le traitement de requ?tes non standards. Apr?s une introduction pr?sentant la nature du bruit rencontr? par les syst?mes de recherche d'information modernes, diverses analyses de l'impact du bruit sur l'efficacit? des syst?mes sont pr?sent?es dans la seconde section. Dans la troisi?me section, une approche modulaire pour l'analyse de l'impact de requ?tes bruit?es sur un syst?me de questions r?ponses est propos?e. La quatri?me section pr?sente les diff?rentes approches propos?es dans la litt?rature pour la prise en compte de corrections probabilistes au bruit. La derni?re section introduit un syst?me de correction pour des requ?tes bruit?es ainsi qu'une approche probabiliste pour des syst?mes de recherche d'information complexes tels que les syst?mes de questions r?ponses. Une nouvelle approche posant les conditions de l'?valuation des syst?mes de transcription pour une interpr?tation incertaine est finalement propos?e. ======================================================================== Chapitre 4 : Questions-r?ponses sur des documents audio Olivier GALIBERT, Sophie ROSSET et Lori LAMEL (LIMSI, Paris Orsay) ======================================================================== L'objectif de ce chapitre est de dresser un ?tat des lieux concernant la probl?matique de la recherche d'information pr?cise dans des documents audio. De plus en plus de documents et de donn?es sont orales et disponibles. Qu'il s'agisse de journaux radio-t?l?diffus?s, d'enregistrements de s?minaires ou de r?unions, de podcasts, ils sont une source d'information importante. Permettre la recherche d'information dans ce type de donn?es parait de plus en plus n?cessaire. Dans la famille des outils d'aide ? l'acc?s ? l'information, il y a les syst?mes de questions-r?ponses. Dans ce cadre, depuis quelques ann?es (2007), des travaux sont r?alis?s pour permettre une recherche efficace sur ce type de donn?es. Les syst?mes de questions-r?ponses peuvent ?tre vus comme une extension des syst?mes de recherche d'information qui permet ? un utilisateur d'effectuer une recherche d'information ? partir de mots clefs. En retour, il obtient une liste de documents, ou de pointeurs vers des documents, qu'il doit consulter pour trouver l'information pr?cise qu'il recherche. Les syst?mes de questions-r?ponses ont eux pour objectif de permettre ? un utilisateur de poser sa question en langue, ? l'?crit ou ? l'oral, de mani?re pr?cise et d'obtenir en retour une r?ponse pr?cise, ?ventuellement accompagn?e d'un document ou d'un extrait de document qui justifie ou accompagne la r?ponse. Cela suppose que les syst?mes de questions-r?ponses analysent la question, en comprennent le sens, analysent les documents et en extraient la r?ponse appropri?e. ======================================================================== Chapitre 5 : Recherche d'information et mod?lisation usagers Guillaume CABANAC, Max CHEVALIER, Christine JULIEN, Gilles HUBERT, Chantal SOULE-DUPUY (IRIT, Toulouse) & C?line CLAVEL (LIMSI, Paris Orsay) & Alexandra CIACCIA (PPCC, Paris Nanterre) & Andr? TRICOT (CLLE, Toulouse) ======================================================================== La gen?se de ce chapitre fait suite ? une r?flexion sur la place de l'usager dans le d?veloppement de syst?mes d'information informatis?s men?e de fa?on conjointe par des membres de deux communaut?s pouvant apporter des ?clairages sp?cifiques et compl?mentaires (informatique et ergonomie cognitive). ? la base, pour tous, un usager est une personne qui, dans un contexte donn? (m?tier, personnel...) a besoin (ou doit se servir) d'un syst?me informatis? (un logiciel quelconque, ou un syst?me de recherche d'information en l'occurrence ici) pour r?aliser une t?che avec un objectif sp?cifique. Concevoir un tel syst?me revient ? r?pondre au moins aux questions de base suivantes : Qui est l'usager ? O? se trouve-t-il ? Que veut-il faire ou que cherche-t-il ? Comment et pourquoi ? Cependant, pour r?pondre ? ces questions et pour caract?riser l'usager, chacune de ces deux communaut?s appr?hende l'usager diff?remment. Ce chapitre correspond ? une synth?se de l'?tat de cette r?flexion sur la mod?lisation usager, dans le cadre d'une d?marche de recherche d'information (RI). Cette r?flexion a ?t? men?e conjointement par des membres des deux communaut?s. Ce chapitre propose des recommandations g?n?rales relatives ? la prise en compte des usagers de syst?mes de recherche d'information (SRI). Dans le m?me temps, il vise ? fournir des connaissances g?n?rales utiles ? l'ergonomie, c'est-?-dire des connaissances utiles pour ?valuer les SRI d'un point de vue cognitif pour les am?liorer, voire pour am?liorer le processus de conception de ces outils. Afin d'illustrer la prise en compte des usagers dans les SRI, la section 2 traite des approches classiques de mod?lisation de l'usager d?velopp?es en informatique (et de la conception de SRI), mais ?galement des applications de ces mod?les. La section 3 pr?sente les r?sultats des ?tudes men?es en ergonomie cognitive sur l'influence des caract?ristiques de la t?che, de l'outil et de l'usager sur l'utilisation d'un SRI. Comme synth?se des sections 2 et 3, la section 4 discute de la compl?mentarit? des deux approches et des diff?rences de point de vue. Elle dresse un bilan des limites et des enjeux de la prise en compte de l'usager dans les processus de RI en se basant sur les observations tir?es des diff?rents points de vue (de l'informatique et des sciences cognitives). ======================================================================== Chapitre 6 : Recherche d'information collaborative Nathalie DENOS (LIG, Grenoble) ======================================================================== La recherche d'information pr?sente une dimension sociale tr?s forte. On envoie ? un coll?gue une r?f?rence int?ressante ; on choisit de regarder d'abord la vid?o la plus souvent t?l?charg?e ; devant un besoin d'information dans un domaine que l'on connait mal, on appelle ? l'aide une personne comp?tente pour formuler la requ?te ; on se documente ? plusieurs sur un th?me afin de pr?parer un expos? ; on se r?f?re aux recommandations d'un site marchand pour trouver des id?es de livres ? acheter. Ce sont autant de manifestations de la nature sociale de la recherche d'information. Ce chapitre pr?sente un tour d'horizon des avanc?es dans le domaine de la recherche d'information collaborative sous toutes ses formes. ======================================================================== Chapitre 7 : Difficult?s de lecture, dyslexies et recherche d'information Patrice BELLOT (LSIS, Marseille) ======================================================================== S'il existe de nombreux travaux autour de la prise en compte du contexte en recherche d'information (voir chapitre 1) et de leur personnalisation (voir chapitre 5), de grandes lacunes concernent l'adaptation ? des utilisateurs aux capacit?s de lecture limit?es. Il peut s'agir de personnes atteintes de pathologies langagi?res (par exemple une dyslexie rendant la lecture lente et complexe) mais aussi de personnes ne ma?trisant pas suffisamment la langue d'un document en consultation ou face ? un contenu dont l'expertise n?cessaire ? sa compr?hension est trop ?lev?e. La personnalisation de la recherche d'information en parall?le de la prise en compte des performances de lecture individuelles est l'une des probl?matiques majeures d'une soci?t? o? l'acc?s ? l'information passe de plus en plus par l'Internet, sans m?diation humaine susceptible d'att?nuer les diff?rences entre les individus. Dans ce chapitre, nous allons tout d'abord nous int?resser aux mod?les cognitifs de la lecture de mani?re ? relever l'ensemble des crit?res qui pourraient permettre d'estimer au mieux la notion de lisibilit?. Ensuite, nous ferons r?f?rence aux principaux travaux qui ont abord? le probl?me de l'estimation automatique de la lisibilit? d'un texte et nous proposerons une mani?re d'exploiter concr?tement la lisibilit? au sein d'un syst?me de recherche d'information. Puis nous d?finirons la, ou plut?t les dyslexies comme sujet d'?tude. En effet, s'il existe un continuum ?vident depuis la personne analphab?te ou illettr?e jusqu'au lecteur expert qui peut ?tre refl?t? par les nombreux tests de lecture disponibles, nous avons choisi dans ce chapitre de nous concentrer sur les dyslexies. Elles touchent significativement toutes les franges de la population et correspondent ? un handicap pour lequel il n'est pas n?cessaire de concevoir de dispositifs de rem?diation trop important ni invasif. Les propositions des premi?res sections du chapitre serviront de base ? la d?finition d'une mesure de lisibilit? sp?cifique et qui ouvre des perspectives int?ressantes pour une adaptation de la recherche d'information. ======================================================================== Chapitre 8 : Navigation dans les documents audio par le r?sum? automatique Benoit FAVRE (LIF, Marseille) ======================================================================== Avec la facilit? d'enregistrer et de stocker des donn?es audio, il devient urgent de pouvoir manipuler ces donn?es avec la m?me facilit? que pour des donn?es textuelles. L'av?nement des baladeurs num?riques, par exemple, a fait ?merger l'?coute d'?missions de radio-amateurs (podcasts), et de livres lus, disponibles ? la demande sur Internet. M?me si ces documents sont souvent consomm?s comme des ?missions de radio, leur archivage est g?n?ralis? et il n'existe pas de solution pour les retrouver par leur contenu. Seules des m?tadonn?es cr??es par leurs auteurs permettent d'y acc?der. Dans de nombreux domaines, des conversations sont enregistr?es et archiv?es. Les services client par t?l?phone, par exemple, ?tudient a posteriori le contenu des conversations entre agents et usag?s pour am?liorer leur services. Dans les domaines l?gaux et financiers, de nombreuses conversations sont enregistr?es pour assurer une tra?abilit? des d?cisions. Toute r?union de travail peut ?tre potentiellement enregistr?e pour permettre aux participants de retrouver une information orale, ou ? d'autres de se tenir au courant de l'avancement des sujets discut?s. Bien que l'enregistrement et l'archivage de documents audio soient tr?s d?velopp?s, il n'existe que peu de moyens de structurer, indexer et retrouver l'information qu'ils contiennent. La navigation dans les documents audio est un probl?me omnipr?sent d? ? la nature ?ph?m?re du son. En effet, la lecture du son est continue dans le temps et alors que l'on peut identifier un objet en y jetant un coup d'?il, il faut ?couter un son dans son int?gralit? pour l'identifier. Il semble plus difficile de localiser des ?v?nements dans le temps que d'utiliser le retour continu de la vision pour localiser des objets dans l'espace. Il en r?sulte une difficult? ? d?velopper des interfaces efficaces pour acc?der au contenu de documents audio. Dans ce chapitre, nous allons tout d'abord lister l'?tat de l'art de la navigation et du r?sum? dans les documents audio, puis nous d?taillerons une exp?rience prouvant l'utilit? du r?sum? de parole. Deux applications seront alors explicit?es pour illustrer une meilleure capture du besoin utilisateur ? l'aide de mots-cl?s et une navigation dans des documents s'?talant sur une grande dur?e temporelle. ======================================================================== Chapitre 9 : Interaction Mountaz HASCOET (LIRMM, Montpellier) ======================================================================== L'exploration rapide d'ensembles d'informations inconnues, avec la mise en ?vidence de relations, de structures, de similarit?s, de r?p?titions ou de diff?rences au sein de ces informations peut-?tre abord?e par diff?rents mod?les d'interaction. L'interaction rend possible l'exploitation r?elle de vues d'ensembles pr?alablement calcul?es car l'?tre humain est particuli?rement habile ? extraire des informations d'un environnement s'il peut agir dessus, contrairement ? un environnement qu'il ne pourrait qu'observer de mani?re passive. Selon l'approche ?cologique de la perception due au psychologue Gibson [GIB 79], la perception est indissociable de l'action : il faut agir pour percevoir et il faut percevoir pour agir. On parle de couplage (ou boucle) action-perception. De plus, la perception de notre environnement consiste ? extraire des flux per?us (comme le flux visuel) des invariants. Par exemple, lorsque l'on se d?place, la direction du d?placement est donn?e par le seul point immobile dans le flux visuel. Gr?ce ? l'interaction sur les donn?es, l'utilisateur peut agir sur ce qu'il per?oit et, par l'extraction d'invariants, mieux comprendre la nature des donn?es ou de leur processus de repr?sentation. Nous commencerons par un rapide survol de l'analyse de l'interaction dans le domaine li? ? la recherche et ? l'exploitation d'informations et nous poursuivrons par la pr?sentation des styles d'interaction mis en ?uvre en pr?sentant les approches des plus classiques aux plus novatrices : interaction ? facettes, filtrage dynamique, brossage, interfaces zoomables, interfaces d?formables et enfin interaction distribu?e. ======================================================================== Chapitre 10 : Pr?diction de mots et saisie de requ?tes sur interfaces limit?es : dispositifs mobiles et aide au handicap Jean-Yves ANTOINE (LI, Tours) ======================================================================== La r?volution Internet est juste derri?re nous qu'une nouvelle ?re se profile avec autant de fulgurance : celle de l'informatique mobile et ubiquitaire. A l'oppos? de l'informatique de bureau ou ? domicile, l'informatique ubiquitaire (ou ambiante) met en jeu de multiples syst?mes ? tout moment et dans n'importe quel lieu de votre vie quotidienne. La recherche d'information est directement concern?e par cette ?volution. Un des usages les plus r?pandus des t?l?phones mobiles intelligents (au premier titre desquels l'IPhone) est en effet la recherche d'une information ou d'un service sur la Toile. Si cette recherche est initi?e par une requ?te ? base de mots-cl?s ou d'un ?nonc? en langue naturelle, on se retrouve dans une probl?matique plus large : la saisie de texte sur interface limit?e. On entend par l? que l'utilisateur ne dispose pas d'un clavier standard du fait des dimensions r?duites du dispositif utilis? : il peut s'agir par exemple d'un clavier de t?l?phone ? nombre de touches r?duites, ou d'un clavier virtuel affich? sur un ?cran tactile. Dans tous les cas, la vitesse de composition des messages est ralentie par le caract?re limit? du dispositif d'entr?e. On observe ?galement souvent une augmentation des erreurs de saisie. L'ing?nierie des langues peut proposer des outils ? m?me de compenser ces insuffisances. C'est en particulier le cas de la pr?diction linguistique, qui fait l'objet de ce chapitre : si le syst?me est capable de pr?dire correctement les prochaines lettres ou mots que l'utilisateur souhaite saisir, la s?lection des hypoth?ses correspondantes va acc?l?rer la composition des messages et ?viter certaines erreurs. Dans un premier temps, nous allons situer la probl?matique de l'aide ? la saisie de message en d?crivant les diff?rents dispositifs d'entr?e qui peuvent ?tre utilis?s dans ces usages mobiles. Cette ?tude nous permettra de saisir l'importance de la pr?diction linguistique pour l'aide ? la composition de message. Nous pr?senterons ensuite en d?tail diff?rentes mod?les de pr?diction, en insistant plus particuli?rement sur les techniques les plus avanc?es en mati?re d'adaptation contextuelle de la pr?diction. Notre propos s'appuiera sur des r?sultats d'?valuation exp?rimentale afin de situer l'int?r?t de chaque technique ?tudi?e. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:32:46 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:32:46 +0200 Subject: Journee: Journee d'etude S'caladis et TAL, Structures enumeratives dans le discours, 8 novembre 2012, Toulouse Message-ID: Date: Thu, 11 Oct 2012 16:50:59 +0200 From: Josette Rebeyrolle Message-ID: <5076DCD3.1090202 at univ-tlse2.fr> Le jeudi 8 novembre 2012 aura lieu ? l?Universit? de Toulouse-Le Mirail une journ?e d??tude ouverte ? tous : Structures ?num?ratives dans le discours (organis?e par les axes S?caladis et TAL de CLLE-ERSS, UMR 5263) Lieu : Universit? de Toulouse-Le Mirail, Maison de la Recherche, salle D155 Programme : 9h30-10h00 : M.-P. P?ry-Woodley (CLLE-ERSS, CNRS & UTM) Pourquoi s'int?resser aux structures ?num?ratives ? 10h00-10h30 : L. Tanguy et L.-M. Ho-Dac (CLLE-ERSS, CNRS & UTM) Identification des marqueurs complexes des structures multi-?chelles 10h30-11h00 : J. Rebeyrolle (CLLE-ERSS, CNRS & UTM) Exploitation de la ressource ANNODIS : le cas des cl?tures de structures ?num?ratives 11h20-11h50 : M. Vergez-Couret et M. Bras (CLLE-ERSS, CNRS & UTM) Structures ?num?ratives en SDRT 11h50-12h20 : L. A. Johnsen (Universit?s de Neuch?tel et de Fribourg) Le syntagme ?tout ?a? ? l?oral en fin de liste : entre marqueur r?f?rentiel et marqueur discursif 14h00-15h00 : C. Schnedecker (LiLPa, Universit? de Strasbourg) Les marqueurs ? ordinaux comme indice fort de structures ?num?ratives 15h30-15h50 : M. Bras (CLLE-ERSS, CNRS & UTM) et C. Schnedecker (LiLPa, Universit? de Strasbourg) Dans un premier temps / en premier lieu : des marqueurs de structures ?num?ratives ? 15h50-17h30 : L.-M. Ho-Dac, M.-P. P?ry-Woodley et J. Rebeyrolle (CLLE-ERSS, CNRS & UTM) Atelier : pr?sentation de la ressource ANNODIS Myriam Bras, Marie-Paule P?ry-Woodley, Josette Rebeyrolle ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:34:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:34:25 +0200 Subject: Appel: WIMS'13, Call for Papers & Proposals Message-ID: Date: Thu, 11 Oct 2012 17:04:02 +0200 From: Planti? Michel Message-ID: <5076DFE2.4040000 at mines-ales.fr> X-url: http://aida.ii.uam.es/wims13/ (Apologies for cross-posting!) CALL FOR PAPERS International Conference on Web Intelligence, Mining and Semantics (WIMS'13) June 12-14, 2013 Madrid, Spain http://aida.ii.uam.es/wims13/ About WIMS'13: The 3rd International Conference on Web Intelligence, Mining and Semantics (WIMS'13) will be organised under the auspices of Autonomous University of Madrid, Spain. The WIMS series of conferences concerned with intelligent approaches to transform the World Wide Web into a global reasoning and semantics-driven computing machine. The conference will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Web intelligence, Web mining and Web semantics. The purpose of the WIMS'13 is: - To provide a forum for established researchers and practitioners to present past and current research contributing to the state of the art of Web technology research and applications. - To give doctoral students an opportunity to present their research to a friendly and knowledgeable audience and receive valuable feedback. - To provide an informal social event where Web technology researchers and practitioners can meet. Conference Venue: The conference will be hosted by Autonomous University of Madrid. Call for Papers/Tutorials/Posters/Workshop: Authors are invited to submit full papers, tutorial proposals, posters on all related areas. Papers exploring new directions or areas will receive a thorough and encouraging review. Areas of interest include, but not limited to: Semantics-driven information retrieval Semantic agent systems Semantic data search Collective Intelligence Social Networking and Semantic Technologies Interaction paradigms for semantic search Evaluation of semantic search User interfaces Web mining Ubiquitous computing Bio-inspired Models & the Web Large Scale Data Mining Semantic deep Web and intelligent e-Technology Representation techniques for Web-based knowledge Quality of Life Technology for Web Document Access Rule markup languages and systems Semantic 3D media and content Scalability vs. expressivity of reasoning on the Web The detailed call for contributed papers, tutorial/workshop proposals, and posters can be found at:http://aida.ii.uam.es/wims13/cfp.php How to submit: The maximum length of - research papers is at most 12 pages in ACM format - tutorial/demonstration papers is 3 to 12 pages in ACM format - poster is at most 2 pages in ACM format Please note that the submission format is MS Word or PDF. The papers must be written in English and formatted according to the ACM guidelines. Author instructions and style files can be downloaded athttp://www.acm.org/sigs/publications/proceedings-templates Authors of accepted papers are expected to attend the conference and present their work. Tutorial/demonstration proposals, poster papers and full research paper submissions must be made electronically in MS Word or PDF format through the EasyChair submission system athttps://www.easychair.org/conferences/? conf=wims13 Publication: Accepted papers/tutorials/posters will be published by ACM and disseminated through the ACM Digital Library. Selected extended papers will be invited to appear in a special issues of reputed journals in the field and also in a book published by Elsevier. Important Dates Electronic submission of research papers: December 23, 2012 Electronic submission of poster papers: December 23, 2012 Tutorial and Workshop proposals due: December 23, 2012 Notification of workshop acceptance: January 10, 2013 Notifications of tutorial acceptance: January 10, 2013 Notification of paper/poster acceptance: February 17, 2013 Registration opens: February 18, 2013 Camera-ready of accepted papers/tutorials: March 4, 2013 Deadline for paper submissions for workshops: February 7, 2013 Acceptance of papers for workshops: March 10, 2013 Camera ready workshop papers: March 30, 2013 Author registration deadline: March 30, 2013 Conference: 12-14 June,2013 Contact: David Camacho Escuela Polit?cnica Superior Universidad Aut?noma de Madrid Francisco Tom?s y Valiente, 11 , 28049, Madrid , Spain Tel/Fax: +34 91 497 51 00 E-mail:wims13 at uam.es ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 12 20:35:29 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 12 Oct 2012 22:35:29 +0200 Subject: Ressource: ANNODIS, un corpus enrichi d'annotations discursives Message-ID: Date: Fri, 12 Oct 2012 09:47:20 +0200 From: Marie-Paule PERY-WOODLEY Message-ID: <5077CB08.4000600 at univ-tlse2.fr> X-url: http://redac.univ-tlse2.fr/corpus/annodis [English version below] Nous avons le plaisir d'annoncer que la ressource ANNODIS, corpus enrichi d'annotations discursives, est d?sormais disponible sur http://redac.univ-tlse2.fr/corpus/annodis. Il s'agit d'un corpus de fran?ais ?crit (687 000 mots) diversifi? en termes de genre, longueur et organisation discursive. Les objets annot?s, qui refl?tent deux approches du discours, sont les relations rh?toriques et deux types de structures multi-?chelles : cha?nes topicales et structures ?num?ratives. Le corpus peut ?tre t?l?charg? librement et, en ce qui concerne les structures multi-?chelles, explor? en ligne gr?ce ? un browser. L'?quipe Ressource ANNODIS (CLLE-ERSS et IRIT, Universit? de Toulouse): Lydia-Mai Ho-Dac (contact), Stergos Afantenos, Nicholas Asher, Farah Benamara, Myriam Bras, C?cile Fabre, Anne Le Draoulec, Philippe Muller, Marie-Paule P?ry-Woodley, Laurent Pr?vot, Josette Rebeyrolle, Ludovic Tanguy, Marianne Vergez-Couret, Laure Vieu. ****************************************************** ANNODIS, a freely available discourse-level annotated corpus We are pleased to announce that the ANNODIS resource, a discourse-level annotated corpus for French, is now available on-line: http://redac.univ-tlse2.fr/corpus/annodis. The corpus (687,000 words) is diversified with respect to genre, length and type of discourse organisation. The annotated objects, which reflect two distinct approaches to discourse, are rhetorical relations and two types of multi-level structures:topical chains and enumerative structures. The corpus can be downloaded and, in the case of multi-level structures, explored on-line via a browser. The ANNODIS resource team (CLLE-ERSS and IRIT, Universit? de Toulouse): Lydia-Mai Ho-Dac (contact), Stergos Afantenos, Nicholas Asher, Farah Benamara, Myriam Bras, C?cile Fabre, Anne Le Draoulec, Philippe Muller, Marie-Paule P?ry-Woodley, Laurent Pr?vot, Josette Rebeyrolle, Ludovic Tanguy, Marianne Vergez-Couret, Laure Vieu. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 11:12:53 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 13:12:53 +0200 Subject: Appel: LATA 2013 Message-ID: Date: Sun, 14 Oct 2012 18:13:39 +0200 From: "GRLMC" Message-ID: <17CA16231E834D33BB719B63722D95C4 at Carlos1> X-url: http://grammars.grlmc.com/LATA2013/ ------------------------------------------------------------------------ 7th INTERNATIONAL CONFERENCE ON LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS LATA 2013 Bilbao, Spain April 2-5, 2013 Organized by: Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University http://grammars.grlmc.com/LATA2013/ AIMS: LATA is a yearly conference in theoretical computer science and its applications. Following the tradition of the International Schools in Formal Languages and Applications developed at Rovira i Virgili University in Tarragona since 2002, LATA 2013 will reserve significant room for young scholars at the beginning of their career. It will aim at attracting contributions from both classical theory fields and application areas (bioinformatics, systems biology, language technology, artificial intelligence, etc.). VENUE: LATA 2013 will take place in Bilbao, at the Basque Country in Northern Spain. The venue will be the Basque Center for Applied Mathematics (BCAM). SCOPE: Topics of either theoretical or applied interest include, but are not limited to: ? algebraic language theory ? algorithms for semi?structured data mining ? algorithms on automata and words ? automata and logic ? automata for system analysis and programme verification ? automata, concurrency and Petri nets ? automatic structures ? cellular automata ? combinatorics on words ? computability ? computational complexity ? computational linguistics ? data and image compression ? decidability questions on words and languages ? descriptional complexity ? DNA and other models of bio?inspired computing ? document engineering ? foundations of finite state technology ? foundations of XML ? fuzzy and rough languages ? grammars (Chomsky hierarchy, contextual, multidimensional, unification, categorial, etc.) ? grammars and automata architectures ? grammatical inference and algorithmic learning ? graphs and graph transformation ? language varieties and semigroups ? language?based cryptography ? language?theoretic foundations of artificial intelligence and artificial life ? parallel and regulated rewriting ? parsing ? pattern recognition ? patterns and codes ? power series ? quantum, chemical and optical computing ? semantics ? string and combinatorial issues in computational biology and bioinformatics ? string processing algorithms ? symbolic dynamics ? symbolic neural networks ? term rewriting ? transducers ? trees, tree languages and tree automata ? weighted automata STRUCTURE: LATA 2013 will consist of: ? invited talks ? invited tutorials ? peer?reviewed contributions INVITED SPEAKERS: Jin-Yi Cai (Madison), Complexity Dichotomy for Counting Problems Kousha Etessami (Edinburgh), Algorithms for Analyzing Infinite-state Recursive Probabilistic Systems Luke Ong (Oxford), tutorial Languages and Automata for Higher-order Model Checking Jo?l Ouaknine (Oxford), tutorial Discrete Linear Dynamical Systems Thomas Schwentick (Dortmund), Applications of Automata in Database Theory -- Challenges to Automata Theory from Databases Andrei Voronkov (Manchester), The Lazy Reviewer Assignment Problem in EasyChair PROGRAMME COMMITTEE: Parosh Aziz Abdulla (Uppsala) Franz Baader (Dresden) Jos Baeten (CWI, Amsterdam) Christel Baier (Dresden) Gerth St?lting Brodal (Aarhus) John Case (Delaware) Marek Chrobak (Riverside) Mariangiola Dezani (Torino) Rod Downey (Wellington) Ding-Zhu Du (Dallas) Ivo D?ntsch (Brock) E. Allen Emerson (Austin) Javier Esparza (Technical University Munich) Michael R. Fellows (Darwin) Alain Finkel (ENS Cachan) Dov M. Gabbay (King?s, London) J?rgen Giesl (Aachen) Rob van Glabbeek (NICTA, Sydney) Georg Gottlob (Oxford) Annegret Habel (Oldenburg) Reiko Heckel (Leicester) Sanjay Jain (Singapore) Charanjit S. Jutla (IBM Thomas J. Watson) Ming-Yang Kao (Northwestern) Deepak Kapur (Albuquerque) Joost-Pieter Katoen (Aachen) S. Rao Kosaraju (Johns Hopkins) Evangelos Kranakis (Carleton) Hans-J?rg Kreowski (Bremen) Tak-Wah Lam (Hong Kong) Gad M. Landau (Haifa) Kim G. Larsen (Aalborg) Richard Lipton (Georgia Tech) Jack Lutz (Iowa State) Ian Mackie (?cole Polytechnique, Palaiseau) Rupak Majumdar (Max Planck, Kaiserslautern) Carlos Mart?n-Vide (Tarragona, chair) Paliath Narendran (Albany) Tobias Nipkow (Technical University Munich) David A. Plaisted (Chapel Hill) Jean-Fran?ois Raskin (Brussels) Wolfgang Reisig (Humboldt Berlin) Micha?l Rusinowitch (LORIA, Nancy) Davide Sangiorgi (Bologna) Bernhard Steffen (Dortmund) Colin Stirling (Edinburgh) Alfonso Valencia (CNIO, Madrid) Helmut Veith (Vienna Tech) Heribert Vollmer (Hannover) Osamu Watanabe (Tokyo Tech) Pierre Wolper (Li?ge) Louxin Zhang (Singapore) ORGANIZING COMMITTEE: Adrian Horia Dediu (Tarragona) Peter Leupold (Tarragona) Carlos Mart?n?Vide (Tarragona, co-chair) Magaly Rold?n (Bilbao) Bianca Truthe (Magdeburg) Florentina Lilica Voicu (Tarragona) Enrique Zuazua (Bilbao, co-chair) SUBMISSIONS: Authors are invited to submit papers presenting original and unpublished research. Papers should not exceed 12 single?spaced pages (including eventual appendices) and should be formatted according to the standard format for Springer Verlag's LNCS series (see http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0). Submissions have to be uploaded to: https://www.easychair.org/conferences/?conf=lata2013 PUBLICATIONS: A volume of proceedings published by Springer in the LNCS series will be available by the time of the conference. A special issue of a major journal will be later published containing peer?reviewed extended versions of some of the papers contributed to the conference. Submissions to it will be by invitation. REGISTRATION: The period for registration is open from August 6, 2012 to April 2, 2013. The registration form can be found at the website of the conference: http://grammars.grlmc.com/LATA2013/ FEES: Early registration fees: 500 Euro Early registration fees (PhD students): 400 Euro Late registration fees: 540 Euro Late registration fees (PhD students): 440 Euro On?site registration fees: 580 Euro On?site registration fees (PhD students): 480 Euro At least one author per paper should register. Papers that do not have a registered author who paid the fees by January 2, 2013 will be excluded from the proceedings. One registration gives the right to present only one paper. Fees comprise access to all sessions, one copy of the proceedings volume, coffee breaks and lunches. PAYMENT: Early (resp. late) registration fees must be paid by bank transfer before January 2, 2013 (resp. March 23, 2013) to the conference bank account: Uno-e Bank bank?s address: Julian Camarillo 4 C, 28037 Madrid, Spain IBAN: ES3902270001820201823142 BIC/SWIFT: UNOEESM1 account holder: C. Martin ? GRLMC account holder?s address: Av. Catalunya 35, 43002 Tarragona, Spain Please mention LATA 2013 and your name in the subject. A receipt will be provided on site. Remarks: - Bank transfers should not involve any expense for the conference. - People claiming early registration will be requested to prove that the bank transfer order was carried out by the deadline. - PhD students will need to provide evidence of their status on site. People registering on site must pay in cash. For the sake of local organization, however, it is much recommended to do it earlier. DEADLINES: Paper submission: November 9, 2012 (23:59h, CET) Notification of paper acceptance or rejection: December 16, 2012 Final version of the paper for the LNCS proceedings: December 25, 2012 Early registration: January 2, 2013 Late registration: March 23, 2013 Starting of the conference: April 2, 2013 End of the conference: April 5, 2013 Submission to the post?conference journal special issue: July 5, 2013 QUESTIONS AND FURTHER INFORMATION: florentinalilica.voicu at urv.cat POSTAL ADDRESS: LATA 2013 Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University Av. Catalunya, 35 43002 Tarragona, Spain Phone: +34?977?559543 Fax: +34?977?558386 ACKNOWLEDGEMENTS: Basque Center for Applied Mathematics Diputaci? de Tarragona Universitat Rovira i Virgili ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 11:30:21 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 13:30:21 +0200 Subject: Appel: New Extended Deadline CFP - ML4HMT-12 Workshop at COLING 2012 Message-ID: Date: Mon, 15 Oct 2012 16:20:39 +0200 From: Maite Melero Message-ID: <16B44A00C6287E46A454C5F9A5914F6FE16646A7CE at FBMEC01.corp.barcelonamedia.org> X-url: http://www.dfki.de/ml4hmt/ -----Apologies for duplicate postings----- ***CALL FOR PAPERS --- EXTENDED DEADLINE --- NEW DEADLINE: OCT 22nd*** ?Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12 WS and Shared Task)? at COLING 2012 Mumbai (India), 9th December, 2012 URL: http://www.dfki.de/ml4hmt/ The workshop and associated shared task are an effort to trigger a systematic investigation on improving state-of-the-art hybrid machine translation, making use of advanced machine-learning (ML) methodologies. It follows the ML4HMT-11 workshop which took place last November in Barcelona. The first workshop also road-tested a shared task (and associated data set) and laid the basis for a broader reach in 2012. Regular Papers ML4HMT-12 ======================== We are soliciting original papers on hybrid MT, including (but not limited to): * use of machine learning methods in hybrid MT; * system combination: parallel in multi-engine MT (MEMT) or sequential in statistical post-editing (SPMT); * combining phrases and translation units from different types of MT; * syntactic pre-/re-ordering; * using richer linguistic information in phrase-based or in hierarchical SMT; * learning resources (e.g., transfer rules, transduction grammars) for probabilistic rule-based MT. Full papers should be anonymous and follow the COLING full paper format (http://www.coling2012-iitb.org/call_for_papers.php). To submit contributions, please follow the instructions at the Workshop management system submission website: https://www.softconf.com/coling2012/ML4HMT12/. The contributions will undergo a double-blind review by members of the programme committee. Shared Task ML4HMT-12 ===================== The main focus of the Shared Task is to address the question: "Can Hybrid MT and System Combination techniques benefit from extra information (linguistically motivated, decoding, runtime, confidence scores, or other meta-data) from the systems involved?" Participants are invited to build hybrid MT systems and/or system combinations by using the output of several MT systems of different types, as provided by the organisers. While participants are encouraged to use machine learning techniques to explore the additional meta-data information sources, other general improvements in hybrid and combination based MT are welcome to participate in the challenge. For systems that exploit additional meta-data information the challenge is that additional meta-data is highly heterogeneous and (individual) system specific. Data: The ML4HMT-12 Shared Task involves (ES-EN) and (ZH-EN) data sets, in each case translating into EN. * (ES-EN): Participants are given a development bilingual set aligned at a sentence level. Each "bilingual sentence" contains: 1) the source sentence, 2) the target (reference) sentence and 3) the corresponding multiple output translations from four systems, based on different MT approaches (Apertium, Ramirez-Sanchez, 2006; Lucy, Alonso and Thurmair, 2003; Moses, Koehn et. al., 2007). The output has been annotated with system-internal meta-data information derived from the translation process of each of the systems. * (ZH-EN) A corresponding data set for ZH-EN with output translations from three systems (Moses, ICT_Chiero, Mi et. al., 2009;and Huajian RBMT) will be provided. (Participants are required to fill out a shared task evaluation agreement form and obtain the ZH-EN data from LDC). Participants are challenged to build an MT mechanism where possible making effective use of the system-specific MT meta-data output. They can provide solutions based on opensource systems, or develop their own mechanisms. The development set can be used for tuning the systems during the development phase. Final submissions have to include translation output on a test set, which will be made available one week after training data release. Data will be provided to build language/reordering models, possibly re-using existing resources from MT research. Participants can also make use of additional (linguistic analysis, confidence estimation etc.) tools, if their systems require so, but they have to explicitly declare this upon submission, so that they are judged as "unconstrained" systems. This will allow for a better comparison between participating systems. Shared task results should be submitted via email attachment. Please compress your results as .zip or .gz archive and send them to cfedermann at dfki.de. Use "ML4HMT-12 Shared Task Submission" as mail subject. Shared task results are due by October 28th. System output will be judged via peer-based human evaluation as well as automatic evaluation. During the evaluation phase, participants will be requested to rank system outputs of other participants through a web-based interface (Appraise, Federmann 2010). Automatic metrics include BLEU (Papineni et. Al, 2002), TER (Snover et al., 2006) and METEOR (Lavie, 2005). Results from the automatic evaluation of submitted shared task results will be made available to participants on November 1st so that they could be referred to in system description papers. As the manual evaluation will take longer, its results will be presented and published at the workshop. Workshop Participation ================== If you are interested in our workshop and intend to participate, we'd much appreciate if you could inform us about your participation intent beforehand so that we can better plan the workshop; to do so, send an email to cfedermann at dfki.de. Important Dates 2012 =================== 15th August: Shared task Training data release (updated ML4HMT corpus) 23rd August: Shared task Test data release 22nd October: Workshop full paper submission deadline 28th October: Shared task Translation results submission deadline 31st October: Workshop paper accept/reject notification 1st November: Shared task Evaluation results release 4th November: Shared Task system description paper submision 11th November: Shared Task system description paper accept/reject notification 18th November: Workshop and Shared task Camera ready paper due 9th December: ML4HMT-12 Workshop Organizers ========== - Prof. Josef van Genabith, Dublin City University (DCU) and Centre for Next Generation Localisation (CNGL) - Prof. Toni Badia, Universitat Pompeu Fabra and Barcelona Media (BM) - Christian Federmann, German Research Center for Artificial Intelligence (DFKI), contact person: cfedermann at dfki.de - Dr. Maite Melero, Barcelona Media (BM) - Dr. Marta R. Costa-juss?, Barcelona Media (BM) - Dr. Tsuyoshi Okita, Dublin City University (DCU) Program committee ================ - Eleftherios Avramidis (German Research Center for Artificial Intelligence, Germany) - Prof. Sivaji Bandyopadhyay (Jadavpur University, India) - Dr. Rafael Banchs (Institute for Infocomm Research - I2R, Singapore) - Prof. Lo?c Barrault (LIUM - University of Le Mans, France) - Prof. Antal van den Bosch (Centre for Language Studies, Radboud University Nijmegen, Netherlands) - Dr. Grzegorz Chrupala (Saarland University, Saarbr?cken, Germany) - Prof. Jinhua Du (Xi'an University of Technology (XAUT), China) - Dr. Andreas Eisele (Directorate-General for Translation (DGT), Luxembourg) - Dr. Cristina Espa?a-Bonet (Technical University of Catalonia, TALP, Barcelona) - Dr. Declan Groves (Center for Next Generation Localisation, Dublin City University, Ireland) - Prof. Jan Hajic (Institute of Formal and Applied Linguistics, Charles University in Prague) - Prof. Timo Honkela (Aalto University, Finland) - Dr. Patrick Lambert (LIUM - University of Le Mans, France) - Prof. Qun Liu (Institute of Computing Technology, Chinese Academy of Sciences, China) - Dr. Maite Melero (Barcelona Media Innovation Center, Spain) - Dr. Tsuyoshi Okita (Dublin City University, Ireland) - Prof. Pavel Pecina (Institute of Formal and Applied Linguistics, Charles University in Prague) - Dr. Marta R. Costa-juss? (Barcelona Media Innovation Center, Spain) - Dr. Felipe Sanchez Martinez (Escuela Politecnica Superior, Universidad de Alicante, Spain) - Dr. Nicolas Stroppa (Google, Zurich, Switzerland) - Prof. Hans Uszkoreit (German Research Center for Artificial Intelligence, Germany) - Dr. David Vilar (German Research Center for Artificial Intelligence, Germany) The ML4HMT workshop is supported by the META-NET T4ME project (http://www.meta-net.eu/), funded by the DG INFSO of the European Commission through the Seventh Framework Programme, grant agreement no.: 249119. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 11:34:06 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 13:34:06 +0200 Subject: Appel: 4e Colloque Res per nomen, Reims Message-ID: Date: Mon, 15 Oct 2012 22:12:28 +0200 From: EMILIA HILGERT Message-ID: <16925_1350332248_507C6F57_16925_9217_1_20121015221228.83s7096nggc00cs4 at wmp.univ-reims.fr> X-url: http://www.res-per-nomen.org/respernomen/colloque-2013/Accueil-2013.html RAPPEL Chers coll?gues, Nous tenons ? vous rappeler que la date limite d'envoi des propositions de communication pour le quatri?me colloque Res per nomen ? Les th?ories du sens et de la r?f?rence. Hommage ? Georges Kleiber ? est le 30 octobre 2012. Cf. le site Res per nomen : http://www.res-per-nomen.org/respernomen/colloque-2013/Accueil-2013.html Bien cordialement. Les organisateurs, Emilia Hilgert Silvia Palma Pierre Frath Ren? Daval ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 11:39:01 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 13:39:01 +0200 Subject: Appel: Cogalex - Deadline extension: October 21, 2012 (no further extensions possible) Message-ID: Date: Tue, 16 Oct 2012 14:07:25 +0200 From: Michael Zock Message-ID: <507D4DFD.8010100 at lif.univ-mrs.fr> X-url: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ============================================================== Deadline extension: October 21, 2012 (no further extensions possible) All other dates (notification of acceptance' and 'camera-ready paper due') are maintained ============================================================== CogALex-3 (Cognitive Aspects of the Lexicon), a post-COLING workshop New deadline for paper submission : October 21, 2012 more details: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ============================================================== 3rd Workshop on "Cognitive Aspects of the Lexicon" (CogALex) Post-conference workshop at COLING 2012 (December 15, Mumbai, India) Invited speaker: Alain Polgu?re (Universit? de Lorraine & ATILF CNRS, France) Submission deadline: October 21, 2012 AIMS and TARGET AUDIENCE The aim of this workshop is to bring together researchers involved in the construction and application of electronic dictionaries to discuss modifications of existing resources in line with the users' needs, thereby fully exploiting the advantages of the digital form. Given the breadth of the questions, we welcome reports on work from many perspectives, including but not limited to: computational lexicography, psycholinguistics, cognitive psychology, language learning and ergonomics. MOTIVATION The way we look at dictionaries, their creation and use, has changed dramatically over the past 30 years. (1) While being considered as an appendix to grammar in the past, they have in the meantime moved to centre stage. Indeed, there is hardly any task in NLP which can be conducted without them. (2) Also, many lexicographers work nowadays with huge digital corpora, using language technology to build and to maintain the lexicon. (3) Last, but not least, rather than being static entities (data-base view), dictionaries are now viewed as graphs, whose nodes and links (connection strengths) may change over time. Interestingly, properties concerning topology, clustering and evolution known from other disciplines (society, economy, human brain) also apply to dictionaries: everything is linked, hence accessible, and everything is evolving. Given these similarities, one may wonder what we can learn from these disciplines. In this 3rd edition of the CogALex workshop we therefore intend to also invite scientists working in these fields, our goals being to broaden the picture, i.e. to gain a better understanding concerning the mental lexicon and to integrate these findings into our dictionaries in order to support navigation. Given recent advances in neurosciences, it appears timely to seek inspiration from neuroscientists studying the human brain. There is also a lot to be learned from other fields studying graphs and networks, even if their object of study is something else than language, for example biology, economy or society. TOPICS OF INTEREST This workshop is about possible enhancements of existing electronic dictionaries. To perform the groundwork for the next generation of electronic dictionaries we invite researchers involved in the building of such dictionaries. The idea is to discuss modifications of existing resources by taking the users' needs and knowledge states into account, and to capitalize on the advantages of the digital media. For this workshop we invite papers including but not limited to the following topics which can be considered from various points of view: linguistics, neuro- or psycholinguistics (associations, tip-of-the-tongue problem), network-related sciences (complex graphs, network topology, small-world problem), etc. 1) Analysis of the conceptual input of a dictionary user - What does a language producer start from (bag of words)? - What is in the authors' minds when they are generating a message and looking for a word? - What does it take to bridge the gap between this input and the desired output (target word)? 2) The meaning of words - Lexical representation (holistic, decomposed) - Meaning representation (concept based, primitives) - Revelation of hidden information (vector-based approaches: LSA/HAL) - Neural models, neurosemantics, neurocomputational theories of content representation. 3) Structure of the lexicon - Discovering structures in the lexicon: formal and semantic point of view (clustering, topical structure) - Creative ways of getting access to and using word associations - Evolution, i.e. dynamic aspects of the lexicon (changes of weights) - Neural models of the mental lexicon (distribution of information concerning words, organisation of the mental lexicon) 4) Methods for crafting dictionaries or indexes - Manual, automatic or collaborative building of dictionaries and indexes (distributional semantics, crowd-sourcing, serious games, etc.) - Impact and use of social networks (Facebook, Twitter) for building dictionaries, for organizing and indexing the data (clustering of words), and for allowing to track navigational strategies, etc. - (Semi-) automatic induction of the link type (e.g. synonym, hypernym, meronym, association, collocation, ...) - Use of corpora and patterns (data-mining) for getting access to words, their uses, and combinations (associations) 5) Dictionary access (navigation and search strategies), interface issues - Semantic-based search - Search (simple query vs multiple words) - Context-dependent search (modification of users? goals during search) - Recovery - Navigation (frequent navigational patterns or search strategies used by people) - Interface problems, data-visualisation IMPORTANT DATES - Deadline for paper submissions: October 15, 2012 - Notification of acceptance: November 5, 2012 - Camera-ready papers due: November 15, 2012 - Workshop date: December 15, 2012 SUBMISSION INSTRUCTIONS see: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html INVITED SPEAKER: Alain Polgu?re (Universit? de Lorraine & ATILF CNRS, France) PROGRAMME COMMITTEE * Barbu, Eduard (Universidad de Ja?n, Spain) * Barrat, Alain (Centre de physique th?orique, CNRS & Aix-Marseille University) * Bilac, Slaven (Google Tokyo, Japan) * Bel Enguix, Gemma (LIF, Aix-Marseille University, France) * Bouillon, Pierrette (TIM, Faculty of Translation and Interpretating, Geneva, Switzerland) * Cook, Paul (The University of Melbourne, Australia) * Cristea, Dan (University of Iasi, Romania) * Fairon, Cedrick (CENTAL, Universit? catholique de Louvain, Belgium) * Fazly, Afsaneh (University of Toronto, Canada) * Fellbaum, Christiane (University of Princeton, USA) * Ferret, Olivier (CEA LIST, Palaiseau, France) * Fontenelle, Thierry (Translation Centre for the Bodies of the European Union, Luxemburg) * Granger, Sylviane (Universit? Catholique de Louvain, Belgium) * Grefenstette, Gregory (3DS Exalead, Paris, France) * Hansen-Schirra, Silvia (University of Mainz, FTSK, Germany) * Heid, Ulrich (University of Hildesheim, Germany) * Hirst, Graeme (University of Toronto, Canada) * Hovy, Ed (ISI, Los Angeles, USA) * Joyce, Terry (Tama University, Kanagawa-ken, Japan) * Kwong, Olivia (City University of Hong Kong, China) * L'Homme, Marie Claude (OLST, University of Montreal, Canada) * Lapalme, Guy (RALI, University of Montreal, Canada) * Mititelu, Verginica (RACAI, Bucharest, Romania) * Pirrelli, Vito (ILC, Pisa, Italy) * Polgu?re, Alain (Universit? de Lorraine & ATILF CNRS, France) * Rapp, Reinhard (University of Leeds, UK) * Ruette, Tom (KU Leuven, Belgium) * Schwab, Didier (LIG, Grenoble, France) * Serasset, Gilles (IMAG, Grenoble, France) * Sharoff, Serge (University of Leeds, UK) * Sinopalnikova, Anna (FIT, BUT, Brno, Czech Republic) * Sowa, John (VivoMind Research, LLC, USA) * Tiberius, Carole (Institute for Dutch Lexicology, The Netherlands) * Tokunaga, Takenobu (TITECH, Tokyo, Japan) * Tufis, Dan (RACAI, Bucharest, Romania) * Valitutti, Alessandro (University of Helsinki and HIIT, Finland) * Vossen, Piek (Vrije Universiteit, Amsterdam, The Netherlands) * Wehrli, Eric (LATL, University of Geneva, Switzerland) * Zock, Michael (LIF, CNRS, Aix-Marseille University, France) * Zweigenbaum, Pierre (LIMSI - CNRS, Orsay & ERTIM - INALCO, Paris, France) WORKSHOP ORGANIZERS and CONTACT PERSONS Michael Zock (LIF-CNRS, Marseille, France), michael.zock AT lif.univ-mrs.fr Reinhard Rapp (University of Leeds, UK), reinhardrapp AT gmx.de For more details see: http://pageperso.lif.univ-mrs.fr/~michael.zock/cogalex-3.html ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 19:00:03 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 21:00:03 +0200 Subject: Job: Stage de recherche, Technicolor (Rennes), Analysis of Web forums Message-ID: Date: Tue, 16 Oct 2012 16:56:58 +0200 From: Guegan Marie Message-ID: X-url: http://www.technicolor.com/ Internship position available at Technicolor R&D in Rennes. Title ------ ?Which scene are you talking about?? Recognizing scenes discussed on cinema or TV forums. Context ------- For more info on Technicolor Research & Innovation, Rennes : https://research.technicolor.com/rennes/ The internship will be hosted at Technicolor R&I in Rennes, France (500 employees, of which 130 researchers), within the Media Computing Lab. Our lab aims at bringing modern trends in computing to the service of novel media engines in content creation (visual effects, animation) as well as content discovery and retrieval. More specifically, our team focuses on Web user comments posted on forums and social networks. We use various approaches such as data mining, social network analysis and natural language processing. Objective --------- This internship aims at designing, developing and evaluating an information extraction system for user comments. The domain is dedicated to cinema and television. Each comment is already attached to a particular audiovisual content (movie, TV series, TV program). One of our goals is to detect within the comments the text segments which refer to a particular moment in the video. For instance, users may talk about their favorite scene or quote a famous dialogue. Task description ----------------- We will not analyze the audiovisual signal (image or audio), but solely the text of comments. Comments have already been collected and saved into a database. Hence the internship will focus on the analysis of the dataset rather than its collection. The intern will be responsible for choosing best techniques, based on a survey he or she will conduct on state-of-the-art approaches. The developed system will be evaluated by the intern, both quantitatively and qualitatively. Depending on obtained results and innovative ideas, this work may lead to a research publication in a conference. Keywords --------- Natural language processing (NLP), machine learning, data mining, text mining Profile of the candidate ------------------------- * Student in final year of master or science engineering school * Computer science (Python, Java) * Skills in machine learning, data mining and natural language processing * Strong interest in research (will constitute a survey) * Interest in social networks * English mandatory * Appreciates working with a team spirit Internship period & duration ----------------------------------- 6 months, starting preferably around February or March 2013, depending on the candidate?s constraints. Please email your CV and cover letter to stage.rennes at technicolor.com with reference [TRDF-DM-029] in the subject. Marie Gu?gan Media Computing Lab Research & Innovation Technicolor R&D France 975 avenue des Champs Blancs, CS 17616 35576 Cesson-S?vign? Cedex, France www.technicolor.com Important : Technicolor R&D France d?m?nage. Notre nouvelle adresse, ? compter du 24 octobre 2012 devient : 975 avenue des champs blancs, CS 17616, 35576 Cesson S?vign?, France - t?l (standard) : +33 (0)2 99 27 30 00 (inchang?). ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 19:11:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 21:11:25 +0200 Subject: Appel: ICICS 2013 Message-ID: Date: Tue, 16 Oct 2012 21:00:34 +0300 From: ICICS2013 Message-ID: <1168219207628 at CIT-SamerSuleiman-M2L-1.just.edu.jo> X-url: http://www.icics.info/icics2013 [Apologies if you receive multiple copies of this Call For Papers.] CALL FOR PAPERS ================= The 4th International Conference on Information and Communication Systems, ICICS 2013 Organized by Jordan University of Science and Technology April 23-25 2013, Irbid, Jordan http://www.icics.info/icics2013 IMPORTANT DATES ================= Full Paper Submission: Dec. 1st, 2012 Notification of Decision: Jan. 20th, 2013 Registration and Camera-Ready: Feb. 15th, 2013 Poster Presentation Submission: Feb. 15th, 2013 GENERAL INFORMATION ===================== The International Conference on Information and Communication Systems (ICICS 2013) is a forum for scientists, engineers, and practitioners to present their latest research results, ideas, developments, and applications in all areas of Computer and Information Sciences. The topics that will be covered in the ICICS 2013 include, but are not limited to: Artificial Intelligence, Mobile Computing, Networking, Information Security and Cryptography, Intrusion Detection and Computer Forensics, Web Content Mining, Bioinformatics and IT Applications, Database Technology, Systems Integration, Information Systems Analysis and Specification, Telecommunications, and Human-Computer Interaction. TOPICS ======= Researchers are encouraged to submit original research contributions in all major areas, which include, but not limited to: Databases and Information Systems Integration Artificial Intelligence Machine Learning Bioinformatics Data Mining Robotics and Autonomous Systems Knowledge Management & Natural Language Processing E-Business E-Learning Health Information Systems Applications of Fuzzy Logic Applications of Neural Network Data Warehouses & Human Computer Interaction Systems Engineering Methodologies Embedded Systems Software Engineering Software Measurement Algorithms and Applications Computer Architecture Computer Graphics VLSI and its applications Computer Networks Wireless and Mobile Computing & Computer Simulation Information Security Information Systems Semantic Web Technologies Multimedia and Image Processing Parallel and Distributed Systems Cloud Computing Pervasive and Adaptive Systems Reliability and Fault-Tolerance Internet and Collaborative Computing Nano Technology INSTRUCTIONS FOR AUTHORS =========================== Prospective authors are invited to submit full papers following the guidelines posted on the conference website http://www.icics.info/icics2013. Submitted papers will be peer-reviewed and prospective authors are expected to present their papers at the conference. The papers that are accepted and presented at the conference will appear in CD proceedings, in the ACM Digital Library (pending approval) and in DBLP. Best paper and best student paper will be selected by peer reviews and will be announced during the social event at the conference. Please submit your paper in PDF format via the electronic submission system available in EasyChair: https://www.easychair.org/conferences/?conf=icics13. Prospective authors are expected to present their papers at the conference. Extended version of selected papers will be published in Journals like: - Journal of emerging technologies in web intelligence (JETWI, ISSN 1798-0461) - Network Protocols and Algorithms (ISSN 1943-3581) Please send any enquiry on ICICS 2013 to icics at just.edu.jo ________________________________ Jordan University of Science and Technology accepts no liability for any damage caused by any virus transmitted by this email. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 19:14:02 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 21:14:02 +0200 Subject: Appel: Journee d'etude annotation corpus oraux, Paris Message-ID: Date: Wed, 17 Oct 2012 11:48:44 +0200 From: Christophe Benzitoun Message-ID: <507E7EFC.8030505 at univ-nancy2.fr> *Annotation syntaxique de corpus oraux* *Projets r?cents et perspectives* ** ** Appel ? communication Journ?e d'?tude Conscila (ENS Paris) *Vendredi 7 d?cembre 2012* ? l'heure actuelle, de plus en plus de corpus de fran?ais parl? sont librement mis ? la disposition de la communaut? scientifique (corpus PFC, Corpus du Fran?ais Parl? Parisien, Valibel, CRDO, TCOF, etc.). Or, ces donn?es pr?sentent des particularit?s non prises en compte par la plupart des outils de traitements de corpus. Ainsi, il est difficile d'employer ces instruments directement sur le fran?ais parl?. De m?me, les donn?es issues de l'oral posent des probl?mes pour leur int?gration dans les cadres traditionnels. Les logiciels et les approches linguistiques ont pour point commun d'avoir ?t? principalement d?velopp?s ? partir de textes ?crits (ou ? partir d'exemples invent?s) et en vue du traitement de l'?crit. Ainsi, afin d'adapter les syst?mes actuels ou, tout simplement, d'approfondir notre connaissance du fran?ais, il est indispensable de produire des annotations sur les ressources orales. Cependant, les initiatives dans ce domaine en sont encore au stade embryonnaire pour le fran?ais, m?me s'il en existe tout de m?me un nombre cons?quent. On peut citer les travaux de Eshkol et al. (2010), le projet PERCEO (http://cnrtl.fr/corpus/perceo/) -- tous deux sur l'annotation morphosyntaxique -- la r?cente journ?e ATALA /Annoter les corpus oraux/ (Paris, avril 2011), le projet CID ? Aix-en-Provence (http://sldr.org/sldr000027), une partie du projet ANR /Colaje/ (chez les jeunes enfants ; http://colaje.risc.cnrs.fr/), le projet SYFRAP (http://talc.loria.fr/HOME,288.html) ou encore l'?cole th?matique CNRS sur l'annotation de donn?es langagi?res (sept. 2011). Pour la syntaxe plus sp?cifiquement, on peut, entre autres, signaler le projet FNRS de L. Degand et A.-C. Simon (2011-2013) portant sur la /P?riph?rie gauche des unit?s de discours /ainsi que le projet ANR Rhapsodie (2008-2012) sous la direction d'A. Lacheret. Un nouveau projet ANR ORFEO (Outils et Recherches sur le Fran?ais Ecrit et Oral) de constitution et d'annotation de corpus va ?galement d?marrer d?but 2013 sous la direction de J.-M. Debaisieux. Malgr? ces travaux, ? l'heure actuelle, aucun corpus de fran?ais parl? annot? en syntaxe n'est disponible, ? notre connaissance. L'un des objectifs de cette journ?e th?matique sera de faire le point sur les initiatives r?centes, en cours et futures dans le domaine de l'annotation syntaxique de corpus de fran?ais parl?, en montrant notamment comment l'annotation syst?matique fait ?merger des questions fondamentales pour la description du fran?ais en g?n?ral. Il s'agira ?galement de voir dans quelle mesure on peut/doit d?velopper de nouveaux mod?les et outils pour int?grer les ph?nom?nes pr?sents ? l'oral. Les communications pourront aussi bien porter sur des protocoles d'annotation, des outils que des ?tudes cibl?es, des probl?mes rencontr?s, etc., et soul?veront une s?rie de questions : quel standard d'annotation pour l'oral ? De quels outils dispose-t-on pour exploiter les annotations ? Etc. Par ailleurs, les d?monstrations de logiciels pour l'annotation/exploitation seront aussi les bienvenues. La journ?e se terminera par une table ronde, ? laquelle tous les participants seront invit?s, et qui devrait permettre ? la fois de faire une synth?se des pr?sentations mais aussi de lister quelques-unes des bonnes pratiques et de lancer des pistes ? explorer dans le cadre de projets futurs. */Organisation/* Christophe Benzitoun -- ATILF CNRS & Universit? de Lorraine Noalig Tanguy -- Lattice UMR 8094 ENS/Paris 3 & Valibel / Universit? Catholique de Louvain */Comit? scientifique/* Fr?d?ric B?chet (Aix-Marseille Universit? / LIF UMR 7279) Marie-Jos? B?guelin (Universit? de Neuch?tel) Alain Berrendonner (Universit? de Fribourg) Mireille Bilger (Universit? de Perpignan) Sandrine Cadd?o (Aix-Marseille Universit? / Laboratoire Parole et Langage UMR 7309) Paul Cappeau (Universit? de Poitiers) Christophe Cerisara (Loria UMR 7503) Jeanne-Marie Debaisieux (Universit? Paris 3 Sorbonne Nouvelle / Lattice UMR 8094) Liesbeth Degand (Universit? catholique de Louvain / Valibel) Jos? Deulofeu (Aix-Marseille Universit? / LIF UMR 7279) Anne Dister (Facult?s universitaires Saint-Louis, Bruxelles) Iris Eshkol (Universit? d'Orl?ans / Laboratoire Lig?rien Linguistique UMR 7270) Fran?oise Gadet (Universit? Paris Ouest Nanterre La D?fense / Modyco UMR 7114) Kim Gerdes (Universit? Paris 3 Sorbonne Nouvelle / LPP / Institut d'Automation / Acad?mie de Sciences Chinoise) Eva Havu (Universit? de Helsinki) Sylvain Kahane (Universit? Paris Ouest Nanterre La D?fense / Modyco UMR 7114) Anne Lacheret (Universit? Paris Ouest Nanterre La D?fense / Modyco UMR 7114) Florence Lefeuvre (Universit? Paris 3 Sorbonne Nouvelle / Clesthia) Michel Pierrard (Universit? Libre de Bruxelles) Paola Pietrandrea (Universit? Roma Tre / Lattice UMR 8094) Thierry Poibeau (Lattice UMR 8094 ENS/Paris 3) Sophie Pr?vost (Lattice UMR 8094 ENS/Paris 3) Nathalie Rossi-Gensane (Universit? Toulouse 2 / CLLE ERSS UMR 5263) Fr?d?ric Sabio (Aix-Marseille Universit? / Laboratoire Parole et Langage UMR 7309) Catherine Schnedecker (Universit? de Strasbourg / Lilpa) Anne-Catherine Simon (Universit? catholique de Louvain / Valibel) Sandra Teston-Bonnard (Universit? de Lyon 2 / ICAR UMR 5191) V?ronique Traverso (ICAR UMR 5191) Dan Van Raemdonck (Universit? Libre de Bruxelles) Dominique Willems (Universit? de Gand) Les propositions de communication (de deux pages maximum, bibliographie comprise), en fran?ais ou en anglais, sont ? adresser *avant le 20 octobre* aux adresses suivantes : Christophe.Benzitoun at univ-lorraine.fr/ noalig.tanguy at uclouvain.be ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 17 19:15:29 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 17 Oct 2012 21:15:29 +0200 Subject: Appel: revue TAL - Note de lecture (MOOT) Message-ID: Date: Wed, 17 Oct 2012 13:20:07 +0200 (CEST) From: Denis Maurel Message-ID: <1033857014.5185527.1350472807249.JavaMail.root at mail10> Appel: revue TAL - Note de lecture (MOOT) La revue TAL publie r?guli?rement des notes de lecture. Nous recherchons un coll?gue souhaitant lire le livre: "Richard MOOT, Christian RETOR?. The logic of categorial grammars: a deductive account of natural language syntax and semantics. LNCS 6850. Springer. 2012. 302 pages." et pr?t ? en faire un compte-rendu pour la revue TAL (cet ouvrage sera envoy? gracieusement en ?change du service rendu). Cette note de lecture doit ?tre r?dig?e en fran?ais (trois pages maximum, au format de la revue) et envoy?e fin d?cembre 2012. D'autres compte-rendu sont possibles si vous avez lu r?cemment un ouvrage qui vous a int?ress? et si vous ?tes pr?t ? partager votre lecture avec la communaut?... Denis Maurel ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Sat Oct 20 13:43:37 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Sat, 20 Oct 2012 15:43:37 +0200 Subject: Appel: CICLing 2013 Message-ID: Date: Thu, 18 Oct 2012 10:50:48 -0500 From: "Alexander Gelbukh $CICLing-2013$" Message-ID: CICLing 2013 14th International Conference on Intelligent Text Processing and Computational Linguistics Samos, Greece March 24-30, 2013 Springer LNCS www.CICLing.org/2013 TOPICS: All topics related with computational linguistics, natural language processing, human language technologies, information retrieval, etc. PUBLICATION: LNCS - Springer Lecture Notes in Computer Science; poster session: special issue of a journal KEYNOTE SPEAKERS: Sophia Ananiadou, Walter Daelemans, Roberto Navigli, Michael Thelwall CULTURAL PROGRAM: Three days of cultural activities: tours to Ephesus, Samos, and nearby islands AWARDS: Best paper, best student paper, best presentation, best poster, best software SUBMISSION DEADLINES: November 30: registration of tentative abstract, December 7: full text of registered papers See complete CFP and contact on www.CICLing.org/2013 This message is sent in good faith of its usefulness for you as an NLP researcher. If this is an error, kindly let me know. Alexander Gelbukh www.Gelbukh.com ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Sat Oct 20 13:46:28 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Sat, 20 Oct 2012 15:46:28 +0200 Subject: Soft: ECDC-TM - A freely available translation memory in 25 languages Message-ID: Date: Fri, 19 Oct 2012 12:03:01 +0200 From: Ralf Steinberger Message-id: <015501cdade0$eca55470$c5effd50$@jrc.ec.europa.eu> X-url: http://langtech.jrc.ec.europa.eu/ECDC-TM.html ECDC-TM is a translation memory (sentences and their manually produced translations) in 25 languages. It is a multilingual parallel corpus covering 300 language pairs. Size: Up to 2500 translation units per language; 32,000 in total. Languages: All 300 language pairs involving the following 25 languages: Bulgarian, Czech, Danish, Dutch, English, Estonian, German, Greek, Finnish, French, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Spanish, Swedish and Turkish. URL: http://langtech.jrc.ec.europa.eu/ECDC-TM.html Creator: European Centre for Disease Prevention and Control (ECDC http://www.ecdc.europa.eu/ ) and JRC WHAT IS ECDC-TM ECDC-TM was produced by professionally translating the English language web pages of the European Centre for Disease Prevention and Control (ECDC), an EU agency based in Stockholm. The results of the translation were stored in 24 bilingual translation memories. The JRC post-processed these by cleaning the data and by producing one alignment for all 25 languages, resulting in parallel data for 300 language pairs. The major part of the documents talks about health-related topics (anthrax, botulism, cholera, dengue fever, hepatitis, etc.), but some of the web pages also describe the organisation ECDC (e.g. its organisation, job opportunities) and its activities (e.g. epidemic intelligence, surveillance). The ECDC Translation Memory (http://langtech.jrc.ec.europa.eu/ECDC-TM.html) is much smaller than the other multilingual resources distributed in the past by the European Commission's Joint Research Centre (JRC). Its main advantages are that (a) it covers even more languages and (b) it is based on texts from a very different domain (Public Health). MOTIVATION FOR THIS RELEASE The public data release is in line with the general effort of the European Commission to support multilingualism, language diversity and the re-use of Commission information. It follows the release of the JRC-Acquis (http://langtech.jrc.ec.europa.eu/JRC-Acquis.html) parallel corpus in 2006 (over 1 billion words in 22 languages), of the DGT-TM Translation Memory (http://langtech.jrc.ec.europa.eu/DGT-TM.html) in 2007 and 2011, the multilingual named entity resource JRC-Names (http://langtech.jrc.ec.europa.eu/JRC-Names.html) in 2011, the multi-label classification software JRC EuroVoc Indexer JEX (http://langtech.jrc.ec.europa.eu/Eurovoc.html) in 22 languages and further smaller multilingual resources. See http://langtech.jrc.ec.europa.eu/JRC_Resources.html for more information on these resources. WHAT ECDC-TM CAN BE USED FOR ECDC-TM can be fed into translation memory software to support human translators in their work. As it is a large parallel corpus in electronic form, it can furthermore be used by specialists in computational linguistics to train statistical machine translation software, to generate multilingual dictionaries, to train and test multilingual information extraction software, and more. WHAT NEXT? The JRC and collaborating services of the European Commission plan to release further large-scale linguistic resources in the near future. These include another 25-language translation memory and a paragraph-aligned full-text parallel corpus in 23 languages. Ralf Steinberger & Mohamed Ebrahim European Commission - Joint Research Centre (JRC) 21027 Ispra (VA), Italy URL - Applications: http://emm.newsbrief.eu/overview.html URL - The science behind them: http://langtech.jrc.ec.europa.eu/ ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Sat Oct 20 13:48:24 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Sat, 20 Oct 2012 15:48:24 +0200 Subject: Appel: Journee d'etudes jeunes chercheurs en lexicologie, terminologie, traduction, Bruxelles, 31 janvier 2013 Message-ID: Date: Fri, 19 Oct 2012 17:01:30 +0200 From: Mathieu Mangeot Message-Id: <2709D583-74D1-48F8-87A8-6B209EF94099 at imag.fr> X-url: http://www.ltt.auf.org/article.php3?id_article=728 PREMI?RE JOURN?E D'?TUDE DES JEUNES CHERCHEURS DU R?SEAU LEXICOLOGIE, TERMINOLOGIE, TRADUCTION BRUXELLES, 31 JANVIER 2013 APPEL ? COMMUNICATIONS http://www.ltt.auf.org/article.php3?id_article=728 Le r?seau Lexicologie, terminologie, traduction organisera le jeudi 31 janvier 2013, ? l'Institut sup?rieur de traducteurs et interpr?tes (ISTI, Haute ?cole de Bruxelles), sa Premi?re Journ?e d'?tude des jeunes chercheurs, intitul?e : ? Lexicologie, terminologie, traduction : nouvelles recherches au c?ur d'un syst?me ?. Destin?e ? r?unir en priorit? des doctorants et des postdoctorants, cette rencontre a l'ambition de maintenir la dynamique d'?changes et de transmission qui a anim? le r?seau depuis sa fondation. Dans la foul?e des 9es Journ?es scientifiques de Villetaneuse (2011) et des Journ?es d?animation scientifiques r?gionales de Tunis (2012), elles permettront aux conf?renciers de pr?senter des recherches consacr?es au syst?me linguistique et situ?es au c?ur des pr?occupations du r?seau, notamment : - l'?quivalence et la synonymie ; - l'?volution de la langue sp?cialis?e ; - la mod?lisation des dictionnaires et des corpus ; - les outils d'aide ? la traduction ; - la description linguistique au service de l'intercompr?hension. Pour cette rencontre, le comit? scientifique du r?seau LTT a fait le choix d'un appel ? communications destin? en priorit? aux jeunes chercheurs et chercheuses, doctorants ou postdoctorants, travaillant au sein d'?quipes affili?es au r?seau ou int?ress?es ? le rejoindre. Les propositions seront s?lectionn?es sur la base d'un r?sum? comptant entre 600 et 1000 mots. Il devra ?tre int?gr? au formulaire ad hoc et envoy? d'ici le 10 novembre 2012, ? l'adresse ltt2013 at imag.fr. La version finale des textes des communications devra ?tre remise au comit? scientifique au plus tard le 28 f?vrier 2013 et se conformera aux normes de pr?sentation. Seuls seront publi?s dans les actes de la rencontre les textes qui auront ?t? valid?s par ce comit?. Calendrier 10 novembre 2012 date limite de d?p?t des propositions de communication 1er d?cembre 2012 d?cision du comit? scientifique 31 janvier 2013 colloque 28 f?vrier 2013 date limite d'envoi des versions d?finitives des textes (actes) Comit? scientifique Coordonnateur : Mathieu Mangeot-Nagata (GETALP-LIG, Grenoble) Ibrahim Ben Mrad (VALD, Universit? de La Manouba, Tunis) Xavier Blanco Escoda (Laboratoire fLexSem, Universit? Autonome de Barcelone) A?cha Bouhjar (Centre de l'am?nagement linguistique de l'Institut royal de la culture amazighe, Rabat) Mame Thierno Ciss? (Laboratoire SOLDILAF, Universit? Cheikh Anta Diop de Dakar) Anne Condamines (ERSS, CNRS-Universit? de Toulouse-le-Mirail) Marie-Claude L'Homme (OLST, Universit? de Montr?al) Teresa Lino (Centro de Lingu?stica, Universidade Nova de Lisboa) Fran?ois Maniez (CRTT, Universit? Lumi?re Lyon 2) Salah Mejri (Laboratoire LDI, Universit? de Paris 13) Franck Neveu (Universit? de Paris IV et ILF) Patrice Pognan (Lalic, Institut national des langues et civilisations orientales) Philippe Thoiron (CRTT, Universit? Lumi?re Lyon 2) Amalia Todirascu (UR LILPA, Universit? de Strasbourg) Marc Van Campenhoudt (Termisti, Institut sup?rieur de traducteurs et interpr?tes, Bruxelles) Mathieu MANGEOT GETALP, LIG-campus, BP 53 - 41 rue des math?matiques F-38041 Grenoble Cedex 9 - France Tel : +33 4 76 63 56 54 / +33 4 79 75 81 89 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Sat Oct 20 13:49:30 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Sat, 20 Oct 2012 15:49:30 +0200 Subject: Job: Ingenieur au LIPN, Universite Paris-Nord Message-ID: Date: Fri, 19 Oct 2012 21:20:47 +0200 From: Sylvie Salotti Message-ID: <0d314954f8237eab0405bb87021239b5 at lipn.univ-paris13.fr> CDD Ing?nieur de d?veloppement au LIPN, Universit? Paris 13 Sujet : Annotation s?mantique de documents juridiques Contexte : Vous travaillerez au LIPN (Laboratoire d?Informatique de Paris-Nord), au sein de l'?quipe RCLN (Repr?sentation des Connaissances et Langage Naturel) dont les domaines de recherche concernent le traitement automatique des langues et l'ing?nierie des connaissances. Plus pr?cis?ment, vous interviendrez dans le cadre du projet Legilocal, dont l'?quipe RCLN est partenaire et dont l'objectif est de faciliter l'acc?s des citoyens aux documents juridiques des collectivit?s locales. Une des t?ches de ce projet concerne l'enrichissement s?mantique des documents avec des annotations permettant de faciliter la recherche d'information et la navigation dans la collection documentaire. Les annotations envisag?es sont de diff?rentes natures, mais on s'int?ressera particuli?rement ? deux types d'annotation : - les termes correspondant ? des instances ou concepts de ressources s?mantiques telles que des ontologies et des th?saurus, - les relations que l?on pourra identifier entre ces concepts et instances, ou entre les documents. Mission : Vous participerez, en collaboration avec les membres de l'?quipe et du projet, ? l'?laboration des sp?cifications concernant ces besoins d'annotation, puis vous prendrez en charge les diff?rentes ?tapes du projet de d?veloppement des outils permettant d'annoter une collection documentaire (d?veloppement, tests, documentation). Vous pourrez vous appuyer sur des modules d'annotation d?j? existants dans l'?quipe qui pourront ?tre modifi?s et/ou compl?t?s pour r?pondre aux sp?cifications. L'int?gration dans une cha?ne de traitement UIMA en cours de d?veloppement dans l??quipe pourra ?tre envisag?e. Profil : Ing?nieur ou master 2 en informatique Comp?tence requise : Bon niveau en programmation JAVA Comp?tences appr?ci?es : Environnement Eclipse/RCP, Web s?mantique, TAL Lieu : Universit? Paris 13, Villetaneuse Contrat ? dur?e d?termin?e de 9 ? 12 mois ? pourvoir d?but janvier 2013. CV et lettre de motivation ? adresser ? : sylvie.salotti at lipn.univ-paris13.fr ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 19:43:11 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 21:43:11 +0200 Subject: Soft: ECDC-TM - A freely available translation memory in 25 languages - CORRECTION Message-ID: Date: Mon, 22 Oct 2012 11:11:41 +0200 From: Ralf Steinberger Message-id: <04f301cdb035$403fca10$c0bf5e30$@jrc.ec.europa.eu> X-url: http://www.ecdc.europa.eu/ This is a correction to the announcement sent on Friday 19 October. The email and the related ECDC-TM webpage contained wrong information regarding the languages covered and regarding the statistics on the corpus. Thanks to Raivis Skadi?? for pointing this out. I had mixed up the information of two different corpora. 25 languages seems to be more than my little brain can handle. ;-) Please accept my apologies. You find the new summary below. The web page has now also been corrected. Ralf ========== ECDC-TM is a translation memory (sentences and their manually produced translations) in 25 languages. It is a multilingual parallel corpus covering 300 language pairs. Size: Up to 3900 translation units per language; 64,000 in total. Languages: All 300 language pairs involving the following 25 languages: Bulgarian, Czech, Danish, Dutch, English, Estonian, German, Greek, Finnish, French, Irish, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Spanish and Swedish. URL: http://langtech.jrc.ec.europa.eu/ECDC-TM.html Creator: European Centre for Disease Prevention and Control (ECDC http://www.ecdc.europa.eu/) and JRC WHAT IS ECDC-TM ECDC-TM was produced by professionally translating the English language web pages of the European Centre for Disease Prevention and Control (ECDC), an EU agency based in Stockholm. The results of the translation were stored in 24 bilingual translation memories. The JRC post-processed these by cleaning the data and by producing one alignment for all 25 languages, resulting in parallel data for 300 language pairs. The major part of the documents talks about health-related topics (anthrax, botulism, cholera, dengue fever, hepatitis, etc.), but some of the web pages also describe the organisation ECDC (e.g. its organisation, job opportunities) and its activities (e.g. epidemic intelligence, surveillance). The ECDC Translation Memory (http://langtech.jrc.ec.europa.eu/ECDC-TM.html) is much smaller than the other multilingual resources distributed in the past by the European Commission?s Joint Research Centre (JRC). Its main advantages are that (a) it covers even more languages and (b) it is based on texts from a very different domain (Public Health). MOTIVATION FOR THIS RELEASE The public data release is in line with the general effort of the European Commission to support multilingualism, language diversity and the re-use of Commission information. It follows the release of the JRC-Acquis (http://langtech.jrc.ec.europa.eu/JRC-Acquis.html) parallel corpus in 2006 (over 1 billion words in 22 languages), of the DGT-TM Translation Memory (http://langtech.jrc.ec.europa.eu/DGT-TM.html) in 2007 and 2011, the multilingual named entity resource JRC-Names (http://langtech.jrc.ec.europa.eu/JRC-Names.html) in 2011, the multi-label classification software JRC EuroVoc Indexer JEX (http://langtech.jrc.ec.europa.eu/Eurovoc.html) in 22 languages and further smaller multilingual resources. See http://langtech.jrc.ec.europa.eu/JRC_Resources.html for more information on these resources. WHAT ECDC-TM CAN BE USED FOR ECDC-TM can be fed into translation memory software to support human translators in their work. As it is a large parallel corpus in electronic form, it can furthermore be used by specialists in computational linguistics to train statistical machine translation software, to generate multilingual dictionaries, to train and test multilingual information extraction software, and more. WHAT NEXT? The JRC and collaborating services of the European Commission plan to release further large-scale linguistic resources in the near future. These include another 25-language translation memory and a paragraph-aligned full-text parallel corpus in 23 languages. Ralf Steinberger & Mohamed Ebrahim European Commission - Joint Research Centre (JRC) 21027 Ispra (VA), Italy URL ? Applications: http://emm.newsbrief.eu/overview.html URL ? The science behind them: http://langtech.jrc.ec.europa.eu/ ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 19:49:06 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 21:49:06 +0200 Subject: Appel: WSLST 2013 Message-ID: Date: Sat, 20 Oct 2012 21:47:55 +0200 From: "GRLMC" Message-ID: X-url: http://grammars.grlmc.com/wslst2013/ ********************************************************************* 2013 INTERNATIONAL WINTER SCHOOL IN LANGUAGE AND SPEECH TECHNOLOGIES WSLST 2013 (formerly International PhD School in Language and Speech Technologies) Tarragona, Spain January 7-11, 2013 Organized by: Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University http://grammars.grlmc.com/wslst2013/ ********************************************************************* AIM: WSLST 2013 offers a broad and intensive series of lectures at different levels on selected topics in language and speech technologies. The students choose their preferred courses according to their interests and background. Instructors are top names in their respective fields. The School intends to help students initiate and foster their research career. The previous event in this series was SSLST 2012: http://grammars.grlmc.com/sslst2012/ ADDRESSED TO: Graduate (and advanced undergraduate) students from around the world. Most appropriate degrees include: Computer Science and Linguistics. Other students (for instance, from Mathematics, Electrical Engineering, Logic, or Cognitive Science) are welcome too. The School is appropriate also for people more advanced in their career who want to keep themselves updated on developments in the field. There will be no overlap in the class schedule. COURSES AND PROFESSORS: - Simon King (U Edinburgh), Speech Synthesis [introductory/intermediate, 8 hours] - Constantine Kotropoulos (U Thessaloniki), Pattern Recognition Problems Related to Speech [intermediate, 6 hours] - Lori Levin (Carnegie Mellon U), The Theory behind the Resources [introductory/intermediate, 8 hours] - Rainer Martin (U Bochum), Signal Processing for Voice Communication Devices [intermediate, 8 hours] - German Rigau (U Basque Country, Donostia), Knowledge Resources for Semantic Processing [introductory/intermediate, 8 hours] - Marc Swerts (Tilburg U), Facial Expressions in Human-Human and Human-Machine Interactions [introductory/intermediate, 6 hours] - Tomoki Toda (Nara Institute of Science and Technology), Statistical Voice Conversion [introductory/advanced, 8 hours] REGISTRATION: It has to be done on line at http://grammars.grlmc.com/wslst2013/Registration.php FEES: They are variable, depending on the number of courses each student takes. The rule is: 1 hour = - 10 euros (for payments until November 15, 2012), - 12.50 euros (for payments between November 16 and December 11, 2012), - 15 euros (for payments after December 11, 2012). PAYMENT PROCEDURE: The fees must be paid to the School's bank account: Uno-e Bank bank?s address: Julian Camarillo 4 C, 28037 Madrid, Spain IBAN: ES3902270001820201823142 SWIFT/BIC: UNOEESM1 account holder: C. Martin ? GRLMC account holder?s address: Av. Catalunya 35, 43002 Tarragona, Spain Please mention WSLST 2013 and your name in the subject. A receipt will be provided on site. Remarks: - Bank transfers should not involve any expense for the School. - People claiming early registration will be requested to prove that the bank transfer order was carried out by the deadline. - The organizers reserve the right to cancel a course if the number of students who signed up for it is less than 10. - Students will be refunded only in the case when a course gets cancelled due to the unavailability of the instructor or because of insufficient registration numbers. People registering on site at the beginning of the School must pay in cash. For the sake of local organization, however, it is much recommended to do it earlier. ACCOMMODATION: Information about accommodation will be available on the website of the School. CERTIFICATE: Students will be delivered a certificate stating the courses attended, their contents, and their duration. IMPORTANT DATES: Announcement of the programme: October 19, 2012 Very early registration deadline: November 15, 2012 Early registration deadline: December 11, 2012 Starting of the School: January 7, 2013 End of the School: January 11, 2013 QUESTIONS AND FURTHER INFORMATION: Lilica Voicu: florentinalilica.voicu at urv.cat WEBSITE: http://grammars.grlmc.com/wslst2013/ POSTAL ADDRESS: WSLST 2013 Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University Av. Catalunya, 35 43002 Tarragona, Spain Phone: +34-977-559543 Fax: +34-977-558386 ACKNOWLEDGEMENTS: Diputaci? de Tarragona Universitat Rovira i Virgili ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 19:54:11 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 21:54:11 +0200 Subject: Revue: Langages, numero 187 Message-ID: Date: Mon, 22 Oct 2012 10:05:33 +0200 From: Catherine SCHNEDECKER Message-ID: X-url: http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&ida * Langages, N?187 (3/2012) * L?analyse de corpus face ? l?h?t?rog?n?it? des donn?es * Septembre 2012 * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9824 Sommaire du n?187 * GARRIC Nathalie, LONGHI Julien * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9823 L?analyse de corpus face ? l?h?t?rog?n?it? des donn?es : d?une difficult? m?thodologique ? une n?cessit? ?pist?mologique * PINCEMIN B?n?dicte * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9822 H?t?rog?n?it? des corpus et textom?trie * BLASCO-DULBECCO Myl?ne, CAPPEAU Paul * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9821 Identifier et caract?riser un genre : l?exemple des interviews politiques * LONGHI Julien * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9820 Types de discours, formes textuelles et normes s?mantiques : expression et doxa dans un corpus de donn?es h?t?rog?nes * CISLARU Georgeta, SITRI Fr?d?rique * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9819 De l??mergence ? l?impact social des discours : h?t?rog?n?it?s d?un corpus * GARRIC Nathalie * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9818 Construire et ma?triser l?h?t?rog?n?it? par la variation des donn?es, des corpus et des m?thodes * * RATINAUD Pierre, MARCHAND Pascal * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9817 Recherche improbable d?une homog?ne diversit? : le d?bat sur l?identit? nationale * ANTOINE Jean-Yves, VILLANEAU Jeanne, GOULIAN J?r?me * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9816 Influence du genre applicatif sur la r?alisation des extractions en dialogue oral : constantes et variations * LEFEUVRE Ana?s, VINOGRADOVA Natalia * http://www.armand-colin.com/revues_article_info.php?idr=20&idnum=424131&idart=9815 H?t?rog?n?it? et extraction d?information factuelle dans un corpus de r?cits de voyage ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 19:58:54 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 21:58:54 +0200 Subject: Seminaire: Karin Harbusch, Expose sur les ellipses, 27 octobre 2012, Paris Message-ID: Date: Tue, 23 Oct 2012 15:00:20 +0200 From: Anne Abeill? Message-Id: X-url: http://ellipse.linguist.univ-paris-diderot.fr/ Dans le cadre du projet Approches typologiques des constructions elliptiques (F?d?ration TUL du CNRS) http://ellipse.linguist.univ-paris-diderot.fr/ nous avons le plaisir d'accueillir le vendredi 27 octobre de 10h a 12h 175 rue du chevaleret, 75013 Paris 4e etage, aquarium l'expos? suivant: ELLEIPO: Generating Clausal Coordinative Ellipsis in Dutch, Estonian, German, and Hungarian Karin Harbusch Computer Science Dept., University of Koblenz-Landau, GERMANY harbusch at uni-koblenz.de Abstract In our talk, we present target-language independent syntactic rules to generate Clausal Coordinate Ellipsis (CCE), i.e. Gapping (including Long-Distance Gapping, Subgapping and Stripping), Forward and Backward Conjunction Reduction (FCR and BCR) and Subject Gap with Finite/ Fronted Verb (SGF). The CCE rules, which are inspired by the psycholinguistic theory by Kempen (2009), have been implemented in Java (cf. system ELLEIPO) so that tests for a new target language require the set up of syntactic trees to be read in by the system. All CCE paraphrases for any input sentence?provided as output by the ELLEIPO system?have to be inspected by native speaker with respect to overgeneration, i.e. does the list contain any ungrammatical sentence, and undergeneration, i.e. does the list lack any CCE paraphrase that is licensed in the currently investigated target language. We show the implementation for Dutch and German, two Indo-European languages, and for Estonian and Hungarian, two Finno-Ugric languages. With respect to incremental production of ellipsis, we present results from four different corpus studies. After an account of our data extraction method, we will present a detailed overview of the incidence of four types of clausal coordinate ellipsis in the spoken and written treebanks in Dutch (ALPINO and CGN 2.0) and German (TIGER and VERBMOBIL). Based on the deviating numbers for the individual CCE types, we propose a theoretical explanation of the data pattern based on the assumption that during spontaneous speaking the scope (?window?) of online grammatical planning is basically restricted to one (finite) clause. In producing clausal coordinations, checking the possibility of ?forward? ellipsis (Gapping, Forward Conjunction Reduction) requires comparison of form and meaning of two adjacent clauses. As this overtaxes the online planning scope of the sentence production system, speakers prefer to plan the form of second or later conjoined clauses in isolation, that is, without taking the shape of preceding clauses into account and thereby eliminating elliptical options. RNR, the ?backward? versions of coordinate ellipsis, is more severely affected in spoken language because it requires the simultaneous presence within the planning window of (nearly) two complete clauses. Indeed, whilst RNR is readily observable in written texts, in spoken language it is a rare phenomenon manifesting itself only in very short clauses. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Oct 23 20:06:01 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 23 Oct 2012 22:06:01 +0200 Subject: Conf: WACAI, 15 et 16 novembre 2012, Grenoble Message-ID: Date: Tue, 23 Oct 2012 18:28:25 +0200 From: Alexandre Pauchet Message-ID: <5086C5A9.606 at insa-rouen.fr> X-url: http://wacai2012.imag.fr/ ********************************************************************** Appel ? participation Inscriptions ? tarif pr?f?rentiel jusqu'au*2 Novembre* Workshop Affects, Compagnons Artificiels, Interaction (WACAI'12) Grenoble, 15 et 16 Novembre 2012 http://wacai2012.imag.fr/ ********************************************************************** L'objectif du workshop*WACAI 2012* est de r?unir les recherches et d?veloppements en cours autour des th?mes des/Compagnons Artificiels (Agents Conversationnels anim?s -ACA-// et robots interactifs)/ et de l'/Informatique Affective/, afin que les chercheurs des communaut?s scientifiques concern?es puissent pr?senter leurs mod?les, outils, technologies et r?sultats de recherche. Ces journ?es seront compos?es : - d'expos?s de synth?se : * Michel Dubois : "Robotique et acceptabilit? sociale : limites des mod?les et perspectives." * Anna Tcherkassof : "Quel mod?le psychologique pour l'interaction affective ? Une th?orie perceptive des ?motions" * Rachid Alami : "Processus D?cisionnels pour robots interactifs" - de plus de 20 communications s?lectionn?es par le comit? de programme. - de pr?sentations de posters et d?monstrations ********************************************************************** ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:16:04 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:16:04 +0200 Subject: Job: TAL pour le japonais, vacations remunerees, LATTICE Message-ID: Date: Wed, 24 Oct 2012 08:32:50 +0200 From: Thierry Poibeau Message-Id: CDD (vacations r?mun?r?es) Pour un projet ponctuel, le Lattice (http://www.lattice.cnrs.fr/) recherche un ?tudiant de niveau Master ayant une formation en traitement automatique des langues (TAL) et ma?trisant la langue japonaise. La t?che consiste ? annoter automatiquement des textes en japonais au moyen de logiciels existants (par ex. Cabocha, http://code.google.com/p/cabocha/) pour en extraire (notamment) les entit?s nomm?es et les termes. L'?tudiant devra ?tre autonome pour manipuler les textes, nettoyer les donn?es (langage de script), les passer dans l'analyseur vis? puis nettoyer les sorties (pour ne garder que le texte avec des balises marquants les ?l?ments int?ressants). Il peut ?tre n?cessaire de concevoir quelques programmes compl?mentaires pour annoter en plus des ?l?ments importants pour la t?che mais non reconnus dans les textes (qui seront fournis). Cette vacation peut d?marrer tr?s rapidement et le travail est de toute mani?re ? effectuer courant novembre (la charge est ?valu?e entre 50 et 100 heures mais ceci est tr?s indicatif). Il est possible de travailler chez soi apr?s s'?tre mis d'accord sur la t?che et les objectifs. Le contrat implique malgr? tout quelques r?unions ? Paris. Pour candidater : contacter par mail d?s que possible (et en tout cas avant le 6 novembre) Thierry Poibeau (pr?nom.nom at ens.fr). Envoyer un CV et donner (?ventuellement de mani?re informelle dans le mail) les ?l?ments pertinents par rapport ? la t?che (ex. utilisation dans le cadre d'un projet d'un outil d'annotation du japonais). ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:18:36 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:18:36 +0200 Subject: Soft: TextCoop pour l'analyse du discours Message-ID: date: Wed, 24 Oct 2012 14:07:23 +0200 from: "Patrick Saint-Dizier" message-id: <4174-5087da00-5-256ba140 at 228118700> analyse du discours en logique TextCoop est une plateforme pour l'analyse de diverses structures du discours (g?neriques ou sp?cifiques ? un genre ou ? un domaine). TextCoop est issu des grammaires logiques et permet d'introduire des connaissances et du raisonnement dans l'analyse. Le langage de description des structures du discours, Dislog, peut coder aussi bien des r?gles ?tablies manuellement que produites par un m?canisme d'aprpentissage automatique. Une archive ainsi qu'une doc utilisateur est ? pr?sent disponible gratuitement sur demande. Cette archive contient quelques ressources lexicales ainsi que des exemples pour d?buter. contact: stdizier at irit.fr ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:19:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:19:25 +0200 Subject: Appel: BioNLP Shared Task 2013, 1st announcement, Sample data release Message-ID: Date: Wed, 24 Oct 2012 14:10:31 +0200 From: Claire Nedellec Message-ID: <5087DAB7.3020403 at jouy.inra.fr> X-url: http://2013.bionlp-st.org/ (apologies for duplicate posting) ================================== BioNLP Shared Task 2013 -- First announcement ==================================== Release of sample data sets ------------------------------- We are pleased to announce the upcoming BioNLP Shared Task, an information extraction task open to any interested participants. The task will be held in early 2013, and a workshop on the task is planned to be co-hosted with the ACL 2013 BioNLP workshop. Sample data are now available at BioNLP Shared Task 2013 website: http://2013.bionlp-st.org/ Participation to the task will be open to all interested parties. The BioNLP Shared Task series represents a community-wide trend in text-mining for biology toward fine-grained information extraction (IE). The two previous events, the BioNLP 2009 and 2011 shared task attracted wide attention, with numerous teams submitting final results. The task setup and data have since served as the basis of numerous studies and published event extraction systems and datasets. The BioNLP Shared Task 2013 (BioNLP-ST'13) follows the general outline and goals of the previous tasks. It identifies biologically relevant extraction targets and proposes a linguistically motivated approach to event representation. BioNLP-ST'13 tasks also covers many new hot topics in the biology domain that are close to biologist needs. As in previous editions, manually annotated data will be provided for training, development and evaluation of the participating extraction methods. The six BioNLP-ST 2013 event extraction tasks are - [GE] Genia Event Extraction for NFkB knowledge base construction - [CG] Cancer Genetics - [PC] Pathway Curation - [GRO] Corpus Annotation with Gene Regulation Ontology - [RNB] Gene Regulation Network in Bacteria - [BB] Microorganism biotope (semantic annotation by an ontology) Tentative schedule is as follows: * Sample Data Release 23 October 2013 * Training Data Release 8 January 2013 * Test Data Release March 2013 (Tentative) * Result Submission March 2013 (Tentative) * Results Notification March 2013 (Tentative) * Manuscript Submission April 2013 (Tentative) * BioNLP Shared task 2013 workshop Summer 2013 (subject to extension) Scientific Advisory Committee * Jun'ichi Tsujii (Microsoft) - Chair * Sophia Ananiadou (NaCTeM) * Kevin Cohen (Univ. Colorado) * Sung-Pil Choi (KISTI) * Tapio Salakoski (Univ. Turku) * Pierre Zweigenbaum (Univ. Paris-Sud, CNRS) Organizing Committee * Claire N?dellec (INRA) - Organizing Chair * Robert Bossy (INRA) - Task BB and RBB chair * Jin-Dong Kim (DBCLS) - Task GE chair * Jung-jae Kim (NTU, Singapore) - Task GRO chair * Tomoko Ohta (NaCTeM) - Task PC chair * Sampo Pyysalo (NaCTeM) - Task GC chair * Julien Jourde (INRA) * Pontus Stenetorp (Univ. Tokyo) * Yue Wang (DBCLS) Program Committee * TBD Contact * Web: http://2013.bionlp-st.org/ * Mailing List: bionlp-st at bionlp-st.org ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:24:07 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:24:07 +0200 Subject: Appel: revue TAL - Note de lecture (PIOTROWSKI) Message-ID: Date: Wed, 24 Oct 2012 16:09:49 +0200 (CEST) From: Denis Maurel Message-ID: <1275077246.6835049.1351087789143.JavaMail.root at mail10> Appel: revue TAL - Note de lecture (PIOTROWSKI) La revue TAL publie r?guli?rement des notes de lecture. Nous recherchons un coll?gue souhaitant lire le livre: "Michael PIOTROWSKI. Natural Language Processing for Historical Texts. Morgan & Claypool publishers. 2012. 145 pages" et pr?t ? en faire un compte-rendu pour la revue TAL (cet ouvrage sera envoy? gracieusement en ?change du service rendu). Cette note de lecture doit ?tre r?dig?e en fran?ais (trois pages maximum, au format de la revue) et envoy?e fin janvier 2013. D'autres compte-rendu sont possibles si vous avez lu r?cemment un ouvrage qui vous a int?ress? et si vous ?tes pr?t ? partager votre lecture avec la communaut?... Denis Maurel ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:32:43 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:32:43 +0200 Subject: Job: 2 ingenieur TALN, LELIE Message-ID: date: Wed, 24 Oct 2012 16:28:16 +0200 from: "Patrick Saint-Dizier" message-id: <4e1c-5087fb00-9-9cef710 at 196554100> X-url: http://www.irit.fr/recherches/ILPL/lelie/accueil.html LELIE - 2 postes ing?. TALN Dans le cadre du projet ANR LELIE: http://www.irit.fr/recherches/ILPL/lelie/accueil.html Une start-up est en cr?ation. Elle est dirig?e par un ing?nieur grandes ?coles. Plusieurs clients grandes entreprises sont int?ress?s par le produit qui a ?t? d?velopp?. Outre le travail sur l'analyse du risque, domaine en forte croissance, nos travaux portent aussi sur l'analyse de la qualit? des cahiers des charges. Dans ce cadre, nous recherchons un ? deux ing?nieurs TALN: - de pr?f?rence ayant un doctorat en TAL, en linguistique appliqu?e, ou ?ventuellement en intelligence artificielle, - bonne connaissance de la syntaxe et du discours et des technologies d'analyse, - bonne connaissance de l'anglais et si possible d'une autre langue, - tr?s bons contacts avec les clients - utilisateurs. - r?gion: toulousaine ou parisienne de pr?f?rence, mais le t?l?-travail est aussi possible. Le travail sera assez diversifi?: extension du syst?me actuel et participation au d?veloppement d'une ligne de prouits, analyse des besoins industriels, promotion du produit, repr?sentation de connaissances m?tier, analyse et mise en place de d?ploiements industriels. Des collaborations avec des soci?t?s fran?aises et ?trang?res sont en cours de mise en place. Embauches: d?but 2013. envoyer un CV + lettre motivation avant le 10/11 ?: stdizier at irit.fr qui fera suivre. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 18:49:33 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 20:49:33 +0200 Subject: Appel: Journees de Rochebrune 2013, La preuve et ses moyens Message-ID: Date: Wed, 24 Oct 2012 16:45:58 +0200 From: Thomas Louail Message-Id: <8A60D591-6309-4BD3-B1E1-1B8AAD870C96 at irit.fr> X-url: http://s4.csregistry.org/rochebrune [En nous excusant pour les doublons ?ventuels.] *Dernier appel ? communications* *Journ?es de Rochebrune* 2013 ?La preuve et ses moyens? *13-19 janvier 2013* *Ech?ancier* - Date limite d'envoi du titre et du r?sum? de la proposition de communication : *04 novembre* 2012 - Date limite d'envoi de la proposition de communication (4 ? 12 pages): *11 novembre* 2012 - Notification : *25 novembre* 2012 - Date limite d?inscription ? Rochebrune : *15 d?cembre* 2012 - Journ?es de Rochebrune : *13 au 19 janvier* 2013 - URL de la page web des journ?es : http://s4.csregistry.org/tiki-index.php?page=rochebrune Ch?res et chers coll?gues, C'est avec plaisir que nous vous faisons parvenir l'appel ? communications des prochaines journ?es de Rochebrune, qui auront lieu du 13 au 19 *janvier* 2013. "La notion de preuve a beaucoup ?volu? au cours de l?histoire des sciences et s?est toujours entendue diff?remment dans les disciplines formelles et axiomatiques, les sciences exp?rimentales et les sciences humaines et sociales. Dans les disciplines aujourd?hui regroup?es sous l??tiquette ? sciences de la complexit? ? et dialoguant sur la base d?une analyse syst?mique des ph?nom?nes, elle renvoie ? un vaste corpus de moyens de preuve. Ces moyens sont des formes argumentatives concurrentes, des m?thodes d?investigation et de raisonnement h?t?rog?nes, des expertises scientifiques qui partagent toutes l?ambition de produire du ? dire solide ? ? la suite de graduations successives du discours (sp?culation, plausibilit?, v?rit?)..." L'appel complet, l'?ch?ancier ainsi qu'un descriptif du principe et du fonctionnement de ces journ?es sont joints ? ce courriel. Ils sont ?galement disponibles en ligne ? l'adresse : http://s4.csregistry.org/rochebrune Bien cordialement, Pour le comit? d'organisation, ------------------------------------------------------------------------ Thomas Louail Postdoctoral associate, STAE Fundation UMR 5505 IRIT, S.M.A.C. team Manufacture des Tabacs, Bat?ment E 21, all?e de Brienne - 31000 Toulouse Phone : +33 (0)682 291 952 Web : http://irit.academia.edu/ThomasLouail ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 19:03:04 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 21:03:04 +0200 Subject: Job: Poste de professeur en analyse semantique des media sociaux, Universite de Montreal Message-ID: Date: Wed, 24 Oct 2012 15:32:38 -0400 From: Philippe Langlais Message-Id: X-url: http://www.nserc-crsng.gc.ca/Professors-Professeurs/CFS-PCP/IRC-PCI_fra.asp Le D?partement d?informatique et de recherche op?rationnelle de l'universit? de Montr?al sollicite des candidatures pour occuper un poste ? temps plein de professeure ou de professeur, au rang d?adjoint ou d?agr?g?, en analyse s?mantique des m?dias sociaux. L?engagement est conditionnel ? l?obtention par le candidat d?une chaire dans le cadre du programme de professeurs-chercheurs industriels du CRSNG. Le dossier de candidature devra ?tre accompagn? des formulaires 100, 101 et 183A demand?s par l?organisme subventionnaire ? l?adresse suivante : http://www.nserc-crsng.gc.ca/Professors-Professeurs/CFS-PCP/IRC-PCI_fra.asp Fonctions Le candidat retenu sera appel? ? enseigner aux trois cycles, ? encadrer des ?tudiants aux ?tudes sup?rieures, ? poursuivre des activit?s de recherche, de publication et de rayonnement ainsi qu?? contribuer aux activit?s de l?institution. Exigences - Doctorat en informatique ou dans un domaine connexe. - Obtention par le candidat (la candidate) d?une chaire de professeur-chercheur. - Exp?rience industrielle. - Le candidat sera appel? ? oeuvrer dans le domaine du traitement des langues naturelles et, plus particuli?rement, dans l?analyse s?mantique des m?dias sociaux. - Exp?rience en enseignement souhaitable. - Dossier de publications. - Ma?trise de la langue fran?aise (http://secretariatgeneral.umontreal.ca/fileadmin/user_upload/secretariat/doc_officiels/reglements/administration/adm10-34_politique-linguistique.pdf) Traitement L?Universit? de Montr?al offre un salaire concurrentiel jumel? ? une gamme compl?te d?avantages sociaux. Entr?e en fonction ? compter du 1er juin 2013. Cl?ture du concours Le dossier de candidature, constitu? d?une lettre de motivation, d?un curriculum vit?, d?un exemplaire de publications ou de travaux de recherche r?cents, doit parvenir ? l?adresse ci-dessous au plus tard le 31 d?cembre 2012. Les candidats doivent ?galement demander ? trois personnes de faire parvenir une lettre de recommandation au directeur du d?partement ? l?adresse suivante : Patrice Marcotte, directeur D?partement d?informatique et de recherche op?rationnelle Universit? de Montr?al C. P. 6128, succursale Centre-ville Montr?al (Qu?bec) H3C 3J7 CANADA Les personnes int?ress?es trouveront des renseignements sur le D?partement d?informatique et de recherche op?rationnelle en consultant le site Web ? l?adresse suivante : www.iro.umontreal.ca. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 19:05:50 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 21:05:50 +0200 Subject: Seminaire: Francois Rastier, INALCO, 15, 22, 29 novembre et 6 decembre 2012 Message-ID: Date: Thu, 25 Oct 2012 09:32:42 +0200 From: Mathieu Valette Message-Id: L'Equipe de Recherche Textes, Informatique, Multilinguisme (ERTIM) de l'INALCO a le plaisir de vous inviter aux s?ances de son s?minaire de recherche anim?es par Fran?ois Rastier : Description s?mantique et contexte culturel. Le s?minaire aura lieu les jeudis 15, 22, 29 novembre et 6 d?cembre 2012, 17h-19h ? l'INALCO-Recherche, 2 rue de Lille 75007 Paris (Salle des marbres, porte 124, premier ?tage, escalier B dans la cour). Lectures possibles : - F. Rastier, La mesure et le grain. S?mantique de corpus, Champion, Paris, 2011. - Revue Texto ! Textes & Cultures (http://www.revue-texto.net) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Fri Oct 26 19:08:25 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Fri, 26 Oct 2012 21:08:25 +0200 Subject: Job: Postdoctoral position in German Computational Linguistics at Voxygen (Rennes, France) Message-ID: Date: Thu, 25 Oct 2012 11:55:17 +0200 From: Chiara Mazza Message-ID: <50890C85.5090208 at voxygen.fr> X-url: http://www.voxygen.fr/ Postdoctoral position in German Computational Linguistics Context ------- Voxygen is a young and innovative company, created in September 2011, constituted by experts in the field of speech synthesis and linguistics, and located in the Lannion and Rennes areas, France. Voxygen proposes speech synthesis products and services essentially for European, Arabic, and African markets, and is particularly experienced in the creation of expressive voices for industrial and entertainment purposes. For more information on Voxygen: http://www.voxygen.fr/ The speech synthesis solution is widely deployed in voice servers and mobile applications and operates in a large range of environments: PCs and servers (Windows, Linux, MacOS) and mobile devices (Android, Windows Mobile, iPhone OS, Symbian). Voxygen is currently working on adding the German language to its catalog of offers. Task description ----------------- Speech synthesis is a cross-disciplinary area requiring computer science, linguistics and speech processing skills. The linguistic processing is a central component that analyzes text to determine its part-of-speech tags, pronunciation and intonation. This requires the candidate to have knowledge in phonetics, phonology, morphology, syntax and prosody. The candidate will be in charge of the implementation of the linguistic processing for the German language by using or adapting existing language processing tools, and by programming new ones using C/C++ or script languages such as Python and Perl. He or she should bring innovative ideas to the technical team and may interact with marketing and sales teams on needs concerning the German Language. Keywords --------- German Language, Computational Linguistics, Natural Language Processing (NLP), Speech Synthesis Profile of the candidate ------------------------- * Native German speaker * PhD in Computational linguistics or related fields * Skills in computer programming (C/C++, Perl, Python) * Experience in natural language processing * The working languages are English and French * Appreciates working in a team spirit Duration & location ----------------------------------- 12 months with a possible extension of 6 months, starting as soon as possible. The position will be located in the Rennes area (France) Please email your CV and cover letter to Paul BAGSHAW (jobs at voxygen.fr) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:05:53 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:05:53 +0100 Subject: Job: Annotateur d'entites nommees, ELDA Message-ID: Date: Mon, 29 Oct 2012 11:55:31 +0100 From: Leixa j?r?my Message-ID: <508E60A3.5040704 at elda.org> X-url: http://www.quaero.org Intitul? du poste ----------------- Annotateur Contexte -------- Dans le cadre du projet de recherche QUAERO (http://www.quaero.org), ELDA (Evaluations and Language resources Distribution Agency) recrute plusieurs personnes pour participer ? la cr?ation d'un r?f?rentiel destin? ? ?valuer les performances des logiciels de reconnaissance d'Entit?s Nomm?es dans des corpus de texte. Les Entit?s Nomm?es (EN) sont des objets textuels (c'est-?-dire un mot, ou un groupe de mot) cat?gorisables dans des classes pr?d?finies (personnes, noms d'organisation, noms de lieux, quantit?s, distances, dates, etc.) La reconnaissance d'Entit?s Nomm?es est une sous-t?che essentielle des syst?mes d'extraction d'information dans des corpus documentaires. ELDA (www.elda.org) ------------------- Notre activit? principale est la distribution et la production de ressources linguistiques (bases de donn?es terminologiques, enregistrements vocaux, dictionnaires ?lectroniques, ...) et l??valuation de technologies de la langue. Lieu ---- Cette mission aura lieu dans les locaux d'ELDA ? Paris (13e). Mission ------- Annotation en Entit?s Nomm?es de documents textuels. La t?che consiste ? d?tecter les Entit?s Nomm?es dans le texte et de les ?tiqueter en se r?f?rant ? une liste de cat?gories pr?d?finies. La formation ? la t?che d?annotation et ? l?utilisation du logiciel est assur?e par ELDA. Profil recherch? ---------------- Langue maternelle: fran?ais. Une licence ou une ma?trise en sciences du langage ou en informatique (sp?cialit? traitement automatique de la langue ou en sciences de l'information et du document) est un plus. Un exp?rience de l?environnement LINUX est un plus. Dur?e ----- Mi-temps sur 2 mois. D?but: d?s que possible. Contact ------- J?r?my Leixa courriel: leixa at elda.org ELDA 55?57, rue Brillat Savarin 75013 Paris http://www.elda.org ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : http://www.atala.org/article.php3?id_article=48 English version : Archives : http://listserv.linguistlist.org/archives/ln.html http://liste.cines.fr/info/ln La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : http://www.atala.org/ ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Wed Oct 31 12:14:18 2012 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Wed, 31 Oct 2012 13:14:18 +0100 Subject: Journee: Consortium Corpus Ecrits, 23 et 24 novembre 2012, Paris Message-ID: [Formulaire d'inscription ? demander ? secretariat.ilf at ling.cnrs.fr ou secretariat-general at ling.cnrs.fr - TH] Date: Mon, 29 Oct 2012 17:19:04 +0100 From: Secretariat General Message-ID: <508EAC78.3060306 at ling.cnrs.fr> X-url: http://www.typologie.cnrs.fr X-url: http://www.ilf.cnrs.fr Ch?res Coll?gues, ChersColl?gues, Le consortium ? Corpus ?crits ? (Corpus-IR) organise sa r?union pl?ni?re annuelle _*le vendredi 24 novembre 2012*_, de 9h30 ? 18h,au Campus des Cordeliers (15, rue de l'Ecole de M?decine, 75006, Paris). Cette r?union sera consacr?e ? la *pr?sentation des activit?s des diff?rents groupes de travail du consortium* : 1. Usage des corpus et droits d'auteurs ou d'?diteurs (aspects juridiques) 2. Corpus d'?tats anciens de la langue (num?risation, codage) 3. Num?risation (OCR, saisie), correction 4. Pluralit? de syst?mes d'?criture 5. Corpus multilingues (parall?les, comparables...) 6. Description de corpus collaborative - m?tadonn?es 7. Corpus d'?crits modernes et prise en compte de nouveaux modes de communication (SMS, mail, blog, etc.) 8. Annotation de plus haut niveau : syntaxe, s?mantique, r?f?rence (annotations collaboratives) 9. Annotation de surface : segmentation lexicale, description morphosyntaxique, chunking, lemmatisation, entit?s nomm?es, etc. 10. Exploration de corpus (m?thodes, outils) 11. Qualit? scientifique et accessibilit? des corpus (place des corpus dans l'?valuation de la production scientifique des UR) Chaque pr?sentation sera suivie d'une discussion d'une vingtaine de minutes. Cette journ?e, tr?s importante pour l'avanc?e de la r?flexion commune sur les corpus ?crits, sera suivie, le lendemain, _*samedi 24 novembre*_, d'une *journ?e d'information et d'?changes sur les aspects juridiques de la propri?t? et de l'archivage des corpus*, dont le programme sera communiqu? tr?s prochainement. *La participation de toutes les personnes int?ress?es par ces journ?es est vivement encourag?e par le comit? de pilotage**, **qu'elles soient ou non inscrites ? un groupe de travail. * ** Si la participation ? ces journ?es est libre, *l'inscription est obligatoire*. Vous trouverez en pi?ce jointe le *formulaire d'inscription *? retourner *au plus tard le 12 novembre 2012 * au secr?tariat de l'Institut de Linguistique Fran?aise,institution gestionnaire du Consortium, qui, le cas ?ch?ant, prendra contact avec vous pour organiser votre mission. Le consortium contribuera au financement des missions des participants actifs des groupes de travail.** Au plaisir devous accueillir nombreux les *23 et 24 novembre prochains*. *Pour le comit? de pilotage du Consortium ? Corpus ?crits ? * *Franck Neveu, Directeurde l'ILF* ** *Le comit? de pilotage du Consortium ?Corpus ?crits ? : * ** *Franck Neveu *pour l'ILF,FR 2393 --*Porteur du consortium* *Sylvie Archaimbault* (suppl?ant *BernardColombat*) pour HTL--UMR7597-Universit? Denis Diderot - Paris 7 *Benoit Sagot *pour ALPAGE -- INRIA-Universit? Denis Diderot - Paris 7 *SergeHeiden *pour ICAR- UMR 5191 - Universit? Lumi?re Lyon 2 *Damon Mayaffre*(Suppl?ante *Mah? Ben Hamed*) pour BCL-UMR6039-Universit? Nice Sophia Antipolis *Jean-Marie Pierrel* pour l'ATILF - UMR 7118 -- Nancy - Universit? *Cl?ment Plancq* (suppl?ant *Olivier Bonami*) pour le LLF-UMR7110-Universit? Denis Diderot- Paris 7 *C?line Poudat *pour le LDI - UMR 7187-- Universit? de Paris 13 *Catherine Schnedecker* (suppl?ante *Amalia Todirascu*) pou rLILPA--EA1339--Universit? de Strasbourg *Agn?s Tutin* (suppl?ante *Marie-PauleJacques*) pour le LIDILEM-- EA 609 -- Universit? Grenoble 3 V?ronique BRISSET-FONTANA Secr?taire g?n?rale F?d?rations de Linguistique FR 2559 - www.typologie.cnrs.fr FR 2393 - www.ilf.cnrs.fr T?l. 01 43 13 56 45 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel