In big data paradigm, potentially a large number of data sources and data assets are considered for analytics. One needs to discover, integrate, and analyze large volumes of diverse data quickly. Finding relevant data for analytics is an important data discovery problem. Data diversity makes this problem difficult. The diversity of the data can be due to data model; type of data—structured, semi-structured, or unstructured; enterprise data vs. open public data; integrating social media data, etc. One also needs to handle data quality and data governance issues. In this workshop we invite demonstrations displaying techniques for identifying relevant sets of data, finding different kinds of relationships between structured, semi-structured, and unstructured data, curating the data for further analysis, integrating data using various join, union, and merge techniques, validating the integrated data, and analyzing it, from various industry domains. Topics of interest include (but are not limited to): - Cleaning big data - Integration of big heterogeneous data - Metadata extraction - Automated rule generation - Curating data - Data discovery - Provisioning and data lineage We welcome good demonstrations, including of previously accepted papers/demos, for this workshop. Authors need to send manuscript describing the demo in up to 2 pages (2 column format) inclusive of all references and figures. Manuscripts must be written in English, and formatted according to IEEE proceedings templates. Please see the workshop website for more details. Dans le cadre de notre développement constant, MyScript, ex-Vision Objects (Nantes, France) est à la recherche d'un: *Ingénieur Informaticien en Traitement Automatique des Langues (TAL)* ** Au sein du département R&D « MyScript Labs », vous serez amené à coordonner le développement de nouvelles langues et à être force de proposition surl'amélioration de l'existant. Votre double compétence informatique et linguistique associée à votre expérience en R&D vous permet d'assumer avec succès les missions suivantes : * Création et maintenance des ressources linguistiques et gestion de leur intégration dans les moteurs de reconnaissance d'écriture manuscrite. * Recherche et développement sur les modèles de langage (statistiques, syntaxiques ou sémantiques), et leurs applications aux interfaces homme-machine. * Participation au processus de collecte d'échantillons d'écriture manuscrite. * En lien avec l'équipe Support, étude et analyse de cas d'usage clients. Les candidats intéressés qui seront à TALN sont invités à se présenter au stand MyScript. *Profil* De formation supérieure (Ingénieur, Master2 ou Doctorat), vous avez une expérience minimum de 3 ans en TAL ou dans un domaine proche (intelligence artificielle, reconnaissance des formes, apprentissage statistique...). Votre maîtrise d'au moins un langage de programmation utilisé en TAL (par exemple Perl ou Python) vous permet d'être complètement autonome sur toutes les tâches techniques. Rigoureux, dynamique, déterminé et d'un relationnel facile, vous saurez rapidement vous intégrer au sein des équipes et démontrer le leadership et l'expertise nécessaires à la réussite de votre mission. Anglais courant impératif. Les éléments suivants seraient considérés comme des plus : maîtrise de C, C++ ou Java, maîtrise d'outils de scripting (bash, commandes Unix/Linux), expérience en automatisation/industrialisation des chaînes de traitements TAL, connaissance d'une ou plusieurs langues étrangères. Dans un contexte international, au sein d'une société en fort développement, vous souhaitez rejoindre une équipe dynamique sur des projets de haute technologie. Au sein de MyScript, vous pourrez identifier les applications directes et concrètes de votre travail et intégrerez une structure à taille humaine qui valorise la créativité, les initiatives, le partage d'expérience et la convivialité. Avec plus de 90% de son CA à l'international, et plus de 100millions d'utilisateurs dans le monde, MyScript est un éditeur de logiciels leader mondial sur le marché des interfaces homme-machine basées sur la reconnaissance d'écriture manuscrite. Disponibles dans plus de 85 langues, ses produits concernent notamment les marchés de la mobilité (saisie de texte, prise de notes, ...),de l'éducation (apprentissage de l'écriture, des mathématiques, de la géométrie,...) de l'entreprise (prise de notes et traitement de formulaires), et de l'automobile (saisie de texte à partir d'un touch pad, interaction avec GPS).Le coeur de sa technologie est diffusé sous forme de kit de développement logiciel (SDK) ou sous forme d'applications. Le moteur de reconnaissance de MyScript se classe régulièrement aux premières places des compétitions scientifiques internationales. Poste basé à Nantes Contact : job at ------------------------------------------------------------------------ As part of our constant development, MyScript (Nantes, France) is looking for a: *Natural Language Processing (NLP) Engineer* Within the R&D department "MyScript Labs", you will coordinate the development of new languages and be proactive about improving current processes. Taking advantage of your double Computer Science/Linguistics background, you will complete the following tasks: * Creation and maintenance of language resources and supervision of their integration into the handwriting recognition engine. * Research and development on language models (statistical, syntactic or semantic), and their application to human-machine interfaces. * Participation in the process of collecting handwriting samples. * In sync with the Support team, study and analysis of customer use cases. *Profile* Holding a Masters or a PhD, you have extensive experience in NLP or in a related field (artificial intelligence, pattern recognition, machine learning...). Your hands-on experience of at least one programming language used in NLP (e.g. Perl or Python) gives you autonomy in completing all your technical tasks. Rigorous, dynamic, engaged and team-oriented, you will demonstrate the leadership and the expertise required for the success of your mission. Fluent English is mandatory. Any of the following would be a plus: knowledge of C, C++ or Java, good control of scripting tools (bash, Unix/Linux tools), experience with automation of NLP processing chains, knowledge of French, fluency in other languages. In an international context, in a fast-growing company, you want to join a dynamic team and work on high-tech projects. Within MyScript, you can identify the direct and concrete applications of your work and join a human structure that values creativity, initiatives and experience sharing. MyScript is the leading worldwide provider of handwriting recognition technology with more than 100 million users. Available in more than 85 languages, MyScript covers a broad range of market applications, including Mobility (smartphones, tablets), Education (interactive digital whiteboard), Enterprise (notes management and form processing) and Automotive (GPS steering for example). The heart of its technology is released as software development kit (SDK) or in the form of applications. The recognition engine is consistently ranked in the first places in international scientific competitions. Il est donc important pour nous de toujours mieux connaître cette communauté mais également ses attentes vis à vis de l'association. C'est pourquoi, nous souhaiterions que chacun d'entre vous prenne quelques minutes pour répondre au questionnaire que nous avons mis en place à cette fin. En vous remerciant par avance pour vos réponses. Amicalement. Vous pouvez maintenant lire les textes suivants, les textes nouvellement en ligne sont signalés par **. *Rubrique Recherche* - Mickaël Roy Sentiment de présence et réalité virtuelle pour les langues – Une étude de l'émergence de la présence et de son influence sur la compréhension de l'oral en allemand langue étrangère *Rubrique Pratique et recherche* ** - Claire Chaplier et Élisabeth Crosnier Dimension et autonomisation psycho-affectives dans deux dispositifs hybrides – Études de cas en master 2 *Rubrique Analyse de livres* - Emmanuelle Artault Duchiron, Monique Marneffe et Christian Ollivier Analyse de Vers l'intégration des TIC dans l'enseignement des langues de Nicolas Guichon ** - Annick Rivens Mompean Analyse de Didactique des langues et technologies – De l'EAO aux réseaux sociaux de Muriel Grosbois *Séminaire Numérique et langues * - Isabelle Salengros-Iguenane Le séminaire numérique et langues – Vision d'ensemble - Françoise Demaizière et Muriel Grosbois Numérique et enseignement-apprentissage des langues en Lansad – Quand, comment, pourquoi? **- Isabelle Salengros-Iguenane Internet pour une approche culturelle - Laurence Vincent-Durroux et Cécile Poussard Conception et utilisation d'un logiciel pédagogique, l'exemple de Macao ** - Eva Schaeffer-Lacroix Utiliser des corpus numériques avec un public Lansad Les vidéos du séminaire sont disponibles sur le compte UM3 de Canal-U. Les propositions sont à envoyer à Daniel Roulland (daniel.roulland at Le prochain numéro de la revue paraîtra fin juillet, sur un nouveau site après intégration au portail Bien cordialement Gilles Col Directeur éditorial de Corela ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Jul 1 19:45:47 2014 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 1 Jul 2014 21:45:47 +0200 Subject: Appel: Enjeux, methodes et outils pour ameliorer la redaction des textes techniques Message-ID: date: Tue, 01 Jul 2014 11:11:37 +0200 from: "Patrick Saint-Dizier" message-id: <626e-53b27b80-5-63e43800 at 89187629> Première annonce : Séminaire-Atelier Enjeux, méthodes et outils pour améliorer la rédaction des textes techniques IRIT, Université Paul Sabatier, Toulouse, 13-14 Novembre 2014 Objectifs La rédaction technique est un secteur en pleine expansion du fait, entre autres, de la complexité des produits commercialisés et des processus industriels, des exigences croissantes en sécurité et du développement des approches en spécification (exigences, règles métier). Les tâches dévolues aux rédacteurs techniques deviennent de plus en plus lourdes. Celles-ci incluent une prise en compte plus importante des interactions avec le métier et les opérateurs, mais aussi la prise en compte des contraintes réglementaires et un accroissement important en matière de qualité de rédaction. Cette qualité de rédaction est d’une nature différente des contrôles offerts par les éditeurs de textes classiques et nécessite une démarche particulière. L’objectif de ce séminaire est de faire se rencontrer les différents métiers de la rédaction technique, les chercheurs, les enseignants, et les industriels qui développent des outils d’aide à la rédaction technique afin d’approfondir une connaissance réciproque des métiers de la rédaction ainsi que les possibilités d’aide que peuvent offrir des systèmes avancés en matière de traitement de la langue, intelligence artificielle et ergonomie cognitive. Le séminaire consistera en présentations, démonstrations et études de cas, dans le but de promouvoir des liens entre métiers et de développer de nouvelles synergies. Thèmes (non exhaustifs) - Pratiques de la rédaction technique (en linguistique, ergonomie, psycholinguistique), protocoles d’analyse de ces pratiques - Le document technique : aspects épistémologiques, fonctionnels, linguistiques, conceptuels. Le document technique, ses évolutions et ses challenges, incluant la dimension multimédia - Théories et recommandations de la rédaction technique (par ex. minimalisme, différentes déclinaisons de la langue contrôlée (par ex. par rapport au domaine, aux tâches, au type de doc technique)), normes métier, - Les attentes et besoins de la rédaction technique, le fonctionnement de la salle de rédaction ; rédaction et cycles de vie des documents, la relation rédacteur-opérateur (par ex. le REX), - La rédaction technique dans une langue étrangère au rédacteur (anglais), - Rédaction technique et maitrise des risques industriels : modèles mathématiques, psychologiques, etc., - Les challenges en analyse de la cohérence et de la cohésion des documents, - Les systèmes d’aide à la rédaction technique au niveau de la langue (Rat-RQA, Rubric, Lelie, etc.), et au niveau fonctionnel (Scenari, plateformes basées XML), - Méthodes en correction des erreurs : stratégies et processus, méthodes linguistiques, capitalisation des corrections, mémoire de correction, - Démonstrations de systèmes. Participation : gratuite. S’inscrire avant le 15 Octobre. Pour une intervention, soumettre un résumé d’une page en Word, format libre, avant le 15 septembre. Réponse le 30 Septembre. Les industriels auteurs de solutions de rédaction ainsi que les équipes de rédacteurs sont encouragées à participer. This interdisciplinary initiative is a response to the growing popularity of Digital Humanities and an increased tendency to apply computer techniques for supporting and facilitating research in Humanities. Nowadays, due to the increasing activities in digitizing and opening historical sources, the Science of History can greatly benefit from the advances of Computer and Information sciences which consist of processing, organizing and making sense of data and information. As such, new Computer Science techniques can be applied to verify and validate historical assumptions based on text reasoning, image interpretation or memory understanding. Our objective is to provide for the two different research communities a place to meet and exchange ideas and to facilitate discussion. We hope the workshop will result in a survey of current problems and potential solutions, with particular focus on exploring opportunities for collaboration and interaction of researchers working on various subareas within Computer Science and History Sciences. The main topics of the workshop are that of supporting historical research and analysis through the application of Computer Science theories or technologies, analyzing and making use of historical texts, recreating past course of actions, analyzing collective memories, visualizing historical data, providing efficient access to large wealth of accumulated historical knowledge and so on. The detailed topics of expected paper submissions are (but not limited to): - Natural language processing and text analytics applied to historical documents - Analysis of longitudinal document collections - Search and retrieval in document archives and historical collections, associative search - Causal relationship discovery based on historical resources - Named entity recognition and disabmiguation - Entity relationship extraction, detecting and resolving historical references in text - Finding analogical entities over time - Computational linguistics for old texts - Analysis of language change over time - Digitizing and archiving - Modeling evolution of entities and relationships over time - Automatic multimedia document dating - Applications of Artificial Intelligence techniques to History - Simulating and recreating the past, social relations, motivations, figurations - Handling uncertain and fragmentary text and image data - Automatic biography generation - Mining Wikipedia for historical data - OCR and transcription old texts - Effective interfaces for searching, browsing or visualizing historical data collections - Studies on collective memory - Studying and modeling forgetting and remembering processes - Estimating credibility of historical findings - Probing the limits of Histoinformatics - Epistemologies in the Humanities and Computer Science Full paper submissions are limited to 10 pages, while short paper submissions should be less than 5 pages. Submissions should be sent in English in PDF via the submission website. They should be formatted according to Springer LNCS paper formatting guidelines. They must be original and have not been submitted for publication elsewhere. Submissions will be evaluated by at least three different reviewers who come from Computer Science and History Science backgrounds. The accepted papers will be published by Springer Lecture Notes in Computer Science (LNCS). See website for more details. --------------------- ---Important dates-- - --------------------- - Paper submission deadline: September 1, 2014 (23:59 Hawaii Standard Time) - Notification of acceptance: September 25, 2014 - Camera ready copy deadline: October 1, 2014 (23:59 Hawaii Standard Time) - Workshop date: Nov 10, 2014 -------------------------- ---Organizing Committee--- -------------------------- - Adam Jatowt (Kyoto University, Japan) - Gaël Dias (Normandie University, France) - Marten Düring (Centre for European Studies, Luxemburg) - Antal van Den Bosch (Radboud University Nijmegen, The Netherlands) -------------------------- ---Scientific Committee-- - -------------------------- - Robert Allen (Yonsei University, South Korea) - Frederick Clavert (Paris Sorbonne University, France) - Antoine Doucet (Normandie University, France) - Roger Evans (University of Brighton, United Kingdom) - Christian Gudehus (University of Flensburg, Germany) - Pedro Rangel Henriques (Minho University, Portugal) - Pim Huijnen (Utrecht University, The Netherlands) - Nattiya Kanhabua (LS3 Research Center, Germany) - Tom Kenter (University of Amsterdam, The Netherlands) - Mike Kestemont (University of Antwerp, Belgium) - Günter Mühlberger (University of Innsbruck, Austria) - Andrea Nanetti (Nanyang Technological University, Singapore) - Daan Odijk (University of Amsterdam, The Netherlands) - Marc Spaniol (Max Planck Institute for Informatics, Germany) - Shigeo Sugimoto (University of Tsukuba, Japan) - Nina Tahmasebi (Chalmers University of Technology, Sweden) - Lars Wieneke (Centre for European Studies, Luxemburg) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Jul 1 19:42:13 2014 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 1 Jul 2014 21:42:13 +0200 Subject: Job: Contrat post-doctoral, Humanites numeriques Message-ID: Date: Mon, 30 Jun 2014 18:41:14 +0200 From: Jean-Gabriel Ganascia Message-ID: <53B1932A.7080704 at> X-url: *INTITULE DU POSTE : chercheur en humanités numériques* /Catégorie : contrat post-doctoral d'un an//Prise de fonction : /1 octobre 2014/ Structure de référence : *LABEX OBVIL* Localisation géographique : *Université Paris-Sorbonne, Maison de la recherche, 28 rue Serpente, 75006 Paris*** Rattachement hiérarchique au sein de la structure : *Directeur du LABEX OBVIL, Didier Alexandre* N+1 de l'agent : *Tuteur ou Co-tuteur de la recherche post-doctorale* Quotité de travail inhérente au poste : *100%* *Missions du Service* Le Laboratoire d'excellence OBVIL s'inscrit dans le cadre de la COMUE Sorbonne-Universités et réunit des chercheurs appartenant à 7 équipes d'accueil, à 2 UMR (unités mixtes université/CNRS) et à un programme transversal de l'UMS de la Maison de la Recherche de Paris-Sorbonne. Il regroupe des enseignants-chercheurs et des chercheurs des universités Paris-Sorbonne et Pierre-et-Marie-Curie, les uns spécialistes de littérature, les autres de sciences cognitives et d'informatique. Il entend développer toutes les ressources offertes par les applications informatiques et le numérique pour examiner aussi bien la littérature française du passé que la plus contemporaine. Il s'intéresse aussi à l'étude des traductions, des transpositions, des adaptations pour comprendre les phénomènes de transmission et la manière dont se constituent les canons. Il recrute un jeune docteur qui sera chargé, dans le domaine de la littérature française des XIXème et XXème siècles, du développement de recherches et d'outils numériques en ontologie et/ou cartographie et/ou lexicographie et/ou alignement de textes et/ou stylistique appliqués à des corpus littéraires. On consultera le site du labex OBVIL qui présente ces projets : *Fonctions de l'agent* Participer au projet de recherche en humanités numériques développé par le labex OBVIL (numérisation, valorisation, problématisation numérique) : - A partir d'un projet de littérature française et de la problématisation du corpus, développer des outils numériques - Contribuer à la conception d'éditions numériques savantes (EPUB) Activités de l'agent - Dans un contrat d'un an, mener à terme le contrat sous la double direction de deux tuteurs, rattachés à une Ecole doctorale de l'Université Paris-Sorbonne pour le domaine littéraire, et rattachés à une Ecole doctorale de l'Université Pierre-et-Marie-Curie pour le domainedu numérique. - Assurer des missions d'information et de formation en humanités numériques auprès des étudiants en Master et en Doctorat de la COMUE Sorbonne-Universités - Contribuer au développement des outils numériques en relation étroite avec les ingénieurs d'étude du LABEX, les ingénieurs et chercheurs du LIP6 - Participer à l'organisation des séminaires de recherche du LABEX Compétences - Etre titulaire d'un doctorat obtenu avec mention Très Bien dans les disciplines littéraires et/ou informatiques. - Avoir une éventuelle expérience d'enseignement (par exemple tutorat et /ou monitorat). - Avoir la capacité à s'intégrer à une équipe de recherche et à travailler en équipe, sur plusieurs sites. *Le dossier de candidature réunit un CV, un diplôme de Master 2 recherche, un projet scientifique interdisciplinaire, une lettre de motivation, éventuellement des lettres de recommandation et deux lettres de l'un et l'autre co-directeur. Deadline is tomorrow July 2nd 2014 Click here to Register Register for Main Conference, 1 or 2 day Workshops and half day Tutorials! ********************************************** Conference Programme Monday - August 25th 09:00-10:15 Invited Speaker: Mary Harper, IARPA Learning from 26 languages: Program Management and Science in the Babel Program 10:45-12:25 Modeling of Discourse and Dialogue I 10:45-12:25 Sentiment Analysis, Opinion Mining and Social Media I 10:45-12:25 Information Retrieval and Question Answering 10:45-12:25 Machine Learning for CL and NLP 15:45-17:25 Modeling of Discourse and Dialogue II 15:45-17:25 Sentiment Analysis, Opinion Mining and Social Media III 15:45-17:25 Semantic Processing, Distributional Semantics and Compositional Semantics I 15:45-17:25 Software, Tools Tuesday - August 26th 09:00-10:15 Invited Speaker: Ted Gibson, MIT Language for communication: Language as rational inference 10:45-12:25 Syntax, grammar induction, syntactic and semantic parsing I 10:45-12:25 Sentiment Analysis, Opinion Mining and Social Media III 10:45-12:25 Applications I 10:45-12:25 Modeling of Discourse and Dialogue III 15:45-17:25 Syntax, grammar induction, syntactic and semantic parsing II 15:45-17:25 Semantic Processing, Distributional Semantics and Compositional Semantics II 15:45-17:25 Applications II 15:45-17:25 Language Resources Wednesday - August 27th Exursion Day Thursday - August 28th 09:00-10:15 Invited Speaker: Qun Liu, CNGL/DCU Annotation Adaptation and Language Adaptation in NLP 10:45-12:25 IE/database linking I 10:45-12:25 Lexical Semantics and Ontologies I 10:45-12:25 Natural Language Generation and Summarization I 10:45-12:25 Modeling of Discourse and Dialogue IV and Multimodal Processing 14:00-15:15 Semantic Processing, Distributional Semantics and Compositional Semantics III 14:00-15:15 Morphology, word segmentation, tagging and chunking I 14:00-15:15 Speech Recognition, Text-To-Speech, Spoken Language Understanding 14:00-15:15 Lesser Resourced Languages 15:45-17:25 Syntax, grammar induction, syntactic and semantic parsing III 15:45-17:25 Machine Translation I 15:45-17:25 Linguistic and Cognitive Issues in CL and NLP I 15:45-17:25 Natural Language Generation and Summarization II and Paraphrasing Friday - August 29th 09:00-10:15 Invited Speaker: Martin Kay, XEROX Does a Computational Linguist have to be a Linguist? 10:45-12:25 Machine Translation II 10:45-12:25 IE/database linking II 10:45-12:25 Linguistic and Cognitive Issues in CL and NLP II 10:45-12:25 Lexical Semantics and Ontologies II 14:00-15:15 Machine Translation III 14:00-15:15 Lexical Semantics and Ontologies III 14:00-15:15 IE/database linking III 14:00-15:15 Morphology, word segmentation, tagging and chunking II 15:45-17:25 Best Paper Talk and Closing The conference committee and organisers take no responsibility for changes or inaccuracies to the conference programme. The above programme is subject to change. ********************************************* Accommodation Don’t forget to book your accommodation at time of registering. Rooms are limited on campus and early booking is advisable! To view accommodation options click here Just book on the registration form at the same time as your registration. ********************************************* Ireland Inspires! Cordialement Laurent Besacier Dernier appel à communications: numéro spécial sur le traitement automatique du langage parlé pour la revue TAL (Traitement Automatique des Langues) ----ENGLISH VERSION OF THIS CFP CAN BE FOUND AT THE END OF THIS MESSAGE------ possibilité de soumettre jusqu'au 15 juillet! Direction : Laurent Besacier, Wolfang Minker Date limite : 30 juin 2014 (possibilité de mettre à jour article jq'au 15 juillet) La communication orale reste le moyen le plus naturel pour dialoguer et interagir (avec la machine ou avec une autre personne). Le traitement automatique du langage parlé (TALP) et le dialogue trouvent désormais de nombreuses applications directes dans des domaines divers tels que (liste non exhaustive) la recherche d'information, l'interaction en langue naturelle avec des dispositifs mobiles, la robotique sociale, les technologies d'assistance à la personne, l'apprentissage des langues, etc. Cependant, le TALP pose des problèmes spécifiques liés à la nature même du matériau traité. En effet, on est amené à traiter des énoncés de parole plus ou moins spontanée et contenant de nombreux traits paralinguistiques. Par exemple, la présence de disfluences orales (répétitions, reprises, incises...) réduit la régularité syntaxique des énoncés ; les énoncés oraux sont également riches d'informations liés aux affects, etc. Par ailleurs, l'étape de transcription automatique, souvent nécessaire avant l'application de traitements de plus haut niveau (compréhension, traduction, analyse, etc.) rend des sorties bruitées (contenant des erreurs) qui nécessitent des analyses robustes et un couplage étroit entre étapes de traitement. Nous invitons donc les contributions portant sur tout aspect (théorique, méthodologique et pratique) relatif au traitement automatique du langage parlé et à la communication orale, et en particulier (liste non exclusive) : - Reconnaissance automatique de la parole - Compréhension automatique de la parole - Traduction de parole - Synthèse de la parole - Dialogue oral homme - machine - Analyse robuste de la langue parlée - Analyse des affects sociaux ou des émotions dans des énoncés oraux - Fouille de documents à composante orale - Applications à composantes orales (recherche d'information, interaction, robotique, etc) - Outils d'aide à l'apprentissage d’une langue seconde - Aspects multilingues du traitement automatique du langage parlé - Evaluation de systèmes de traitement du langage parlé - Corpus et ressources pour l'oral - Analyse du discours oral - Dialogue adaptatif au contexte et au profil de l'utilisateur - Analyse des traits paralinguistiques dans des énoncés oraux ÉDITEURS INVITÉS Laurent Besacier Wolfang Minker COMITE SCIENTIFIQUE ADDA Gilles LIMSI, Paris ANTOINE Jean-Yves U. F. Rabelais, Tours AUBERGE Véronique LIG, Grenoble BELLEGARDA Jérôme APPLE, USA BONNEAU-MAYNARD Hélène, LIMSI, Orsay CERISARA Christophe LORIA, Nancy CERNOCKY Jan Univ. Brno, Tcheque Republic DAMNATI Géraldine Orangs Labs, Lannion DEVILLERS Laurence LIMSI, Orsay DUTOIT Thierry TCTS, Mons ESTEVE Yannick LIUM, Le Mans ESKENAZI Maxine CMU, Pittsburgh FAVRE Benoit LIF, Marseille FERRANE Isabelle IRIT, Toulouse GRAVIER Guillaume IRISA, Rennes JOUVET Denis LORIA, Nancy KAHN Juliette LNE, Paris LECOUTEUX Benjamin LIG, Grenoble LEFEVRE Fabrice LIA, Avignon LINARES Georges LIA, Avignon MEIGNIER Sylvain LIUM, Le Mans PIETQUIN Olivier Univ. Lille 1 POPESCU-BELIS Andrei IDIAP, Martigny ROSSET Sophie LIMSI, Orsay LANGUE Les articles sont écrits en français ou en anglais. Les soumissions en anglais ne sont acceptées que pour les auteurs non francophones. FORMAT DE LA SOUMISSION Les articles doivent être déposés sur la plateforme La revue ne publie que des contributions originales, en français ou en anglais. Les papiers acceptés feront au maximum 25 pages en PDF. Le style est disponible pour téléchargement sur le site du journal TAL CONTACT Laurent Besacier (Laurent.Besacier at Wolfgang Minker (Wolfgang.Minker at ==========CFP IN ENGLISH============ Special issue on spoken language processing Guest editors: Laurent Besacier, Wolfgang Minker Speech is the most natural way to communicate and interact (with the machine or with another person) . Spoken language processing and dialogue have now many direct applications in various areas such as (but not limited to) information retrieval, natural language interaction with mobile devices, social robotics, assistive technologies, technologies for language learning, etc. . However, spoken language processing poses specific problems related to the nature of the speech material itself. Indeed, spontaneous speech utterances have to be processed and they contain many paralinguistic features. For instance, disfluencies (repetitions , false starts, etc.) reduces the syntactic regularity of utterances. Moreover, spontaneous utterances convey rich information related to emotions , etc. Furthermore, automatic speech recognition (ASR) step, often required before the application of higher level processing (understanding , translation, analysis, etc.), produces noisy outputs (with errors ) which require robust and tight coupling between modules. We invite contributions on any aspect (theoretical, methodological and practical) of spoken language processing and oral communication ; in particular (non-exclusive list): - Automatic speech recognition - Spoken language understanding - Speech translation - Text-to-Speech synthesis - Man-machine dialogue - Robust analysis of spoken language - Analysis of social affects or emotions in spontaneous speech - Mining spoken language documents - Spoken language applications (mobile interaction, robotics, etc. ) - Technologies for language learning - Multilingual aspects of spoken language processing - Evaluation for spoken language processing - Corpora and resources for spoken language - (Spoken) discourse analysis - Adaptive dialogue (context, user profile) - Analysis of paralinguistic features in spoken language IMPORTANT DATES - call : march 2014 - submission of contributions : 30 june 2014 (possibility to update paper until July 15th) - first authors notification : 15 september 2014 - publication : end 2014 / begin 2015 Submission format LANGUAGE Manuscripts may be submitted in English or French. French-speaking authors are requested to submit their contributions in French. PAPER SUBMISSION Papers must describe original, completed, and unpublished work. Each submission will be reviewed by two programme committee members. Papers must be submitted on Sciencesconf platform Accepted papers will be maximum 25 pages long in PDF. A cette occasion, l'ATALA procède au renouvellement (par tiers) du Conseil d'Administration et sollicite les bonnes volontés pour participer activement au développement de l'Association. L'ATALA deviendra ce que vous en ferez ! Les membres du tiers sortant sont rééligibles, mais ils doivent en exprimer le souhait. L’entrée au CA est ouverte à tout membre de l’ATALA. Les personnes souhaitant se porter candidates sont priées de le signaler par courrier électronique à contact(@) avant le mardi 01 juillet 21h. Les candidatures seront affichées sur le site au fur et à mesure de leur réception. Nous attirons l’attention sur le fait que l’ATALA souhaite accueillir en son CA des personnes motivées et prêtes à s’investir dans les activités administratives de l’association. Pour ceux qui ne sont pas familiers de ses travaux, qui concernent les mathématiques, la logique, la linguistique et la physique: Michael Barr barr at 2014-06-24 01:55:48 GMT I regret to inform you all that Jim died this afternoon. His son says it was congestive heart failure which is as good a way as any to describe dying of old age. He was still coming to seminar last fall and celebrated his 91st birthday in December in pretty good shape, but has been gradually going downhill since. I don't believe he came to the office since late fall. He had a good run. En ce qui concerne la linguistique informatique, il est surtout connu pour "the mathematics of sentence structure" (1958) qui établit un lien profond entre grammaire formelle et logique, les grammaires catégorielles, les grammaires de types logiques. Depuis la fin des années 90, il travaillait sur les grammaires de prégroupes. De nombreuses personnes ont travaillé sur le calculd e Lambek: J-P Desclés, M. Eytan, A. Lecomte, et plus récemment les équipes comme Calli/Sémagramme au LORIA, Signes à Bordeaux, LALLIC à Paris 4, et bien sûr les gens de Montréal. D'autres, comme Alain Lecomte ou François Lamarche, sont sans probablement tout autant habilités que je le suis à publier cette annonce. J'avais fait sa connaissance en 1988 à un congrès de logique catégorique à l'unviersité Paris 7, puis à Urbino, nous l'avions invité aux conférences LACL. En 2001 l'avais invité à un workshop sur l'apprentissage des grammaires catégorielles à Nantes où il s'était cassé le bras, et depuis il hésitait à traverser l'océan pour nous rendre visite (il avait envoyé son rapport sur mon habilitation, mais avait décliné l'invitation). URL: From hamon at LIMSI.FR Tue Jul 8 15:42:12 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:42:12 +0200 Subject: Appel: Workshop TOTh, December 2014 Message-ID: Date: Wed, 2 Jul 2014 17:43:22 +0200 From: Luc Damas Message-ID: X-url: Workshop TOTh 2014 ------------------------------------------------------------------------ The 2014 TOTh Workshop is organised by the Royal Museums of Art and History of Brussels within the scope of the European Project AthenaPlus ( ------------------------------------------------------------------------ Title: Multilingual Thesaurus and Terminology ------------------------------------------------------------------------ Brussels - December 5th 2014 The Cinquantenaire Museum Parc du Cinquantenaire 10 The ever-increasing amount of open data and linked data raises questions concerning its access in a multilingual context. Due to the diversity of the collections, of the institutions that manage them, of the public that has access to them and of the technologies currently available, it is necessary to rethink the notions of thesaurus and terminology, as well as the ways to manage and access these collections. Some of the topics that will be covered by the TOTh 2014 Workshop include (this list is not exhaustive): Principles, Theories and Methods: Thesauri, Terminology, Ontology, Controlled Vocabulary, Semantic Network; Embracing and Managing Multilingualism; Mapping, Alignment and Harmonisation of terminologies, thesauri, ontologies; Indexing and Research; The compatibility of the ISO and W3C Standards concerning terminology, thesauri, knowledge systems and interchange format; Impact and contributions from new domains and technologies connected to Knowledge Engineering and the Semantic Web; Software Environments. Special attention will be given to the issue of cultural content management. Submission: Abstracts of one or two pages must be sent to: workshop-toth at Official languages: English and French Deadline for submission: September 30th 2014 Information: Eva Coudyzer e.coudyzer at Scientific coordination : C. Roche, R. Costa, E. Coudyzer From hamon at LIMSI.FR Tue Jul 8 15:40:44 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:40:44 +0200 Subject: Appel: SIMBig'2014, Extended Deadline Message-ID: Date: Wed, 2 Jul 2014 12:24:57 +0200 From: Juan Lossio Message-ID: X-url: =========================================================== EXTENDING THE DEADLINE FOR SUBMISSION - SIMBig 2014 =========================================================== SIMBig'2014 - 1st Symposium on Information Management and Big Data 8-10 September 2014 - Cusco, PERU Paper/Demos Submission Deadline: extended till July 11, 2O14 simbig2014 at =========================================================== CALL FOR PAPERS =========================================================== The first edition of the International Symposium on Information Management and Big Data SIMBig 2014, aims to bring together main national and international actors in the decision-making field to state in new technologies dedicated to handle large amount of information. SIMBig 2014 first edition will be held in Cusco, Peru during three days of conference. The city renowned for its architecture, history and sincere culture, you will undoubtedly appreciate during our scheduled guided tours. On behalf of the Scientific Program Committee, we have great pleasure in inviting you to submit one or more papers (for oral or poster presentation) in accordance with the instructions that are provided in Paper Submission Guidelines =========================================================== Important dates: - Paper Submission Deadline: extended till July 11, 2O14 - Exhibitions and Demos Submission Deadline: extended till July 11, 2O14 - Notification of Acceptance: July 31, 2014 - Final Paper Submission Deadline: August 15, 2014 - Simposium: September 8-10, 2014 =========================================================== Scope and Topics Authors are invited to submit original and innovative papers that break new ground, present insightful results based on your experience in Data Management and Big Data. SIMBig2014 has a broad scope, and specific topics of interest include (but are not limited to): Big Data Management Big Data Applications Text Analytics Information Retrieval Data mining OLAP and MDA Models Text mining Semantic Web Linked Data for data pre-processing: cleaning, sorting, filtering or enrichment Linked Data applied to Machine Learning Decision Support Systems Data warehousing Information management Business intelligence Data management Semi-structured and Unstructured Data Data governance Outsourcing Social media/Collaboration Spatiotemporal data Information Services and Resources Open Data Natural Language Processing Strategic uses of information systems Information technology management =========================================================== Submission guidelines: The paper must follow IEEE two-column format with single-spaced, 10 point font in the text The document should be formatted for the standard A4-size paper Papers must be submitted only in portable document format otherwise known as PDF The paper length should be between 4 to 8 pages (including references and figures) Follow the instructions in Word document and Latex templates (ACL templates) =========================================================== From hamon at LIMSI.FR Tue Jul 8 16:09:56 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:09:56 +0200 Subject: Appel: Posters COLDOC 2014, Diversite des langues Message-ID: Date: Mon, 7 Jul 2014 15:12:27 +0200 From: ColDoc 2014 Message-ID: X-url: ===================================================== COLDOC est le colloque annuel organisé par les doctorants et jeunes chercheurs en Sciences du Langage du laboratoire MoDyCo (UMR 7114 – CNRS/Université Paris Ouest Nanterre/Université Paris Descartes). Il aura lieu le 13 et 14 novembre 2014 à l'Université Paris Ouest Nanterre la Défense. Cette année, nous nous intéressons à la diversité et aux contacts des langues et souhaitons traiter les questions qu'elle soulève en termes de classement typologique et de réflexion sur les universaux linguistiques. L'appel à communication est désormais clos, mais l'appel à posters restera ouvert jusqu'au 7 août 2014. Les auteurs recevront leur notification d'acceptation ou de refus le 10 août. Ces communications donneront lieu à la publication d'un article court dans les actes en ligne du colloque. Les modalités de soumission sont en ligne sur le site (voir la section "Posters"). Nous demandons un projet explicatif d'une page. Le colloque Sénélangues 2015- Langues d’Afrique de l’ouest se tiendra les 24 et 25 avril à l’Université Cheikh Anta Diop de Dakar. Thématique du colloque Stimulée par divers projets collaboratifs soutenus par différentes agences ou fondations, la description des langues d’Afrique a pu bénéficier, au cours des dernières décennies, des développements récents des bonnes pratiques et des ressources informatiques en matière d’analyse linguistique, typologique et documentaire. En ouvrant la problématique de la description linguistique à toute l’Afrique de l’ouest, l’objectif de ce colloque est de permettre aux linguistes qui travaillent sur les langues de cette région de se rencontrer pour faire le point sur leurs avancées scientifiques, partager leurs connaissances, leur savoir-faire et leurs interrogations, et d’accroître ainsi les connaissances sur les langues de cette région. Les contributions attendues doivent porter sur des langues vernaculaires d’Afrique de l’ouest (créoles inclus), sans exclure toutefois la description des phénomènes de contact avec des langues d’autres familles. Tous les niveaux de l’analyse linguistique (phonologie, morphologie, syntaxe, sémantique, énonciation et pragmatique) pourront être abordés. Conférences plénières Denis Creissels, Université Lumière Lyon 2 Felix Ameka, Universitéde Leiden Modalités de soumission des communications Les communications pourront se faire sous forme orale (durée 20mn suivies de 10mn de discussion) ou sous forme de poster (dimensions recommandées Format A0, H : 1,20 m - L : 0.80 m) dans le cadre d’une session spéciale (par choix des proposants ou décision des membres du comité de sélection). Dans les deux cas, les consignes pour l’envoi des propositions sont les suivantes: - le résumé doit faire un maximum d’une page (titre, exemples et références compris), en Times 12 (simple interligne) - il doit être envoyé anonymisé et aux formats rtf et pdf à l’adresse suivante: senelangues2015call at - le nom du fichier pdf comportera simplement quelques mots clefs du titre de la communication - sujet du message: communication Senelangues 2015 - dans le corps du texte du message, indiquer: nom, prénom, affiliation, adresse mail, titre de la proposition, format souhaité (poster vs. oral) - les langues de la conférence sont le français, l’anglais et le portugais Adresse pour les soumissions et contact senelangues2015call at Calendrier Date limite d’envoi des résumés:15 novembre 2014 Notification aux auteurs: 15 janvier 2015 Lieu de la conférence Université de Cheikh Anta Diop, Dakar, Sénégal Comité Scientifique Felix Ameka Université de Leiden Larry Hyman U.C. Berkeley Valentin Vydrine INALCO, LLACAN, Paris Martine Vanhove LLACAN, CNRS & INALCO, Paris Koen Bostoen Ghent University Jérémie Kouadio N'Guessan Université de Cocody Comité d’organisation Sylvie Voisin DDL, CNRS & Université d’Aix Marseille Stéphane Robert LLACAN, CNRS & INALCO, Paris Alain-Christian Bassène FLSH UCAD, Dakar Denis Creissels DDL, CNRS & Lyon 2 Thierno Cissé FLSH UCAD, Dakar Noël Bernard Biagui CLAD UCAD, Dakar Nicolas Quint LLACAN, CNRS & INALCO, Paris Jeanne Zerner LLACAN, CNRS & INALCO, Paris Anna Marie Diagne IFAN UCAD, Dakar El Hadji Dièye FLSH UCAD, Dakar Dame Ndao FLSH UCAD, Dakar ________________________________________________________________________ Colloquium Senelangues 2015 West African Languages Call for Papers English Version First Call for Abstracts Colloquium Senelangues 2015 West African Languages 24-25 April 2015 Dakar, Senegal Deadline for submission: 15 November 2014 web site: contact: senelangues2015call at The Sénélangues project (, which aimed at the description and documentation of the languages of Senegal, was financed by the Agence Nationale de la Recherche française for a period of 4 years, involving linguists from the CNRS laboratories LLACAN and DDL in collaboration with the University Cheikh Anta Diop of Dakar. This scientific collaboration continues with the organisation of a double event, Sénélangues 2015, which consists of a Colloquium on the description of West African languages, and a thematic school with the same topic. The Colloque Sénélangues 2015 Langues d’Afrique de l’Ouest will take place on 24 and 25 April 2015 at the Cheikh Anta Diop University of Dakar. Topics of the colloquium In the last decades, the description of African languages benefited a lot from the recent developments of good practices in the areas of information technology and of linguistic analysis including typology and language documentation. These developments have been stimulated by various collaborative projects and funding schemes. The aim of the Colloquium is to gather linguists working in West Africa so that they can share each other’s scientific results, insights, know-how and research questions in order to increase our understanding of the languages of the region. We welcome contributions on the analysis of West African languages including Creole languages, as well as on phenomena of language contact with other language families. Contributions in all sub-disciplines of linguistic analysis are welcome, including phonology, morphology, syntax, semantics, pragmatics. Plenary speakers Denis Creissels, University of Lyon 2 Felix Ameka, Universityof Leiden How to submit a contribution Contributions can either be in the form of an oral presentation of 20 minutes + 10 minutes discussion or in the form of a poster presentation (poster format A0, 120 by 80 cm). Presenters may indicate their preference (oral presentation or poster) but the selection committee reserves the right to do otherwise. For both types of presentation the abstract should adhere to the following instructions: - Maximum one page including title, examples and references, using a Times 12 point font. - Send an anonymous version of your abstract in both rtf and pdf formats as an attachment to an email message to senelangues2015call at - Use some key words of your title in the name of your pdf-file. - Mention “communication Senelangues 2015” in the subject line of the email message - Indicate in the body of your message: surname, first name, affiliation, email address, title of your paper, preferred presentation (poster or oral) - The language of presentation should be either French, English or Portuguese. Address for submissions and any contact senelangues2015call at Important dates Deadline for submitting abstracts:15 November 2014 Notification of decision of acceptance : 15 January 2015 Conference venue Faculté de Lettres, Université de Cheikh Anta Diop, Dakar, Senegal Scientific Committee Felix Ameka University of Leiden Larry Hyman U.C. Berkeley Valentin Vydrine INALCO, LLACAN, Paris Martine Vanhove LLACAN, CNRS & INALCO, Paris Koen Bostoen University of Ghent Jérémie Kouadio N'Guessan Universityof Cocody Organizing Committee Sylvie Voisin DDL, CNRS & University of Aix-Marseille Stéphane Robert LLACAN, CNRS & INALCO, Paris Alain-Christian Bassène FLSH UCAD, Dakar Denis Creissels DDL, CNRS & Lyon 2 Thierno Cissé FLSH UCAD, Dakar Noël Bernard Biagui CLAD UCAD, Dakar Nicolas Quint LLACAN, CNRS & INALCO, Paris Jeanne Zerner LLACAN, CNRS & INALCO, Paris Anna Marie Diagne IFAN UCAD, Dakar El Hadji Dièye FLSH UCAD, Dakar Dame Ndao FLSH UCAD, Dakar ________________________________________________________________________ Conferência Sénélangues 2015 Línguas da África Ocidental Chamada para comunicação Versão portuguesa 1eirachamada para comunicação Conferência Sénélangues 2015 Línguas da África Ocidental 24-25 Abril 2015 Dakar, Senegal Prazo de entrega das submissões: 15 Novembro 2014 Web: Contacto: senelangues2015call at O projecto Sénélangues, financiado pela Agência Nacional [Francesa] para a Pesquisa, reuniu durante quarto anos, linguistas das unidades de pesquisa LLACAN e DDL do CNRS [Centro Nacional [Francês] de Pesquisa Científica] em parceria com a Universidade Cheikh Anta Diop de Dakar no âmbito dum ambicioso projecto de descrição e documentação das línguas de Senegal ( Na continuidade desta colaboração científica, os membros de Sénélangues decidiram organizar em Abril de 2015 um duplo evento, Sénélangues 2015, que combinará uma conferência sobre a descrição das línguas da África Ocidental com um minicurso dedicado ao mesmo tema. A conferência Sénélangues 2015 Línguas da África Ocidental terá lugar a 24 e 25 de Abril na Universidade Cheikh Anta Diop de Dakar. Temática da conferência Graças ao estímulo de vários projectos colaborativos apoiados por diversas agências ou fundações, a descrição das línguas africanas tem vindo a beneficiar, ao longo das últimas décadas, dos desenvolvimentos recentes das boas práticas e dos recursos informáticos no que tange aos processos de análise de cariz linguístico, tipológico e documental. Ao abrir a problemática da descrição linguística ao conjunto da África Ocidental, esta conferência tem como objectivo permeter aos linguistas que trabalham sobre as línguas dessa área encontrarem-se para fazer o balanço dos seus avances científicos, compartilharem os seus respectivos conhecimentos, as suas experiências e dúvidas, assim como favorecer o aumento dos conhecimentos globais disponíveis sobre as línguas da África Ocidental. Esperamos contribuições que tratem das línguas vernáculas da África Ocidental (inclusive os crioulos) e também estamos interessados na descriç?ão dos fenómenos de contactos que se produzem entre estas línguas e idomas de outras familhas. Todos os níveis da análise linguística (fonologia, morfologia, sintaxe, semântica, enunciação e pragmática) serão contemplados. Conferências plenárias Denis Creissels, Universidade de Lyon 2 Felix Ameka, Universidadede Leiden Modo de submissão das comunicações Conforme o gosto dos conferencistas ou a decisão dos membros do comité de selecção, as comunicações far-se-ão de forma oral (20 mn mais 10 mn de perguntas) ou sob forma de póster (tamanho recomendado A0, H : 1,20 m - L : 0,80 m) no quadro de uma sessão especial. Em ambos os casos, as consignas para o envio das propostas são as seguintes: - o resumo não deve exceder uma página (título, exemplos e referências incluídos), em Times 12 (intervalo entre linhas simples) - será enviado (versão anonimizada) em formato rtf e pdf para o endereço seguinte: senelangues2015call at - o nome do ficheiro pdf constará simplesmente de algumas palavras-chaves do título da comunicação - assunto da mensagem: “communication Senelangues 2015” - mencione no texto da mensagem: o seu apelido, nome, afiliação (universitária), endereço electrónico (e-mail), título da proposta, formato desejado (póster vs. oral) - as línguas da conferência são o francês, o inglês e o português Contacto parasubmissão de resumos - informações senelangues2015call at Calendário Submissão dos resumos: até ao 15 de Novembro 2014 Notificação aos autores: 15 Janeiro 2015 Lugar da conferência Universidade Cheikh Anta Diop, Dakar, Sénégal Comité Científico Felix Ameka Universidadede Leiden Larry Hyman U.C. Many Arabic NLP (or Arabic NLP-related) workshops and conferences have taken place, both in the Arab World and in association with international conferences. This workshop follows in the footsteps of previous efforts to provide a forum for researchers to share and discuss their ongoing work. We invite submissions on topics that include, but are not limited to, the following: * Basic core technologies: morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling, sentiment analysis, Arabic dialect modeling, etc. * Applications: machine translation, speech recognition, speech synthesis, optical character recognition, pedagogy, assistive technologies, social media, etc. * Resources: dictionaries, annotated data, specialized databases etc. Submissions may include work in progress as well as finished work. Submissions must have a clear focus on specific issues pertaining to the Arabic language whether it is standard Arabic, dialectal, or mixed. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work. Submissions are expected to be 8 pages long plus 2 pages for references. Associated with the workshop will be a shared task on Arabic text error correction (see link to Shared Task Website above). IMPORTANT DATES Paper submission deadline: July 26, 2014 Author notification: August 26, 2014 Camera Ready: September 15, 2014 Workshop: October 25, 2014 ORGANIZERS Program Co-chairs Nizar Habash, Columbia University Stephan Vogel, Qatar Computing Research Institute Publication Co-chairs Nadi Tomeh, Paris 13 University Houda Bouamor, Carnegie Mellon University Qatar Website Committee Kareem Darwish, Qatar Computing Research Institute Noura Farra, Columbia University Shared Task Committee Behrang Mohit (co-chair), Carnegie Mellon University Qatar Alla Rozovskaya (co-chair), Columbia University Wajdi Zaghouani, Carnegie Mellon University Qatar Ossama Obeid, Carnegie Mellon University Qatar Nizar Habash (advisor), Columbia University Program Committee Members Abdelmajid Ben-Hamadou, University of Sfax, Tunisia Abdelhadi Soudi, Ecole Nationale de l’Industrie Minérale, Morocco Abdelsalam Nwesri, University of Tripoli, Libya Achraf Chalabi , Microsoft Research, Egypt Ahmed Ali, Qatar Computing Research Institute, Qatar Ahmed Rafea, The American University in Cairo, Egypt Alexis Nasr, University of Marseille, France Ali Farghaly, Monterey Peninsula College, USA Almoataz B. Al-Said, Cairo University, Egypt Alon Lavie, Carnegie Mellon University, USA Aly Fahmy, Cairo University, Egypt Azadeh Shakery, University of Tehran, Iran Azzeddine Mazroui, University Mohamed I, Morocco Bassam Haddad, University of Petra, Jordan Bayan Abu Shawar, Arab Open University, Jordan Behrang Mohit, Carnegie Mellon University Qatar, Qatar Eric Atwell, University of Leeds, UK FarhadOroumchian, University of Wollongong, Australia Ghassan Mourad, Université Libanaise, Lebanon Hassan Sawaf, eBay Inc., USA Hazem Hajj, American University of Beirut, Lebanon Hend Alkhalifa, King Saud University, Saudi Arabia Houda Bouamor, Carnegie Mellon University Qatar, Qatar Imed Zitouni, Microsoft Research, USA Joseph Dichy, Université Lyon 2, France Karim Bouzoubaa , Mohammad V University, Morocco KarineMegerdoomian, The MITRE Corporation, USA Katrin Kirchhoff, University of Washington, USA Kemal Oflazer, Carnegie Mellon University Qatar, Qatar Khaled Shaalan, The British University in Dubai, UAE Khaled Shaban, Qatar University, Qatar Khalil Sima’an, Universiteit van Amsterdam, Netherlands Lamia Hadrich Belguith, University of Sfax, Tunisia Michael Rosner, University of Malta, Malta Mohamed Elmahdy, Qatar University, Qatar Mohsen Rashwan, Cairo University, Egypt Mona Diab, George Washington University, USA Mustafa Jarrar, Bir Zeit University, Palestine Nada Ghneim, Higher Institute for Applied Sciences and Technology, Syria Nadi Tomeh, University Paris 13, France Ossama Emam, IBM, USA Otakar Smrž, Džám-e Džam Language Institute, Czech Republic Owen Rambow, Columbia University, USA Preslav Nakov, Qatar Computing Research Institute, Qatar Ramzi Abbes, TECHLIMED, France Salwa Hamada, Cairo University, Egypt Shahram Khadivi, Tehran Polytechnic, Iran Sherri Condon , The MITRE Corporation, USA Taha Zerrouki, University of Bouira, Algeria Violetta Cavalli-Sforza, Al Akhawayn University, Morocco ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 16:06:45 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:06:45 +0200 Subject: Appel: Colloque PhonoGenres and Speaking Styles, Geneve Message-ID: Date: Mon, 07 Jul 2014 15:10:38 +0200 From: Message-ID: X-url: Chère collègue, cher collègue, Nous organisons les 10 et 11 septembre 2014 à l'Université de Genève le 3ème SWIP (Swiss Workshop on Prosody) intitulé "PhonoGenres and Speaking Styles". Pour plus d'information se référer à l’adresse: Cordialement, Le comité d’organisation ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 15:45:55 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:45:55 +0200 Subject: Soft: Glozz 2.0 (plate-forme d'annotation) Message-ID: Date: Fri, 4 Jul 2014 15:42:14 +0200 From: Yann Mathet Message-Id: <61B3F419-7CFF-45EB-814D-DA6A3B04BA53 at> X-url: Chers collègues, Nous avons le plaisir de vous annoncer la mise en ligne de la version 2.0.1-beta4 de la plate-forme d'annotation Glozz. Les principales nouveautés sont repertoriées ci-dessous. Soulignons notamment son passage prochain à l'open-source, et sa nouvelle architecture permettant l'ajout de plug-ins (pouvant être développés par la communauté des utilisateurs). 1) Passage prochain en open source (licence en cours d'établissement). 2) Mécanisme de plug-ins permettant le développement d'extensions au logiciel sans modifier son noyau. Quiconque peut ainsi développer ses propres plug-ins, et éventuellement les diffuser, en limitant les risques d'incompatibilité. Nous encourageons les développeurs intéressés à privilégier la création de plug-ins plutôt que de modifier directement le noyau (quand cela sera possible). Nous répondrons avec plaisir aux demandes de précisions concernant ces aspects. 3) Nouveau plug-in "Concordancer" offrant une vue de type concordancier. Cette vue synthétique permet de consulter rapidement des unités annotées dans leur contexte, contexte composé de texte mais aussi éventuellement d'autres objets annotés. Un manuel dédié est disponible sur 4) Généralisation du principe du "Basket" à toute l'application et aux plug-ins. Il permet ainsi de stocker des annotations spécifiquement choisies (à la main, en sortie d'une requête GlozzQL ou via le concordancier), afin de mieux le observer, de les enregistrer dans un fichier spécifique, ou encore de leur affecter automatiquement un couple attribut-valeur. Se reporter au manuel dédié sur 5) Raccourcis clavier paramétrables (Options/Préférences/Shortcuts) permettant d'annoter rapidement une portion de texte sélectionné avec une unité dont le type et les attributs-valeurs sont indiqués pour un raccourci donné. Par exemple, une fois un segment de texte sélectionné à la souris, la touche "F1" pourra créer une unité de type "Nom", avec le trait "genre" à la valeur "masculin" et le trait nombre à la valeur "singulier", tandis que "F2" pourra être utilisé pour un nom féminin singulier. Ces raccourcis peuvent être sauvergardés et transmis à d'autres utilisateurs, si bien que le responsable de campagne peut délivrer une configuration prête à l'emploi pour tous ses annotateurs. 6) Nouveaux export et import CSV, pour un emploi dans un tableur, et une éventuelle passerelle vers d'autres applications, en entrée ou en sortie de Glozz. 7) Ajout d'une contrainte "Same Position" dans GlozzQL, qui concerne les Units et indique si deux unités sont situées exactement au même endroit dans le texte. Cela peut notamment être utile dans le cas où il y a plusieurs couches d'annotation indépendantes, et que l'on souhaite voir s'il existe des superpositions d'unités entre ces couches. 8) Interface graphique remaniée. Notamment, utilisation d'onglets pour les différents outils rarement utilisés simultanément. 9) Export SQL corrigé et amélioré. Vous pourrez récupérer une archive au format TGZ à l'adresse : L'archive contient notamment : - le JAR de l'application (glozz-platform.jar) ; - le plugin concordancer (plugins/concordancer/glozz-concordancer-plugin.jar) ; - différents fichiers de test (répertoire data/) ; - un changelog (CHANGELOG.utf8) indiquant les principales évolutions de cette version et des précédentes ; - un script de lancement destiné aux utilisateurs de Windows rencontrant des difficultés lors de l'utilisation directe des JARs (StartGlozz.bat) ; - la licence (licence.pdf). Nous vous rappelons que Glozz dispose de manuels, accessibles sur, qui présentent de manière détaillée et illustrée la plupart des fonctionnalités disponibles. N'hésitez pas à nous faire part de vos remarques et suggestions concernant la plate-forme ou son manuel. We are looking for outstanding young research scientists to join the group on several projects involving speech processing. Opened Positions 1. PhD / Automatic speech recognition and machine assisted speech annotation for African Languages You will work in the context of the ALFFA project which is really interdisciplinary since it not only gathers technology experts (LIG, LIA, VOXYGEN) but also includes fieldwork linguists/phoneticians (DDL). The PhD will focus on analysing the capabilities of existing automatic speech processing systems to investigate phonetic characteristics of languages or annotate speech (especially on mobile devices: tablets, glasses, etc) to provide an innovative digital assistant to the fieldwork linguist. Start : Fall 2014 Duration : 36 months Particular aspect : co-supervision with DDL lab in Lyon Contact : Laurent.Besacier at & Francois.Pellegrino at Project Web Site : Team/Lab Web Sites : 2. PhD / Speech interaction for socio-affective ubiquitous agents and robots in ambient assisted living environments You will work on a research and development project (CASSIE) involving academic and industrial stakeholders of spoken dialog, assistive technologies, affectives sciences and social robotics. The PhD objective is to design a spoken dialogue system that will interact with a user in her/his home through an ubiquitous (physical and/or virtual) and personalized agent. This dialogue system will be corpus based, with iterative machine learning approach hydride with boostrap expert knowledge (observed from “intelligent” annotations) from spontaneous and ecological data collected in real or quasi-real environment (Smart Home) and situation (real scenario). The system will focus on the socio-affective dimensions of the interaction (socio-affective prosody, paralinguistic events, imitation, synchrony etc), especially the dynamics (timing) of the dialog… One aspect of this PhD will also focus on the comparison of the same character implemented in robot versus virtual agent for interaction (epathy aspects, etc.). Start : Fall 2014 Duration : 36 months Contact : Veronique.Auberge at & Benjamin.Lecouteux at (+ Laurent.Besacier at 3. PhD / Context-aware spoken dialogue in ambient assisted living environments You will work on a research and development project (CASSIE) involving academic and industrial stakeholders of spoken dialog, assistive technologies and social robotics. The PhD objective is to make a social cyber-physical agent "aware" of its environment by sensors and/or connected objects. This contextual information will drive the system interaction (natural language understanding and dialog). The heart of the research will be to build probabilistic and logical models for multimodal situation analysis and understanding in a domestic and multilingual context. For the experimental development and validation, the research will benefit from the fully-equipped LIG smart home (DOMUS). Start : Fall 2014 Duration : 36 months (PhD) Contact : Francois.Portet at & Michel.Vacher at Profiles The applicants must hold a Master degree in Computational Linguistics, Computing sciences or Cognitive Sciences preferably with experience in the fields of speech processing and/or natural language processing and/or machine learning. Good background in programming will also be required. He/she will also be involved in experimenting the technology with human participants being either French or English speakers. For this reason good English level is required as well as a good command of French. Finally effective communication skills in English, both written and verbal are mandatory. Location Grenoble is a high-tech city with 4 universities. It is located at the heart of the Alps, in outstanding scientific and natural surroundings. It is 3h by train from Paris ; 2h from Geneva ; 1h from Lyon ; 2h from Torino and is less than 1h from Lyon international airport. Research Group Website : Dates Interviews will be held in July 2014 (until September 2014 if needed). Elle est organisée par les laboratoires LLACAN et DDL du CNRS en partenariat avec l’Université Cheikh Anta Diop de Dakar, dans la continuité du projet ANR Sénélangues ( Les objectifs de cette école sont de profiter des acquis du projet Sénélangues pour transmettre les dernières avancées théoriques, méthodologiques et technologiques en matière de description de langues à tradition orale et délivrer une formation axée essentiellement sur des langues parlées en Afrique de l’Ouest (langues atlantiques, langues mandé, créoles, mais aussi français d’Afrique). La perspective de travail sera avant tout descriptive et typologique. Cette formation de deux semaines, qui entend compléter les formations de Master et de Doctorat existantes, doit permettre aux stagiaires d’avoir une vue d’ensemble des différents enjeux scientifiques et cadres d’analyses existants, des diverses tâches à entreprendre, ainsi que des méthodes et outils à disposition lorsque l’on se lance dans la description d’une langue parlée en Afrique de l’Ouest. Elle doit également leur donner une première initiation à la pratique de terrain. L’école thématique s’étendra sur deux plages de quatre jours chacune (semaine 1: 20-23 avril 2015; semaine 2: 28 avril 1er mai 2015), entre lesquelles sera inséré un colloque international sur la description des langues de l’Afrique de l’Ouest (24-25 avril 2014). Contenu des enseignements La formation représente un volume total de 53h d’enseignement. Tous les cours sont obligatoires. Ils seront dispensés en français, la plupart sous forme de cours magistraux, complétés par plusieurs séances de travaux dirigés (organisées en sous-groupes) pour permettre l’entraînement, en conditions d’enquête de terrain, à l’analyse morphosyntaxique, à la perception et à la transcription des tons, ou encore l’utilisation des logiciels de traitement. La formation est articulée autour de 3 axes correspondant (1) aux connaissances de base en linguistique générale et aux particularités structurelles des langues africaines, (2) aux spécificités de la pratique de linguistique de terrain et (3) aux outils, techniques et méthodes d’exploitation des données de terrain. Un accent particulier sera donné aux langues de la famille atlantique, mais des spécialistes de langues mandé, de créoles à base portugaise et du français d’Afrique compléteront la formation. Liste des cours: Axe1. Fondamentaux Sémantique (2 sessions de 1h30) Typologie (1 session de 2h) Morphosyntaxe (2 sessions de 1h30) Tonologie (2 sessions de 1h30) Phonologie (2 sessions de 1h30) Sociolinguistique (1 session de 1h30) Axe1. Langues atlantiques Les classes nominales des langues atlantiques (1 session de 2h) Les langues atlantiques: connaissances et reconstruction (1 session de 2h) La flexion verbale dans les langues atlantiques (1 session de 2h) Extension verbale et valence dans les langues atlantiques (1 session de 2h) Axe1. Cours spécifiques à la région Français d’Afrique (1 session de 2h) Les créoles (1 session de 1h30) Description et langues en danger en Afrique de l’Ouest (1 session de 2h) Langues mandé (2 sessions de 1h30) Axe2. Terrain Techniques d’enregistrement (1 session de 1h30) Pratiques de terrain et enquêtes (1 session de 1h30 pour 2 sous-groupes) Ethnolinguistique (1 session de 1h30) Le chercheur sur le terrain (1 session de 1h30) Axe3. Exploitation des données ELAN (Logiciel) (2 sessions de 1h30) Les métadonnées (ArBIL) (1 session de 1h30) Comment écrire une grammaire (1 session de 1h30) Lexicographie (2 sessions de 1h30) Un certificat de participation (comprenant la liste des enseignements reçus et le nombre de crédits équivalents) sera délivré à tous les participants pour permettre une validation de la formation, comme stage ou autre selon les universités concernées. Liste des enseignants (à compléter) F. Ameka (Pr., Université de Leiden) C. Chanard (IE, LLACAN) D. Creissels (Pr. émérite, Université Lyon2) A. M. Diagne (assimilé CR, IFAN, Dakar) J. Kouadio (MCF, Université Cocody, Abidjan) M. Mous (Pr., Leiden) P. A. Ndao (Pr., UCAD, Dakar) K. Pozdniakov (IUF - Pr., INALCO) N. Quint (DR, LLACAN) S. Robert (DR, LLACAN) P. Roulon-Doko (DR, LLACAN) S. Voisin (MCF, Aix Marseille Université) V. Vydrine (Pr., INALCO) Public concerné et critères d’admissibilité L’école thématique doit permettre d’accueillir 70 stagiaires. Elle est ouverte à tous ceux qui désirent acquérir des connaissances sur les langues d’Afrique de l’Ouest, prioritairement les étudiants de Master 1 et 2, doctorants, post-doctorants ou jeunes chercheurs et enseignants-chercheurs de sciences du langage qui souhaitent effectuer un travail de description sur une langue parlée en Afrique de l’Ouest. Niveau d’études minimum requis: Licence de Sciences du langage (ou niveau équivalent en linguistique). Modalité de soumission des candidatures: Pour le 1er octobre 2014 au plus tard, remplir le formulaire de candidature en ligne sur le site: La notification d’acceptation parviendra aux candidats le 1er décembre. Les modalités d’inscription leur seront précisées à cette occasion. RATIONALE As the Web rapidly evolves, Web users and Web contents are evolving with it. In an era of social connectedness, people are becoming increasingly enthusiastic about interacting, sharing, and collaborating through social networks, online communities, blogs, Wikis, and other online collaborative media. In recent years, this collective intelligence has spread to many different areas, with particular focus on fields related to everyday life such as commerce, tourism, education, and health, causing the size of the Web to expand exponentially. The distillation of knowledge from such a large amount of unstructured information, however, is an extremely difficult task, as the contents of today's Web are perfectly suitable for human consumption, but remain hardly accessible to machines. To this end, biologically and linguistically motivated computational paradigms that go beyond syntax are needed. Intelligent and evolutionary systems potentially have a large future possibility to play an important role in natural language processing (NLP) research for tasks such as grammatical evolution, knowledge discovery, and rule learning. In this light, this Special Session focuses on the introduction, presentation, and discussion of novel NLP systems that are not merely based on domain-dependent corpora or word co-occurrence counts, but rather systems that can be considered intelligent and evolutionary. The main motivation for the Special Session, in particular, is to go beyond a mere word-level analysis of text and provide novel concept-level approaches to natural language processing that allow a more efficient passage from (unstructured) textual information to (structured) machine-processable data, in potentially any domain. Articles are thus invited in areas such as AI, Semantic Web, knowledge-based systems, machine learning, and computational intelligence for NLP research. Topics include, but are not limited to: - Intelligent and evolutionary systems for information extraction and retrieval - Intelligent and evolutionary systems for text summarization and visualization - Intelligent and evolutionary systems for topic modeling - Intelligent and evolutionary systems for sentiment analysis - Intelligent and evolutionary systems for knowledge acquisition - Intelligent and evolutionary systems for social network analysis - Intelligent and evolutionary systems for adaptive and transfer learning - Intelligent and evolutionary systems for agents and complex systems - Intelligent and evolutionary systems for evolutionary game theory - Intelligent and evolutionary systems for bioinformatics The Special Session also welcomes papers on specific application domains of natural language procesing, e.g., social data mining, influence networks, customer experience management, computer mediated human-human communication, social media marketing, multimedia management, personalization and persuasion, enterprise feedback management, human-agent, -computer and -robot interaction, intelligent user interfaces, patient opinion mining, surveillance, art. The authors will be required to follow the Author's Guide for manuscript submission to the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems ( TIMEFRAME Submission Deadline: August 1st, 2014 Notification of Acceptance: September 1st, 2014 Final Manuscripts Due: October 1st, 2014 Session dates: November 10-12th, 2014 ORGANIZATION Erik Cambria, Nanyang Technological University, Singapore Amir Hussain, University of Stirling, UK Yunqing Xia, Tsinghua University, China From hamon at LIMSI.FR Tue Jul 8 16:23:00 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:23:00 +0200 Subject: Appel: JDMDH, Journal of Data Mining and Digital Humanities Message-ID: Date: 8 Jul 2014 16:47:59 +0200 From: "Nicolas Turenne" Message-ID: <14716720fd3.5cd5.1b78e at> X-url: X-url: X-url: Dear Colleague, The Journal of Data Mining and Digital Humanities ( is an OA journal dedicated to a range of research studies between the fields of data mining and digital humanities. It is hosted as an overlay journal on the Epiciences platform ( Submissions are peer-reviewed and the journal is free of charge for both authors and readers. Accepted publications are immediately published on the JDMDH website. You or your colleagues may be interested in submitting an original manuscript to the journal. You can find all instructions to authors on the website and create an account to submit. See: . Looking forward to see you contribute to this full open access endeavor. The JDMDH editorial board From hamon at LIMSI.FR Tue Jul 8 16:33:37 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:33:37 +0200 Subject: Ressources: Corpus disponibles annotes avec FRMG Message-ID: Date: Tue, 08 Jul 2014 18:14:00 +0200 From: Eric De la clergerie Message-ID: <53BC18C8.8060006 at> X-url: X-url: En relation avec le wiki linguistique FrmgWiki [] développé par l'équipe ALPAGE (INRIA & Université Paris Diderot), nous avons le plaisir de mettre à disposition de la communauté plusieurs corpus annotés avec l'analyseur syntaxique FRMG. Il s'agit de: - Wikipedia Fr (178 millions de mots) - Wikisource Fr ( 64 millions de mots) - EuroParlement Fr ( 41 millions de mots) Les corpus annotés sont librement disponibles, modulo les licences s'appliquant sur les corpus originels. FrmgWiki offre également la possibilité de lancer des requêtes (en langage DPath) sur les corpus ainsi annotés. Ce service est cependant encore expérimental. Message-ID: Date: Tue, 8 Jul 2014 15:45:15 +0200 (CEST) From: Karen Fort Message-ID: <415602684.10106950.1404827115827.JavaMail.zimbra at> X-url: X-url: Jouez (et faites jouer) à Zombilingo : Zombilingo est un jeu ayant un but (ou Game With A Purpose) permettant d’annoter des corpus en syntaxe de dépendances. Les annotations créées sont librement disponibles sur le site du jeu. La production de ressources linguistiques de grande taille est très coûteuse, en particulier en main d’œuvre. Ainsi, le coût d’annotation du Prague Dependency Treebank a été estimé à 600 000 dollars (Böhmová et al., 2001). Une alternative pour produire des ressources est l’utilisation de la myriadisation (crowdsourcing), c’est-à-dire le recours à la "foule" pour réaliser une tâche. Les jeux ayant un but, par exemple, ont été utilisés pour différentes tâches en TAL : JeuxDeMots (Lafourcade, 2007) a pour but de créer un réseau lexical ; Phrase Detectives (Chamberlain et al., 2008) fait annoter un corpus en anaphores. Ces deux jeux ont eu un succès considérable et ont permis de créer des ressources de qualité raisonnable pour un coût réduit. Le premier fait appel au sens commun et le deuxième à des connaissances scolaires. Dans d’autres domaines, il a été possible d’utiliser un jeu pour des tâches nettement plus complexes et qui nécessitent une formation des personnes qui participent. Ainsi, dans FoldIt (Cooper et al., 2010) les joueurs doivent manipuler des représentations 3D de protéines pour étudier la façon dont elle peuvent interagir. Zombilingo est inspiré de ces succès et a pour but de faire réaliser à des joueurs une tâche de TAL réputée complexe : annoter des dépendances syntaxiques. Réf. : Karën Fort, Bruno Guillaume et Valentin Stern. Zombilingo : manger des têtes pour annoter en syntaxe de dépendances. Actes de Traitement Automatique des Langues Naturelles (TALN), Marseille, France, juillet 2014 - Démonstration. Karën Fort, Bruno Guillaume and Hadrien Chastant. Creating Zombilingo, a Game With A Purpose for dependency syntax annotation. Proceedings of the Gamification for Information Retrieval (GamifIR'14) Workshop, Amsterdam, Pays-Bas, avril 2014. ( Karën Fort ATER ENSMN Dép. Please, distribute it among potentially interested colleagues.)* Translating and the Computer – 36, London, 27/28 November 2014 Translating and the Computer attracts a unique amalgam of researchers, developers and users. It brings together academics involved in language technology research and in teaching translation and terminology with those who develop and market tools for language transformation and both of these groups with users: translators, terminologists, interpreters, and voice-over specialists, whether freelancers or working in translation departments of large organisations such as those of the European Parliament, European courts and the European Patent Office, the United Nations family, international companies and other organisations, and Language Services Providers (LSPs), large and small. First the Computer, then the Internet and more recently the Cloud, are changing and remodelling expectations and processes in the Language and Localization industries. These changes are accompanied by new requirements for standards and interoperability. The digital age is modifying the concept of text and quality. Content is a key item together with strings, chunks, segments and words. In its 36th session Translating and the Computer has moved from ASLIB to ASLING. The conference often referred to as the “ASLIB Conference” is now the ASLING Translating and the Computer Conference. ASLING is working hard to ensure that this conference remains a key date in your calendar to help you keep in touch. To do this we need the support and contributions of all those who are interested in sharing their knowledge and ideas on the latest developments in an extremely stimulating sector, and equally interested in hearing other contributions. If you, or one of your colleagues, have something important to announce or discuss, we urge you to consider presenting a paper at this conference. Abstracts must be submitted using the START system at the following address: . The deadline for submissions is extended to July 14th; authors will be notified of acceptance by August 7th. For further details of the Call for Abstracts, please see Call for Abstracts at: . For any other information write to us at: info at . Chairs * Juliet Macan, Arancho Doc srl. (Lead Chair 2014) * João Esteves-Ferreira, Tradulex, International Association for Quality Translation * Ruslan Mitkov, University of Wolverhampton * Olaf-Michael Stefanov, United Nations (ret), JIAMCATT Programme Committee * David Chambers, World Intellectual Property Organisation (ret) * Gloria Corpas Pastor, University of Malaga * Alain Désilets, National Research Council of Canada (NRC) * David Filip, LRC, CNGL, LT-Web, University of Limerick * Pamela Mayorcas, Institute of Translating and Interpreting * Paola Valli, University of Trieste Conference Manager: * Nicole Adamides We look forward to welcoming you to London and a new start. Association internationale pour la promotion des technologies linguistiques International Association for Advancement in Language Technology Bologna, Genève, London, Wien, Wolverhampton ASLING is a new international not-for-profit association, set up by the conference chairs to renew the organisation and opportunities offered by the Translating and the Computer Conference series in London. Its main objectives are: “to promote the use of information technology in the fields of language, translation, terminology and related fields” and “provide the general public with a better understanding of the contribution of technology in the fields of language, translation, terminology and related fields”. Whether it is dedicated to interactive storytelling or not, it involves many algorithms: multi-modal input recognition (utterances, gestures, gazes, vocal inflections), natural language understanding and generation, dialogue management, planning and cognitive capacities, emotion modelling, prosodic speech generation, non-verbal behavior. In this context, the post-doc fellow’s research will focus on designing algorithms for modelling the interactions (turn of speech) during the conversation, starting from real data gathered during dialogues between adult (or Wizard of Oz) and child, that are annotated by psychologists. This task can be viewed as predicting a particular event in a sequence of itemsets. The model of the dialogue will be implemented in an embodied conversational agent. Application: The candidate must prepare a detailed CV including a complete bibliography, a motivation letter and recommendation letters as a single pdf file. This file should be sent by email to the contact below. French people may apply using French language. Contact: François Rioult CNRS UMR6072 GREYC Université de Caen Basse-Normandie francois.rioult at ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 15:57:38 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:57:38 +0200 Subject: Appel: 3emes journees Unitex/Gramlab, 3rd UNITEX/GramLab Workshop Message-ID: Date: Fri, 4 Jul 2014 22:05:20 +0200 (CEST) From: Denis Maurel Message-ID: <837136366.3877602.1404504320808.JavaMail.zimbra at> X-url: English below. 3èmes journées Unitex/Gramlab 9-10 octobre 2014, Université François Rabelais Tours Unitex est une plate-forme open-source d’analyse de texte largement utilisée en recherche, dans l’enseignement et l’industrie. Le système GramLab/Unitex offre en plus aux équipes le travail collaboratif (partage de ressources, suivi de versions, etc.). Ces outils reposent sur des technologies de type « automates à états finis » et incorporent des ressources linguistiques à large couverture, disponibles dans de nombreuses langues. Les journées Unitex/GramLab sont un forum qui a pour ambition de rassembler la communauté des chercheurs et industriels utilisant Unitex et GramLab pour leurs travaux (tous domaines confondus) ou participant au développement de cette plate-forme et de ses ressources. L’objectif principal de cet événement est de favoriser le partage d’expérience et les contacts entre utilisateurs et développeurs. Le programme présentera un éventail des recherches en cours et offrira une série de formations pratiques de différents niveaux aux chercheurs qui souhaitent utiliser ce système ou découvrir de nouvelles fonctionnalités. Appel à communications, formations et démonstrations 1) Nous invitons les utilisateurs d’Unitex à soumettre des propositions de communication orale dans les domaines suivants (non exclusifs) : * Les recherches actuelles utilisant Unitex (dans tous les domaines). * Le développement de ressources à destination d'Unitex. * Le développement d'extensions, nouvelles fonctionnalités, etc. * Les développements industriels reposant sur Unitex. * Les initiatives pédagogiques utilisant Unitex (Unitex dans l'enseignement du TAL ou des ressources linguistiques, etc.). * L’intégration d’autres outils à Unitex. 2) Nous invitons les experts ayant une maîtrise particulière du système à proposer des tutoriels dans les domaines suivants ou d’autres : * Utilisation avancée des graphes. * Cascades de transducteurs avec CaSys. * Étiquetage en parties du discours, en parenthèsages minimaux.... * Développement de scripts Unitex / initiation à la programmation Unitex. * Programmation : persistance et optimisation. * Intégration Unitex-UIMA dans une chaine de traitement. * autres. 3) Nous faisons également appel aux auteurs d’applications intégrant des fonctionnalités d’Unitex et leur proposons de faire la démonstration de leurs outils. Procédure de soumission Envoyer un résumé d'une page maximum (3000 caractères) à : j.unitex2014 at Ce document reprendra les informations suivantes : - Type de communication : * communication orale de 20 minutes * formation (tutoriel) de 2 heures * démonstration (10 minutes en séance plénière + rencontres) - Nom, prénom, affiliation, adresse de courrier électronique - Titre de la présentation - Résumé de la présentation - Pour les tutoriels : défendre l’intérêt de cette formation et identifier le public visé. Programme provisoire Mercredi 8 octobre Après-midi Accueil et visite de la région Jeudi 9 octobre Matin Présentations Après-midi Tutoriels Ateliers et Réunion développeurs (en parallèle) Vendredi 10 octobre Matin Présentations Après-midi Réunion plénière (questions aux développeurs, nouveautés, orientations, etc.) Publication et communication Les communications pouront être diffusées sous forme électronique sur le site de la conférence, à la demande des auteurs. Calendrier Appel à communication : début juillet Soumission : avant le 5 septembre 2014 Approbation/évaluation : 10-15 septembre Inscriptions : - Avant le 30 septembre : 25 euros/jours - Après le 30 septembre : 35 euros /jours - Inscription gratuite pour les formateurs. - Possibilité de bourse pour les étudiants ne disposant pas de financement (soumettre une lettre de demande) - Les frais d’inscription couvrent les pauses café, deux repas en restaurant d'entreprise et les documents pédagogiques distribués durant les formations. Comité de lecture Antonio Balvet, STL, Université de Lille 3 Anne Dister, Université Saint-Louis - Bruxelles Cédrick Fairon, CENTAL, Université catholique de Louvain Nathalie Friburger, LI, Université François Rabelais Tours Cvetana Krstev, Université de Belgrade Tita Kyriacopoulou, IGM, Université de Marne-la-Vallée Denis Maurel, LI, Université François Rabelais Tours Agata Savary, LI, Université François Rabelais Tours Duško Vitas, Université de Belgrade Gilles Vollant, Ergonotics Comité d’organisation Nathalie Friburger, Université François Rabelais Tours Tita Kyriacopoulou, IGM, Université de Marne-la-Vallée Claude Martineau, LIGM, Université Paris-Est Marne-la-Vallée Denis Maurel, Université François Rabelais Tours Agata Savary, LI, Université François Rabelais Tours Comité de pilotage Cédrick Fairon, CENTAL, Université catholique de Louvain Tita Kyriacopoulou, IGM, Université de Marne-la-Vallée Éric Laporte, IGM, Université de Marne-la-Vallée Denis Maurel, Université François Rabelais Tours Gaelle Recource, Kwaga Duško Vitas, Université de Belgrade ------------------------------------------------------------------------ 3 rd UNITEX/GramLab Workshop October 9-10, 2014, Université François Rabelais Tours Unitex is an open-source text analysis software widely used in research, teaching and industry ( It relies on Finite State technologies and incorporates large-scale language resources for several languages. GramLab is an Integrated Development Environment (IDE), based on Unitex software components but specifically designed for industrial purposes. It integrates tools that help collaborative development, versioning, etc. The Unitex/GramLab Workshop aims at gathering academic and industrial users, developers and researchers contributing to the development of this open-source software and its resources or using them for developing new research or industrial applications. It is also a unique opportunity for new users to learn the basic and advanced Unitex techniques and discuss their projects with experts during hands-on sessions. The program will offer an overview of the current trends: the development of real-life industrial applications based on Unitex, the use of Unitex in various research projects, the improvement of Unitex through the development of new features and the extension of the language resources. Training sessions of various levels will be also offered. Call for communication, training session, demonstrations 1) We invite submissions for communication in the following areas (not exclusive): - Current research projects exploiting Unitex (all domains) - Development of language resources for Unitex - Development of new features or functionalities, optimization of existing programs, etc. - Industrial developements based on Unitex - Pedagogical initiatives using Unitex (for teaching NLP, Corpus linguistics, etc.) - Tools integration in Unitex 2) We invite Unitex experts to offer tutorials in the following domains (or others): * Advanced use of graphs * Cascading Transducers with CaSys * POS tagging * Unitex Scripting (Introduction to Unitex programming) * Persistance and optimization for Unitex-based applications * Building up on Unitex-UIMA * other 3) We also call for developers to demonstrate Unitex-based applications. Submission procedure Send a 1 page abstract (max 3000 characters) to : j.unitex2014 at This page will contain the following information: - Type of communication: * Oral communication (20 minutes) * Tutorial (2 hours) * Demonstration (10 minutes in plenary session + poster/demo session) - Name, First name, Institution, e-mail - Title - Abstract - Oral presentation Language (French or English) - For tutorials: explain the rationale for this training and identify target audience Draft program Wednesday 8 october Afternoon possibility to visit the Region of Tours Thursday 9 october : Morning: Presentations Afternoon : Tutorials Workshop and developpers session (parallel sessions) Friday 10 october : Morning: Presentations Afternoon : Plenary session (questions to developpers, innovations, new orientations, etc.) Proceedings and Publication The Workshop proceedings can be published electronically in the website of the workshop, if the authors wish. Call for papers: Begining of July Submission deadline: September 5, 2014 Author notification: September 10-15, 2014 Registration - Until September 30, 2013: 25 euros/day - After September 30, 2013: 35 euros/day - Free registration for experts in charge of tutorials - A few grants are available for students who do not have access to funding (registration fees waived): submit application letter . - Registration fees cover: coffee breaks, lunches and handouts for the tutorials. PLease find below the CFP tp the TRELA 2015 Conference: Areas of Research in Applied Linguistics (Paris Diderot University, Paris, France, 8-10 July, 2015: Aujourd'hui, au vingt et unième siècle, la linguistique appliquée est une discipline riche qui se décline en de multiples domaines, par ailleurs liés aux traditions scientifiques des divers pays dans lesquels elle s'est développée: acquisition / apprentissage, bi- / plurilinguisme, didactique des langues, lexicographie, linguistique de corpus, terminologie, traductologie, traitement automatique des langues, variation linguistique. Pourtant, les terrains de recherche de ces disciplines ou domaines sont souvent communs. On le voit dans ce que recouvrent les dénominations dans d'autres langues, comme /applied linguistics/ en anglais, /angewandte Linguistik/ en allemand ou /Lingüística aplicada/en espagnol. Le vingt et unième siècle est celui de la transdisciplinarité et de la pluridisciplinarité, comme le montre l'émergence de nombreux domaines pluri- ou transdisciplinaires, combinant sciences humaines et sciences exactes par exemple. La linguistique appliquée, par les terrains de recherche qu'elle parcourt, accompagne le développement incontournable de cette pluridisciplinarité. Ce faisant, elle démontre qu'elle ne consiste pas en la seule application de connaissances théoriques mais qu'elle permet l'émergence de nouveaux champs d'investigation qui viennent alimenter la réflexion en sciences du langage. L'objectif du colloque international /TRELA/, qui s'inscrit dans la suite de la réflexion engagée lors du colloque CRELA à Nancy en 2013, est de permettre aux chercheurs et autres acteurs des différents domaines de la linguistique appliquée de se rencontrer sur des enjeux de recherche partagés, amenant ainsi à offrir un éclairage pluridisciplinaire ou transdisciplinaire sur des problématiques croisées. Dans cette perspective sont attendues des communications portant sur l'un des cinq axes suivants : - Notion de « terrain » en linguistique appliquée (analyse multicritères de terrains spécifiques, analyse comparative de terrains) - Linguistique appliquée et sciences du langage: où situer le terrain ? Les terrains se croisent-ils ? - Modélisation et approches théoriques multiples (combiner plusieurs approches pour étudier le même terrain) - Ressources, outils et méthodologies d'approche du terrain - Linguistique appliquée et linguistique théorique : comment construire la complémentarité ? Interrogations épistémologiques face à une complémentarité évidente ? - Croisements entre linguistique appliquée et traductologie: une approche hybride? Echéance : 15 octobre 2014 Notification : 20 janvier 2015 ======================English CFP======================= *Paris 8-10 July 2015* Applied Linguistics in the 21^st Century is a rich and varied discipline, with many sub-domains. Each of these has its own research tradition, often associated with the particular countries in which it developed: language acquisition / learning, bi- and multi-lingualism, didactics, lexicography, corpus linguistics, terminology, translation studies, computational linguistics, variation, etc. However, these disciplines often share similar research fields. A prime example of this to look at the areas covered by the term /Applied Linguistics /and its equivalents /linguistique appliquée/, /angewandte Linguistik/, /Lingüística aplicada/ in their respective languages. Transdisciplinarity and multidisciplinarity are trademarks of the 21^st Century, as can be seen in the emergence of so many multi- or transdisciplinary fields, including examples which combine 'humanities' and 'pure sciences'. Applied Linguistics, because of the variety of fields which it is involved in, has followed the inexorable development of this process of hybridization. Furthermore, the practice of Applied Linguistics has come to involve not only the application of theoretical knowledge, but also the emergence of new fields of investigation, which then feed back into current debates within the language sciences. The aim of the international conference TRELA is to follow up on the CRELA conference in Nancy 2013, and to allow researchers and other practitioners in the different fields of Applied Linguistics to discuss and debate issues relating to common areas of research, pooling ideas on these topics from a multidisciplinary or transdisciplinary perspective. To this end, we invite submissions on any of the following areas: - The notion of 'area' or 'field' in Applied Linguistics (multicritieral analysis in specific areas, comparative analysis of different fields, etc.) - Applied Linguistics and Linguistics:Are they two distinct'areas' or a single 'field' or can they not be divided? - Models and multiple-theory approaches (combining several approaches to explore the same area) - Resources, tools and methodologies to explore an 'area' (or conduct 'field' work). - Applied Linguistics and Theoretical Linguistics: building a complementary approach? And what are the epistemological issues raised by obvious complementary? Le procédé innovant de Traitement Automatique du Langage Naturel (TALN) qu’exploite OWI a reçu de nombreux trophées (Lauréat du concours SFR Jeunes Talents, Prix TECHINNOV du Créateur Innovant, Entreprise Innovante du Pôle Cap Digital, concours Microsoft – Finance Innovation, label Scientipôle initiative, etc.). Nos clients sont les grands et moyens comptes de divers secteurs d’activité tels que Canal +, Ikea, EDF, Bouygues Telecom, MGEN, BPCE Assurance, etc. Contexte : Confronté par les succès commerciaux dans le domaine du traitement des écrits, OWI a désormais la volonté d’étendre son business vers le domaine de la voix. Dans cette optique, nous recherchons un « Ingénieur Recherche & Développement » spécialisé dans le speech to text pour renforcer notre équipe R&D et travailler sur des projets « vocaux » afin de répondre aux attentes de nos clients. Au sein de l’équipe Recherche et Développement, vous serez en charge des activités suivantes liées au domaine vocal : - Expérimenter des technologies de reconnaissance vocale - Mettre au point les algorithmes d’optimisation de ces technologies, au moyen des éléments fournis par le moteur OWI - Permettre une alimentation des solutions OWI à partir d’enregistrements vocaux ou de conversations temps réel - Participer à l’élaboration de tableaux de bord « customer experience » et « quality monitoring » - Expérimenter la possibilité d’apporter une assistance temps réel aux conseillers téléphoniques (« aide à la réponse et à la conduite de conversation ») Profil du candidat : - Docteur ou ingénieur, en informatique ou en TAL - Vous possédez obligatoirement une bonne connaissance des technologies de reconnaissance vocale - Vous maitrisez C++, Java et SQL - Une première expérience dans un poste similaire est fortement souhaitée - La maîtrise d’une autre langue étrangère serait un plus - Votre potentiel et votre personnalité feront la différence : motivation, sens de l'engagement, rigueur, capacité à s'impliquer dans des projets collaboratifs. Modalités : - Poste à pourvoir rapidement à Bourg-la-Reine (92) - Type de contrat : CDI - Rémunération : de 38 à 50 k€ selon expérience - CV et lettre de motivation à recrut at Je vous remercie par avance de votre aide Cordialement Xiaolu CHEN Service Marketing Tél : 01 78 16 12 10 | Email : xiaolu.chen at OWI Technologies | 31, Av du Général Leclerc, 92340 Bourg-la-Reine Suivez nos actualités : ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:43:13 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:43:13 +0200 Subject: Job: Post-doc IRIT/MELODI, Toulouse Message-ID: Date: Fri, 11 Jul 2014 16:32:15 +0200 From: philippe muller Message-ID: <53BFF56F.5060607 at> X-url: Dans le cadre du projet ANR Asfalda, nous proposons un contrat post-doctoral pour travailler avec le partenaire IRIT du projet, à l'Université Paul Sabatier (Toulouse). Le projet Asfalda a pour objectif de développer un corpus annoté sémantiquement, et des outils de traitement automatique pour l'analyse sémantique, utilisant les corpus constitués durant le projet. Les annotations sémantiques s'appuient sur le standard FrameNet. Framenet définit un ensemble de cadres sémantiques, des situations prototypiques et la caractérisation de leurs participants. Les cadres sont reliés entre eux dans une structure hiérarchique enrichie par des liens sémantiques de natures diverses. L'objectif du post-doc est d'enrichir cette structure, qui est pour l'instant peu dense. Les relations entre frames ainsi créées sont utiles pour l'analyse discursive, et pour compléter les annotations sémantiques de structures de cadres partiellement connectées en contexte. Ceci se fera en deux temps: premièrement en se concentrant sur des liens typiques entre types de lemmes impliqués dans les cadres considérés, à partir d'approches non supervisées sur corpus, deuxièmement en désambiguisant les lemmes reliés pour identifier simultanément les frames reliés dans leurs contextes d'apparition, ainsi que les liens entre leurs participants. Cette étape s'appuiera sur les outils d'annotation développés dans les autres tâches du projet, et les données annotées collectées. Nous recherchons des candidats avec des compétences en Traitement Automatique des Langues, en apprentissage automatique, et idéalement une expertise sur les thématiques du projet. Mots-clefs: semantic role labelling, analyse du discours, sémantique lexicale Coordinateur du projet: Marie Candito, Alpage, Univ Paris Diderot & INRIA Nous accepterons les candidatures jusqu'au 31 aout 2014. Période: 1 an, démarrage 1er octobre 2014. À l'issue de la formation, les étudiants sont aptes à analyser les besoins des organisations (entreprise, établissement public, collectivité locale...) en terme de veille stratégique (accès,collecte, traitement et communication de l'information). Ils peuvent ainsi réaliser l'audit d'un dispositif de veille, concevoir et mettre en place un dispositif de veille automatisé, réaliser des produits et services d'information électronique et animer une communautés de veilleurs. La formation fournit aux étudiants les compétences conceptuelles, méthodologiques, techniques et pratiques permettant d'assurer la responsabilité de projets de veille et d'analyse de l'information dans différents domaines : veille commerciale et marketing, veille réglementaire, veille documentaire, veille d'image et de e-réputation, veille concurrentielle... Entreprises partenaires La formation s'appuie sur un réseau de partenaires du monde de l'industrie de l'information qui participent à la formation : des éditeurs de logiciels (AMI Software, KB Crawl, TEMIS, Web Site Watcher...), des agrégateurs de presse et de contenu (Europresse), des spécialistes de la veille et du /knowledge management/ (Histen Riller, la CCIR Nord Pas de Calais, Kurt Salmon, OTO Research...), le GFII (Groupement français de l'industrie de l'information) ainsi qu'un ensemble d'organismes accueillant des stagiaires (Cofidis, CCIR Nord Pas de Calais, Pas de Calais Habitat, SNCF, LVMH, Norauto, Decathlon, Carrefour, Lesaffre, Pierre Fabre, BNP Paribas...). Modalités de formation La formation est accessible en formation initiale, en alternance (contrat de professionnalisation) et en VAE (Validation des Acquis d'Expérience). Contact & inscription Responsable pédagogique : Stéphane Chaudiron, Professeur en sciences de l'information et de la communication, stephane.chaudiron at Secrétariat : Mme Delerue, beatrice.delerue at Site web : ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:26:38 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:26:38 +0200 Subject: These: Anais Lefeuvre, Semantique des temps du francais, une formalisation compositionnelle Message-ID: Date: Wed, 9 Jul 2014 18:56:11 +0200 From: anais lefeuvre Message-ID: X-url: X-url: Chers collègues, j'ai le plaisir de vous annoncé que j'ai soutenu ma thèse intitulée "Sémantique des temps du français: une formalisation compositionnelle" le 23 juin dernier au LaBRI. Le manuscrit peut être consulté à l'adresse suivante: Résumé: Cette thèse s’inscrit dans le cadre du projet Région Aquitaine - INRIA : ITIPY. Ce projet vise l’extraction automatique d’itinéraires à partir de récits de voyage en milieu pyrénéen du XIXème et du début du XXème siècle. Notre premier travail fut de caractériser le corpus comme échantillon du français, par une étude contrastive d’une part de données quantitatives et d’autre part de la structure des récits de voyage. Nous nous sommes ensuite consacrée à l’étude du temps, et plus particulièrement à l’analyse automatique de la sémantique des temps verbaux du français. Disposant d’un analyseur syntaxique et sémantique à large échelle du français, basé sur les grammaires catégorielles et la sémantique compositionnelle (λ-DRT), notre tâche a été de prendre en compte les temps des verbes pour reconstituer la temporalité des événements et des états, notions regroupées sous le termes d’éventualité. Cette thèse se concentre sur la construction d’un lexique sémantique traitant des temps verbaux du français. Nous proposons une extension et une adaptation d’un système d’opérateurs compositionnels conçu pour les temps du verbe anglais, aux temps et à l’aspect du verbe français du XIXème siècle à nos jours. Cette formalisation est de facto opérationnelle, car elle est définie en terme d’opérateurs du λ-calcul dont la composition et la réduction, déjà programmées, calculent automatiquement les représentations sémantiques souhaitées, des formules multisortes de la logique d’ordre supérieur. Le passage de l’énoncé comportant une éventualité seule au discours, dont le maillage référentiel est complexe, est discuté et nous concluons par les perspectives qu’ouvre nos travaux pour l’analyse du discours. Cordialement, Anaïs Lefeuvre A. T. E. R. Rhyming dictionaries are a kind of reverse dictionaries. They group words according to rhyming patterns. Rhymes can share exact sequences of vowel and consonant sounds towards the end of a word (consonant rhyme) or just similar vowel sounds (assonant rhyme). Thus, these dictionaries are based on pronunciation, not on writing patterns. Also, since consonance and assonance depend on the stressed syllable, words which end with a stressed syllable are grouped together, those whose stressed syllable is the next to last appear together, and so on. In addition, word pronunciation may vary with time and across geographical and social dialects. In Spanish, this is particularly clear when word loans (for instance, Anglicisms and Galicisms) are considered. In fact, they tend to keep their original writing, at least in the Mexican variant which is the most spoken one. For example, the following loan words, common in Mexican Spanish, rhyme: flash, collage, garage, cottage, squash. Their last syllable is stressed and they are ordered in reverse according to their sounds and not their letters: (respectively, /fláʃ/, /ko.láʃ/, /ga.ráʃ/, /ko.táʃ/ and /es.kwáʃ/). The project described takes the current nomenclature of the Diccionario del español de México ( to generate automatically a rhyming dictionary. Also, since the results of an online query to such a dictionary can be quite large, a procedure was developed to rank them semantically. The idea is to measure the similarities of the query definition to each of the definitions of the rhyming words. Semantia propose une gamme complète de produits dédiés à des marchés spécifiques : - les annonces classées ou petites annonces avec Semantia Classifieds - le e-commerce avec Semantia eCommerce - le voyage avec Semantia Travel - l'emploi avec Semantia Job Ces services s'appuient sur des référentiels métiers et sur des modèles linguistiques et sémantiques robustes, performants et spécialisés : télécommunication, banque, assurance, énergie, informatique, hi-fi, vidéo, textile, tourisme, électroménager bricolage, sport, emploi, rencontre, ameublement, alimentaire, photographie, médical, immobilier, automobile,... Aujourd'hui, les services déployés chez les clients de Semantia traitent près de 5 000 000 de questions Internautes et près de 30 000 000 fiches par mois. Semantia est présente en région marseillaise (siège), à Paris et à San Francisco. Missions Intégré(e) à l’équipe de traitement du langage, vous aurez pour principales missions : 1. dans le cadre du suivi des projets en cours : - développer et maintenir les bases de connaissances des clients existants - suivre, classer et répondre aux demandes des clients - proposer de nouvelles thématiques - produire des rapports mensuels destinés aux clients - communiquer et échanger avec les clients - mettre à jour les documentations - partager les bonnes pratiques aux autres membres de l'équipe 2. dans le cadre de développement de nouveaux projets : - déterminer la meilleure stratégie linguistique en fonction du domaine du client, des langues à développer et du projet concerné - déterminer la meilleure stratégie de paramétrage du moteur linguistique - développer les bases de connaissances - préparer ou rédiger les documentations nécessaires 3. dans le cadre de la recherche et développement : - participer activement aux activités de R&D de la société 4. dans le cadre de l’appui au marketing et au commercial : - apporter son expérience et son expertise dans la réflexion de nouveaux produits et services Connaissances requises - Expressions régulières et systèmes symboliques - Bonnes connaissances en syntaxe et morphologie des langues du monde - Capacité d'adaptation en fonction des langues travaillées - Notions en langages informatiques environnement web : PHP, SQL, HTML - Maîtrise des suites bureautiques Connaissances appréciées - Méthode de gestion Agile, environnement de développement Eclipse, systèmes Mac OS X et Linux de graphisme et de Mind Mapping. Qualités demandées - Facilité d'adaptation et efficacité, travail en équipe, logique, rigueur et motivation, créativité. Message-ID: Date: Wed, 09 Jul 2014 21:25:45 +0200 From: pap Message-ID: <53BD9739.9040401 at> X-url: Chers collègues et autres destinataires, Un nouvel appel à communications, cette fois non seulement pour faire le point sur les avatars de la terminologie, cette discipline voisine et soeur de la traduction, mais aussi pour rendre hommage à notre collègue John Humbley, qui en est une des figures éminentes : *Quo Vadis, Terminologia ?*, donc *Colloque international en hommage à John Humbley* Site Web: Dates : 18-20 février 2015, Date limite d'envoi des propositions : 15 septembre 2014, Notification: 20 décembre 2014 Depuis sa naissance, notamment sous l’impulsion de Eugen Wüster, la terminologie a connu maintes évolutions, tant techniques que conceptuelles. Elle s’est trouvée appliquée à une multitude d’applications, de la politique linguistique à l’indexation du web, qui l’ont amenée tout à tour à se confronter ou à s’hybrider à une grande variété de disciplines : lexicographie, traduction, rédaction technique... Plus frappant, encore, elle a donné lieu à des controverses de haute volée scientifique, par exemple sur l’héritage véritable de son fondateur, sur la source à privilégier dans la recherche d’informations, sur son caractère descriptif ou normatif ou encore sur son rapport avec l’ontologie. Elle nous paraît en particulier voisine de la traduction, en ceci que l’une et l’autre sont, comme l’auteur chez Flaubert, « présent[es] partout et visible[s] nulle part ». Comme elle, son histoire est intimement liée à celle du fait national, puisque la survie ou l’émergence d’une langue tient à la capacité de cette dernière de nommer la totalité des objets du réel (Michel Serres). Tout le monde fait de la terminologie, utilise la terminologie, mais souvent avec une forme d’ignorance qui va de la franche naïveté au franc déni. On pourrait d’ailleurs en dire de même de la linguistique de corpus à laquelle la terminologie s’est intimement liée ces deux dernières décennies dans une approche contextuelle, chère aux contextualistes britanniques, et non plus seulement conceptuelle. De la boîte à chaussures de jadis aux corpus et aux bases informatiques d’aujourd’hui, qui seront demain interconnectées, on ne classe plus, on n’ordonne plus les données du savoir spécialisé, c’est-à-dire les concepts et les termes, ainsi que les relations qui les unissent comme on le faisait par le passé. Autant de raison pour organiser un colloque international et pluridisciplinaire pour poser, avec des spécialistes de ces différents domaines, la question de l’unité de la ou des terminologies, pour faire le point sur ces diverses branches, ces diverses hybridations et pour en discerner les perspectives de développement. Et s’il fallait chercher un point de contact, un élément unificateur, une boussole dans cet océan terminologique, peut-être faudrait-il se tourner, par-delà les aspects scientifiques, vers une personnalité, dont le nom serait susceptible de faire référence dans chacun de ces champs. Un nom s’impose ici : John Humbley. Par l’extrême richesse de ses travaux sur les multiples domaines cités plus haut, par la diversité de son expérience professionnelle et institutionnelle, par les contacts qu’il a noués et qu’il entretient à l’échelle de la planète, par sa hauteur de vue et sa disponibilité sur tous ces aspects, par sa participation à une multitude de comités de lecture, notre collègue John Humbley est et a été un acteur et un témoin de premier plan dans ces bouleversements. C’est en hommage à sa personne et à son œuvre que nous avons décidé d’organiser ce grand colloque international, dont les fruits feront, après passage en comité de lecture, l’objet d’un numéro thématique dans une prestigieuse revue internationale de traductologie et de terminologie. Différentes méthodes de recherche y sont disponibles : par mot-clé, par auteur, par année ou par texte intégral. Theme: Data-to-Text Generation Funded by the ITEA ModelWriter Project Main topic: Natural Language Generation for the Semantic Web Description: There is a growing need in the semantic web (SW) community for technologies that give humans easy access to the machine-oriented Web of data. Because it maps data to text, Natural Language Generation (NLG) provides a natural mean for presenting this data in an organized, coherent and accessible way. Conversely, the representation languages used by the semantic web (e.g., OWL ontologies and RDF data) are a natural starting ground for NLG systems. The aim of the PhD thesis will be to explore the interaction between the semantic web, the textual web and Natural Language Generation (NLG). More precisely, the goal will be to develop generic weakly supervised methods for generating text from semantic web data in particular, content selection and verbalisation methods. The project will build on an ongoing collaboration between LORIA (Nancy, France), the KRDB group at (Bolzano, Italy) and Stanford Research International (USA), bringing together high level academic partners with internationally recognised expertise in both NLG (LORIA) and knowledge processing (KRDB, SRI). Profile: We are looking for outstanding young research scientists with a good honours degree in Computational Linguistics or Computer Science, with programming skills and with a strong interest in Natural Language Processing. Required skills: - Master's degree in Computational Linguistics or Computer Science - experience in Natural Language Processing - good command of the English language Desirable skills: - experience in natural language generation Supervisor: - Claire Gardent, Research Environment: LORIA is a computer science research unit which conducts most of its scientific activities in partnership with the Inria Nancy - Grand-Est Centre, the French National Centre for Scientific Research (CNRS), the University of Lorraine. We also maintain close ties with research institutes and universities from the wider region, notably in Saarbrücken and Luxembourg. With around 500 staff and 27 research teams, it is one of the biggest research unit in Lorraine. It conducts research in Algorithms, Computation, Image & Geometry; Formal methods; Networks, Systems and Services; Knowledge & Natural Language Processing; and Complex Systems & Artificial Intelligence. The PhD will be funded by the ITEA3 ModelWriter project ( for a period of 36 months. Including industrial and academic partners from France, Belgium and Turkey, this projects targets the Development of an integrated authoring environment combining a semantic parser, a data-to-text generator and Knowledge Capture Tools. The PhD Candidate will be working in collaboration with the members of the Synalp team (, a Research Group in Computational Linguistics. Synalp research focuses on hybrid, symbolic and statistical approaches to natural language processing and applications built thereon, including NLP for Man-Machine Dialog, for language learning and for Data Verbalisation. Location: Nancy ( is a high-tech city located at the heart of the Lorraine Region, in outstanding scientific and natural surroundings. It is 1h30 by train from Paris, Germany and Luxemburg and 1h from Paris Roissy international airport. JOB REQUIREMENTS - Ph.D. in Computer Science, Natural Language Processing, Information Retrieval, Information Extraction - Solid programming skills in Java environment - Strong publication record - Participation in international evaluation campaign like TREC or KBP is a plus - English and French speaking Please send your resume / Merci d'envoyer votre CV à : Eric Charton eric.charton at ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:35:39 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:35:39 +0200 Subject: Appel: Special Session: Environmental and geo-spatial data analytics (EnGeoData) - DSAA'2014 Message-ID: Date: Thu, 10 Jul 2014 09:32:00 +0200 From: Mathieu Roche Message-ID: <6f398480685853b8b5519f758d7cb42d at> X-url: ################### ########## Call for Papers ####### ##### Special Session: Environmental and geo-spatial data analytics #### (EnGeoData) ### ## DSAA'2014 - IEEE International Conference on Data Science and # Advanced Analytics # with ACM SIGKDD and technically co-sponsored by IEEE Computational # Intelligence Society # # 30 October - 1 November, 2014, Shanghai, China # # Contact: engeodata at # Web: AIM AND SCOPE Environmental and more generally geo-spatial information is now provided by crowdsourcing but also by public administrations in the context of the open data policies. Analyses of such data are still challenging. Firstly because of their heterogeneity (structural, semantic, spatial and temporal), and secondly because of the difficulty in choosing the “best” knowledge discovery process to apply, according to the needs of the experts in the field. This special session aims at discussing and assessing some of these strategies covering all or part of the issues mentioned above, from a theoretical or experimental point of view. TOPICS - Pre and Post Data processing - Data Quality, Result Evaluation - Data Mining or Data Warehousing Applications - Text-Mining - Visual Analytics - KDD real use-cases dedicated to environmental and geo-spatial Data PAPER SUBMISSION - Papers should be submitted by DSAA submission site choosing Special Session on "Environmental and geo-spatial data analytics (EnGeoData)" before 22nd July 2014 (PST). - Conference paper submissions should be limited to a maximum of seven (7) pages, in the IEEE 2-column format (see the IEEE Proceedings Author Guidelines: - All submissions will be blind reviewed by the Program Committee on the basis of technical quality, relevance to conference topics of interest, originality, significance, and clarity. Author names and affiliations must not appear in the submissions, and bibliographic references must be adjusted to preserve author anonymity. - Accepted conference papers will be published in the conference proceedings by IEEE and included into the IEEE Xplore Digital Library and will be submitted for EI indexing through INSPEC by IEEE. Apologies for any cross-postings.] CALL FOR PARTICIPATION WoLLIC 2014 21st Workshop on Logic, Language, Information and Computation September 1st to 4th, 2014 Valparaiso, Chile (Co-located with ISR 2014 - 7th International School on Rewriting SCIENTIFIC SPONSORSHIP Interest Group in Pure and Applied Logics (IGPL) The Association for Logic, Language and Information (FoLLI) Association for Symbolic Logic (ASL) European Association for Theoretical Computer Science (EATCS) European Association for Computer Science Logic (EACSL) Sociedade Brasileira de Computação (SBC) Sociedade Brasileira de Lógica (SBL) ORGANISATION Department of Computer Science, Universidad de Chile, Chile Department of Computer Science, Pontificia Universidad Católica de Chile, Chile Centro de Informática, Universidade Federal de Pernambuco, Brazil HOSTED BY Department of Informatics, Universidad Técnica Federico Santa María, Chile CALL FOR PARTICIPATION WoLLIC is an annual international forum on inter-disciplinary research involving formal logic, computing and programming theory, and natural language and reasoning. Each meeting includes invited talks and tutorials as well as contributed papers. The twentieth WoLLIC will be held at the Universidad Técnica Federico Santa María, from September 1st to 4th, 2014. It is sponsored by the Association for Symbolic Logic (ASL), the Interest Group in Pure and Applied Logics (IGPL), the The Association for Logic, Language and Information (FoLLI), the European Association for Theoretical Computer Science (EATCS), the European Association for Computer Science Logic (EACSL), the Sociedade Brasileira de Computação (SBC), and the Sociedade Brasileira de Lógica (SBL). INVITED TALKS Verónica Becher (Universidad de Buenos Aires): *On Normal Numbers* Juha Kontinen (University of Helsinki): *Dependence Logic* Aarne Ranta (University of Gothenburg): *Syntax and Semantics for Translation* Kazushige Terui (Kyoto University): *Intersection Types for Normalization and Verification* Luca Vigano (Università di Verona): *Modal and Temporal Deduction Systems for Quantum State Transformations* Thomas Wilke (Christian-Albrechts-Universität zu Kiel): *Backward Deterministic Büchi Automata* TUTORIAL LECTURES Aarne Ranta (University of Gothenburg) Luca Vigano (Università di Verona) EARLY REGISTRATION (UNTIL AUGUST 20TH) General: US$ 300 Latinamerican students: US$ 150 LATE REGISTRATION General: US$ 350 Latinamerican students: US$ 200 PROGRAMME COMMITTEE Ulrich Kohlenbach (Technische Universität Darmstadt) - Chair Natasha Alechina (University of Nottingham) Eric Allender (Rutgers University) Marcelo Arenas (Pontificia Universidad Católica de Chile) Steve Awodey (Carnegie Mellon University) Stefano Berardi (Università di Torino) Julian Bradfield (University of Edinburgh) Xavier Caicedo (Universidad de los Andes de Chile) Olivier Danvy (University of Aarhus) Hans van Ditmarsch (LORIA) Marcus Kracht (University of Bielefeld) Michiel van Lambalgen (University of Amsterdam) Klaus Meer (Technische Universität Cottbus) George Metcalfe (University of Bern) Dale Miller (INRIA/LIX) Russell Miller (City University of New York) Sara Negri (University of Helsinki) Grigory Olkhovikov (Urals State University) Nicole Schweikardt (Goethe-University Frankfurt am Main) Sebastiaan Terwijn (Radboud University Nijmegen) STEERING COMMITTEE Samson Abramksy, Johan van Benthem, Anuj Dawar, Joe Halpern, Wilfrid Hodges, Daniel Leivant, Leonid Libkin, Angus Macintyre, Grigori Mints (in memoriam), Luke Ong, Hiroakira Ono, Ruy de Queiroz. ORGANISING COMMITTEE Pablo Barceló (Universidad de Chile) (Local chair) Anjolina G. de Oliveira (U Fed Pernambuco) Ruy de Queiroz (U Fed Pernambuco) (co-chair) Juan Reutter (Pontificia Universidad Católica de Chile) Cristián Riveros (Pontificia Universidad Católica de Chile) FURTHER INFORMATION Contact one of the Co-Chairs of the Organising Committee. The focus of this second workshop is on definition practices in either human or machine-assisted ontology development. PRESENTATION A current problem in ontology development is constructing the needed definitions of terms either logical or in natural language. For example, ontologies built using OBO Foundry principles are advised to include both logical and natural language definitions, but ontology developers too often focus on only one of these, or they pay insufficient attention to whether they are equivalent. Explicit definitions of terms in ontologies serve a number of purposes. Logical definitions allow reasoners to create inferred hierarchies, lessening the burden of asserting and checking the validity of subsumptions. Natural language definitions help to ameliorate the pervasive problem of low inter-annotator agreement. In specialized domains, experts will know their own field well, but may only have limited knowledge of adjacent disciplines. Good definitions make it possible for non-experts to understand unfamiliar terms and thereby make it possible for more confident reuse of terms by external ontologies, which in turn facilitates data integration. The goal of this workshop is to bring together interested researchers and developers to explore these issues by presenting case studies in a biomedical domain discussing the difficulties that arise when constructing definitions with a view to sharing strategies in the future. Even in the seemingly narrow domain of definition construction, cross-fertilization from related disciplines should yield benefits in quality and help to identify novel approaches. Papers submitted should include one or more case studies and raise specific questions related to definitions with a link to a biomedical domain. Reports on successful or unsuccessful methods are both appropriate. TOPICS - experiences in formulating definitions - tools that assist in definition editing, including collaborative systems - coordination of logical and textual definitions - validation and quality control of definitions, e.g., checking that definitions comply with the all/some form - methods for constructing definitions from multiple sources - use of controlled languages such as Rabbit or ACE for more user-friendly logical definition creation - use of templates to systematize definition creation FORMAT AND OUTCOMES This will be a half-day workshop with a selected mix of presentations based on accepted papers. In order to promote discussion, each presentation will be followed by a short response by a participant of the workshop to be arranged in advance of the workshop. This workshop will document findings on the workshop’s website ( We expect accepted papers to be published in the Journal of Biomedical Semantics (JBS). Papers should be between 5 and 10 pages long (rendered), excluding references, formatted using the JBS templates at, and submitted via EasyChair ( IMPORTANT DATES Workshop paper submission EXTENDED DEADLINE: July 25, 2014 Notification of paper acceptance: August 15, 2014 Camera-ready copies for the proceedings: September 15, 2014 Workshops: October 6-7, 2014 ORGANIZING COMMITTEE Selja Seppälä (University at Buffalo, USA) Patrick Ray (University at Buffalo, USA) Alan Ruttenberg (University at Buffalo, USA) PROGRAM COMMITTEE Nathalie Aussenac-Gilles (National Center for Scientific Research (CNRS), France) Mélanie Courtot (MBB Department Simon Fraser University and BC Public Health Microbiology & Reference Laboratory, Canada) Natalia Grabar (Université de Lille 3, France) Janna Hastings (European Bioinformatics Institute, Cambridge, UK) James Malone (European Bioinformatics Institute, Cambridge, UK) Alexis Nasr (Aix Marseille Université, France) Richard Power (The Open University, UK) Allan Third (The Open University, UK) SUPPORTED BY The Swiss National Science Foundation (SNSF) The State University of New York at Buffalo ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 15 19:49:23 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 15 Jul 2014 21:49:23 +0200 Subject: Appel: AICCSA 2014, The 11th ACS/IEEE International Conference on Computer Systems and Applications Message-ID: Date: Mon, 14 Jul 2014 09:56:25 -0700 From: Abdelmounaam Rezgui Message-ID: X-url: ACS/IEEE AICCSA2014 International Conference 10-13 November, 2014 Doha, Qatar ------------------- The ACS/IEEE International Conference on Computer Systems and Applications is a premier Computer Science and Engineering Conference. It is an international forum for researchers and practitioners interested in the advances of computer systems and their applications. The 11th edition of AICCSA will be organized in Doha, Qatar by the Department of Computer Science and Engineering (CSE), College of Engineering at Qatar University on November 10-13, 2014. We are pleased to invite you to submit your papers to AICCSA’14. Any theoretical, conceptual or applicative paper, or a survey of the state of the art contribution is welcome. Topics of interest include the following areas (but not limited to): * Cloud and Distributed Computing * Networking, Sensor Networks, MobileComputing * High Performance Computing * Multimedia, Computer Vision and Image Processing * Big Data, Business Intelligence, Analytics * IR, Data and Knowledge Management * BPM, Web Services, SOA * Natural Language Processing * Interoperability, Semantic Web and Future Internet Technologies * Social Computing * Security and Privacy * E-Learning, M-Learning Proceedings Papers selected for presentation will appear in the Conference Proceedings, which will be published by the IEEE Computer Society and be submitted to IEEE Xplore for inclusion. Regular Papers Papers must be submitted electronically by the deadline below to All papers will be reviewed and judged on merits including originality, significance, interest, correctness, clarity, and relevance to the broader community. Beyond a great feedback on their work, they will have a unique opportunity to be part of an international PhD network. Tutorial Proposals Proposals tutorials and panels should be submitted to the tutorial chair with a copy to the program chairs. Conference Awards Awards will be given to the best Conference Paper and the best Doctoral Symposium presentation. Selection for Journals Best papers of the conference will be selected and proposed for publication in some indexed international journals such as Cluster Computing Journal,International Journal of Secure Software Engineering, International Journal of Product Lifecycle Management, International Journal Engineering Applications of Artificial Intelligence (EAAI), International Journal of Computer Vision and Image Processing (IJCVIP) Industrial Sessions A half-day will be dedicated to industrial sessions and panels, managed by the Industrial Advisory board of the conference. Research Collaboration and Networking Take a unique opportunity to attend the specific Panel on the collaboration possibilities with Qatar research teams and get the latest news on current running projects and the funding possibilities through Qatar National Research Fund (QNRF) (nprp, exceptional projects, etc). Rich Social Program Discover Qatar culture through the organized visits to local famous museums, picturesque Corniche, Qatara cultural village, and the fantastic Souk Waqif!... Experience an unforgettable complementary 4x4 Desert Safari and bedouin campsite in the heart of the wonderful dunes of Qatar! Important Dates Research Paper Submissions July 1, 2014 July 21, 2014 Notification of Acceptance September 1, 2014 Camera Ready September 8, 2014 Author Registration September 8, 2014 PhD Symposium Submissions September 8, 2014 Tutorial Proposals July 1, 2014 For more information about the conference, please visit: For any inquiries, please, use this email: aiccsa2014 at Please, kindly redistribute this CFP to all research relevant venues. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 15 19:56:11 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 15 Jul 2014 21:56:11 +0200 Subject: Appel: Deadline extension, ToRPorEsp (at Propor 2014) Message-ID: Date: Tue, 15 Jul 2014 09:07:36 -0300 From: Muntsa Padró Message-ID: X-url: ---------------------------------- Extended deadline for abstract submission: 24th of July ---------------------------------- The workshop on Tools and Resources for Automatically Processing Portuguese and Spanish aims to be a forum for the presentation and discussion of language-specific developments for Portuguese and Spanish. We expect to join together researchers and developers with a focus on the creation of tools and linguistic resources for these two languages. A special interest of the workshop is to facilitate access to technologies and resources that are specific to Portuguese and Spanish. We intend that the workshop will contribute to make tools and resources easily available to the local community. To that aim, we encourage submissions that are oriented to simplify the integration of a given tool or resource to address specific needs in these regions. Detailed descriptions, motivation of utility with specific scenarios, even tutorial-like approaches will be highly appreciated. All workshop-related materials will be readily available from the workshop webpage, to promote adoption. Important Dates: - NEW! July 24th, 2014: Abstract submission deadline - September 1st, 2014: Notification of acceptance - October 9th, 2014: Workshop Visit for more information ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 15 19:53:30 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 15 Jul 2014 21:53:30 +0200 Subject: Ressources: ELRA - Language Resources Catalogue - Update Message-ID: Date: Tue, 15 Jul 2014 10:42:00 +0200 From: ELRA ELDA Information Message-ID: <53C4E958.4070605 at> X-url: X-url: X-url: [Our apologies if you have received multiple copies of this announcement.] ***************************************************************** ELRA - Language Resources Catalogue - Update ***************************************************************** We are happy to announce that 3 new Speech resources are now available in our catalogue. *ELRA-S0366 LECTRA (LECture TRAnscriptions in European Portuguese)* This corpus is composed of the audio and the manual transcriptions from seven 1-semester University courses in Portuguese. The corpus contains a total of 28 hours of audio speech that were manually transcribed by several trained annotators. The corpus is comprised of technical University lectures. For more information, see: *ELRA-S0367 CORAL Corpus* The CORAL Corpus is a collection of spoken dialogues in European Portuguese. It consists of 56 dialogues about a predetermined subject: maps. One of the participants (giver) has a map with some landmarks and a route drawn between them; the other (follower) has also landmarks, but no route and consequently must reconstruct it. Only orthographic transcription was done for the whole corpus. A pilot recording was annotated in several levels. For more information, see: *ELRA-S0370 MoveOn Speech and Noise Corpus* The MoveOn Speech and Noise Corpus is a corpus recorded under the extreme conditions of the motorcycle environment within the MoveOn project. Please send it to interested colleagues and students. Thanks! CALL FOR EXTENDED ABSTRACTS, PAPERS, WORKSHOPS and TUTORIALS! ************************************************************************ International Conference on Information Society (i-Society 2014) Technical Co-Sponsored by IEEE UK/RI Computer Chapter 10-12 November, 2014 Venue: London Heathrow Marriott Hotel London, UK ************************************************************************ The i-Society 2014 is Technical Co-Sponsored by UK/RI Computer Chapter. The i-Society is a global knowledge-enriched collaborative effort that has its roots from both academia and industry. The conference covers a wide spectrum of topics that relate to information society, which includes technical and non-technical research areas. The mission of i-Society 2014 conference is to provide opportunities for collaboration of professionals and researchers to share existing and generate new knowledge in the field of information society. The conference encapsulates the concept of interdisciplinary science that studies the societal and technological dimensions of knowledge evolution in digital society. The i-Society bridges the gap between academia and industry with regards to research collaboration and awareness of current development in secure information management in the digital society. The topics in i-Society 2014 include but are not confined to the following areas: *New enabling technologies - Internet technologies - Wireless applications - Mobile Applications - Multimedia Applications - Protocols and Standards - Ubiquitous Computing - Virtual Reality - Human Computer Interaction - Geographic information systems - e-Manufacturing *Intelligent data management - Intelligent Agents - Intelligent Systems - Intelligent Organisations - Content Development - Data Mining - e-Publishing and Digital Libraries - Information Search and Retrieval - Knowledge Management - e-Intelligence - Knowledge networks *Secure Technologies - Internet security - Web services and performance - Secure transactions - Cryptography - Payment systems - Secure Protocols - e-Privacy - e-Trust - e-Risk - Cyber law - Forensics - Information assurance - Mobile social networks - Peer-to-peer social networks - Sensor networks and social sensing *e-Learning - Collaborative Learning - Curriculum Content Design and Development - Delivery Systems and Environments - Educational Systems Design - e-Learning Organisational Issues - Evaluation and Assessment - Virtual Learning Environments and Issues - Web-based Learning Communities - e-Learning Tools - e-Education *e-Society - Global Trends - Social Inclusion - Intellectual Property Rights - Social Infonomics - Computer-Mediated Communication - Social and Organisational Aspects - Globalisation and developmental IT - Social Software *e-Health - Data Security Issues - e-Health Policy and Practice - e-Healthcare Strategies and Provision - Medical Research Ethics - Patient Privacy and Confidentiality - e-Medicine *e-Governance - Democracy and the Citizen - e-Administration - Policy Issues - Virtual Communities *e-Business - Digital Economies - Knowledge economy - eProcurement - National and International Economies - e-Business Ontologies and Models - Digital Goods and Services - e-Commerce Application Fields - e-Commerce Economics - e-Commerce Services - Electronic Service Delivery - e-Marketing - Online Auctions and Technologies - Virtual Organisations - Teleworking - Applied e-Business - Electronic Data Interchange (EDI) *e-Art - Legal Issues - Patents - Enabling technologies and tools *e-Science - Natural sciences in digital society - Biometrics - Bioinformatics - Collaborative research *Industrial developments - Trends in learning - Applied research - Cutting-edge technologies * Research in progress - Ongoing research from undergraduates, graduates/postgraduates and professionals Important Dates: *Extended Abstract (Work in Progress) Submission Date: August 20, 2014 *Notification of Extended Abstract (Work in Progress) Acceptance/Rejection: August 31, 2014 *Research Paper, Student Paper, Case Study, Report Submission Date: August 31, 2014 *Notification of Research Paper, Student Paper, Case Study, Report Acceptance/Rejection: September 15, 2014 *Camera Ready Paper Due: October 10, 20124 *Proposal for Workshops: September 01, 2014 *Notification of Workshop Acceptance/Rejection: September 10, 2014 *Poster/Demo Proposal Submission: August 31, 2014 *Notification of Poster/Demo Acceptance: September 10, 2014 *Participant(s) Registration (Open): May 01, 2014 *Early Bird Registration Deadline: September 30, 2014 *Late Bird Registration Deadline (Authors only): October 01 to October 15, 2014 *Late Bird Registration Deadline (Participants only): October 01 to November 03, 2014 *Conference Dates: November 10-12, 2014 For more details, please visit From hamon at LIMSI.FR Tue Jul 15 20:05:06 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 15 Jul 2014 22:05:06 +0200 Subject: Appel: SSST-8, 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (EMNLP 2014) Message-ID: Date: Tue, 15 Jul 2014 16:06:09 +0100 From: Eva Maria Vecchi Message-Id: <8AEBA765-3C7B-4998-AEF8-E82AB7F38E08 at> X-url: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) EMNLP 2014 / SIGMT / SIGLEX Workshop Oct 2014, Doha, Qatar *** Special theme: Compositional Distributional Semantics and Machine Translation *** The Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) seeks to bring together a large number of researchers working on diverse aspects of structure, semantics and representation in relation to statistical machine translation. Since its first edition in 2006, its program each year has comprised high-quality papers discussing current work spanning topics including: new grammatical models of translation; new learning methods for syntax- and semantics-based models; formal properties of synchronous/transduction grammars (hereafter S/TGs); discriminative training of models incorporating linguistic features; using S/TGs for semantics and generation; and syntax- and semantics-based evaluation of machine translation. We invite two types of submissions this year: 1. Extended abstracts for poster or hands-on presentations on the special theme 2. Full papers spanning all areas of interest for SSST =========================== Special Theme Extended Abstracts =========================== This year, the special theme of semantics of the past three editions of SSST takes a new step with a "working workshop" bringing together researchers interested in compositional distributional semantics, distributed representations, and continuous vector space models in MT, with tutorials bridging both directions, as well as discussions and hands-on work on relevant tasks with real data. Such models have proven beneficial for a number of NLP tasks, for example phrasal similarity, lexical entailment, modeling semantic deviance, detecting order restrictions in recursive structures, or improving NP bracketing in parsing. However, they have not received as much attention in MT. Extended abstracts of at most two (2) pages should describe poster or hands-on presentations that will stimulate discussions on the special theme of compositional distributional semantics and machine translation, including position papers, recent work, pilot studies, negative results. We encourage the presentation of relevant work that has been published or submitted elsewhere, as well as new work in progress. ========= Full Papers ========= The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is now wide consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and similar notational equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages. Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations. At the same time, SMT has seen a movement toward semantics over the past few years, which has been reflected at recent SSST workshops, including the last three editions which had semantics for SMT as a special theme. The issues of deep syntax and shallow semantics are closely linked and SSST-8 continues to encourage submissions on semantics for MT in a number of directions, including semantic role labeling, sense disambiguation, and compositional distributional semantics for translation and evaluation. We invite papers on: syntax-based / semantics-based / tree-structured SMT machine learning techniques for inducing structured translation models algorithms for training, decoding, and scoring with semantic representation structure empirical studies on adequacy and efficiency of formalisms creation and usefulness of syntactic/semantic resources for MT formal properties of synchronous/transduction grammars learning semantic information from monolingual, parallel or comparable corpora unsupervised and semi-supervised word sense induction and disambiguation methods for MT lexical substitution, word sense induction and disambiguation, semantic role labeling, textual entailment, paraphrase and other semantic tasks for MT semantic features for MT models (word alignment, translation lexicons, language models, etc.) evaluation of syntactic/semantic components within MT (task-based evaluation) scalability of structured translation methods to small or large data applications of S/TGs to related areas including: speech translation formal semantics and semantic parsing paraphrases and textual entailment information retrieval and extraction syntactically- and semantically-motivated evaluation of MT compositional distributional semantics in MT distributed representations and continuous vector space models in MT ========= Organizers ========= Dekai WU, Hong Kong University of Science and Technology (HKUST) Marine CARPUAT, National Research Council (NRC) Canada Xavier CARRERAS, Universitat Politècnica de Catalunya (UPC) Eva Maria VECCHI, Cambridge University ============= Important Dates ============= Submission deadline for papers and extended abstracts: 26 Jul 2014 Notification to authors: 26 Aug 2014 Camera copy deadline: 15 Sep 2014 For more information ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 15 20:15:08 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 15 Jul 2014 22:15:08 +0200 Subject: Appel: TSD 2014, Call for Demonstrations and Participation, 8-12 September 2014, Brno, Czech Republic Message-ID: Date: Tue, 15 Jul 2014 22:06:24 +0200 From: TSD 2014 Message-Id: X-url: ********************************************************* TSD 2014 - CALL FOR DEMONSTRATIONS AND PARTICIPATION ********************************************************* Seventeenth International Conference on TEXT, SPEECH and DIALOGUE (TSD 2014) Brno, Czech Republic, 8-12 September 2014 SUBMISSION OF DEMONSTRATION ABSTRACTS Authors are invited to present actual projects, developed software and hardware or interesting material relevant to the topics of the conference. The authors of the demonstrations should provide the abstract not exceeding one page as plain text. The submission must be made using the online form available at the conference www pages. The accepted demonstrations will be presented during a special Demonstration Session (see the Demo Instructions at Demonstrators can present their contribution with their own notebook with an Internet connection provided by the organisers or the organisers can prepare a PC computer with multimedia support for demonstrators. IMPORTANT DATES August 3 2014 ............ Submission of demonstration abstracts August 10 2014 ............ Notification of acceptance for demonstrations sent to the authors September 3-7 2014 ........ Conference dates The demonstration abstracts will not appear in the Proceedings of TSD 2014 but they will be published electronically at the conference website. KEYNOTE SPEAKERS Ralph Grishman, New York University, USA Active Learning for Information Extraction Bernardo Magnini, FBK - Fondazione Bruno Kessler, Italy Entailment graphs for text analytics Salim Roukos, IBM, USA Recent Progress in Statistical Machine Translation: Algorithms and Applications The conference is organized by the Faculty of Informatics, Masaryk University, Brno, and the Faculty of Applied Sciences, University of West Bohemia, Pilsen. The conference is supported by International Speech Communication Association. Venue: Brno, Czech Republic TSD SERIES TSD series evolved as a prime forum for interaction between researchers in both spoken and written language processing from all over the world. Proceedings of TSD form a book published by Springer-Verlag in their Lecture Notes in Artificial Intelligence (LNAI) series. TSD Proceedings are regularly indexed by Thomson Reuters Conference Proceedings Citation Index. Moreover, LNAI series are listed in all major citation databases such as DBLP, SCOPUS, EI, INSPEC or COMPENDEX. TOPICS Topics of the conference will include (but are not limited to): Corpora and Language Resources (monolingual, multilingual, text and spoken corpora, large web corpora, disambiguation, specialized lexicons, dictionaries) Speech Recognition (multilingual, continuous, emotional speech, handicapped speaker, out-of-vocabulary words, alternative way of feature extraction, new models for acoustic and language modelling) Tagging, Classification and Parsing of Text and Speech (morphological and syntactic analysis, synthesis and disambiguation, multilingual processing, sentiment analysis, credibility analysis, automatic text labeling, summarization, authorship attribution) Speech and Spoken Language Generation (multilingual, high fidelity speech synthesis, computer singing) Semantic Processing of Text and Speech (information extraction, information retrieval, data mining, semantic web, knowledge representation, inference, ontologies, sense disambiguation, plagiarism detection) Integrating Applications of Text and Speech Processing (machine translation, natural language understanding, question-answering strategies, assistive technologies) Automatic Dialogue Systems (self-learning, multilingual, question-answering systems, dialogue strategies, prosody in dialogues) Multimodal Techniques and Modelling (video processing, facial animation, visual speech synthesis, user modelling, emotions and personality modelling) Papers on processing of languages other than English are strongly encouraged. PROGRAM COMMITTEE Hynek Hermansky, USA (general chair) Eneko Agirre, Spain Genevieve Baudoin, France Paul Cook, Australia Jan Cernocky, Czech Republic Simon Dobrisek, Slovenia Karina Evgrafova, Russia Darja Fiser, Slovenia Radovan Garabik, Slovakia Alexander Gelbukh, Mexico Louise Guthrie, GB Jan Hajic, Czech Republic Eva Hajicova, Czech Republic Yannis Haralambous, France Ludwig Hitzenberger, Germany Jaroslava Hlavacova, Czech Republic Ales Horak, Czech Republic Eduard Hovy, USA Maria Khokhlova, Russia Daniil Kocharov, Russia Ivan Kopecek, Czech Republic Valia Kordoni, Germany Steven Krauwer, The Netherlands Siegfried Kunzmann, Germany Natalija Loukachevitch, Russia Vaclav Matousek, Czech Republic Diana McCarthy, United Kingdom France Mihelic, Slovenia Hermann Ney, Germany Elmar Noeth, Germany Karel Oliva, Czech Republic Karel Pala, Czech Republic Nikola Pavesic, Slovenia Fabio Pianesi, Italy Maciej Piasecki, Poland Adam Przepiorkowski, Poland Josef Psutka, Czech Republic James Pustejovsky, USA German Rigau, Spain Leon Rothkrantz, The Netherlands Anna Rumshisky, USA Milan Rusko, Slovakia Mykola Sazhok, Ukraine Pavel Skrelin, Russia Pavel Smrz, Czech Republic Petr Sojka, Czech Republic Stefan Steidl, Germany Georg Stemmer, Germany Marko Tadic, Croatia Tamas Varadi, Hungary Zygmunt Vetulani, Poland Pascal Wiggers, The Netherlands Yorick Wilks, GB Marcin Wolinski, Poland Victor Zakharov, Russia FORMAT OF THE CONFERENCE The conference program will include presentation of invited papers, oral presentations, and poster/demonstration sessions. Papers will be presented in plenary or topic oriented sessions. Social events including a trip in the vicinity of Brno will allow for additional informal interactions. OFFICIAL LANGUAGE The official language of the conference is English. ACCOMMODATION The organizing committee will arrange discounts on accommodation in the 4-star hotel at the conference venue. The current prices of the accommodation are available at the conference website. ADDRESS All correspondence regarding the conference should be addressed to Ales Horak, TSD 2014 Faculty of Informatics, Masaryk University Botanicka 68a, 602 00 Brno, Czech Republic phone: +420-5-49 49 18 63 fax: +420-5-49 49 18 20 email: tsd2014 at The official TSD 2014 homepage is: LOCATION Brno is the second largest city in the Czech Republic with a population of almost 400.000 and is the country's judiciary and trade-fair center. Brno is the capital of South Moravia, which is located in the south-east part of the Czech Republic and is known for a wide range of cultural, natural, and technical sights. South Moravia is a traditional wine region. Brno had been a Royal City since 1347 and with its six universities it forms a cultural center of the region. Brno can be reached easily by direct flights from London, Moscow, and Eindhoven, and by trains or buses from Prague (200 km) or Vienna (130 km). For the participants with some extra time, nearby places may also be of interest. Local ones include: Brno Castle now called Spilberk, Veveri Castle, the Old and New City Halls, the Augustine Monastery with St. Thomas Church and crypt of Moravian Margraves, Church of St. James, Cathedral of St. Peter & Paul, Cartesian Monastery in Kralovo Pole, the famous Villa Tugendhat designed by Mies van der Rohe along with other important buildings of between-war Czech architecture. For those willing to venture out of Brno, Moravian Karst with Macocha Chasm and Punkva caves, battlefield of the Battle of three emperors (Napoleon, Russian Alexander and Austrian Franz - Battle by Austerlitz), Chateau of Slavkov (Austerlitz), Pernstejn Castle, Buchlov Castle, Lednice Chateau, Buchlovice Chateau, Letovice Chateau, Mikulov with one of the largest Jewish cemeteries in Central Europe, Telc - a town on the UNESCO heritage list, and many others are all within easy reach. The workshop continues the successful series of workshops held previously in cooperation with ACM SIGSPATIAL and in conjunction with SIGIR and CIKM conferences. The purpose of the workshop is to bring together members of the vibrant and growing community of researchers and practitioners working in the field of geographic information retrieval to discuss current research activity and potential future research directions. The subject and format of the workshop ----------------------------------------------------- There is a vast quantity of information in text documents and other media that is referenced to geographic space. The discipline of Geographical Information Retrieval (GIR) is concerned with developing methods to gain access to this geographical information, with a particular focus on the content of web documents and social media. Because much of the information is in the form of unstructured or semi-structured text, there is a challenge to develop methods that can automatically recognise and interpret the geographical terminology and spatial or spatio-temporal concepts that people use when recording and querying the information. GIR falls at the intersection of Information Retrieval (IR) and Geographical Information Science (GIScience) resulting in research and systems development that benefits from the fusion of text-based methods for information extraction, natural language processing, indexing and search with GIS methods for spatial data management, analysis and visualization. The workshop invites contributions on the following topics, and other research related to GIR: - Detection, disambiguation and geocoding of geographical references in text; - User needs for geographic search; - Classification of web documents and social media with regard to their geographic foci; - Interpretation of spatial natural language in documents and queries; - Extraction of geographically-specific facts and events from text documents and social media; - Spatial and spatio-temporal indexing of documents and other media objects; - Modelling, construction and integration of ontologies, gazetteers and geographic thesauri; - Reasoning with geo-spatial facts for purposes of information retrieval; - Geographical query interfaces for search on the web; - Geographic question / answering systems; - Geographic search engine architectures; - Relevance ranking of geographical information; - Evaluation methods for geographic search. We invite both long papers (8 pages) and short papers (2 pages). Long papers are expected to report on relatively mature research results, while short papers may also cover more speculative or early stage research that may stimulate discussion at the workshop. All submissions will be reviewed by three members of the programme committee and all accepted papers will be published in the ACM Digital Library. Please note that we welcome contributions both from academic researchers and from practitioners working in industry and in public agencies engaged in GIR-related activities. The workshop programme will ensure opportunity for discussion of the presented papers and of the broader agenda for research in GIR. Submission procedure ------------------------------ You should prepare your paper in accordance with the ACM camera-ready instructions ( and submit it using the EasyChair system ( by 29th August 2014. Decisions on acceptance will be announced by 15th September 2014. Camera ready versions of accepted papers to be submitted by 29th September 2014. At least one author of accepted papers will be required to register for the workshop before the paper is published in the ACM Digital Library, and to present the paper at the workshop. Please note that attendance at the workshop also requires registration for the main ACM SIGSPATIAL GIS conference ( in addition to registering for the workshop. Further details of the workshop can be found at From hamon at LIMSI.FR Sun Jul 20 20:21:02 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 20 Jul 2014 22:21:02 +0200 Subject: Appel: Special Session, Environmental and geo-spatial data analytics (EnGeoData), DSAA'2014 Message-ID: Date: Wed, 16 Jul 2014 06:01:35 +0200 From: Mathieu Roche Message-ID: <5b38f2179a7ac5d1034ad52f3221bb1b at> X-url: X-url: ################### ########## 2nd Call for Papers ####### ##### Special Session: Environmental and geo-spatial data analytics #### (EnGeoData) ### ## DSAA'2014 - IEEE International Conference on Data Science and ## Advanced Analytics # with ACM SIGKDD and technically co-sponsored by IEEE Computational # Intelligence Society # # 30 October - 1 November, 2014, Shanghai, China # # Contact: engeodata at # Web: # # Deadline: 22nd July 2014 AIM AND SCOPE Environmental and more generally geo-spatial information is now provided by crowdsourcing but also by public administrations in the context of the open data policies. Analyses of such data are still challenging. Firstly because of their heterogeneity (structural, semantic, spatial and temporal), and secondly because of the difficulty in choosing the “best” knowledge discovery process to apply, according to the needs of the experts in the field. This special session aims at discussing and assessing some of these strategies covering all or part of the issues mentioned above, from a theoretical or experimental point of view. TOPICS - Pre and Post Data processing - Data Quality, Result Evaluation - Data Mining or Data Warehousing Applications - Text Mining - Visual Analytics - KDD real use-cases dedicated to environmental and geo-spatial Data PAPER SUBMISSION - Papers should be submitted by DSAA submission site choosing Special Session on "Environmental and geo-spatial data analytics (EnGeoData)" before 22nd July 2014 (PST). - Conference paper submissions should be limited to a maximum of seven (7) pages, in the IEEE 2-column format (see the IEEE Proceedings Author Guidelines: ). - All submissions will be blind reviewed by the Program Committee on the basis of technical quality, relevance to conference topics of interest, originality, significance, and clarity. Author names and affiliations must not appear in the submissions, and bibliographic references must be adjusted to preserve author anonymity. - Accepted conference papers will be published in the conference proceedings by IEEE and included into the IEEE Xplore Digital Library and will be submitted for EI indexing through INSPEC by IEEE. WEB SITE AND SUBMISSION CHAIRS - Maguelonne Teisseire (Irstea, TETIS, France) - Mathieu Roche (Cirad, TETIS, France) PROGRAM COMMITTEE (to be completed) - Gloria Bordogna, CNR Milan, Italy - Mete Celik, Erciyes University, Turkey - Pierre Gançarski, University of Strasbourg, France - Diana Inkpen, University of Ottawa, Canada - Eric Kergosien, University Lille 3, France - Florence Le Ber, ENGEES, France - Corrado Loglisci, University of Bari, Italy - Donato Malerba, University of Bari, Italy - Stan Matwin, Dalhousie University, Canada - Jordi Nin, Polytechnic University of Catalonia, Spain - François Petitjean, Monash University, Australia - Julien Velcin, University Lyon 2, France - Osmar R. Le projet sud4science ( a démarré en janvier 2011, et fait partie d'un grand projet international, sms4science (, initié par des chercheurs belges (Cental, UCL), en 2004. Le corpus « 88milSMS » est diffusé à partir du 26 juin 2014. Il s'agit d'un grand corpus de SMS authentiques, anonymisés, en français. Il est produit par l’Université Paul-Valéry Montpellier 3 et le CNRS, en collaboration avec l’Université catholique de Louvain, et il est financé grâce au soutien de la MSH-M et du Ministère de la Culture (Délégation générale à la langue française et aux langues de France) et avec la participation de Praxiling, Lirmm, Lidilem, Tetis, Viseo. Nous avons obtenu l'accord pour le mettre à disposition sur la grille de services d'Huma-Num. Les conditions d'utilisation et les téléchargements s'effectuent ici : C'est un grand jour pour tous les membres du projet. Nous profitons de ce message pour remercier nos institutions de recherche publique, nos entreprises, nos services juridiques, nos laboratoires de recherche, nos partenaires et nos 8 stagiaires étudiants qui ont travaillé tout au long de ces dernières années avec nous. Nous voudrions terminer ce message par des remerciements très appuyés au service juridique de l'Université Paul-Valéry, le SAJI, dirigé par Stéphanie Delaunay. Si le projet sud4science a pu aboutir sur le plan juridique, et si nous pouvons mettre à disposition le corpus « 88milSMS » aujourd'hui, c'est grâce à l'énorme investissement dans le projet par tout le service, et, en particulier, par notre correspondant Informatique et libertés (CIL), Nicolas Hvoinsky. Notre juriste-CIL s'est montré très actif dès le début du projet en 2011 : participation à nos séminaires scientifiques pour comprendre les enjeux du projet, rédaction de très nombreux documents juridiques, échanges de centaines de courriels, conseils sur l'anonymisation des SMS, réponses à nos questions incessantes, etc. Le temps et l'énergie consacrés au projet, et la patience à toute épreuve de Nicolas Hvoinsky ont très largement contribué à la réussite de ce projet. Comme dit précédemment, le corpus « 88milSMS » est diffusé à partirdu 26 juin 2014 et nous sommes ravis et fiers de pouvoir le mettre à disposition de tous. Bien cordialement, Rachel Panckhurst, Catherine Détrie, Cédric Lopez, Claudine Moïse, Mathieu Roche, Bertrand Verine. ---------- Annonce : ---------- Le corpus de SMS en langue française 88milSMS est disponible ! Conditions d’utilisation, téléchargements : © Panckhurst R., Détrie C., Lopez C., Moïse C., Roche M., Verine B. (2014) "88milSMS. Posters are expected to present ongoing and not necessarily completed research, teaching or training activity, practical work, software programs, projects or developments in general related to translation, interpretation and terminology, and to the related industries. The Translating and the Computer conference is a unique forum for researchers, developers and users. It brings together academics involved in language technology research and in teaching translation and terminology with those who develop and market tools for language transformation and both of these groups with users: translators, terminologists, interpreters, and voice-over specialists, whether freelancers or working in translation departments of large organisations such as those of the European Parliament, European courts and the European Patent Office, the United Nations family, international companies and other organisations, and Language Services Providers (LSPs), large and small. In its 36th session Translating and the Computer has moved from ASLIB to ASLING. The conference often referred to as the “ASLIB Conference” is now the ASLING Translating and the Computer Conference. One of the new developments is also the launch of a poster session in addition to the regular presentation slots. Poster proposals in the form of poster abstracts not exceeding 500 words (the final versions of the accepted posters can be up to 1,500 words) must be submitted using the START system at the following address:, adding the text “Poster:” at the start of the “Title of Submission: ” field in the online submission form. Accepted poster papers will be included (and will have the have the same status as regular papers) in the conference proceedings only after the registration fee for at least one presenter of the paper has been paid. Important dates Deadline for poster submissions: 8 August 2014 Notification of acceptance or rejection: 22 August 2014 Camera-ready poster papers due: 3 October Conference: 29 and 30 November 2014 Chairs * Juliet Macan, Arancho Doc srl. (Lead Chair 2014) * João Esteves-Ferreira, Tradulex, International Association for Quality Translation * Ruslan Mitkov, University of Wolverhampton * Olaf-Michael Stefanov, United Nations (ret), JIAMCATT Programme Committee * David Chambers, World Intellectual Property Organisation (ret) * Gloria Corpas Pastor, University of Malaga * Estelle Delpeche, Nomao * Alain Désilets, National Research Council of Canada (NRC) * David Filip, LRC, CNGL, LT-Web, University of Limerick * Pamela Mayorcas, FITI * Paola Valli, University of Trieste Conference Manager: * Nicole Adamides Association internationale pour la promotion des technologies linguistiques International Association for Advancement in Language Technology Bologna, Genève, London, Wien, Wolverhampton ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 22 19:59:04 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 22 Jul 2014 21:59:04 +0200 Subject: Job: Post-doc LPL Message-ID: Date: Tue, 22 Jul 2014 14:28:01 +0200 From: Joëlle Lavaud Message-ID: <53CE58D1.1050603 at> X-url: *POST-DOCTORAT AU LPL (AIX-EN-PROVENCE) - PROJET ANR PhonIACog* ************************************************************************ Offre d'un contrat post-doctoral d'un an au Laboratoire Parole et Langage (Aix-Marseille Université, CNRS, UMR 7309) dans le cadre du projet ANR PhonIACog (-/R//ô//le de l'Accentuation Initiale dans la structuration prosodique en français - de la phonologie au traitement de la parole/- ; Coordinateur principal : Corine Astésano, Université de Toulouse 2) . ***Description du projet* Le projet PhonIACog est financé par l'Agence Nationale de la Recherche (ANR -- Programme Blanc). Ce projet vise à décrire les caractéristiques accentuelles du français afin de mettre en lumière la structure phonologique sous-jacente caractérisant cette langue. Cette question est abordée au travers du patron bipolaire /AI-AF/ (Accent Initial - Accent Final), envisagé comme structure métrique de base en français. Nous proposons d'appliquer une même grille d'analyse sur une série de corpus allant de la parole de laboratoire à la parole semi-contrôlée et l'interaction dialogique spontanée. Les expériences de production sur divers styles de parole nous permettront d'affiner la caractérisation acoustico-phonétique de AI et AF afin d'améliorer les systèmes de détection automatique des événements prosodiques sur de larges corpus. Pour plus d'informations, consulter le site du projet: . *Description du travail* Le post-doctorant sera principalement impliqué dans le traitement des données. Il/elle participera aux analyses acoustiques et devra ensuite mettre en oeuvre les traitements statistiques prévus dans le projet. . *Pré-requis* Une thèse en Science du Langage (phonétique expérimentale/prosodie) ou en Traitement automatique du Langage ainsi qu'une solide expérience en statistiques et traitement de données sont attendues. Des connaissances sur le traitement et l'analyse des corpus oraux sont également bienvenues. . *Procédure* Les candidats enverront un CV détaillé avec une liste des publications, ainsi qu'une brève lettre mentionnant leurs intérêts scientifiques en précisant la nature de leur expérience en traitement des données. Merci d'envoyer les documents à : roxane.bertrand at (Roxane Bertrand, Responsable scientifique LPL, Aix-en-Provence, France). Date limite de réception des candidatures : 30 septembre 2014 Date de démarrage prévue : novembre 2014 (mais flexible) Durée du contrat : 12 mois Salaire : environ 2000EUR/mois ------------------ *POST-DOCTORAL POSITION FOR THE PROJECT PhonIACog (LPL - AIX-EN-PROVENCE, FRANCE)** ************************************************************************ We invite applications for a one-year Post-Doctoral position at the Laboratoire Parole et Langage (LPL, Aix-Marseille Université, CNRS, UMR 7309, France), to work on the project PhonIACog (-/The role of the Initial Accent in prosodic structuring in French/-/From phonology to speech processing/- Main coordinator : Corine Astésano, Université de Toulouse 2). . *Description* The PhonIACog project is funded by the The French National Research Agency (ANR). The present project aims at describing the characteristics of the French accentual system in order to bring to light the underlying phonological structure of this language. It addresses the status of the bipolar pattern /IA FA/ (initial accent-final accent), considered as the basic metric pattern in French. We propose to apply the same analyses to different corpora, from laboratory speech to semi-controlled speech and dialogic spontaneous interaction. The production studies will allow us to refine the acoustic-phonetic characterization of IA and FA, with potential application to automatic detection of prosodic cues on large, spontaneous corpora. More information is available at the project website: . *Job description* The post-doctoral fellow will be mainly involved in data processing. He/she will participate in the acoustic analyses and will then have to implement the statistical analyses planned in the project. . *Qualifications* A Ph.D. in linguistics (experimental phonetics/prosody) or in computer science and solid competence/experience in statistics and data analysis are required. Experience in processing and analysis of large speech database is also welcome. . *Application procedure* Candidates should send a detailed CV with a list of publications, and a cover letter with statement of research interests and details of their experience in data analysis. Please e-mail documents to:roxane.bertrand at (Roxane Bertrand, Scientific coordinator LPL, Aix-en-Provence, France). Deadline for submission: September 30, 2014 Expected start date: November 2014 (with some flexibility.) Kindly email this call for papers to your colleagues, faculty members and postgraduate students. Call for Papers, Extended Abstracts, Posters, Tutorials and Workshops! ********************************************************************************************* Ireland International Conference on Education (IICE-2014) October 27-29 Dublin, Ireland ********************************************************************************************* The Ireland International Conference on Education (IICE-2014) is an international refereed conference dedicated to the advancement of the theory and practices in education. The IICE promotes collaborative excellence between academicians and professionals from Education. The aim of IICE is to provide an opportunity for academicians and professionals from various educational fields with cross-disciplinary interests to bridge the knowledge gap, promote research esteem and the evolution of pedagogy. The IICE 2014 invites research papers that encompass conceptual analysis, design implementation and performance evaluation. All the accepted papers will appear in the proceedings and modified version of selected papers will be published in special issues peer reviewed journals. Topics: The topics in IICE-2014 include but are not confined to the following areas: * Academic Advising and Counselling * Art Education * Adult Education * APD/Listening and Acoustics in Education Environment * Business Education * Counsellor Education * Curriculum, Research and Development * Competitive Skills * Continuing Education * Distance Education * Early Childhood Education * Education for Sustainable Development * Educational Administration * Educational Foundations * Educational Psychology * Educational Technology * Education Policy and Leadership * Elementary Education * E-Learning * E-Manufacturing * ESL/TESL * E-Society * Geographical Education * Geographic information systems * Health Education * Higher Education * History * Home Education * Human Computer Interaction * Human Resource Development * Inclusive Education * Indigenous Education * ICT Education * Internet technologies * Imaginative Education * Kinesiology and Leisure Science * K12 * Language Education * Mathematics Education * Mobile Applications * Multi-Virtual Environment * Music Education * Pedagogy * Physical Education (PE) * Reading Education * Writing Education * Religion and Education Studies * Research Assessment Exercise(RAE) * Rural Education * Science Education * Secondary Education * Second life Educators * Social Studies Education * Special Education * Student Affairs * Teacher Education * Cross-disciplinary areas of Education * Ubiquitous Computing * Virtual Reality * Wireless applications * Other Areas of Education Submission: - You can submit your research paper at http:// or email it to papers-2014october at Important Dates: * Extended Abstract (Work in Progress) Submission Date: August 20, 2014 * Notification of Extended Abstract (Work in Progress) Acceptance/Rejection: August 31, 2014 * Research Paper, Student Paper, Case Study, Report Submission Date: August 25, 2014 * Notification of Research Paper, Student Paper, Case Study, Report Acceptance/Rejection: September 05, 2014 * Proposal for Workshops Submission Date: July 25 2014 * Notification of Workshop Acceptance/Rejection: July 30, 2014 * Posters Proposal Submission Date: August 01, 2014 * Notification of Posters Acceptance/Rejection: August 10, 2014 * Camera Ready Paper Due: September 20, 2014 * Early Bird Registration Deadline (Authors and Participants): May 31, 2014 - September 10, 2014 * Late Bird Registration Deadline (Authors only): September 11, 2014 - October 10, 2014 * Late Bird Registration Deadline (Participants only): August 31, 2014 - October 20, 2014 * Conference Dates: October 27- 29, 2014 For further information please visit IICE-2014 at From hamon at LIMSI.FR Sun Jul 20 20:46:11 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 20 Jul 2014 22:46:11 +0200 Subject: Ecole: BigDat 2015, 23 July registration deadline Message-ID: Date: Fri, 18 Jul 2014 22:55:33 +0200 From: "GRLMC - URV" Message-ID: <001301cfa2ca$a3821ae0$6400a8c0 at GRLMC.local> X-url: ***************************************************** INTERNATIONAL WINTER SCHOOL ON BIG DATA BigDat 2015 Tarragona, Spain January 26-30, 2015 Organized by Rovira i Virgili University ***************************************************** --- 2nd registration deadline: July 23, 2014 --- ***************************************************** AIM: BigDat 2015 is a research training event for graduates and postgraduates in the first steps of their academic career. It aims at updating them about the most recent developments in the fast developing area of big data, which covers a large spectrum of current exciting research, development and innovation with an extraordinary potential for a huge impact on scientific discoveries, medicine, engineering, business models, and society itself. Renowned academics and industry pioneers will lecture and share their views with the audience. All big data subareas will be displayed, namely: foundations, infrastructure, management, search and mining, security and privacy, and applications. Main challenges of analytics, management and storage of big data will be identified through 4 keynote lectures and 24 six-hour courses, which will tackle the most lively and promising topics. The organizers believe outstanding speakers will attract the brightest and most motivated students. Interaction will be a main component of the event. ADDRESSED TO: Graduate and postgraduates from around the world. There are no formal pre-requisites in terms of academic degrees. However, since there will be differences in the course levels, specific knowledge background may be required for some of them. BigDat 2015 is also appropriate for more senior people who want to keep themselves updated on recent developments and future trends. They will surely find it fruitful to listen and discuss with major researchers, industry leaders and innovators. REGIME: In addition to keynotes, 3 courses will run in parallel during the whole event. Participants will be able to freely choose the courses they will be willing to attend as well as to move from one to another. VENUE: BigDat 2015 will take place in Tarragona, located 90 kms. to the south of Barcelona. The venue will be: Campus Catalunya Universitat Rovira i Virgili Av. Catalunya, 35 43002 Tarragona KEYNOTE SPEAKERS: Ian Foster (Argonne National Laboratory), tba Geoffrey C. Fox (Indiana University, Bloomington), Mapping Big Data Applications to Clouds and HPC C. Lee Giles (Pennsylvania State University, University Park), Scholarly Big Data: Information Extraction and Data Mining William D. Gropp (University of Illinois, Urbana-Champaign), tba COURSES AND PROFESSORS: Hendrik Blockeel (Katholieke Universiteit Leuven), [intermediate] Decision Trees for Big Data Analytics Diego Calvanese (Free University of Bozen-Bolzano), [introductory/intermediate] End-User Access to Big Data Using Ontologies Jiannong Cao (Hong Kong Polytechnic University), [introductory/intermediate] Programming with Big Data Edward Y. Chang (HTC Corporation, New Taipei City), [introductory/advanced] From Design of Distributed and Online Algorithms to Hands-on Code Lab Practice on Real Datasets Ernesto Damiani (University of Milan), [introductory/intermediate] Process Discovery and Predictive Decision Making from Big Data Sets and Streams Gautam Das (University of Texas, Arlington), [intermediate/advanced] Mining Deep Web Repositories Maarten de Rijke (University of Amsterdam), tba Geoffrey C. Fox (Indiana University, Bloomington), [intermediate] Using Software Defined Systems to Address Big Data Problems Minos Garofalakis (Technical University of Crete, Chania) [intermediate/advanced], Querying Continuous Data Streams Vasant G. Honavar (Pennsylvania State University, University Park) [introductory/intermediate], Learning Predictive Models from Big Data Mounia Lalmas (Yahoo! Research Labs, London), [introductory] Measuring User Engagement Tao Li (Florida International University, Miami), [introductory/intermediate] Data Mining Techniques to Understand Textual Data Kwan-Liu Ma (University of California, Davis), [intermediate] Big Data Visualization Christoph Meinel (Hasso Plattner Institute, Potsdam), [introductory/intermediate] New Computing Power by In-Memory and Multicore to Tackle Big Data David Padua (University of Illinois, Urbana-Champaign), [intermediate] Data Parallel Programming Manish Parashar (Rutgers University, Piscataway), [intermediate] Big Data in Simulation-based Science Srinivasan Parthasarathy (Ohio State University, Columbus), [intermediate] Scalable Data Analysis Evaggelia Pitoura (University of Ioannina), [intermediate] Online Social Networks Vijay V. Raghavan (University of Louisiana, Lafayette), [introductory/intermediate] Visual Analytics of Time-evolving Large-scale Graphs Pierangela Samarati (University of Milan), [intermediate], Data Security and Privacy in the Cloud Peter Sanders (Karlsruhe Institute of Technology), [introductory/intermediate] Algorithm Engineering for Large Data Sets Johan Suykens (Katholieke Universiteit Leuven), [introductory/intermediate] Fixed-size Kernel Models for Big Data Domenico Talia (University of Calabria, Rende), [intermediate] Scalable Data Mining on Parallel, Distributed and Cloud Computing Systems Jieping Ye (Arizona State University, Tempe), [introductory/advanced] Large-Scale Sparse Learning and Low Rank Modeling ORGANIZING COMMITTEE: Adrian Horia Dediu (Tarragona) Carlos Martín-Vide (Tarragona, chair) Florentina Lilica Voicu (Tarragona) REGISTRATION: It has to be done at The selection of up to 8 courses requested in the registration template is only tentative and non-binding. For the sake of organization, it will be helpful to have an approximation of the respective demand for each course. Since the capacity of the venue is limited, registration requests will be processed on a first come first served basis. The registration period will be closed and the on-line registration facility disabled when the capacity of the venue will be complete. It is much recommended to register prior to the event. FEES: As far as possible, participants are expected to stay full-time. Fees are a flat rate covering the attendance to all courses during the week. There are several early registration deadlines. Fees depend on the registration deadline. ACCOMMODATION: Suggestions of accommodation will be provided in due time. CERTIFICATE: Participants will be delivered a certificate of attendance. QUESTIONS AND FURTHER INFORMATION: florentinalilica.voicu at POSTAL ADDRESS: BigDat 2015 Lilica Voicu Rovira i Virgili University Av. Catalunya, 35 43002 Tarragona, Spain Phone: +34 977 559 543 Fax: +34 977 558 386 ACKNOWLEDGEMENTS: Universitat Rovira i Virgili ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 22 19:55:51 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 22 Jul 2014 21:55:51 +0200 Subject: Revue: Corela, Vol. 12, num. 1 Message-ID: Date: Mon, 21 Jul 2014 12:31:04 +0200 From: Gilles Col Message-Id: X-url: X-url: X-url: Chers collègues La revue Corela est désormais accueillie dans le portail (OpenEdition / Cléo) à cette nouvelle adresse: A cette occasion, la revue publie deux nouveaux numéros, l'un hors-série : "Calcul du sens et contexte" (sous la direction de J. Benoist, G. Col, T. Poibeau ; et l'autre dan sa série régulière (Vol. 12, num. 1 : ) dont je vous invite à découvrir le sommaire : Joasha Boutault Enough et too : expression de la suffisance et de l'excès dans les constructions « tough » en anglais Gilles Corminboeuf et Christophe Benzitoun Evaluation critique des modèles graduels et non graduels de l'intégration syntaxique Lise Hamelin Vers une analyse des marqueurs yet et still : There is still much to say Gilbert Ghio Temporalité et aspectualité en anglais : opérations, représentations, cognition Sonia Benamsil Les stéréotypes de la femme dans la caricature de Dilem Ali Rudy Loock et Cyril Auran Magnitude Estimation: can it do something for your pragmatics? Tchaa Pali L'item « y? » du miyobé (Togo/Bénin) : verbe plein, auxiliaire ou auxiliant ? Nous vous souhaitons une bonne lecture. Pour plus de détails, voici le descriptif du poste en question en cliquant sur le lien suivant : La société promet une multitude de projets variés et de bonnes perspectives d’évolution à long terme. Si vous êtes intéressé et souhaitez en savoir plus, n’hésitez pas à me faire parvenir votre CV actualisé ainsi que vos disponibilités pour une première conversation téléphonique. A contrario, vous pouvez toujours diffuser cette offre auprès de personnes susceptibles d’être intéressées. Bien à vous, Ali RIAD Business Consultant Experis IT Luxembourg WE HAVE MOVED ! Il concerne de la classification automatique de messages sur les réseaux sociaux (Facebook et Twetter) DESCRIPTION DU POSTE : Le programme de travail portera sur la classification automatique de tweets à l'aide de classifeurs. Une première expérimentation a été menée (classifieur Naives Bayes , SVM). Les premiers résultats nous amènent à approfondir d'une part le type de traits linguistiques à considérer, et d'autre part à élaborer une méthodologie d'élaboration des classes en fonctions des actions de communication à observer. COMPETENCES ATTENDUES - La maîtrise des techniques d'apprentissage automatique et des logiciels (Weka, scikit-learn) ; - La connaissances des pratiques sur les réseaux sociaux ; - Une bonne capacité rédactionnelle en français (rédaction d'un rapport de synthèse) ; - Capacité à travailler de manière autonome et à faire des synthèses sur son activité CONDITIONS D'ADMISSION : Pour un profil IR, être titulaire d'un doctorat en informatique ou en TAL Pour un profil IE, être titulaire d'un master en informatique ou en TAL. LOCALISATION : Poste situé à l'UMR MoDyCo, Université Paris Ouest Nanterre La Défense, 200 avenue de la République, 92200 Nanterre. The term SENTIRE comes from the Latin feel and it is root of words such as sentiment and sensation. SENTIRE aims to provide an international forum for researchers in the field of opinion mining and sentiment analysis to share information on their latest investigations in social information retrieval and their applications both in academic research areas and industrial sectors. The broader context of the workshop comprehends Web mining, AI, Semantic Web, information retrieval and natural language processing. The workshop is going to be held in Shenzhen on 14th December 2014. For more information, please visit: RATIONALE Memory and data capacities double approximately every two years and, apparently, the Web is following the same rule. User-generated contents, in particular, are an ever-growing source of opinion and sentiments which are continuously spread worldwide through blogs, wikis, fora, chats and social networks. The distillation of knowledge from such sources is a key factor for applications in fields such as commerce, tourism, education and health, but the quantity and the nature of the contents they generate make it a very difficult task. Due to such challenging research problems and wide variety of practical applications, opinion mining and sentiment analysis have become very active research areas in the last decade. Our understanding and knowledge of the problem and its solution are still limited as natural language understanding techniques are still pretty weak. Most of current research in sentiment analysis, in fact, merely relies on machine learning algorithms. Such algorithms, despite most of them being very effective, produce no human understandable results such that we know little about how and why output values are obtained. All such approaches, moreover, rely on syntactical structure of text, which is far from the way human mind processes natural language. Next-generation opinion mining systems should employ techniques capable to better grasp the conceptual rules that govern sentiment and the clues that can convey these concepts from realization to verbalization in the human mind. TOPICS SENTIRE aims to provide an international forum for researchers in the field of opinion mining and sentiment analysis to share information on their latest investigations in social information retrieval and their applications both in academic research areas and industrial sectors. The broader context of the workshop comprehends Web mining, AI, Semantic Web, information retrieval and natural language processing. Topics of interest include but are not limited to: - Sentiment identification & classification - Opinion and sentiment summarization & visualization - Explicit & latent semantic analysis for sentiment mining - Concept-level opinion and sentiment analysis - Sentic computing - Opinion and sentiment search & retrieval - Time evolving opinion & sentiment analysis - Semantic multidimensional scaling for sentiment analysis - Multidomain & cross-domain evaluation - Domain adaptation for sentiment classification - Multimodal sentiment analysis - Multimodal fusion for continuous interpretation of semantics - Multilingual sentiment analysis & re-use of knowledge bases - Knowledge base construction & integration with opinion analysis - Transfer learning of opinion & sentiment with knowledge bases - Sentiment corpora & annotation - Affective knowledge acquisition for sentiment analysis - Biologically inspired opinion mining - Sentiment topic detection & trend discovery - Big social data analysis - Social ranking - Social network analysis - Social media marketing - Comparative opinion analysis - Opinion spam detection TIMEFRAME - August 1st, 2014: Submission deadline - September 26th, 2014: Notification of acceptance - October 20th, 2014: Final manuscripts due - December 14th, 2014: Workshop date SUBMISSIONS AND PROCEEDINGS Authors are required to follow IEEE Computer Society Press Proceedings Author Guidelines. The paper length is limited to 10 pages, including references, diagrams, and appendices, if any. Manuscripts are to be submitted through EasyChair. Each submitted paper will be evaluated by three PC members with respect to its novelty, significance, technical soundness, presentation, and experiments. Accepted papers will be published in IEEE ICDM proceedings. Selected, expanded versions of papers presented at the workshop will be invited to a forthcoming Special Issue of Cognitive Computation on opinion mining and sentiment analysis. ORGANIZERS - Erik Cambria, Nanyang Technological University (Singapore) - Bing Liu, University of Illinois at Chicago (USA) - Yunqing Xia, Tsinghua University (China) - Yongzheng Zhang, LinkedIn Inc. (USA) From hamon at LIMSI.FR Fri Jul 25 19:51:25 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Fri, 25 Jul 2014 21:51:25 +0200 Subject: Appel: Deadline extension for SSST-8 (EMNLP 2014) Message-ID: Date: Wed, 23 Jul 2014 13:56:57 -0400 From: "Carpuat, Marine" Message-ID: X-url: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) EMNLP 2014 / SIGMT / SIGLEX Workshop Oct 2014, Doha, Qatar * New submission deadline for papers and abstracts: August 1st, 2014 * * Special theme: Compositional Distributional Semantics and Machine Translation * The Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) seeks to bring together a large number of researchers working on diverse aspects of structure, semantics and representation in relation to statistical machine translation. Thanks to progress in the training of SMT engines, machine translation has become good enough so that it has become advantageous for translators to post-edit machine outputs rather than translate from scratch. However, current enhancement of SMT systems from human post-edition (PE) are rather basic: the post-edited output is added to the training corpus and the translation model and language model are re-trained, with no clear view of how much has been improved and how much is left to be improved. Moreover, the final PE result is the only feedback used: available technologies do not take advantage of logged sequences of post-edition actions, which inform on the cognitive processes of the post-editor. The proposed thesis aims at using the post-edition process as a demonstration of how an expert translator modifies the SMT result to produce a perfect translation. Learning from demonstration is an emerging field in machine learning, mostly applied to robotics [1] that will thus be explored further in the particular framework of SMT. Topic of research A novel approach to SMT training will be adopted in this thesis, i.e. considering the post-edition process as a sequential decision making process performed by human experts who should be imitated. This thesis’ first fundamental contribution to SMT will be to reformulate the problem of post-edition in SMT as a sequential decision making problem [4]. Indeed, the hypothesis selection and ranking process occurring in an SMT system can be seen as an action selection strategy, choosing after each post-edition step amongst a large number of actions (all possible hypotheses and rankings). This strategy has to be modified according to post-edition results arising sequentially and being influenced by previous actions (hypothesis selection) of the system. From this, SMT will be casted into an imitation learning problem, that is learning from demonstrations made by an expert: post-edition results can be seen as examples of what the system should do, again in a sequential decision making process and not in a static one such as supervised learning. Indeed, SMT decoding, whether it is based on phrases or chunks, can be seen as a sequential decision making process. The sequences of decisions taken by an expert during the post-edition process can be seen as a target for the system, which will try to imitate them in similar situations. To do so, we will extend the work described in [2], that modelled semantic parsing as an Inverse Reinforcement Learning (IRL) [3]. In addition, the question of automatically selecting the sentences that should be used for post-edition and further learning will be addressed. Especially, this will be studied under the active learning paradigm. Large and diversified amounts of post-edited data, collected in an industrial setting, will be made available for the research project. Profile The applicants must hold an Engineering or a Master degree in Computational Linguistics or computer science, preferably with experience in the fields of statistical machine learning and/or natural language processing. Good background in programming will also be required. He/she will also be involved in a research project, funded by the French National Agency for Research, involving 2 research labs (LIFL in Lille and LIG in Grenoble) and a company (Lingua & Machina). For this reason good English level is required (good command of French being a plus). Finally effective communication skills in English, both written and verbal are mandatory. Context The candidate will be hired by University Lille 1 in the framework of a national research project. S/he will mainly be hosted in the SequeL ( Sequential Learning) team of the Laboratoire d’Informatique Fondamentale de Lille (LIFL). SequeL is also a common team-project with INRIA (national institute for research in computer science and mathematics) and espe- cially the INRIA Lille - Nord Europe Center. The group involves around 25 researchers working on sequential learning and is internationally recognized. Lille is the largest city of the north of France, a metropolis with 1 million inhabitants, with excellent train connections to Brussels (30 min), Paris (1h) and London (1h30). This thesis will be supervised in strong collaboration with the GETALP team of Laboratoire d’Informatique de Grenoble (LIG), widely renowned for its research on natural language and speech processing. Grenoble is a high-tech city with 4 universities. It is located at the heart of the Alps, in outstanding scientific and natural surroundings. It is 3h by train from Paris ; 2h from Geneva ; 1h from Lyon ; 2h from Torino and is less than 1h from Lyon international airport. The PhD thesis will be co-supervised by Olivier Pietquin in Lille and Laurent Besacier in Grenoble. Contacts Interviews will be held in Sept 2014. Meetings during Interspeech 2014 in Singapore can be also organized. For further info, please contact: Olivier Pietquin : olivier.pietquin at Laurent Besacier : laurent.besacier at References [1] Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5):469–483, May 2009. [2] Gergely Neu and Csaba SzepesvÃąri. Training parsers by inverse reinforcement learning. Machine Learning, 77(2-3):303–337, 2009. [3] Andrew Y. Ng and Stuart J. Russell. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning, ICML ’00, pages 663–670, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc. [4] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. This includes both the main workshop papers and the shared task descriptions. (2) We extend the page length of the main workshop papers to be up to 9 pages + any number of reference pages. The shared task descriptions are still up to 4 pages with 2 additional pages of references. Regards, Workshop Organizers ======================================================= Last Call for Papers and Participation EMNLP Workshop on Arabic Natural Language Processing Including Shared Task on Automatic Arabic Error Correction Apologies for multiple postings Please distribute to colleagues ======================================================= Last Call for Papers and Participation Arabic Natural Language Processing Workshop collocated with EMNLP 2014, Doha, Qatar Workshop date: Saturday October 25, 2014 Paper submission deadline: July 26, 2014 Workshop Website: Shared Task Website: ======================================================= WORKSHOP DESCRIPTION There has been a lot of progress in the last 15 years in the area of Arabic Natural Language Processing (NLP). Many Arabic NLP (or Arabic NLP-related) workshops and conferences have taken place, both in the Arab World and in association with international conferences. This workshop follows in the footsteps of previous efforts to provide a forum for researchers to share and discuss their ongoing work. We invite submissions on topics that include, but are not limited to, the following: * Basic core technologies: morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling, sentiment analysis, Arabic dialect modeling, etc. * Applications: machine translation, speech recognition, speech synthesis, optical character recognition, pedagogy, assistive technologies, social media, etc. * Resources: dictionaries, annotated data, specialized databases etc. Submissions may include work in progress as well as finished work. Submissions must have a clear focus on specific issues pertaining to the Arabic language whether it is standard Arabic, dialectal, or mixed. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work. Submissions are expected to be 8 pages long plus 2 pages for references. Associated with the workshop will be a shared task on Arabic text error correction (see link to Shared Task Website above). IMPORTANT DATES Paper submission deadline: July 26, 2014 => July 28 11:59pm (UTC/GMT -11 hours) Author notification: August 26, 2014 Camera Ready: September 15, 2014 Workshop: October 25, 2014 ORGANIZERS Program Co-chairs Nizar Habash, Columbia University Stephan Vogel, Qatar Computing Research Institute Publication Co-chairs Nadi Tomeh, Paris 13 University Houda Bouamor, Carnegie Mellon University Qatar Website Committee Kareem Darwish, Qatar Computing Research Institute Noura Farra, Columbia University Shared Task Committee Behrang Mohit (co-chair), Carnegie Mellon University Qatar Alla Rozovskaya (co-chair), Columbia University Wajdi Zaghouani, Carnegie Mellon University Qatar Ossama Obeid, Carnegie Mellon University Qatar Nizar Habash (advisor), Columbia University Program Committee Members Abdelmajid Ben-Hamadou, University of Sfax, Tunisia Abdelhadi Soudi, Ecole Nationale de l’Industrie Minérale, Morocco Abdelsalam Nwesri, University of Tripoli, Libya Achraf Chalabi , Microsoft Research, Egypt Ahmed Ali, Qatar Computing Research Institute, Qatar Ahmed Rafea, The American University in Cairo, Egypt Alexis Nasr, University of Marseille, France Ali Farghaly, Monterey Peninsula College, USA Almoataz B. Workshops. Tutorials. Doctoral Consortium. Best papers awards. KEYNOTES: - Vladimir Vapnik (NEC Lab): inventor of SVM - John Sowa (VivoMind Research; IBM): inventor of conceptual graphs - Bing Liu (U. of Illinois): opinion mining - Oscar Castillo (Tijuana IT): fuzzy logic Proceedings: Springer LNAI (IE, ISI); IEEE CPS, special issues of journals (including ISI JCR). Venue: Tuxtla Gutiérrez, Chiapas, Mexico. Cultural program and tours: Sumidero canyon; El Chiflón waterfalls; Tenam Puente ancient pyramids; San Cristobal de las Casas colonial city (anticipated). Dates (late submissions are possible): July 31: draft abstract -- just a general idea of what your paper will be about (takes 1 minute; you can change the text later); Aug 6: full text for blind review. See complete CFP at PLEASE CIRCULATE this CFP among your colleagues and students. We apologize if you receive multiple copies. URL: From hamon at LIMSI.FR Sun Jul 27 18:29:26 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 27 Jul 2014 20:29:26 +0200 Subject: Appel: JeTou 2015 Message-ID: Date: Sat, 26 Jul 2014 18:01:23 +0200 From: Contact Jetou Message-ID: X-url: Bonjour, Des doctorants des laboratoires toulousains CLLE-ERSS, OCTOGONE-Lordat et IRIT organisent les 28 et 29 mai 2015 un colloque international jeunes chercheurs (JéTou 2015 : Journées d'Etudes Toulousaines) sur la thématique suivante : "Le(s) discours en Sciences du Langage : unités et niveaux d'analyse". Pour plus d'informations : Merci d'avance. Cordialement. Le comité d'organisation des JéTou 2015 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 29 20:15:31 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 22:15:31 +0200 Subject: Ressources: Page dediee aux formats de transcriptions et metadonnees corpus oraux, IRCOM Message-ID: Date: Mon, 28 Jul 2014 11:12:58 +0200 From: Christophe Benzitoun Message-ID: <53D6141A.6050008 at> X-url: Chères et chers collègues, Suite à la table-ronde qui a eu lieu le 23 juin dernier à Paris, nous avons le plaisir de vous annoncer la création de la page suivante synthétisant un certain nombre d'interventions : Cette table-ronde avait pour objectifs principaux d'aborder les questions des formats de transcription et des métadonnées pour les corpus oraux. Il s'agissait de faire un état des lieux des projets finalisés et en cours (principalement en France et dans les pays francophones). Cette journée a notamment abouti à l'élaboration d'un document de synthèse des besoins et des données existantes consultable à l'adresse ci-dessus. IJBDI 2014 Vol. 1 No. 1/2 *Big data (lost) in the cloud Beniamino Di Martino; Rocco Aversa; Giuseppina Cretella; Antonio Esposito; Joanna Kołodziej *Designing and implementing a cloud-hosted SaaS for data movement and sharing with SlapOS Walid Saad; Heithem Abbes; Mohamed Jemni; Christophe Cérin *Multi-source streaming-based data accesses for MapReduce systems Jiadong Wu; Bo Hong *A new approach for accurate distributed cluster analysis for Big Data: competitive K-Means Rui Máximo Esteves; Thomas Hacker; Chunming Rong *Peculiarities of numerical algorithms parallel implementation for exa-flops multicomputers Victor E. Malyshkin *Towards quality-of-service driven consistency for Big Data management Álvaro García-Recuero; Sérgio Esteves; Luís Veiga *D-CEP4CMA: a dynamic architecture for cloud performance monitoring and analysis via complex event processing Afef Mdhaffar; Riadh Ben Halima; Mohamed Jmaiel; Bernd Freisleben *An extended analytical study of Arabic sentiments Nawaf A. Abdulla; Mahmoud Al-Ayyoub; Mohammed Naji Al-Kabi *Health big data analytics: current perspectives, challenges and potential solutions Mu-Hsing Kuo; Tony Sahama; Andre W. Kushniruk; Elizabeth M. Borycki; Daniel K. Grunwell IJBDI is a peer-reviewed journal. It provides a vehicle to help professionals, academics, researchers, scientists, engineers, educators, and policy makers, working in the field of data science and management to demonstrate and explore current advances in all aspects of big data. IJBDI aims to be a leading journal in the interdisciplinary field of big data intelligence. It encourages/publishes high-quality submissions of articles on the following subjects in this field: big data science and foundations, big data infrastructure, big data management, big data intelligence, big data privacy/security and big data applications. We are writing to invite you to submit an article to IJBDI which provides a rapid forum for the dissemination of original research articles. The IJBDI has a distinguished Editorial Board with extensive academic qualifications, ensuring that the journal maintains high scientific standards and has a broad international coverage. All published articles will be arranged for abstracting and indexing services. Manuscripts should be submitted to the journal online at Once a manuscript has been accepted for publication, it will undergo language copyediting, typesetting, and reference validation in order to provide the highest publication quality possible. Please do not hesitate to contact us if you have any questions about the journal. We look forward to reading and publishing your work! In addition, the student will be financially supported to attend conferences and schools, and to spend a period of 6-12 months abroad. The PhD programme lasts three years. The official language of the programme and of the faculty is English. Research in the Faculty of Computer Science is divided in three research centres (see below). Candidates are strongly advised to get in contact with the desired research centre before applying. Bolzano is the city’s Italian name while Bozen is its German name: it is the capital of the multilingual province of Alto Adige / Südtirol. Near Italy’s northern border with Austria, the city is a gateway to the Dolomites, the majestic white mountain peaks that are part of the Alps. Bolzano is an Italian city with Austrian flair. Its two lifestyles, one Northern European and the other more Mediterranean, combine to make the perfect union, which can be clearly seen in the historic and artistic treasures of this city. Bolzano is constantly among the top-ranked cities in Italy when it comes to quality of life. It has one of Europe's lowest unemployment rates, excellent services and a wonderful landscape. The landscape of the surrounding mountain sides is characterised by old wine hamlets and villages nestling amid vineyards watched over by 200 castles, stately country houses and ruins (see the wonderful video). Instructions for application (pre-enrollment): The deadline for the applications is the 29th August 2014. RESEARCH CENTRES: KRDB RESEARCH CENTRE FOR KNOWLEDGE AND DATA - web page: - contact person: Enrico Franconi franconi at - Logic based languages for knowledge representation - Intelligent access to databases - Semantic technologies - Visual and verbal paradigms for information exploration - Temporal aspects of data and knowledge - Extending database technologies - Inter-operation, verification, and composition of business processes The research topics in knowledge representation are focused on foundational and practical aspects of knowledge representation technologies applied to information systems. The whole life cycle ranging from the design to the deployment of such technologies is covered: the conceptual modelling of various types of knowledge, the linguistic and logical aspects of knowledge, the integration of heterogeneous knowledge sources, including information coming from the Internet, the usage of knowledge to support the intelligent retrieval of information, and the usage of knowledge to create virtual services on the net. INFORMATION AND DATABASE SYSTEMS ENGINEERING - Spatial and temporal databases - Approximation Techniques in databases - Query optimisation in databases - Cooperative interfaces for information access and filtering - Data mining techniques for preference elicitation and recommendation - Cloud computing and big data - Agile development & human aspects of software engineering - Software startups and open science - Design based Hardware engineering - Technology enhanced learning The research activities in the area of database and information systems focus on key aspects of applied computer science, including data warehousing and data mining, the integration of heterogeneous and distributed databases, time-varying information, data models, and query processing. The research approach is primarily constructive in its outset, and it includes substantial experimental and analytical elements. The development activities cover the design of data models and structures, and the development of algorithms, data structures, languages, and systems. The experimental activities verify real world artifacts with the help of prototypes and simulations. The analytic activities include the analysis of the algorithmic complexity and the evaluation of languages. The main goal is theoretically sound results that solve real world problems. SOFTWARE ENGINEERING - Agile methods, lean management, and open source - Measurement and study of software quality, reliability, evolution and reuse - Distributed computing and service-oriented architectures (mobile and distributes services) - IT and business alignment - Software reuse and component based development - Interoperability in collaborative systems - IT for automation - Energy-aware systems The research topics in software engineering are focused on the empirical and quantitative study of innovative models for software development. The target analysis techniques include both traditional statistics, and new approaches, such as computational intelligence, Bayesian models, and meta-analytical systems. Following the tradition of the diverse PhD training events in the field developed at Rovira i Virgili University in Tarragona since 2002, LATA 2015 will reserve significant room for young scholars at the beginning of their career. It will aim at attracting contributions from classical theory fields as well as application areas. VENUE: LATA 2015 will take place in Nice, the second largest French city on the Mediterranean coast. The venue will be the University Castle at Parc Valrose. SCOPE: Topics of either theoretical or applied interest include, but are not limited to: algebraic language theory algorithms for semi-structured data mining algorithms on automata and words automata and logic automata for system analysis and programme verification automata networks automata, concurrency and Petri nets automatic structures cellular automata codes combinatorics on words computational complexity data and image compression descriptional complexity digital libraries and document engineering foundations of finite state technology foundations of XML fuzzy and rough languages grammars (Chomsky hierarchy, contextual, unification, categorial, etc.) grammatical inference and algorithmic learning graphs and graph transformation language varieties and semigroups language-based cryptography parallel and regulated rewriting parsing patterns power series string and combinatorial issues in bioinformatics string processing algorithms symbolic dynamics term rewriting transducers trees, tree languages and tree automata unconventional models of computation weighted automata STRUCTURE: LATA 2015 will consist of: invited talks invited tutorials peer-reviewed contributions INVITED SPEAKERS: to be announced PROGRAMME COMMITTEE: Andrew Adamatzky (West of England, Bristol, UK) Andris Ambainis (Latvia, Riga, LV) Franz Baader (Dresden Tech, DE) Rajesh Bhatt (Massachusetts, Amherst, US) José-Manuel Colom (Zaragoza, ES) Bruno Courcelle (Bordeaux, FR) Erzsébet Csuhaj-Varjú (Eötvös Loránd, Budapest, HU) Aldo de Luca (Naples Federico II, IT) Susanna Donatelli (Turin, IT) Paola Flocchini (Ottawa, CA) Enrico Formenti (Nice, FR) Tero Harju (Turku, FI) Monika Heiner (Brandenburg Tech, Cottbus, DE) Yiguang Hong (Chinese Academy, Beijing, CN) Kazuo Iwama (Kyoto, JP) Sanjay Jain (National Singapore, SG) Maciej Koutny (Newcastle, UK) Antonín Kučera (Masaryk, Brno, CZ) Thierry Lecroq (Rouen, FR) Salvador Lucas (Valencia Tech, ES) Veli Mäkinen (Helsinki, FI) Carlos Martín-Vide (Rovira i Virgili, Tarragona, ES, chair) Filippo Mignosi (L’Aquila, IT) Victor Mitrana (Madrid Tech, ES) Ilan Newman (Haifa, IL) Joachim Niehren (INRIA, Lille, FR) Enno Ohlebusch (Ulm, DE) Arlindo Oliveira (Lisbon, PT) Joël Ouaknine (Oxford, UK) Wojciech Penczek (Polish Academy, Warsaw, PL) Dominique Perrin (ESIEE, Paris, FR) Alberto Policriti (Udine, IT) Sanguthevar Rajasekaran (Connecticut, Storrs, US) Jörg Rothe (Düsseldorf, DE) Frank Ruskey (Victoria, CA) Helmut Seidl (Munich Tech, DE) Ayumi Shinohara (Tohoku, Sendai, JP) Bernhard Steffen (Dortmund, DE) Frank Stephan (National Singapore, SG) Paul Tarau (North Texas, Denton, US) Andrzej Tarlecki (Warsaw, PL) Jacobo Torán (Ulm, DE) Frits Vaandrager (Nijmegen, NL) Jaco van de Pol (Twente, Enschede, NL) Pierre Wolper (Liège, BE) Zhilin Wu (Chinese Academy, Beijing, CN) Slawomir Zadrozny (Polish Academy, Warsaw, PL) Hans Zantema (Eindhoven Tech, NL) ORGANIZING COMMITTEE: Sébastien Autran (Nice) Adrian Horia Dediu (Tarragona) Enrico Formenti (Nice, co-chair) Sandrine Julia (Nice) Carlos Martín-Vide (Tarragona, co-chair) Christophe Papazian (Nice) Julien Provillard (Nice) Pierre-Alain Scribot (Nice) Bianca Truthe (Giessen) Florentina Lilica Voicu (Tarragona) SUBMISSIONS: Authors are invited to submit non-anonymized papers in English presenting original and unpublished research. Papers should not exceed 12 single-spaced pages (including eventual appendices, references, etc.) and should be prepared according to the standard format for Springer Verlag's LNCS series (see Submissions have to be uploaded to: PUBLICATIONS: A volume of proceedings published by Springer in the LNCS series will be available by the time of the conference. A special issue of a major journal will be later published containing peer-reviewed substantially extended versions of some of the papers contributed to the conference. Submissions to it will be by invitation. REGISTRATION: The period for registration is open from July 21, 2014 to March 2, 2015. The registration form can be found at: DEADLINES: Paper submission: October 10, 2014 (23:59 CET) Notification of paper acceptance or rejection: November 18, 2014 Early registration: November 25, 2014 Final version of the paper for the LNCS proceedings: November 26, 2014 Late registration: February 16, 2015 Submission to the journal special issue: June 6, 2015 QUESTIONS AND FURTHER INFORMATION: florentinalilica.voicu at POSTAL ADDRESS: LATA 2015 Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University Av. Catalunya, 35 43002 Tarragona, Spain Phone: +34 977 559 543 Fax: +34 977 558 386 ACKNOWLEDGEMENTS: Nice Sophia Antipolis University Rovira i Virgili University ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 29 21:04:36 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 23:04:36 +0200 Subject: Appel: IEEE Big Data 2014 Call for Workshop Papers and Posters Message-ID: Date: Mon, 28 Jul 2014 09:51:06 -0400 From: Call For Papers Message-ID: X-url: X-url: X-url: Call for Workshop Papers and Posters 2014 IEEE International Conference on Big Data (IEEE BigData 2014) October 27-30, 2014, Washington DC, USA In recent years, “Big Data” has become a new ubiquitous term. Big Data is transforming science, engineering, medicine, healthcare, finance, business, and ultimately society itself. The IEEE Big Data has established itself as the top tier research conference in Big Data. The first conference IEEE Big Data 2013 ( was held in Santa Clara , CA from Oct 6-7, 2013, 259 paper submissions for the main conference and 32 paper submissions for the industry and government program. Of those, 44 regular papers and 53 short papers were accepted, which translates into a selectivity that is on-par with top tier conferences. Also, there were 14 workshops associated with IEEE Big Data 2013 covering various important topics related to various aspects of Big Data research, development and applications, and more than 400 participants from 40 countries attend the 4-day event. The IEEE International Conference on Big Data 2014(IEEE BigData 2014) continues the success of the IEEE BigData 2013. We expect to have an exciting prgoram, IEEE Big Data 2014 has received 271 paper submissions for the main conference and 37 paper submissions for the industry and government program. Also there are 21 workshops covering a lot emerging research areas associated with it, If you miss the deadline to submit a paper to the conference, you are encouraged to submit your research work to one of the workshops or poster program I. 21 Workshops (most of the workshop paper submission deadlines are in late August) 1. Scholarly Big Data: Challenges & Issues ( 2. The 2nd Workshop on Scalable Machine Learning: Theory and Applications ( 3. 1st International Workshop on High Performance Big Graph Data Management, Analysis, and Mining ( 4. Big Data in Motion and Big Data at Rest ( 5. Workshop on Enterprise Big Data Semantic and Analytics Modeling ( 6. The Second Workshop on Distributed Storage Systems and Coding for Big Data ( 7. First IEEE International Workshop on Big Data Security and Privacy (BDSP 2014) ( 8. The 2nd International Workshop of BigData in Bioinformatics and Healthcare Informatics ( 9. Solar Astronomy Big Data (SABiD) – 1st Workshop on Management, Search and Mining of Massive Repositories of Solar Astronomy Data ( 10. Using Big Data to Understand Spatial Connectivity ( 11. CASK-141st International Workshop on Collaborative methodologies to Accelerate Scientific Knowledge discovery in big data ( 12. Rapid Response Cyber Forensics Workshop ( 13. First Hands-On Workshop on Leveraging High Performance Computing Resources for Managing Large Datasets ( 14. Workshop on BigData and Service Discovery ( 15. Workshop on Advances in Software and Hardware for Big Data to Knowledge Discovery (ASH) ( 16. IEEE Big Data Workshop on Semantics for Big Data on the Internet of Things (SemBIoT 2014) ( 17. Big Data in Computational Epidemiology ( 18. Large Scale Data Analytics in Transportation and Railway Infrastructure ( 19. 2nd Workshop on Scalable Cloud Data Management ( 20. Big Humanities Data ( 21. Complexity for Big Data( II. Click here to Register Click for details on Main Conference (, 1 or 2 day Workshops ( and half day Tutorials ( and Demos ( Not forgetting the Excursion Day! ( Conference Programme Monday - August 25th ( 09:00 - 10:15 Invited Speaker: Mary Harper, IARPA Learning from 26 languages: Program Management and Science in the Babel Program 10:45 - 12:25 Modeling of Discourse and Dialogue I 10:45 - 12:25 Sentiment Analysis, Opinion Mining and Social Media I 10:45 - 12:25 Information Retrieval and Question Answering 10:45 - 12:25 Machine Learning for CL and NLP 15:45 - 17:25 Modeling of Discourse and Dialogue II 15:45 - 17:25 Sentiment Analysis, Opinion Mining and Social Media III 15:45 - 17:25 Semantic Processing, Distributional Semantics and Compositional Semantics I 15:45 - 17:25 Software, Tools Tuesday - August 26th ( 09:00 - 10:15 Invited Speaker: Ted Gibson, MIT Language for communication: Language as rational inference 10:45 - 12:25 Syntax, grammar induction, syntactic and semantic parsing I 10:45 - 12:25 Sentiment Analysis, Opinion Mining and Social Media III 10:45 - 12:25 Applications I 10:45 - 12:25 Modeling of Discourse and Dialogue III 15:45 - 17:25 Syntax, grammar induction, syntactic and semantic parsing II 15:45 - 17:25 Semantic Processing, Distributional Semantics and Compositional Semantics II 15:45 - 17:25 Applications II 15:45 - 17:25 Language Resources Wednesday - August 27th ( 09:00 - 10:15 Exursion Day Thursday - August 28th ( 09:00 - 10:15 Invited Speaker: Qun Liu, CNGL/DCU Annotation Adaptation and Language Adaptation in NLP 10:45 - 12:25 IE/database linking I 10:45 - 12:25 Lexical Semantics and Ontologies I 10:45 - 12:25 Natural Language Generation and Summarization I 10:45 - 12:25 Modeling of Discourse and Dialogue IV and Multimodal Processing 14:00 - 15:15 Semantic Processing, Distributional Semantics and Compositional Semantics III 14:00 - 15:15 Morphology, word segmentation, tagging and chunking I 14:00 - 15:15 Speech Recognition, Text-To-Speech, Spoken Language Understanding 14:00 - 15:15 Lesser Resourced Languages 15:45 - 17:25 Syntax, grammar induction, syntactic and semantic parsing III 15:45 - 17:25 Machine Translation I 15:45 - 17:25 Linguistic and Cognitive Issues in CL and NLP I 15:45 - 17:25 Natural Language Generation and Summarization II and Paraphrasing Friday - August 29th ( 09:00 - 10:15 Invited Speaker: Martin Kay, XEROX Does a Computational Linguist have to be a Linguist? 10:45 - 12:25 Machine Translation II 10:45 - 12:25 IE/database linking II 10:45 - 12:25 Linguistic and Cognitive Issues in CL and NLP II 10:45 - 12:25 Lexical Semantics and Ontologies II 14:00 - 15:15 Machine Translation III 14:00 - 15:15 Lexical Semantics and Ontologies III 14:00 - 15:15 IE/database linking III 14:00 - 15:15 Morphology, word segmentation, tagging and chunking II 15:45 - 17:25 Best Paper Talk and Closing The conference committee and organisers take no responsibility for changes or inaccuracies to the conference programme. Vous travaillerez en collaboration avec les services R&D et marketing de Weborama. Compétences requises ==================== - Anglais américain langue maternelle - Connaissance en linguistique (lexicologie, sémantique) Type de contrat =============== La mission consiste dans un premier temps à constituer les clusters (1 mois à mi-temps environ), puis à intervenir ponctuellement pour des mises à jour. Ce travail peut donc s'effectuer dans le cadre d'un CDD à temps partiel et en télétravail si nécessaire. Début du contrat ================ Août/Septembre 2014 Localisation du poste ===================== 75019 Paris Télétravail possible Indemnité ========= Selon profil Comment postuler ================ Merci de faire parvenir votre candidature à l'adresse suivante : rachid at ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA décline toute responsabilité concernant le contenu des messages diffusés sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 29 20:39:13 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 22:39:13 +0200 Subject: Appel: First Call for Participation, SemEval - Task 4, TimeLine: Cross-Document Event Ordering (pilot task) Message-ID: Date: Mon, 28 Jul 2014 13:32:44 +0000 From: "Erp, M.G.J. van" Message-ID: X-url: X-url:!forum/semeval-task4-timeline SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering (pilot task) First Call for Participation Website: Google Group:!forum/semeval-task4-timeline Evaluation period: November 15 - 30, 2014 Paper submission: January 2015 *Introduction* In any domain, professionals need to have access to knowledge in order to take well-informed decisions. An insightful way of presenting information in an easily updatable and complete manner is to present it on a timeline that is continuously updated with new information. The aim of the task is to build timelines from written news in English. More specifically, the goal is to order on a timeline all the events in which a target entity is involved. We focus mainly on cross-document event coreference resolution and cross-document temporal relation extraction. Temporal relation extraction has been the topic of the three past TempEval tasks as part of SemEval: - TempEval-1 (2007): Temporal Relation Identification - TempEval-2 (2010): Evaluating Events, Time Expressions, and Temporal Relations - TempEval-3 (2013): Temporal Annotation In addition, temporal relation extraction has been the focus of the 6th i2b2 NLP Challenge for clinical records but the cross-document aspect, has not been often explored. At RANLP 2009 there was a cross-document temporal relation extraction task, in which the goal was to link pre-defined events involving the same centroid entities (i.e. entities frequently participating in events) on a timeline. Nominal coreference resolution has been the topic of SemEval 2010 Task on Coreference Resolution in Multiple Languages. Partially motivated by the work in the NewsReader project (, TimeLine goes beyond the these tasks by addressing coreference resolution for events and temporal relation identification across documents. *Task Description* Given a set of documents and a target entity, the task is to build an event TimeLine related to that entity, i.e. to detect, anchor in time and order the events involving the target entity. As input data, we provide a set of documents and a set of target entities (people, organization, product or financial entity); only entities of interest will be selected as target entities, i.e. entities involved in many events across different documents and for which it is relevant to build a timeline. There are two tracks in this task based on the data used as input. For Track A only raw text is provided to the participants, while for Track B gold-standard event mentions are also given. For both tracks the expected output is one TimeLine for each target entity. Each TimeLine consists of an ordered list of events in which each event is associated to a time anchor. For both tracks a sub-track in which the events are not associated to a time anchor is proposed. Participants can choose to participate in any track and subtrack. Participants can submit up to two runs for each track/subtrack. *Data* The trial data consists of a set of 30 documents collected from Wikinews ( about Apple Inc. A set of target entities (input) and the corresponding ordered list of events (the output timeline) is provided with the set of documents. The trial data have been annotated with the extents of event mentions and are available from The evaluation tool can also be found there. The evaluation data will consist of 3 sets of documents annotated with event mentions and a set of target entities. Each set will contain around 30 documents from Wikinews, totalling around 30,000 tokens. For each set of documents, one file is provided containing the list of target entities. No training corpus will be provided for this task. *Evaluation Methodology* Participants will submit the TimeLines produced by their system for all target entities. The focus of this second workshop is on definition practices in either human or machine-assisted ontology development. PRESENTATION A current problem in ontology development is constructing the needed definitions of terms either logical or in natural language. For example, ontologies built using OBO Foundry principles are advised to include both logical and natural language definitions, but ontology developers too often focus on only one of these, or they pay insufficient attention to whether they are equivalent. Explicit definitions of terms in ontologies serve a number of purposes. Logical definitions allow reasoners to create inferred hierarchies, lessening the burden of asserting and checking the validity of subsumptions. Natural language definitions help to ameliorate the pervasive problem of low inter-annotator agreement. In specialized domains, experts will know their own field well, but may only have limited knowledge of adjacent disciplines. Good definitions make it possible for non-experts to understand unfamiliar terms and thereby make it possible for more confident reuse of terms by external ontologies, which in turn facilitates data integration. The goal of this workshop is to bring together interested researchers and developers to explore these issues by presenting case studies in a biomedical domain discussing the difficulties that arise when constructing definitions with a view to sharing strategies in the future. Even in the seemingly narrow domain of definition construction, cross-fertilization from related disciplines should yield benefits in quality and help to identify novel approaches. Papers submitted should include one or more case studies and raise specific questions related to definitions with a link to a biomedical domain. Reports on successful or unsuccessful methods are both appropriate. TOPICS - experiences in formulating definitions - tools that assist in definition editing, including collaborative systems - coordination of logical and textual definitions - validation and quality control of definitions, e.g., checking that definitions comply with the all/some form - methods for constructing definitions from multiple sources - use of controlled languages such as Rabbit or ACE for more user-friendly logical definition creation - use of templates to systematize definition creation FORMAT AND OUTCOMES This will be a half-day workshop with a selected mix of presentations based on accepted papers. In order to promote discussion, each presentation will be followed by a short response by a participant of the workshop to be arranged in advance of the workshop. This workshop will document findings on the workshop’s website ( We expect accepted papers to be published in the Journal of Biomedical Semantics (JBS). INTENDED AUDIENCE - ontologists, tool developers, and domain experts whose work encounters issues regarding definitions - tool developers building definition- or ontology-authoring tools - philosophers and logicians - biomedical researchers working on definitions in nomenclatures such as SNOMED - computer scientists addressing these issues in languages like OWL - NLP researchers working on definition extraction, generation, or checking - NLP/IR researchers reusing definitions produced for ontologies SUBMISSIONS All papers should include one or more case studies and raise specific questions related to definitions with a link to a biomedical domain. For getting data ready for analysis, ETL (extract-transfrom-load) is used which involves reading data from different sources, cleaning the data, converting the format of the input data so that it conforms to the target database, and writing it to the target database. Big data paradigm is changing this problem due to three V?s: volume, velocity, and variety. In big data paradigm, potentially a large number of data sources and data assets are considered for analytics. One needs to discover, integrate, and analyze large volumes of diverse data quickly. Finding relevant data for analytics is an important data discovery problem. Data diversity makes this problem difficult. The diversity of the data can be due to data model; type of data?structured, semi-structured, or unstructured; enterprise data vs. open public data; integrating social media data, etc. One also needs to handle data quality and data governance issues. In this workshop we invite demonstrations displaying techniques for identifying relevant sets of data, finding different kinds of relationships between structured, semi-structured, and unstructured data, curating the data for further analysis, integrating data using various join, union, and merge techniques, validating the integrated data, and analyzing it, from various industry domains. Topics of interest include (but are not limited to): - Cleaning big data - Integration of big heterogeneous data - Metadata extraction - Automated rule generation - Curating data - Data discovery - Provisioning and data lineage We welcome good demonstrations, including of previously accepted papers/demos, for this workshop. Authors need to send manuscript describing the demo in up to 2 pages (2 column format) inclusive of all references and figures. Manuscripts must be written in English, and formatted according to IEEE proceedings templates. Please see the workshop website for more details. Dans le cadre de notre d?veloppement constant, MyScript, ex-Vision Objects (Nantes, France) est ? la recherche d'un: *Ing?nieur Informaticien en Traitement Automatique des Langues (TAL)* ** Au sein du d?partement R&D ? MyScript Labs ?, vous serez amen? ? coordonner le d?veloppement de nouvelles langues et ? ?tre force de proposition surl'am?lioration de l'existant. Votre double comp?tence informatique et linguistique associ?e ? votre exp?rience en R&D vous permet d'assumer avec succ?s les missions suivantes : * Cr?ation et maintenance des ressources linguistiques et gestion de leur int?gration dans les moteurs de reconnaissance d'?criture manuscrite. * Recherche et d?veloppement sur les mod?les de langage (statistiques, syntaxiques ou s?mantiques), et leurs applications aux interfaces homme-machine. * Participation au processus de collecte d'?chantillons d'?criture manuscrite. * En lien avec l'?quipe Support, ?tude et analyse de cas d'usage clients. Les candidats int?ress?s qui seront ? TALN sont invit?s ? se pr?senter au stand MyScript. *Profil* De formation sup?rieure (Ing?nieur, Master2 ou Doctorat), vous avez une exp?rience minimum de 3 ans en TAL ou dans un domaine proche (intelligence artificielle, reconnaissance des formes, apprentissage statistique...). Votre ma?trise d'au moins un langage de programmation utilis? en TAL (par exemple Perl ou Python) vous permet d'?tre compl?tement autonome sur toutes les t?ches techniques. Rigoureux, dynamique, d?termin? et d'un relationnel facile, vous saurez rapidement vous int?grer au sein des ?quipes et d?montrer le leadership et l'expertise n?cessaires ? la r?ussite de votre mission. Anglais courant imp?ratif. Les ?l?ments suivants seraient consid?r?s comme des plus : ma?trise de C, C++ ou Java, ma?trise d'outils de scripting (bash, commandes Unix/Linux), exp?rience en automatisation/industrialisation des cha?nes de traitements TAL, connaissance d'une ou plusieurs langues ?trang?res. Dans un contexte international, au sein d'une soci?t? en fort d?veloppement, vous souhaitez rejoindre une ?quipe dynamique sur des projets de haute technologie. Au sein de MyScript, vous pourrez identifier les applications directes et concr?tes de votre travail et int?grerez une structure ? taille humaine qui valorise la cr?ativit?, les initiatives, le partage d'exp?rience et la convivialit?. Avec plus de 90% de son CA ? l'international, et plus de 100millions d'utilisateurs dans le monde, MyScript est un ?diteur de logiciels leader mondial sur le march? des interfaces homme-machine bas?es sur la reconnaissance d'?criture manuscrite. Disponibles dans plus de 85 langues, ses produits concernent notamment les march?s de la mobilit? (saisie de texte, prise de notes, ...),de l'?ducation (apprentissage de l'?criture, des math?matiques, de la g?om?trie,...) de l'entreprise (prise de notes et traitement de formulaires), et de l'automobile (saisie de texte ? partir d'un touch pad, interaction avec GPS).Le coeur de sa technologie est diffus? sous forme de kit de d?veloppement logiciel (SDK) ou sous forme d'applications. Le moteur de reconnaissance de MyScript se classe r?guli?rement aux premi?res places des comp?titions scientifiques internationales. Poste bas? ? Nantes Contact : job at ------------------------------------------------------------------------ As part of our constant development, MyScript (Nantes, France) is looking for a: *Natural Language Processing (NLP) Engineer* Within the R&D department "MyScript Labs", you will coordinate the development of new languages and be proactive about improving current processes. Taking advantage of your double Computer Science/Linguistics background, you will complete the following tasks: * Creation and maintenance of language resources and supervision of their integration into the handwriting recognition engine. * Research and development on language models (statistical, syntactic or semantic), and their application to human-machine interfaces. * Participation in the process of collecting handwriting samples. * In sync with the Support team, study and analysis of customer use cases. *Profile* Holding a Masters or a PhD, you have extensive experience in NLP or in a related field (artificial intelligence, pattern recognition, machine learning...). Your hands-on experience of at least one programming language used in NLP (e.g. Perl or Python) gives you autonomy in completing all your technical tasks. Rigorous, dynamic, engaged and team-oriented, you will demonstrate the leadership and the expertise required for the success of your mission. Fluent English is mandatory. Any of the following would be a plus: knowledge of C, C++ or Java, good control of scripting tools (bash, Unix/Linux tools), experience with automation of NLP processing chains, knowledge of French, fluency in other languages. In an international context, in a fast-growing company, you want to join a dynamic team and work on high-tech projects. Within MyScript, you can identify the direct and concrete applications of your work and join a human structure that values creativity, initiatives and experience sharing. MyScript is the leading worldwide provider of handwriting recognition technology with more than 100 million users. Available in more than 85 languages, MyScript covers a broad range of market applications, including Mobility (smartphones, tablets), Education (interactive digital whiteboard), Enterprise (notes management and form processing) and Automotive (GPS steering for example). The heart of its technology is released as software development kit (SDK) or in the form of applications. The recognition engine is consistently ranked in the first places in international scientific competitions. Il est donc important pour nous de toujours mieux conna?tre cette communaut? mais ?galement ses attentes vis ? vis de l'association. C'est pourquoi, nous souhaiterions que chacun d'entre vous prenne quelques minutes pour r?pondre au questionnaire que nous avons mis en place ? cette fin. En vous remerciant par avance pour vos r?ponses. Amicalement. Vous pouvez maintenant lire les textes suivants, les textes nouvellement en ligne sont signal?s par **. *Rubrique Recherche* - Micka?l Roy Sentiment de pr?sence et r?alit? virtuelle pour les langues ? Une ?tude de l'?mergence de la pr?sence et de son influence sur la compr?hension de l'oral en allemand langue ?trang?re *Rubrique Pratique et recherche* ** - Claire Chaplier et ?lisabeth Crosnier Dimension et autonomisation psycho-affectives dans deux dispositifs hybrides ? ?tudes de cas en master 2 *Rubrique Analyse de livres* - Emmanuelle Artault Duchiron, Monique Marneffe et Christian Ollivier Analyse de Vers l'int?gration des TIC dans l'enseignement des langues de Nicolas Guichon ** - Annick Rivens Mompean Analyse de Didactique des langues et technologies ? De l'EAO aux r?seaux sociaux de Muriel Grosbois *S?minaire Num?rique et langues * - Isabelle Salengros-Iguenane Le s?minaire num?rique et langues ? Vision d'ensemble - Fran?oise Demaizi?re et Muriel Grosbois Num?rique et enseignement-apprentissage des langues en Lansad ? Quand, comment, pourquoi? **- Isabelle Salengros-Iguenane Internet pour une approche culturelle - Laurence Vincent-Durroux et C?cile Poussard Conception et utilisation d'un logiciel p?dagogique, l'exemple de Macao ** - Eva Schaeffer-Lacroix Utiliser des corpus num?riques avec un public Lansad Les vid?os du s?minaire sont disponibles sur le compte UM3 de Canal-U. Revue *ALSIC* *Apprentissage des langues et syst?mes d'information et communication* ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Jul 1 19:54:46 2014 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 1 Jul 2014 21:54:46 +0200 Subject: Appel: Corela Message-ID: Date: Tue, 1 Jul 2014 17:21:39 +0200 From: Gilles Col Message-Id: X-url: Chers coll?gues La date-limite de r?ception des propositions de publication pour la revue Corela (num?ro 12/2, parution en d?cembre 2014) est fix?e au 15 septembre 2014. Les propositions sont ? envoyer ? Paul Sabatier, Toulouse, 13-14 Novembre 2014 Objectifs La r?daction technique est un secteur en pleine expansion du fait, entre autres, de la complexit? des produits commercialis?s et des processus industriels, des exigences croissantes en s?curit? et du d?veloppement des approches en sp?cification (exigences, r?gles m?tier). Les t?ches d?volues aux r?dacteurs techniques deviennent de plus en plus lourdes. Celles-ci incluent une prise en compte plus importante des interactions avec le m?tier et les op?rateurs, mais aussi la prise en compte des contraintes r?glementaires et un accroissement important en mati?re de qualit? de r?daction. Cette qualit? de r?daction est d?une nature diff?rente des contr?les offerts par les ?diteurs de textes classiques et n?cessite une d?marche particuli?re. L?objectif de ce s?minaire est de faire se rencontrer les diff?rents m?tiers de la r?daction technique, les chercheurs, les enseignants, et les industriels qui d?veloppent des outils d?aide ? la r?daction technique afin d?approfondir une connaissance r?ciproque des m?tiers de la r?daction ainsi que les possibilit?s d?aide que peuvent offrir des syst?mes avanc?s en mati?re de traitement de la langue, intelligence artificielle et ergonomie cognitive. Le s?minaire consistera en pr?sentations, d?monstrations et ?tudes de cas, dans le but de promouvoir des liens entre m?tiers et de d?velopper de nouvelles synergies. Th?mes (non exhaustifs) - Pratiques de la r?daction technique (en linguistique, ergonomie, psycholinguistique), protocoles d?analyse de ces pratiques - Le document technique : aspects ?pist?mologiques, fonctionnels, linguistiques, conceptuels. Le document technique, ses ?volutions et ses challenges, incluant la dimension multim?dia - Th?ories et recommandations de la r?daction technique (par ex. minimalisme, diff?rentes d?clinaisons de la langue contr?l?e (par ex. par rapport au domaine, aux t?ches, au type de doc technique)), normes m?tier, - Les attentes et besoins de la r?daction technique, le fonctionnement de la salle de r?daction ; r?daction et cycles de vie des documents, la relation r?dacteur-op?rateur (par ex. le REX), - La r?daction technique dans une langue ?trang?re au r?dacteur (anglais), - R?daction technique et maitrise des risques industriels : mod?les math?matiques, psychologiques, etc., - Les challenges en analyse de la coh?rence et de la coh?sion des documents, - Les syst?mes d?aide ? la r?daction technique au niveau de la langue (Rat-RQA, Rubric, Lelie, etc.), et au niveau fonctionnel (Scenari, plateformes bas?es XML), - M?thodes en correction des erreurs : strat?gies et processus, m?thodes linguistiques, capitalisation des corrections, m?moire de correction, - D?monstrations de syst?mes. Participation : gratuite. S?inscrire avant le 15 Octobre. Pour une intervention, soumettre un r?sum? d?une page en Word, format libre, avant le 15 septembre. R?ponse le 30 Septembre. Les industriels auteurs de solutions de r?daction ainsi que les ?quipes de r?dacteurs sont encourag?es ? participer. This interdisciplinary initiative is a response to the growing popularity of Digital Humanities and an increased tendency to apply computer techniques for supporting and facilitating research in Humanities. Nowadays, due to the increasing activities in digitizing and opening historical sources, the Science of History can greatly benefit from the advances of Computer and Information sciences which consist of processing, organizing and making sense of data and information. As such, new Computer Science techniques can be applied to verify and validate historical assumptions based on text reasoning, image interpretation or memory understanding. Our objective is to provide for the two different research communities a place to meet and exchange ideas and to facilitate discussion. We hope the workshop will result in a survey of current problems and potential solutions, with particular focus on exploring opportunities for collaboration and interaction of researchers working on various subareas within Computer Science and History Sciences. The main topics of the workshop are that of supporting historical research and analysis through the application of Computer Science theories or technologies, analyzing and making use of historical texts, recreating past course of actions, analyzing collective memories, visualizing historical data, providing efficient access to large wealth of accumulated historical knowledge and so on. The detailed topics of expected paper submissions are (but not limited to): - Natural language processing and text analytics applied to historical documents - Analysis of longitudinal document collections - Search and retrieval in document archives and historical collections, associative search - Causal relationship discovery based on historical resources - Named entity recognition and disabmiguation - Entity relationship extraction, detecting and resolving historical references in text - Finding analogical entities over time - Computational linguistics for old texts - Analysis of language change over time - Digitizing and archiving - Modeling evolution of entities and relationships over time - Automatic multimedia document dating - Applications of Artificial Intelligence techniques to History - Simulating and recreating the past, social relations, motivations, figurations - Handling uncertain and fragmentary text and image data - Automatic biography generation - Mining Wikipedia for historical data - OCR and transcription old texts - Effective interfaces for searching, browsing or visualizing historical data collections - Studies on collective memory - Studying and modeling forgetting and remembering processes - Estimating credibility of historical findings - Probing the limits of Histoinformatics - Epistemologies in the Humanities and Computer Science Full paper submissions are limited to 10 pages, while short paper submissions should be less than 5 pages. Submissions should be sent in English in PDF via the submission website. They should be formatted according to Springer LNCS paper formatting guidelines. They must be original and have not been submitted for publication elsewhere. Submissions will be evaluated by at least three different reviewers who come from Computer Science and History Science backgrounds. The accepted papers will be published by Springer Lecture Notes in Computer Science (LNCS). See website for more details. --------------------- ---Important dates-- - --------------------- - Paper submission deadline: September 1, 2014 (23:59 Hawaii Standard Time) - Notification of acceptance: September 25, 2014 - Camera ready copy deadline: October 1, 2014 (23:59 Hawaii Standard Time) - Workshop date: Nov 10, 2014 -------------------------- ---Organizing Committee--- -------------------------- - Adam Jatowt (Kyoto University, Japan) - Ga?l Dias (Normandie University, France) - Marten D?ring (Centre for European Studies, Luxemburg) - Antal van Den Bosch (Radboud University Nijmegen, The Netherlands) -------------------------- ---Scientific Committee-- - -------------------------- - Robert Allen (Yonsei University, South Korea) - Frederick Clavert (Paris Sorbonne University, France) - Antoine Doucet (Normandie University, France) - Roger Evans (University of Brighton, United Kingdom) - Christian Gudehus (University of Flensburg, Germany) - Pedro Rangel Henriques (Minho University, Portugal) - Pim Huijnen (Utrecht University, The Netherlands) - Nattiya Kanhabua (LS3 Research Center, Germany) - Tom Kenter (University of Amsterdam, The Netherlands) - Mike Kestemont (University of Antwerp, Belgium) - G?nter M?hlberger (University of Innsbruck, Austria) - Andrea Nanetti (Nanyang Technological University, Singapore) - Daan Odijk (University of Amsterdam, The Netherlands) - Marc Spaniol (Max Planck Institute for Informatics, Germany) - Shigeo Sugimoto (University of Tsukuba, Japan) - Nina Tahmasebi (Chalmers University of Technology, Sweden) - Lars Wieneke (Centre for European Studies, Luxemburg) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Jul 1 19:42:13 2014 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 1 Jul 2014 21:42:13 +0200 Subject: Job: Contrat post-doctoral, Humanites numeriques Message-ID: Date: Mon, 30 Jun 2014 18:41:14 +0200 From: Jean-Gabriel Ganascia Message-ID: <53B1932A.7080704 at> X-url: *INTITULE DU POSTE : chercheur en humanit?s num?riques* /Cat?gorie : contrat post-doctoral d'un an//Prise de fonction : /1 octobre 2014/ Structure de r?f?rence : *LABEX OBVIL* Localisation g?ographique : *Universit? Paris-Sorbonne, Maison de la recherche, 28 rue Serpente, 75006 Paris*** Rattachement hi?rarchique au sein de la structure : *Directeur du LABEX OBVIL, Didier Alexandre* N+1 de l'agent : *Tuteur ou Co-tuteur de la recherche post-doctorale* Quotit? de travail inh?rente au poste : *100%* *Missions du Service* Le Laboratoire d'excellence OBVIL s'inscrit dans le cadre de la COMUE Sorbonne-Universit?s et r?unit des chercheurs appartenant ? 7 ?quipes d'accueil, ? 2 UMR (unit?s mixtes universit?/CNRS) et ? un programme transversal de l'UMS de la Maison de la Recherche de Paris-Sorbonne. Il regroupe des enseignants-chercheurs et des chercheurs des universit?s Paris-Sorbonne et Pierre-et-Marie-Curie, les uns sp?cialistes de litt?rature, les autres de sciences cognitives et d'informatique. Il entend d?velopper toutes les ressources offertes par les applications informatiques et le num?rique pour examiner aussi bien la litt?rature fran?aise du pass? que la plus contemporaine. Il s'int?resse aussi ? l'?tude des traductions, des transpositions, des adaptations pour comprendre les ph?nom?nes de transmission et la mani?re dont se constituent les canons. Il recrute un jeune docteur qui sera charg?, dans le domaine de la litt?rature fran?aise des XIX?me et XX?me si?cles, du d?veloppement de recherches et d'outils num?riques en ontologie et/ou cartographie et/ou lexicographie et/ou alignement de textes et/ou stylistique appliqu?s ? des corpus litt?raires. On consultera le site du labex OBVIL qui pr?sente ces projets : *Fonctions de l'agent* Participer au projet de recherche en humanit?s num?riques d?velopp? par le labex OBVIL (num?risation, valorisation, probl?matisation num?rique) : - A partir d'un projet de litt?rature fran?aise et de la probl?matisation du corpus, d?velopper des outils num?riques - Contribuer ? la conception d'?ditions num?riques savantes (EPUB) Activit?s de l'agent - Dans un contrat d'un an, mener ? terme le contrat sous la double direction de deux tuteurs, rattach?s ? une Ecole doctorale de l'Universit? Paris-Sorbonne pour le domaine litt?raire, et rattach?s ? une Ecole doctorale de l'Universit? Pierre-et-Marie-Curie pour le domainedu num?rique. - Assurer des missions d'information et de formation en humanit?s num?riques aupr?s des ?tudiants en Master et en Doctorat de la COMUE Sorbonne-Universit?s - Contribuer au d?veloppement des outils num?riques en relation ?troite avec les ing?nieurs d'?tude du LABEX, les ing?nieurs et chercheurs du LIP6 - Participer ? l'organisation des s?minaires de recherche du LABEX Comp?tences - Etre titulaire d'un doctorat obtenu avec mention Tr?s Bien dans les disciplines litt?raires et/ou informatiques. - Avoir une ?ventuelle exp?rience d'enseignement (par exemple tutorat et /ou monitorat). - Avoir la capacit? ? s'int?grer ? une ?quipe de recherche et ? travailler en ?quipe, sur plusieurs sites. *Le dossier de candidature r?unit un CV, un dipl?me de Master 2 recherche, un projet scientifique interdisciplinaire, une lettre de motivation, ?ventuellement des lettres de recommandation et deux lettres de l'un et l'autre co-directeur. Le jury pourra, apr?s lecture et s?lection d'un projet, d?cider du choix des co-directeurs et du rattachement du doctorant ? deux laboratoires. * *Date limite du d?p?t du dossier_le 07/09/2014._* *_Contacts :_* /Didier ALEXANDRE, directeur du LABEX OBVIL/ /didier.alexandre at /Jean-Gabriel GANASCIA, UPMC, LIP 6/ /jean-gabriel at /copie ? Clarisse Barth?lemy/ /clarisse.barthelemy at /copie ? Deadline is tomorrow July 2nd 2014 Click here to Register Register for Main Conference, 1 or 2 day Workshops and half day Tutorials! ********************************************** Conference Programme Monday - August 25th 09:00-10:15 Invited Speaker: Mary Harper, IARPA Learning from 26 languages: Program Management and Science in the Babel Program 10:45-12:25 Modeling of Discourse and Dialogue I 10:45-12:25 Sentiment Analysis, Opinion Mining and Social Media I 10:45-12:25 Information Retrieval and Question Answering 10:45-12:25 Machine Learning for CL and NLP 15:45-17:25 Modeling of Discourse and Dialogue II 15:45-17:25 Sentiment Analysis, Opinion Mining and Social Media III 15:45-17:25 Semantic Processing, Distributional Semantics and Compositional Semantics I 15:45-17:25 Software, Tools Tuesday - August 26th 09:00-10:15 Invited Speaker: Ted Gibson, MIT Language for communication: Language as rational inference 10:45-12:25 Syntax, grammar induction, syntactic and semantic parsing I 10:45-12:25 Sentiment Analysis, Opinion Mining and Social Media III 10:45-12:25 Applications I 10:45-12:25 Modeling of Discourse and Dialogue III 15:45-17:25 Syntax, grammar induction, syntactic and semantic parsing II 15:45-17:25 Semantic Processing, Distributional Semantics and Compositional Semantics II 15:45-17:25 Applications II 15:45-17:25 Language Resources Wednesday - August 27th Exursion Day Thursday - August 28th 09:00-10:15 Invited Speaker: Qun Liu, CNGL/DCU Annotation Adaptation and Language Adaptation in NLP 10:45-12:25 IE/database linking I 10:45-12:25 Lexical Semantics and Ontologies I 10:45-12:25 Natural Language Generation and Summarization I 10:45-12:25 Modeling of Discourse and Dialogue IV and Multimodal Processing 14:00-15:15 Semantic Processing, Distributional Semantics and Compositional Semantics III 14:00-15:15 Morphology, word segmentation, tagging and chunking I 14:00-15:15 Speech Recognition, Text-To-Speech, Spoken Language Understanding 14:00-15:15 Lesser Resourced Languages 15:45-17:25 Syntax, grammar induction, syntactic and semantic parsing III 15:45-17:25 Machine Translation I 15:45-17:25 Linguistic and Cognitive Issues in CL and NLP I 15:45-17:25 Natural Language Generation and Summarization II and Paraphrasing Friday - August 29th 09:00-10:15 Invited Speaker: Martin Kay, XEROX Does a Computational Linguist have to be a Linguist? 10:45-12:25 Machine Translation II 10:45-12:25 IE/database linking II 10:45-12:25 Linguistic and Cognitive Issues in CL and NLP II 10:45-12:25 Lexical Semantics and Ontologies II 14:00-15:15 Machine Translation III 14:00-15:15 Lexical Semantics and Ontologies III 14:00-15:15 IE/database linking III 14:00-15:15 Morphology, word segmentation, tagging and chunking II 15:45-17:25 Best Paper Talk and Closing The conference committee and organisers take no responsibility for changes or inaccuracies to the conference programme. The above programme is subject to change. ********************************************* Accommodation Don?t forget to book your accommodation at time of registering. Rooms are limited on campus and early booking is advisable! To view accommodation options click here Just book on the registration form at the same time as your registration. ********************************************* Ireland Inspires! Cordialement Laurent Besacier Dernier appel ? communications: num?ro sp?cial sur le traitement automatique du langage parl? pour la revue TAL (Traitement Automatique des Langues) ----ENGLISH VERSION OF THIS CFP CAN BE FOUND AT THE END OF THIS MESSAGE------ possibilit? de soumettre jusqu'au 15 juillet! Direction : Laurent Besacier, Wolfang Minker Date limite : 30 juin 2014 (possibilit? de mettre ? jour article jq'au 15 juillet) La communication orale reste le moyen le plus naturel pour dialoguer et interagir (avec la machine ou avec une autre personne). Le traitement automatique du langage parl? (TALP) et le dialogue trouvent d?sormais de nombreuses applications directes dans des domaines divers tels que (liste non exhaustive) la recherche d'information, l'interaction en langue naturelle avec des dispositifs mobiles, la robotique sociale, les technologies d'assistance ? la personne, l'apprentissage des langues, etc. Cependant, le TALP pose des probl?mes sp?cifiques li?s ? la nature m?me du mat?riau trait?. En effet, on est amen? ? traiter des ?nonc?s de parole plus ou moins spontan?e et contenant de nombreux traits paralinguistiques. Par exemple, la pr?sence de disfluences orales (r?p?titions, reprises, incises...) r?duit la r?gularit? syntaxique des ?nonc?s ; les ?nonc?s oraux sont ?galement riches d'informations li?s aux affects, etc. Par ailleurs, l'?tape de transcription automatique, souvent n?cessaire avant l'application de traitements de plus haut niveau (compr?hension, traduction, analyse, etc.) rend des sorties bruit?es (contenant des erreurs) qui n?cessitent des analyses robustes et un couplage ?troit entre ?tapes de traitement. Nous invitons donc les contributions portant sur tout aspect (th?orique, m?thodologique et pratique) relatif au traitement automatique du langage parl? et ? la communication orale, et en particulier (liste non exclusive) : - Reconnaissance automatique de la parole - Compr?hension automatique de la parole - Traduction de parole - Synth?se de la parole - Dialogue oral homme - machine - Analyse robuste de la langue parl?e - Analyse des affects sociaux ou des ?motions dans des ?nonc?s oraux - Fouille de documents ? composante orale - Applications ? composantes orales (recherche d'information, interaction, robotique, etc) - Outils d'aide ? l'apprentissage d?une langue seconde - Aspects multilingues du traitement automatique du langage parl? - Evaluation de syst?mes de traitement du langage parl? - Corpus et ressources pour l'oral - Analyse du discours oral - Dialogue adaptatif au contexte et au profil de l'utilisateur - Analyse des traits paralinguistiques dans des ?nonc?s oraux ?DITEURS INVIT?S Laurent Besacier Wolfang Minker COMITE SCIENTIFIQUE ADDA Gilles LIMSI, Paris ANTOINE Jean-Yves U. F. Rabelais, Tours AUBERGE V?ronique LIG, Grenoble BELLEGARDA J?r?me APPLE, USA BONNEAU-MAYNARD H?l?ne, LIMSI, Orsay CERISARA Christophe LORIA, Nancy CERNOCKY Jan Univ. Brno, Tcheque Republic DAMNATI G?raldine Orangs Labs, Lannion DEVILLERS Laurence LIMSI, Orsay DUTOIT Thierry TCTS, Mons ESTEVE Yannick LIUM, Le Mans ESKENAZI Maxine CMU, Pittsburgh FAVRE Benoit LIF, Marseille FERRANE Isabelle IRIT, Toulouse GRAVIER Guillaume IRISA, Rennes JOUVET Denis LORIA, Nancy KAHN Juliette LNE, Paris LECOUTEUX Benjamin LIG, Grenoble LEFEVRE Fabrice LIA, Avignon LINARES Georges LIA, Avignon MEIGNIER Sylvain LIUM, Le Mans PIETQUIN Olivier Univ. Lille 1 POPESCU-BELIS Andrei IDIAP, Martigny ROSSET Sophie LIMSI, Orsay LANGUE Les articles sont ?crits en fran?ais ou en anglais. Les soumissions en anglais ne sont accept?es que pour les auteurs non francophones. FORMAT DE LA SOUMISSION Les articles doivent ?tre d?pos?s sur la plateforme La revue ne publie que des contributions originales, en fran?ais ou en anglais. Les papiers accept?s feront au maximum 25 pages en PDF. Le style est disponible pour t?l?chargement sur le site du journal TAL CONTACT Laurent Besacier (Laurent.Besacier at Wolfgang Minker (Wolfgang.Minker at ==========CFP IN ENGLISH============ Special issue on spoken language processing Guest editors: Laurent Besacier, Wolfgang Minker Speech is the most natural way to communicate and interact (with the machine or with another person) . Spoken language processing and dialogue have now many direct applications in various areas such as (but not limited to) information retrieval, natural language interaction with mobile devices, social robotics, assistive technologies, technologies for language learning, etc. . However, spoken language processing poses specific problems related to the nature of the speech material itself. Indeed, spontaneous speech utterances have to be processed and they contain many paralinguistic features. For instance, disfluencies (repetitions , false starts, etc.) reduces the syntactic regularity of utterances. Moreover, spontaneous utterances convey rich information related to emotions , etc. Furthermore, automatic speech recognition (ASR) step, often required before the application of higher level processing (understanding , translation, analysis, etc.), produces noisy outputs (with errors ) which require robust and tight coupling between modules. We invite contributions on any aspect (theoretical, methodological and practical) of spoken language processing and oral communication ; in particular (non-exclusive list): - Automatic speech recognition - Spoken language understanding - Speech translation - Text-to-Speech synthesis - Man-machine dialogue - Robust analysis of spoken language - Analysis of social affects or emotions in spontaneous speech - Mining spoken language documents - Spoken language applications (mobile interaction, robotics, etc. ) - Technologies for language learning - Multilingual aspects of spoken language processing - Evaluation for spoken language processing - Corpora and resources for spoken language - (Spoken) discourse analysis - Adaptive dialogue (context, user profile) - Analysis of paralinguistic features in spoken language IMPORTANT DATES - call : march 2014 - submission of contributions : 30 june 2014 (possibility to update paper until July 15th) - first authors notification : 15 september 2014 - publication : end 2014 / begin 2015 Submission format LANGUAGE Manuscripts may be submitted in English or French. Style sheets are available for download on the Web site of the TAL journal ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From thierry.hamon at UNIV-PARIS13.FR Tue Jul 1 19:37:59 2014 From: thierry.hamon at UNIV-PARIS13.FR (Thierry Hamon) Date: Tue, 1 Jul 2014 21:37:59 +0200 Subject: ATALA: Appel a candidatures pour le conseil d=?UTF-8?Q?=E2=80=99administration_?=/ Assemblee Generale Message-ID: Date: Mon, 30 Jun 2014 03:24:53 +0200 From: pap Message-ID: <53B0BC65.4080501 at> Bonjour, je vous rappelle que l'ATALA organise sont Assembl?e G?n?rale le Vendredi 4 Juillet apr?s la session de cl?ture de TALN. A cette occasion, l'ATALA proc?de au renouvellement (par tiers) du Conseil d'Administration et sollicite les bonnes volont?s pour participer activement au d?veloppement de l'Association. L'ATALA deviendra ce que vous en ferez ! Les membres du tiers sortant sont r??ligibles, mais ils doivent en exprimer le souhait. L?entr?e au CA est ouverte ? tout membre de l?ATALA. Les personnes souhaitant se porter candidates sont pri?es de le signaler par courrier ?lectronique ? contact(@) avant le mardi 01 juillet 21h. Les candidatures seront affich?es sur le site au fur et ? mesure de leur r?ception. Nous attirons l?attention sur le fait que l?ATALA souhaite accueillir en son CA des personnes motiv?es et pr?tes ? s?investir dans les activit?s administratives de l?association. En souhaitant vous rencontrer prochainement ? TALN 2014, Patrick Paroubek Pr?sident de l'ATALA ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Thu Jul 3 14:36:03 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Thu, 3 Jul 2014 16:36:03 +0200 Subject: Carnet: Deces de Joachim (Jim) Lambek Message-ID: Date: Thu, 3 Jul 2014 15:38:58 +0200 From: retore Message-Id: <9C98A30E-4491-4672-8C8A-9E9DFF30DD81 at> X-url: Cher coll?gues, Vous trouverez ci-dessous le message de Michael Barr diffus? sur la liste de th?orie des cat?gories, qui nous fait part de la disparition de Jim Lambek lundi 23 juin, ? l'?ge de 91 ans. Pour ceux qui ne sont pas familiers de ses travaux, qui concernent les math?matiques, la logique, la linguistique et la physique: Michael Barr barr at 2014-06-24 01:55:48 GMT I regret to inform you all that Jim died this afternoon. His son says it was congestive heart failure which is as good a way as any to describe dying of old age. He was still coming to seminar last fall and celebrated his 91st birthday in December in pretty good shape, but has been gradually going downhill since. I don't believe he came to the office since late fall. He had a good run. En ce qui concerne la linguistique informatique, il est surtout connu pour "the mathematics of sentence structure" (1958) qui ?tablit un lien profond entre grammaire formelle et logique, les grammaires cat?gorielles, les grammaires de types logiques. Depuis la fin des ann?es 90, il travaillait sur les grammaires de pr?groupes. De nombreuses personnes ont travaill? sur le calculd e Lambek: J-P Descl?s, M. Eytan, A. Lecomte, et plus r?cemment les ?quipes comme Calli/S?magramme au LORIA, Signes ? Bordeaux, LALLIC ? Paris 4, et bien s?r les gens de Montr?al. D'autres, comme Alain Lecomte ou Fran?ois Lamarche, sont sans probablement tout autant habilit?s que je le suis ? publier cette annonce. J'avais fait sa connaissance en 1988 ? un congr?s de logique cat?gorique ? l'unviersit? Paris 7, puis ? Urbino, nous l'avions invit? aux conf?rences LACL. En 2001 l'avais invit? ? un workshop sur l'apprentissage des grammaires cat?gorielles ? Nantes o? il s'?tait cass? le bras, et depuis il h?sitait ? traverser l'oc?an pour nous rendre visite (il avait envoy? son rapport sur mon habilitation, mais avait d?clin? l'invitation). Un livre d'hommages est sorti pour son 90e anniversaire, et son dernier ouvrage vient de para?tre: Il ?tait tr?s actif jusqu'? l'automne dernier, et d'apr?s ce qu'a dit son fils, il s'est ?teint en paix, entour? des siens. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From SpecialIssue at JOURNALBOOKSERVICE.NET Mon Jul 7 12:58:48 2014 From: SpecialIssue at JOURNALBOOKSERVICE.NET (SpecialIssue) Date: Mon, 7 Jul 2014 20:58:48 +0800 Subject: To Be a Special Issue Lead Guest Editor Message-ID: An HTML attachment was scrubbed... URL: From hamon at LIMSI.FR Tue Jul 8 15:42:12 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:42:12 +0200 Subject: Appel: Workshop TOTh, December 2014 Message-ID: Date: Wed, 2 Jul 2014 17:43:22 +0200 From: Luc Damas Message-ID: X-url: Workshop TOTh 2014 ------------------------------------------------------------------------ The 2014 TOTh Workshop is organised by the Royal Museums of Art and History of Brussels within the scope of the European Project AthenaPlus ( ------------------------------------------------------------------------ Title: Multilingual Thesaurus and Terminology ------------------------------------------------------------------------ Brussels - December 5th 2014 The Cinquantenaire Museum Parc du Cinquantenaire 10 The ever-increasing amount of open data and linked data raises questions concerning its access in a multilingual context. Due to the diversity of the collections, of the institutions that manage them, of the public that has access to them and of the technologies currently available, it is necessary to rethink the notions of thesaurus and terminology, as well as the ways to manage and access these collections. Some of the topics that will be covered by the TOTh 2014 Workshop include (this list is not exhaustive): Principles, Theories and Methods: Thesauri, Terminology, Ontology, Controlled Vocabulary, Semantic Network; Embracing and Managing Multilingualism; Mapping, Alignment and Harmonisation of terminologies, thesauri, ontologies; Indexing and Research; The compatibility of the ISO and W3C Standards concerning terminology, thesauri, knowledge systems and interchange format; Impact and contributions from new domains and technologies connected to Knowledge Engineering and the Semantic Web; Software Environments. Special attention will be given to the issue of cultural content management. Submission: Abstracts of one or two pages must be sent to: workshop-toth at Official languages: English and French Deadline for submission: September 30th 2014 Information: Eva Coudyzer e.coudyzer at Scientific coordination : C. Roche, R. Costa, E. Coudyzer From hamon at LIMSI.FR Tue Jul 8 15:40:44 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:40:44 +0200 Subject: Appel: SIMBig'2014, Extended Deadline Message-ID: Date: Wed, 2 Jul 2014 12:24:57 +0200 From: Juan Lossio Message-ID: X-url: =========================================================== EXTENDING THE DEADLINE FOR SUBMISSION - SIMBig 2014 =========================================================== SIMBig'2014 - 1st Symposium on Information Management and Big Data 8-10 September 2014 - Cusco, PERU Paper/Demos Submission Deadline: extended till July 11, 2O14 simbig2014 at =========================================================== CALL FOR PAPERS =========================================================== The first edition of the International Symposium on Information Management and Big Data SIMBig 2014, aims to bring together main national and international actors in the decision-making field to state in new technologies dedicated to handle large amount of information. SIMBig 2014 first edition will be held in Cusco, Peru during three days of conference. The city renowned for its architecture, history and sincere culture, you will undoubtedly appreciate during our scheduled guided tours. On behalf of the Scientific Program Committee, we have great pleasure in inviting you to submit one or more papers (for oral or poster presentation) in accordance with the instructions that are provided in Paper Submission Guidelines =========================================================== Important dates: - Paper Submission Deadline: extended till July 11, 2O14 - Exhibitions and Demos Submission Deadline: extended till July 11, 2O14 - Notification of Acceptance: July 31, 2014 - Final Paper Submission Deadline: August 15, 2014 - Simposium: September 8-10, 2014 =========================================================== Scope and Topics Authors are invited to submit original and innovative papers that break new ground, present insightful results based on your experience in Data Management and Big Data. SIMBig2014 has a broad scope, and specific topics of interest include (but are not limited to): Big Data Management Big Data Applications Text Analytics Information Retrieval Data mining OLAP and MDA Models Text mining Semantic Web Linked Data for data pre-processing: cleaning, sorting, filtering or enrichment Linked Data applied to Machine Learning Decision Support Systems Data warehousing Information management Business intelligence Data management Semi-structured and Unstructured Data Data governance Outsourcing Social media/Collaboration Spatiotemporal data Information Services and Resources Open Data Natural Language Processing Strategic uses of information systems Information technology management =========================================================== Submission guidelines: The paper must follow IEEE two-column format with single-spaced, 10 point font in the text The document should be formatted for the standard A4-size paper Papers must be submitted only in portable document format otherwise known as PDF The paper length should be between 4 to 8 pages (including references and figures) Follow the instructions in Word document and Latex templates (ACL templates) =========================================================== From hamon at LIMSI.FR Tue Jul 8 16:09:56 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:09:56 +0200 Subject: Appel: Posters COLDOC 2014, Diversite des langues Message-ID: Date: Mon, 7 Jul 2014 15:12:27 +0200 From: ColDoc 2014 Message-ID: X-url: ===================================================== COLDOC est le colloque annuel organis? par les doctorants et jeunes chercheurs en Sciences du Langage du laboratoire MoDyCo (UMR 7114 ? CNRS/Universit? Paris Ouest Nanterre/Universit? Paris Descartes). Il aura lieu le 13 et 14 novembre 2014 ? l'Universit? Paris Ouest Nanterre la D?fense. Cette ann?e, nous nous int?ressons ? la diversit? et aux contacts des langues et souhaitons traiter les questions qu'elle soul?ve en termes de classement typologique et de r?flexion sur les universaux linguistiques. L'appel ? communication est d?sormais clos, mais l'appel ? posters restera ouvert jusqu'au 7 ao?t 2014. Les auteurs recevront leur notification d'acceptation ou de refus le 10 ao?t. Ces communications donneront lieu ? la publication d'un article court dans les actes en ligne du colloque. Les modalit?s de soumission sont en ligne sur le site (voir la section "Posters"). Nous demandons un projet explicatif d'une page. Dans l'attente de vous lire, Le comit? d'organisation de COLDOC 2014 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 15:52:59 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:52:59 +0200 Subject: Appel: Colloque Senelangues 2015 Message-ID: Date: Fri, 04 Jul 2014 16:18:13 +0200 From: Stephane Robert Message-ID: X-url: Colloque S?n?langues 2015 Langues d?Afrique de l?Ouest Colloquium Senelangues 2015 West African Languages Confer?ncia S?n?langues 2015 L?nguas da ?frica Ocidental English version below / Vers?o portuguesa : veja abaixo 1er appel ? communications Colloque S?n?langues 2015 Langues d?Afrique de l?ouest 24-25 avril 2015 Dakar, S?n?gal Date limitede soumission: 15 novembre 2014 Site: Contact: senelangues2015call at Le projet S?n?langues, financ? par l?Agence Nationale de la Recherche fran?aise, a r?uni pendant quatre ans des linguistes des laboratoires LLACAN et DDL du CNRS dans une collaboration avec l?Universit? Cheikh Anta Diop de Dakar autour d?un ambitieux projet de description et de documentation des langues du S?n?gal ( Dans la continuit? de cette collaboration scientifique, les membres de S?n?langues organisent en avril 2015 un double ?v?nement, S?n?langues 2015, qui articulera un colloque sur la description des langues d?Afrique de l?ouest avec une ?cole th?matique sur le m?me th?me. Le colloque S?n?langues 2015- Langues d?Afrique de l?ouest se tiendra les 24 et 25 avril ? l?Universit? Cheikh Anta Diop de Dakar. Th?matique du colloque Stimul?e par divers projets collaboratifs soutenus par diff?rentes agences ou fondations, la description des langues d?Afrique a pu b?n?ficier, au cours des derni?res d?cennies, des d?veloppements r?cents des bonnes pratiques et des ressources informatiques en mati?re d?analyse linguistique, typologique et documentaire. En ouvrant la probl?matique de la description linguistique ? toute l?Afrique de l?ouest, l?objectif de ce colloque est de permettre aux linguistes qui travaillent sur les langues de cette r?gion de se rencontrer pour faire le point sur leurs avanc?es scientifiques, partager leurs connaissances, leur savoir-faire et leurs interrogations, et d?accro?tre ainsi les connaissances sur les langues de cette r?gion. Les contributions attendues doivent porter sur des langues vernaculaires d?Afrique de l?ouest (cr?oles inclus), sans exclure toutefois la description des ph?nom?nes de contact avec des langues d?autres familles. Tous les niveaux de l?analyse linguistique (phonologie, morphologie, syntaxe, s?mantique, ?nonciation et pragmatique) pourront ?tre abord?s. Conf?rences pl?ni?res Denis Creissels, Universit? Lumi?re Lyon 2 Felix Ameka, Universit?de Leiden Modalit?s de soumission des communications Les communications pourront se faire sous forme orale (dur?e 20mn suivies de 10mn de discussion) ou sous forme de poster (dimensions recommand?es Format A0, H : 1,20 m - L : 0.80 m) dans le cadre d?une session sp?ciale (par choix des proposants ou d?cision des membres du comit? de s?lection). Dans les deux cas, les consignes pour l?envoi des propositions sont les suivantes: - le r?sum? doit faire un maximum d?une page (titre, exemples et r?f?rences compris), en Times 12 (simple interligne) - il doit ?tre envoy? anonymis? et aux formats rtf et pdf ? l?adresse suivante: senelangues2015call at - le nom du fichier pdf comportera simplement quelques mots clefs du titre de la communication - sujet du message: communication Senelangues 2015 - dans le corps du texte du message, indiquer: nom, pr?nom, affiliation, adresse mail, titre de la proposition, format souhait? (poster vs. oral) - les langues de la conf?rence sont le fran?ais, l?anglais et le portugais Adresse pour les soumissions et contact senelangues2015call at Calendrier Date limite d?envoi des r?sum?s:15 novembre 2014 Notification aux auteurs: 15 janvier 2015 Lieu de la conf?rence Universit? de Cheikh Anta Diop, Dakar, S?n?gal Comit? Scientifique Felix Ameka Universit? de Leiden Larry Hyman U.C. Berkeley Valentin Vydrine INALCO, LLACAN, Paris Martine Vanhove LLACAN, CNRS & INALCO, Paris Koen Bostoen Ghent University J?r?mie Kouadio N'Guessan Universit? de Cocody Comit? d?organisation Sylvie Voisin DDL, CNRS & Universit? d?Aix Marseille St?phane Robert LLACAN, CNRS & INALCO, Paris Alain-Christian Bass?ne FLSH UCAD, Dakar Denis Creissels DDL, CNRS & Lyon 2 Thierno Ciss? FLSH UCAD, Dakar No?l Bernard Biagui CLAD UCAD, Dakar Nicolas Quint LLACAN, CNRS & INALCO, Paris Jeanne Zerner LLACAN, CNRS & INALCO, Paris Anna Marie Diagne IFAN UCAD, Dakar El Hadji Di?ye FLSH UCAD, Dakar Dame Ndao FLSH UCAD, Dakar ________________________________________________________________________ Colloquium Senelangues 2015 West African Languages Call for Papers English Version First Call for Abstracts Colloquium Senelangues 2015 West African Languages 24-25 April 2015 Dakar, Senegal Deadline for submission: 15 November 2014 web site: contact: senelangues2015call at The S?n?langues project (, which aimed at the description and documentation of the languages of Senegal, was financed by the Agence Nationale de la Recherche fran?aise for a period of 4 years, involving linguists from the CNRS laboratories LLACAN and DDL in collaboration with the University Cheikh Anta Diop of Dakar. This scientific collaboration continues with the organisation of a double event, S?n?langues 2015, which consists of a Colloquium on the description of West African languages, and a thematic school with the same topic. The Colloque S?n?langues 2015 Langues d?Afrique de l?Ouest will take place on 24 and 25 April 2015 at the Cheikh Anta Diop University of Dakar. Topics of the colloquium In the last decades, the description of African languages benefited a lot from the recent developments of good practices in the areas of information technology and of linguistic analysis including typology and language documentation. These developments have been stimulated by various collaborative projects and funding schemes. The aim of the Colloquium is to gather linguists working in West Africa so that they can share each other?s scientific results, insights, know-how and research questions in order to increase our understanding of the languages of the region. We welcome contributions on the analysis of West African languages including Creole languages, as well as on phenomena of language contact with other language families. Contributions in all sub-disciplines of linguistic analysis are welcome, including phonology, morphology, syntax, semantics, pragmatics. Plenary speakers Denis Creissels, University of Lyon 2 Felix Ameka, Universityof Leiden How to submit a contribution Contributions can either be in the form of an oral presentation of 20 minutes + 10 minutes discussion or in the form of a poster presentation (poster format A0, 120 by 80 cm). Presenters may indicate their preference (oral presentation or poster) but the selection committee reserves the right to do otherwise. For both types of presentation the abstract should adhere to the following instructions: - Maximum one page including title, examples and references, using a Times 12 point font. - Send an anonymous version of your abstract in both rtf and pdf formats as an attachment to an email message to senelangues2015call at - Use some key words of your title in the name of your pdf-file. - Mention ?communication Senelangues 2015? in the subject line of the email message - Indicate in the body of your message: surname, first name, affiliation, email address, title of your paper, preferred presentation (poster or oral) - The language of presentation should be either French, English or Portuguese. Address for submissions and any contact senelangues2015call at Important dates Deadline for submitting abstracts:15 November 2014 Notification of decision of acceptance : 15 January 2015 Conference venue Facult? de Lettres, Universit? de Cheikh Anta Diop, Dakar, Senegal Scientific Committee Felix Ameka University of Leiden Larry Hyman U.C. Berkeley Valentin Vydrine INALCO, LLACAN, Paris Martine Vanhove LLACAN, CNRS & INALCO, Paris Koen Bostoen University of Ghent J?r?mie Kouadio N'Guessan Universityof Cocody Organizing Committee Sylvie Voisin DDL, CNRS & University of Aix-Marseille St?phane Robert LLACAN, CNRS & INALCO, Paris Alain-Christian Bass?ne FLSH UCAD, Dakar Denis Creissels DDL, CNRS & Lyon 2 Thierno Ciss? FLSH UCAD, Dakar No?l Bernard Biagui CLAD UCAD, Dakar Nicolas Quint LLACAN, CNRS & INALCO, Paris Jeanne Zerner LLACAN, CNRS & INALCO, Paris Anna Marie Diagne IFAN UCAD, Dakar El Hadji Di?ye FLSH UCAD, Dakar Dame Ndao FLSH UCAD, Dakar ________________________________________________________________________ Confer?ncia S?n?langues 2015 L?nguas da ?frica Ocidental Chamada para comunica??o Vers?o portuguesa 1eirachamada para comunica??o Confer?ncia S?n?langues 2015 L?nguas da ?frica Ocidental 24-25 Abril 2015 Dakar, Senegal Prazo de entrega das submiss?es: 15 Novembro 2014 Web: Contacto: senelangues2015call at O projecto S?n?langues, financiado pela Ag?ncia Nacional [Francesa] para a Pesquisa, reuniu durante quarto anos, linguistas das unidades de pesquisa LLACAN e DDL do CNRS [Centro Nacional [Franc?s] de Pesquisa Cient?fica] em parceria com a Universidade Cheikh Anta Diop de Dakar no ?mbito dum ambicioso projecto de descri??o e documenta??o das l?nguas de Senegal ( Na continuidade desta colabora??o cient?fica, os membros de S?n?langues decidiram organizar em Abril de 2015 um duplo evento, S?n?langues 2015, que combinar? uma confer?ncia sobre a descri??o das l?nguas da ?frica Ocidental com um minicurso dedicado ao mesmo tema. A confer?ncia S?n?langues 2015 L?nguas da ?frica Ocidental ter? lugar a 24 e 25 de Abril na Universidade Cheikh Anta Diop de Dakar. Tem?tica da confer?ncia Gra?as ao est?mulo de v?rios projectos colaborativos apoiados por diversas ag?ncias ou funda??es, a descri??o das l?nguas africanas tem vindo a beneficiar, ao longo das ?ltimas d?cadas, dos desenvolvimentos recentes das boas pr?ticas e dos recursos inform?ticos no que tange aos processos de an?lise de cariz lingu?stico, tipol?gico e documental. Ao abrir a problem?tica da descri??o lingu?stica ao conjunto da ?frica Ocidental, esta confer?ncia tem como objectivo permeter aos linguistas que trabalham sobre as l?nguas dessa ?rea encontrarem-se para fazer o balan?o dos seus avances cient?ficos, compartilharem os seus respectivos conhecimentos, as suas experi?ncias e d?vidas, assim como favorecer o aumento dos conhecimentos globais dispon?veis sobre as l?nguas da ?frica Ocidental. Esperamos contribui??es que tratem das l?nguas vern?culas da ?frica Ocidental (inclusive os crioulos) e tamb?m estamos interessados na descri???o dos fen?menos de contactos que se produzem entre estas l?nguas e idomas de outras familhas. Todos os n?veis da an?lise lingu?stica (fonologia, morfologia, sintaxe, sem?ntica, enuncia??o e pragm?tica) ser?o contemplados. Confer?ncias plen?rias Denis Creissels, Universidade de Lyon 2 Felix Ameka, Universidadede Leiden Modo de submiss?o das comunica??es Conforme o gosto dos conferencistas ou a decis?o dos membros do comit? de selec??o, as comunica??es far-se-?o de forma oral (20 mn mais 10 mn de perguntas) ou sob forma de p?ster (tamanho recomendado A0, H : 1,20 m - L : 0,80 m) no quadro de uma sess?o especial. Em ambos os casos, as consignas para o envio das propostas s?o as seguintes: - o resumo n?o deve exceder uma p?gina (t?tulo, exemplos e refer?ncias inclu?dos), em Times 12 (intervalo entre linhas simples) - ser? enviado (vers?o anonimizada) em formato rtf e pdf para o endere?o seguinte: senelangues2015call at - o nome do ficheiro pdf constar? simplesmente de algumas palavras-chaves do t?tulo da comunica??o - assunto da mensagem: ?communication Senelangues 2015? - mencione no texto da mensagem: o seu apelido, nome, afilia??o (universit?ria), endere?o electr?nico (e-mail), t?tulo da proposta, formato desejado (p?ster vs. oral) - as l?nguas da confer?ncia s?o o franc?s, o ingl?s e o portugu?s Contacto parasubmiss?o de resumos - informa??es senelangues2015call at Calend?rio Submiss?o dos resumos: at? ao 15 de Novembro 2014 Notifica??o aos autores: 15 Janeiro 2015 Lugar da confer?ncia Universidade Cheikh Anta Diop, Dakar, S?n?gal Comit? Cient?fico Felix Ameka Universidadede Leiden Larry Hyman U.C. Berkeley Valentin Vydrine INALCO, LLACAN, Paris Martine Vanhove LLACAN, CNRS & INALCO, Paris Koen Bostoen Universidade de Ghent J?r?mie Kouadio N'Guessan Universitade de Cocody Comit? de organiza??o Sylvie Voisin DDL, CNRS & Universidade de Aix-Marseille St?phane Robert LLACAN, CNRS & INALCO, Paris Alain-Christian Bass?ne FLSH UCAD, Dakar Denis Creissels DDL, CNRS & Lyon 2 Thierno Ciss? FLSH UCAD, Dakar No?l-Bernard Biagui CLAD UCAD, Dakar Nicolas Quint LLACAN, CNRS & INALCO, Paris Jeanne Zerner LLACAN, CNRS & INALCO, Paris Anna Marie Diagne IFAN UCAD, Dakar El Hadji Dieye FLSH UCAD, Dakar Dame Ndao FLSH UCAD, Dakar ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 16:18:52 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:18:52 +0200 Subject: Appel: EMNLP 2014Workshop on Arabic Natural Language Processing Message-ID: Date: Tue, 8 Jul 2014 01:11:45 -0700 From: Wajdi Zaghouani Message-ID: <1404807105.24132.YahooMailNeo at> X-url: X-url: ======================================================= Last Call for Papers and Participation EMNLP Workshop on Arabic Natural Language Processing Including Shared Task on Automatic Arabic Error Correction Apologies for multiple postings Please distribute to colleagues ======================================================= Last Call for Papers and Participation Arabic Natural Language Processing Workshop collocated with EMNLP 2014, Doha, Qatar Workshop date: Saturday October 25, 2014 Paper submission deadline: July 26, 2014 Workshop Website: Shared Task Website: ======================================================= WORKSHOP DESCRIPTION There has been a lot of progress in the last 15 years in the area of Arabic Natural Language Processing (NLP). Many Arabic NLP (or Arabic NLP-related) workshops and conferences have taken place, both in the Arab World and in association with international conferences. This workshop follows in the footsteps of previous efforts to provide a forum for researchers to share and discuss their ongoing work. We invite submissions on topics that include, but are not limited to, the following: * Basic core technologies: morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling, sentiment analysis, Arabic dialect modeling, etc. * Applications: machine translation, speech recognition, speech synthesis, optical character recognition, pedagogy, assistive technologies, social media, etc. * Resources: dictionaries, annotated data, specialized databases etc. Submissions may include work in progress as well as finished work. Submissions must have a clear focus on specific issues pertaining to the Arabic language whether it is standard Arabic, dialectal, or mixed. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work. Submissions are expected to be 8 pages long plus 2 pages for references. Associated with the workshop will be a shared task on Arabic text error correction (see link to Shared Task Website above). IMPORTANT DATES Paper submission deadline: July 26, 2014 Author notification: August 26, 2014 Camera Ready: September 15, 2014 Workshop: October 25, 2014 ORGANIZERS Program Co-chairs Nizar Habash, Columbia University Stephan Vogel, Qatar Computing Research Institute Publication Co-chairs Nadi Tomeh, Paris 13 University Houda Bouamor, Carnegie Mellon University Qatar Website Committee Kareem Darwish, Qatar Computing Research Institute Noura Farra, Columbia University Shared Task Committee Behrang Mohit (co-chair), Carnegie Mellon University Qatar Alla Rozovskaya (co-chair), Columbia University Wajdi Zaghouani, Carnegie Mellon University Qatar Ossama Obeid, Carnegie Mellon University Qatar Nizar Habash (advisor), Columbia University Program Committee Members Abdelmajid Ben-Hamadou, University of Sfax, Tunisia Abdelhadi Soudi, Ecole Nationale de l?Industrie Min?rale, Morocco Abdelsalam Nwesri, University of Tripoli, Libya Achraf Chalabi , Microsoft Research, Egypt Ahmed Ali, Qatar Computing Research Institute, Qatar Ahmed Rafea, The American University in Cairo, Egypt Alexis Nasr, University of Marseille, France Ali Farghaly, Monterey Peninsula College, USA Almoataz B. Al-Said, Cairo University, Egypt Alon Lavie, Carnegie Mellon University, USA Aly Fahmy, Cairo University, Egypt Azadeh Shakery, University of Tehran, Iran Azzeddine Mazroui, University Mohamed I, Morocco Bassam Haddad, University of Petra, Jordan Bayan Abu Shawar, Arab Open University, Jordan Behrang Mohit, Carnegie Mellon University Qatar, Qatar Eric Atwell, University of Leeds, UK FarhadOroumchian, University of Wollongong, Australia Ghassan Mourad, Universit? Libanaise, Lebanon Hassan Sawaf, eBay Inc., USA Hazem Hajj, American University of Beirut, Lebanon Hend Alkhalifa, King Saud University, Saudi Arabia Houda Bouamor, Carnegie Mellon University Qatar, Qatar Imed Zitouni, Microsoft Research, USA Joseph Dichy, Universit? Lyon 2, France Karim Bouzoubaa , Mohammad V University, Morocco KarineMegerdoomian, The MITRE Corporation, USA Katrin Kirchhoff, University of Washington, USA Kemal Oflazer, Carnegie Mellon University Qatar, Qatar Khaled Shaalan, The British University in Dubai, UAE Khaled Shaban, Qatar University, Qatar Khalil Sima?an, Universiteit van Amsterdam, Netherlands Lamia Hadrich Belguith, University of Sfax, Tunisia Michael Rosner, University of Malta, Malta Mohamed Elmahdy, Qatar University, Qatar Mohsen Rashwan, Cairo University, Egypt Mona Diab, George Washington University, USA Mustafa Jarrar, Bir Zeit University, Palestine Nada Ghneim, Higher Institute for Applied Sciences and Technology, Syria Nadi Tomeh, University Paris 13, France Ossama Emam, IBM, USA Otakar Smr?, D??m-e D?am Language Institute, Czech Republic Owen Rambow, Columbia University, USA Preslav Nakov, Qatar Computing Research Institute, Qatar Ramzi Abbes, TECHLIMED, France Salwa Hamada, Cairo University, Egypt Shahram Khadivi, Tehran Polytechnic, Iran Sherri Condon , The MITRE Corporation, USA Taha Zerrouki, University of Bouira, Algeria Violetta Cavalli-Sforza, Al Akhawayn University, Morocco ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 16:06:45 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:06:45 +0200 Subject: Appel: Colloque PhonoGenres and Speaking Styles, Geneve Message-ID: Date: Mon, 07 Jul 2014 15:10:38 +0200 From: Message-ID: X-url: Ch?re coll?gue, cher coll?gue, Nous organisons les 10 et 11 septembre 2014 ? l'Universit? de Gen?ve le 3?me SWIP (Swiss Workshop on Prosody) intitul? "PhonoGenres and Speaking Styles". Pour plus d'information se r?f?rer ? l?adresse: Cordialement, Le comit? d?organisation ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 15:45:55 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:45:55 +0200 Subject: Soft: Glozz 2.0 (plate-forme d'annotation) Message-ID: Date: Fri, 4 Jul 2014 15:42:14 +0200 From: Yann Mathet Message-Id: <61B3F419-7CFF-45EB-814D-DA6A3B04BA53 at> X-url: Chers coll?gues, Nous avons le plaisir de vous annoncer la mise en ligne de la version 2.0.1-beta4 de la plate-forme d'annotation Glozz. Les principales nouveaut?s sont repertori?es ci-dessous. Soulignons notamment son passage prochain ? l'open-source, et sa nouvelle architecture permettant l'ajout de plug-ins (pouvant ?tre d?velopp?s par la communaut? des utilisateurs). 1) Passage prochain en open source (licence en cours d'?tablissement). 2) M?canisme de plug-ins permettant le d?veloppement d'extensions au logiciel sans modifier son noyau. Quiconque peut ainsi d?velopper ses propres plug-ins, et ?ventuellement les diffuser, en limitant les risques d'incompatibilit?. Nous encourageons les d?veloppeurs int?ress?s ? privil?gier la cr?ation de plug-ins plut?t que de modifier directement le noyau (quand cela sera possible). Nous r?pondrons avec plaisir aux demandes de pr?cisions concernant ces aspects. 3) Nouveau plug-in "Concordancer" offrant une vue de type concordancier. Cette vue synth?tique permet de consulter rapidement des unit?s annot?es dans leur contexte, contexte compos? de texte mais aussi ?ventuellement d'autres objets annot?s. Un manuel d?di? est disponible sur 4) G?n?ralisation du principe du "Basket" ? toute l'application et aux plug-ins. Il permet ainsi de stocker des annotations sp?cifiquement choisies (? la main, en sortie d'une requ?te GlozzQL ou via le concordancier), afin de mieux le observer, de les enregistrer dans un fichier sp?cifique, ou encore de leur affecter automatiquement un couple attribut-valeur. Se reporter au manuel d?di? sur 5) Raccourcis clavier param?trables (Options/Pr?f?rences/Shortcuts) permettant d'annoter rapidement une portion de texte s?lectionn? avec une unit? dont le type et les attributs-valeurs sont indiqu?s pour un raccourci donn?. Par exemple, une fois un segment de texte s?lectionn? ? la souris, la touche "F1" pourra cr?er une unit? de type "Nom", avec le trait "genre" ? la valeur "masculin" et le trait nombre ? la valeur "singulier", tandis que "F2" pourra ?tre utilis? pour un nom f?minin singulier. Ces raccourcis peuvent ?tre sauvergard?s et transmis ? d'autres utilisateurs, si bien que le responsable de campagne peut d?livrer une configuration pr?te ? l'emploi pour tous ses annotateurs. 6) Nouveaux export et import CSV, pour un emploi dans un tableur, et une ?ventuelle passerelle vers d'autres applications, en entr?e ou en sortie de Glozz. 7) Ajout d'une contrainte "Same Position" dans GlozzQL, qui concerne les Units et indique si deux unit?s sont situ?es exactement au m?me endroit dans le texte. Cela peut notamment ?tre utile dans le cas o? il y a plusieurs couches d'annotation ind?pendantes, et que l'on souhaite voir s'il existe des superpositions d'unit?s entre ces couches. 8) Interface graphique remani?e. Notamment, utilisation d'onglets pour les diff?rents outils rarement utilis?s simultan?ment. 9) Export SQL corrig? et am?lior?. Vous pourrez r?cup?rer une archive au format TGZ ? l'adresse : L'archive contient notamment : - le JAR de l'application (glozz-platform.jar) ; - le plugin concordancer (plugins/concordancer/glozz-concordancer-plugin.jar) ; - diff?rents fichiers de test (r?pertoire data/) ; - un changelog (CHANGELOG.utf8) indiquant les principales ?volutions de cette version et des pr?c?dentes ; - un script de lancement destin? aux utilisateurs de Windows rencontrant des difficult?s lors de l'utilisation directe des JARs (StartGlozz.bat) ; - la licence (licence.pdf). Nous vous rappelons que Glozz dispose de manuels, accessibles sur, qui pr?sentent de mani?re d?taill?e et illustr?e la plupart des fonctionnalit?s disponibles. N'h?sitez pas ? nous faire part de vos remarques et suggestions concernant la plate-forme ou son manuel. Bien cordialement, Yann Mathet et Antoine Widl?cher ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 16:00:54 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:00:54 +0200 Subject: Sujet de these: 3 PhD Positions in Speech Processing, LIG/Grenoble (France) Message-ID: Date: Mon, 7 Jul 2014 09:23:24 +0200 From: Jean-Claude MARTIN Message-ID: X-url: X-url: X-url: X-url: 3 PhD Positions in Speech Processing at LIG/Grenoble (France) The Study Group for Machine Translation and Automated Processing of Languages and Speech (GETALP) of LIG (Laboratory of Informatics of Grenoble) offers 3 PhD Positions in Speech Processing. We are looking for outstanding young research scientists to join the group on several projects involving speech processing. Opened Positions 1. PhD / Automatic speech recognition and machine assisted speech annotation for African Languages You will work in the context of the ALFFA project which is really interdisciplinary since it not only gathers technology experts (LIG, LIA, VOXYGEN) but also includes fieldwork linguists/phoneticians (DDL). The PhD will focus on analysing the capabilities of existing automatic speech processing systems to investigate phonetic characteristics of languages or annotate speech (especially on mobile devices: tablets, glasses, etc) to provide an innovative digital assistant to the fieldwork linguist. Start : Fall 2014 Duration : 36 months Particular aspect : co-supervision with DDL lab in Lyon Contact : Laurent.Besacier at & Francois.Pellegrino at Project Web Site : Team/Lab Web Sites : 2. PhD / Speech interaction for socio-affective ubiquitous agents and robots in ambient assisted living environments You will work on a research and development project (CASSIE) involving academic and industrial stakeholders of spoken dialog, assistive technologies, affectives sciences and social robotics. The PhD objective is to design a spoken dialogue system that will interact with a user in her/his home through an ubiquitous (physical and/or virtual) and personalized agent. This dialogue system will be corpus based, with iterative machine learning approach hydride with boostrap expert knowledge (observed from ?intelligent? annotations) from spontaneous and ecological data collected in real or quasi-real environment (Smart Home) and situation (real scenario). The system will focus on the socio-affective dimensions of the interaction (socio-affective prosody, paralinguistic events, imitation, synchrony etc), especially the dynamics (timing) of the dialog? One aspect of this PhD will also focus on the comparison of the same character implemented in robot versus virtual agent for interaction (epathy aspects, etc.). Start : Fall 2014 Duration : 36 months Contact : Veronique.Auberge at & Benjamin.Lecouteux at (+ Laurent.Besacier at 3. PhD / Context-aware spoken dialogue in ambient assisted living environments You will work on a research and development project (CASSIE) involving academic and industrial stakeholders of spoken dialog, assistive technologies and social robotics. The PhD objective is to make a social cyber-physical agent "aware" of its environment by sensors and/or connected objects. This contextual information will drive the system interaction (natural language understanding and dialog). The heart of the research will be to build probabilistic and logical models for multimodal situation analysis and understanding in a domestic and multilingual context. For the experimental development and validation, the research will benefit from the fully-equipped LIG smart home (DOMUS). Start : Fall 2014 Duration : 36 months (PhD) Contact : Francois.Portet at & Michel.Vacher at Profiles The applicants must hold a Master degree in Computational Linguistics, Computing sciences or Cognitive Sciences preferably with experience in the fields of speech processing and/or natural language processing and/or machine learning. Good background in programming will also be required. He/she will also be involved in experimenting the technology with human participants being either French or English speakers. For this reason good English level is required as well as a good command of French. Finally effective communication skills in English, both written and verbal are mandatory. Location Grenoble is a high-tech city with 4 universities. It is located at the heart of the Alps, in outstanding scientific and natural surroundings. It is 3h by train from Paris ; 2h from Geneva ; 1h from Lyon ; 2h from Torino and is less than 1h from Lyon international airport. Research Group Website : Dates Interviews will be held in July 2014 (until September 2014 if needed). Meetings during Interspeech 2014 in SIngapore can be also organized. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 15:48:08 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:48:08 +0200 Subject: Appel: Ecole Thematique, Langues d'Afrique de l'Ouest Message-ID: Date: Fri, 04 Jul 2014 16:17:42 +0200 From: Stephane Robert Message-ID: X-url: Ecole th?matique ?Description des langues d?Afrique de l?Ouest? L??cole th?matique ?Description des langues d?Afrique de l?Ouest? se tiendra ? l?Universit? Cheikh Anta Diop de Dakar (S?n?gal), du 20 avril au 1er mai 2015. Elle est organis?e par les laboratoires LLACAN et DDL du CNRS en partenariat avec l?Universit? Cheikh Anta Diop de Dakar, dans la continuit? du projet ANR S?n?langues ( Les objectifs de cette ?cole sont de profiter des acquis du projet S?n?langues pour transmettre les derni?res avanc?es th?oriques, m?thodologiques et technologiques en mati?re de description de langues ? tradition orale et d?livrer une formation ax?e essentiellement sur des langues parl?es en Afrique de l?Ouest (langues atlantiques, langues mand?, cr?oles, mais aussi fran?ais d?Afrique). La perspective de travail sera avant tout descriptive et typologique. Cette formation de deux semaines, qui entend compl?ter les formations de Master et de Doctorat existantes, doit permettre aux stagiaires d?avoir une vue d?ensemble des diff?rents enjeux scientifiques et cadres d?analyses existants, des diverses t?ches ? entreprendre, ainsi que des m?thodes et outils ? disposition lorsque l?on se lance dans la description d?une langue parl?e en Afrique de l?Ouest. Elle doit ?galement leur donner une premi?re initiation ? la pratique de terrain. L??cole th?matique s??tendra sur deux plages de quatre jours chacune (semaine 1: 20-23 avril 2015; semaine 2: 28 avril 1er mai 2015), entre lesquelles sera ins?r? un colloque international sur la description des langues de l?Afrique de l?Ouest (24-25 avril 2014). Contenu des enseignements La formation repr?sente un volume total de 53h d?enseignement. Tous les cours sont obligatoires. Ils seront dispens?s en fran?ais, la plupart sous forme de cours magistraux, compl?t?s par plusieurs s?ances de travaux dirig?s (organis?es en sous-groupes) pour permettre l?entra?nement, en conditions d?enqu?te de terrain, ? l?analyse morphosyntaxique, ? la perception et ? la transcription des tons, ou encore l?utilisation des logiciels de traitement. La formation est articul?e autour de 3 axes correspondant (1) aux connaissances de base en linguistique g?n?rale et aux particularit?s structurelles des langues africaines, (2) aux sp?cificit?s de la pratique de linguistique de terrain et (3) aux outils, techniques et m?thodes d?exploitation des donn?es de terrain. Un accent particulier sera donn? aux langues de la famille atlantique, mais des sp?cialistes de langues mand?, de cr?oles ? base portugaise et du fran?ais d?Afrique compl?teront la formation. Liste des cours: Axe1. Fondamentaux S?mantique (2 sessions de 1h30) Typologie (1 session de 2h) Morphosyntaxe (2 sessions de 1h30) Tonologie (2 sessions de 1h30) Phonologie (2 sessions de 1h30) Sociolinguistique (1 session de 1h30) Axe1. Langues atlantiques Les classes nominales des langues atlantiques (1 session de 2h) Les langues atlantiques: connaissances et reconstruction (1 session de 2h) La flexion verbale dans les langues atlantiques (1 session de 2h) Extension verbale et valence dans les langues atlantiques (1 session de 2h) Axe1. Cours sp?cifiques ? la r?gion Fran?ais d?Afrique (1 session de 2h) Les cr?oles (1 session de 1h30) Description et langues en danger en Afrique de l?Ouest (1 session de 2h) Langues mand? (2 sessions de 1h30) Axe2. Terrain Techniques d?enregistrement (1 session de 1h30) Pratiques de terrain et enqu?tes (1 session de 1h30 pour 2 sous-groupes) Ethnolinguistique (1 session de 1h30) Le chercheur sur le terrain (1 session de 1h30) Axe3. Exploitation des donn?es ELAN (Logiciel) (2 sessions de 1h30) Les m?tadonn?es (ArBIL) (1 session de 1h30) Comment ?crire une grammaire (1 session de 1h30) Lexicographie (2 sessions de 1h30) Un certificat de participation (comprenant la liste des enseignements re?us et le nombre de cr?dits ?quivalents) sera d?livr? ? tous les participants pour permettre une validation de la formation, comme stage ou autre selon les universit?s concern?es. Liste des enseignants (? compl?ter) F. Ameka (Pr., Universit? de Leiden) C. Chanard (IE, LLACAN) D. Creissels (Pr. ?m?rite, Universit? Lyon2) A. M. Diagne (assimil? CR, IFAN, Dakar) J. Kouadio (MCF, Universit? Cocody, Abidjan) M. Mous (Pr., Leiden) P. A. Ndao (Pr., UCAD, Dakar) K. Pozdniakov (IUF - Pr., INALCO) N. Quint (DR, LLACAN) S. Robert (DR, LLACAN) P. Roulon-Doko (DR, LLACAN) S. Voisin (MCF, Aix Marseille Universit?) V. Vydrine (Pr., INALCO) Public concern? et crit?res d?admissibilit? L??cole th?matique doit permettre d?accueillir 70 stagiaires. Elle est ouverte ? tous ceux qui d?sirent acqu?rir des connaissances sur les langues d?Afrique de l?Ouest, prioritairement les ?tudiants de Master 1 et 2, doctorants, post-doctorants ou jeunes chercheurs et enseignants-chercheurs de sciences du langage qui souhaitent effectuer un travail de description sur une langue parl?e en Afrique de l?Ouest. Niveau d??tudes minimum requis: Licence de Sciences du langage (ou niveau ?quivalent en linguistique). Modalit? de soumission des candidatures: Pour le 1er octobre 2014 au plus tard, remplir le formulaire de candidature en ligne sur le site: La notification d?acceptation parviendra aux candidats le 1er d?cembre. Les modalit?s d?inscription leur seront pr?cis?es ? cette occasion. Pour les tarifset l?h?bergement : voir le site web Dates importantes - Ecole: du 20 avril au 1er mai 2015 - Date limite de d?p?t des candidatures: 1er octobre 2014 - Notification d?acceptation: 1er d?cembre 2014 Lieu Universit? Cheikh Anta Diop, Dakar S?n?gal,-17.46325,17z Site et contact Site web: Contact: senelangues2015et at ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 16:03:31 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:03:31 +0200 Subject: Appel: IES'14 special session on Intelligent and Evolutionary Systems for Natural Language Processing Message-ID: Date: Mon, 7 Jul 2014 04:45:00 -0500 (EST) From: SenticNet Message-ID: < at> X-url: Apologies for cross-posting, Submissions are invited for a special session on "Intelligent and Evolutionary Systems for Natural Language Processing" of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES'14), to be held from 10th to 12th November 2014 in Singapore. RATIONALE As the Web rapidly evolves, Web users and Web contents are evolving with it. In an era of social connectedness, people are becoming increasingly enthusiastic about interacting, sharing, and collaborating through social networks, online communities, blogs, Wikis, and other online collaborative media. In recent years, this collective intelligence has spread to many different areas, with particular focus on fields related to everyday life such as commerce, tourism, education, and health, causing the size of the Web to expand exponentially. The distillation of knowledge from such a large amount of unstructured information, however, is an extremely difficult task, as the contents of today's Web are perfectly suitable for human consumption, but remain hardly accessible to machines. To this end, biologically and linguistically motivated computational paradigms that go beyond syntax are needed. Intelligent and evolutionary systems potentially have a large future possibility to play an important role in natural language processing (NLP) research for tasks such as grammatical evolution, knowledge discovery, and rule learning. In this light, this Special Session focuses on the introduction, presentation, and discussion of novel NLP systems that are not merely based on domain-dependent corpora or word co-occurrence counts, but rather systems that can be considered intelligent and evolutionary. The main motivation for the Special Session, in particular, is to go beyond a mere word-level analysis of text and provide novel concept-level approaches to natural language processing that allow a more efficient passage from (unstructured) textual information to (structured) machine-processable data, in potentially any domain. Articles are thus invited in areas such as AI, Semantic Web, knowledge-based systems, machine learning, and computational intelligence for NLP research. Topics include, but are not limited to: - Intelligent and evolutionary systems for information extraction and retrieval - Intelligent and evolutionary systems for text summarization and visualization - Intelligent and evolutionary systems for topic modeling - Intelligent and evolutionary systems for sentiment analysis - Intelligent and evolutionary systems for knowledge acquisition - Intelligent and evolutionary systems for social network analysis - Intelligent and evolutionary systems for adaptive and transfer learning - Intelligent and evolutionary systems for agents and complex systems - Intelligent and evolutionary systems for evolutionary game theory - Intelligent and evolutionary systems for bioinformatics The Special Session also welcomes papers on specific application domains of natural language procesing, e.g., social data mining, influence networks, customer experience management, computer mediated human-human communication, social media marketing, multimedia management, personalization and persuasion, enterprise feedback management, human-agent, -computer and -robot interaction, intelligent user interfaces, patient opinion mining, surveillance, art. The authors will be required to follow the Author's Guide for manuscript submission to the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems ( TIMEFRAME Submission Deadline: August 1st, 2014 Notification of Acceptance: September 1st, 2014 Final Manuscripts Due: October 1st, 2014 Session dates: November 10-12th, 2014 ORGANIZATION Erik Cambria, Nanyang Technological University, Singapore Amir Hussain, University of Stirling, UK Yunqing Xia, Tsinghua University, China From hamon at LIMSI.FR Tue Jul 8 16:23:00 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:23:00 +0200 Subject: Appel: JDMDH, Journal of Data Mining and Digital Humanities Message-ID: Date: 8 Jul 2014 16:47:59 +0200 From: "Nicolas Turenne" Message-ID: <14716720fd3.5cd5.1b78e at> X-url: X-url: X-url: Dear Colleague, The Journal of Data Mining and Digital Humanities ( is an OA journal dedicated to a range of research studies between the fields of data mining and digital humanities. It is hosted as an overlay journal on the Epiciences platform ( Submissions are peer-reviewed and the journal is free of charge for both authors and readers. Accepted publications are immediately published on the JDMDH website. You or your colleagues may be interested in submitting an original manuscript to the journal. You can find all instructions to authors on the website and create an account to submit. See: . Looking forward to see you contribute to this full open access endeavor. The JDMDH editorial board From hamon at LIMSI.FR Tue Jul 8 16:33:37 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:33:37 +0200 Subject: Ressources: Corpus disponibles annotes avec FRMG Message-ID: Date: Tue, 08 Jul 2014 18:14:00 +0200 From: Eric De la clergerie Message-ID: <53BC18C8.8060006 at> X-url: X-url: En relation avec le wiki linguistique FrmgWiki [] d?velopp? par l'?quipe ALPAGE (INRIA & Universit? Paris Diderot), nous avons le plaisir de mettre ? disposition de la communaut? plusieurs corpus annot?s avec l'analyseur syntaxique FRMG. Il s'agit de: - Wikipedia Fr (178 millions de mots) - Wikisource Fr ( 64 millions de mots) - EuroParlement Fr ( 41 millions de mots) Les corpus annot?s sont librement disponibles, modulo les licences s'appliquant sur les corpus originels. FrmgWiki offre ?galement la possibilit? de lancer des requ?tes (en langage DPath) sur les corpus ainsi annot?s. Ce service est cependant encore exp?rimental. Nous rappelons que FrmgWiki offre ?galement un service de traitement de petits corpus (jusqu'? 1 million de mots) pour les membres enregistr?s. Contacts: Paul Bui-Quang (paul.bui-quang at Eric de la Clergerie (eric.de_la_clergerie at ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 16:20:01 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:20:01 +0200 Subject: Info: ZOMBILINGO, Manger des tetes pour annoter en syntaxe de dependances ! Message-ID: Date: Tue, 8 Jul 2014 15:45:15 +0200 (CEST) From: Karen Fort Message-ID: <415602684.10106950.1404827115827.JavaMail.zimbra at> X-url: X-url: Jouez (et faites jouer) ? Zombilingo : Zombilingo est un jeu ayant un but (ou Game With A Purpose) permettant d?annoter des corpus en syntaxe de d?pendances. Les annotations cr??es sont librement disponibles sur le site du jeu. La production de ressources linguistiques de grande taille est tr?s co?teuse, en particulier en main d??uvre. Ainsi, le co?t d?annotation du Prague Dependency Treebank a ?t? estim? ? 600 000 dollars (B?hmov? et al., 2001). Une alternative pour produire des ressources est l?utilisation de la myriadisation (crowdsourcing), c?est-?-dire le recours ? la "foule" pour r?aliser une t?che. Les jeux ayant un but, par exemple, ont ?t? utilis?s pour diff?rentes t?ches en TAL : JeuxDeMots (Lafourcade, 2007) a pour but de cr?er un r?seau lexical ; Phrase Detectives (Chamberlain et al., 2008) fait annoter un corpus en anaphores. Ces deux jeux ont eu un succ?s consid?rable et ont permis de cr?er des ressources de qualit? raisonnable pour un co?t r?duit. Le premier fait appel au sens commun et le deuxi?me ? des connaissances scolaires. Dans d?autres domaines, il a ?t? possible d?utiliser un jeu pour des t?ches nettement plus complexes et qui n?cessitent une formation des personnes qui participent. Ainsi, dans FoldIt (Cooper et al., 2010) les joueurs doivent manipuler des repr?sentations 3D de prot?ines pour ?tudier la fa?on dont elle peuvent interagir. Zombilingo est inspir? de ces succ?s et a pour but de faire r?aliser ? des joueurs une t?che de TAL r?put?e complexe : annoter des d?pendances syntaxiques. R?f. : Kar?n Fort, Bruno Guillaume et Valentin Stern. Zombilingo : manger des t?tes pour annoter en syntaxe de d?pendances. Actes de Traitement Automatique des Langues Naturelles (TALN), Marseille, France, juillet 2014 - D?monstration. Kar?n Fort, Bruno Guillaume and Hadrien Chastant. Creating Zombilingo, a Game With A Purpose for dependency syntax annotation. Proceedings of the Gamification for Information Retrieval (GamifIR'14) Workshop, Amsterdam, Pays-Bas, avril 2014. ( Kar?n Fort ATER ENSMN D?p. Information et Syst?mes - Responsable Projets Loria, ?quipe S?magramme Bureau C303 +33 (0)3 54 95 86 54 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 16:02:30 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:02:30 +0200 Subject: Appel: 2nd call for abstracts, Translating and the Computer Message-ID: Date: Mon, 7 Jul 2014 07:47:17 +0000 From: "Stajner, Sanja" Message-ID: <8D7E2A326D0D3549A39B4B714C54FFAA9A8D57F6 at> X-url: X-url: 2nd call for Abstracts ________________________________ *(Apologies if you received multiple copies. Please, distribute it among potentially interested colleagues.)* Translating and the Computer ? 36, London, 27/28 November 2014 Translating and the Computer attracts a unique amalgam of researchers, developers and users. It brings together academics involved in language technology research and in teaching translation and terminology with those who develop and market tools for language transformation and both of these groups with users: translators, terminologists, interpreters, and voice-over specialists, whether freelancers or working in translation departments of large organisations such as those of the European Parliament, European courts and the European Patent Office, the United Nations family, international companies and other organisations, and Language Services Providers (LSPs), large and small. First the Computer, then the Internet and more recently the Cloud, are changing and remodelling expectations and processes in the Language and Localization industries. These changes are accompanied by new requirements for standards and interoperability. The digital age is modifying the concept of text and quality. Content is a key item together with strings, chunks, segments and words. In its 36th session Translating and the Computer has moved from ASLIB to ASLING. The conference often referred to as the ?ASLIB Conference? is now the ASLING Translating and the Computer Conference. ASLING is working hard to ensure that this conference remains a key date in your calendar to help you keep in touch. To do this we need the support and contributions of all those who are interested in sharing their knowledge and ideas on the latest developments in an extremely stimulating sector, and equally interested in hearing other contributions. If you, or one of your colleagues, have something important to announce or discuss, we urge you to consider presenting a paper at this conference. Abstracts must be submitted using the START system at the following address: . The deadline for submissions is extended to July 14th; authors will be notified of acceptance by August 7th. For further details of the Call for Abstracts, please see Call for Abstracts at: . For any other information write to us at: info at . Chairs * Juliet Macan, Arancho Doc srl. (Lead Chair 2014) * Jo?o Esteves-Ferreira, Tradulex, International Association for Quality Translation * Ruslan Mitkov, University of Wolverhampton * Olaf-Michael Stefanov, United Nations (ret), JIAMCATT Programme Committee * David Chambers, World Intellectual Property Organisation (ret) * Gloria Corpas Pastor, University of Malaga * Alain D?silets, National Research Council of Canada (NRC) * David Filip, LRC, CNGL, LT-Web, University of Limerick * Pamela Mayorcas, Institute of Translating and Interpreting * Paola Valli, University of Trieste Conference Manager: * Nicole Adamides We look forward to welcoming you to London and a new start. Association internationale pour la promotion des technologies linguistiques International Association for Advancement in Language Technology Bologna, Gen?ve, London, Wien, Wolverhampton ASLING is a new international not-for-profit association, set up by the conference chairs to renew the organisation and opportunities offered by the Translating and the Computer Conference series in London. Its main objectives are: ?to promote the use of information technology in the fields of language, translation, terminology and related fields? and ?provide the general public with a better understanding of the contribution of technology in the fields of language, translation, terminology and related fields?. Our new website: (coming soon) ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 16:10:32 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 18:10:32 +0200 Subject: Job: Postdoc position, Event prediction for dialogue modelling Message-ID: Date: Mon, 07 Jul 2014 16:25:09 +0200 From: Fran?ois Rioult Message-ID: <53BAADC5.50902 at> Postdoc position available at GREYC - University of Caen Basse-Normandie - France Event prediction for dialogue modelling in embodied conversational agent Location: Caen (France) Duration: 15 months (potentially 18 months) Keywords: sequence mining, knowledge discovery, event prediction, dialogue modelling Description of the task: Designing a model of dialogue is a difficult and often multidisciplinary task. Whether it is dedicated to interactive storytelling or not, it involves many algorithms: multi-modal input recognition (utterances, gestures, gazes, vocal inflections), natural language understanding and generation, dialogue management, planning and cognitive capacities, emotion modelling, prosodic speech generation, non-verbal behavior. In this context, the post-doc fellow?s research will focus on designing algorithms for modelling the interactions (turn of speech) during the conversation, starting from real data gathered during dialogues between adult (or Wizard of Oz) and child, that are annotated by psychologists. This task can be viewed as predicting a particular event in a sequence of itemsets. The model of the dialogue will be implemented in an embodied conversational agent. Application: The candidate must prepare a detailed CV including a complete bibliography, a motivation letter and recommendation letters as a single pdf file. This file should be sent by email to the contact below. French people may apply using French language. Contact: Fran?ois Rioult CNRS UMR6072 GREYC Universit? de Caen Basse-Normandie francois.rioult at ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 15:57:38 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:57:38 +0200 Subject: Appel: 3emes journees Unitex/Gramlab, 3rd UNITEX/GramLab Workshop Message-ID: Date: Fri, 4 Jul 2014 22:05:20 +0200 (CEST) From: Denis Maurel Message-ID: <837136366.3877602.1404504320808.JavaMail.zimbra at> X-url: English below. 3?mes journ?es Unitex/Gramlab 9-10 octobre 2014, Universit? Fran?ois Rabelais Tours Unitex est une plate-forme open-source d?analyse de texte largement utilis?e en recherche, dans l?enseignement et l?industrie. Le syst?me GramLab/Unitex offre en plus aux ?quipes le travail collaboratif (partage de ressources, suivi de versions, etc.). Ces outils reposent sur des technologies de type ? automates ? ?tats finis ? et incorporent des ressources linguistiques ? large couverture, disponibles dans de nombreuses langues. Les journ?es Unitex/GramLab sont un forum qui a pour ambition de rassembler la communaut? des chercheurs et industriels utilisant Unitex et GramLab pour leurs travaux (tous domaines confondus) ou participant au d?veloppement de cette plate-forme et de ses ressources. L?objectif principal de cet ?v?nement est de favoriser le partage d?exp?rience et les contacts entre utilisateurs et d?veloppeurs. Le programme pr?sentera un ?ventail des recherches en cours et offrira une s?rie de formations pratiques de diff?rents niveaux aux chercheurs qui souhaitent utiliser ce syst?me ou d?couvrir de nouvelles fonctionnalit?s. Appel ? communications, formations et d?monstrations 1) Nous invitons les utilisateurs d?Unitex ? soumettre des propositions de communication orale dans les domaines suivants (non exclusifs) : * Les recherches actuelles utilisant Unitex (dans tous les domaines). * Le d?veloppement de ressources ? destination d'Unitex. * Le d?veloppement d'extensions, nouvelles fonctionnalit?s, etc. * Les d?veloppements industriels reposant sur Unitex. * Les initiatives p?dagogiques utilisant Unitex (Unitex dans l'enseignement du TAL ou des ressources linguistiques, etc.). * L?int?gration d?autres outils ? Unitex. 2) Nous invitons les experts ayant une ma?trise particuli?re du syst?me ? proposer des tutoriels dans les domaines suivants ou d?autres : * Utilisation avanc?e des graphes. * Cascades de transducteurs avec CaSys. * ?tiquetage en parties du discours, en parenth?sages minimaux.... * D?veloppement de scripts Unitex / initiation ? la programmation Unitex. * Programmation : persistance et optimisation. * Int?gration Unitex-UIMA dans une chaine de traitement. * autres. 3) Nous faisons ?galement appel aux auteurs d?applications int?grant des fonctionnalit?s d?Unitex et leur proposons de faire la d?monstration de leurs outils. Proc?dure de soumission Envoyer un r?sum? d'une page maximum (3000 caract?res) ? : j.unitex2014 at Ce document reprendra les informations suivantes : - Type de communication : * communication orale de 20 minutes * formation (tutoriel) de 2 heures * d?monstration (10 minutes en s?ance pl?ni?re + rencontres) - Nom, pr?nom, affiliation, adresse de courrier ?lectronique - Titre de la pr?sentation - R?sum? de la pr?sentation - Pour les tutoriels : d?fendre l?int?r?t de cette formation et identifier le public vis?. Programme provisoire Mercredi 8 octobre Apr?s-midi Accueil et visite de la r?gion Jeudi 9 octobre Matin Pr?sentations Apr?s-midi Tutoriels Ateliers et R?union d?veloppeurs (en parall?le) Vendredi 10 octobre Matin Pr?sentations Apr?s-midi R?union pl?ni?re (questions aux d?veloppeurs, nouveaut?s, orientations, etc.) Publication et communication Les communications pouront ?tre diffus?es sous forme ?lectronique sur le site de la conf?rence, ? la demande des auteurs. Calendrier Appel ? communication : d?but juillet Soumission : avant le 5 septembre 2014 Approbation/?valuation : 10-15 septembre Inscriptions : - Avant le 30 septembre : 25 euros/jours - Apr?s le 30 septembre : 35 euros /jours - Inscription gratuite pour les formateurs. - Possibilit? de bourse pour les ?tudiants ne disposant pas de financement (soumettre une lettre de demande) - Les frais d?inscription couvrent les pauses caf?, deux repas en restaurant d'entreprise et les documents p?dagogiques distribu?s durant les formations. Comit? de lecture Antonio Balvet, STL, Universit? de Lille 3 Anne Dister, Universit? Saint-Louis - Bruxelles C?drick Fairon, CENTAL, Universit? catholique de Louvain Nathalie Friburger, LI, Universit? Fran?ois Rabelais Tours Cvetana Krstev, Universit? de Belgrade Tita Kyriacopoulou, IGM, Universit? de Marne-la-Vall?e Denis Maurel, LI, Universit? Fran?ois Rabelais Tours Agata Savary, LI, Universit? Fran?ois Rabelais Tours Du?ko Vitas, Universit? de Belgrade Gilles Vollant, Ergonotics Comit? d?organisation Nathalie Friburger, Universit? Fran?ois Rabelais Tours Tita Kyriacopoulou, IGM, Universit? de Marne-la-Vall?e Claude Martineau, LIGM, Universit? Paris-Est Marne-la-Vall?e Denis Maurel, Universit? Fran?ois Rabelais Tours Agata Savary, LI, Universit? Fran?ois Rabelais Tours Comit? de pilotage C?drick Fairon, CENTAL, Universit? catholique de Louvain Tita Kyriacopoulou, IGM, Universit? de Marne-la-Vall?e ?ric Laporte, IGM, Universit? de Marne-la-Vall?e Denis Maurel, Universit? Fran?ois Rabelais Tours Gaelle Recource, Kwaga Du?ko Vitas, Universit? de Belgrade ------------------------------------------------------------------------ 3 rd UNITEX/GramLab Workshop October 9-10, 2014, Universit? Fran?ois Rabelais Tours Unitex is an open-source text analysis software widely used in research, teaching and industry ( It relies on Finite State technologies and incorporates large-scale language resources for several languages. GramLab is an Integrated Development Environment (IDE), based on Unitex software components but specifically designed for industrial purposes. It integrates tools that help collaborative development, versioning, etc. The Unitex/GramLab Workshop aims at gathering academic and industrial users, developers and researchers contributing to the development of this open-source software and its resources or using them for developing new research or industrial applications. It is also a unique opportunity for new users to learn the basic and advanced Unitex techniques and discuss their projects with experts during hands-on sessions. The program will offer an overview of the current trends: the development of real-life industrial applications based on Unitex, the use of Unitex in various research projects, the improvement of Unitex through the development of new features and the extension of the language resources. Training sessions of various levels will be also offered. Call for communication, training session, demonstrations 1) We invite submissions for communication in the following areas (not exclusive): - Current research projects exploiting Unitex (all domains) - Development of language resources for Unitex - Development of new features or functionalities, optimization of existing programs, etc. - Industrial developements based on Unitex - Pedagogical initiatives using Unitex (for teaching NLP, Corpus linguistics, etc.) - Tools integration in Unitex 2) We invite Unitex experts to offer tutorials in the following domains (or others): * Advanced use of graphs * Cascading Transducers with CaSys * POS tagging * Unitex Scripting (Introduction to Unitex programming) * Persistance and optimization for Unitex-based applications * Building up on Unitex-UIMA * other 3) We also call for developers to demonstrate Unitex-based applications. Submission procedure Send a 1 page abstract (max 3000 characters) to : j.unitex2014 at This page will contain the following information: - Type of communication: * Oral communication (20 minutes) * Tutorial (2 hours) * Demonstration (10 minutes in plenary session + poster/demo session) - Name, First name, Institution, e-mail - Title - Abstract - Oral presentation Language (French or English) - For tutorials: explain the rationale for this training and identify target audience Draft program Wednesday 8 october Afternoon possibility to visit the Region of Tours Thursday 9 october : Morning: Presentations Afternoon : Tutorials Workshop and developpers session (parallel sessions) Friday 10 october : Morning: Presentations Afternoon : Plenary session (questions to developpers, innovations, new orientations, etc.) Proceedings and Publication The Workshop proceedings can be published electronically in the website of the workshop, if the authors wish. Call for papers: Begining of July Submission deadline: September 5, 2014 Author notification: September 10-15, 2014 Registration - Until September 30, 2013: 25 euros/day - After September 30, 2013: 35 euros/day - Free registration for experts in charge of tutorials - A few grants are available for students who do not have access to funding (registration fees waived): submit application letter . - Registration fees cover: coffee breaks, lunches and handouts for the tutorials. Scientific Committee Antonio Balvet, STL, Universit? de Lille 3 Anne Dister, Universit? Saint-Louis - Bruxelles C?drick Fairon, CENTAL, Universit? catholique de Louvain Nathalie Friburger, LI, Universit? Fran?ois Rabelais Tours Cvetana Krstev, Universit? de Belgrade Tita Kyriacopoulou, IGM, Universit? de Marne-la-Vall?e Denis Maurel, LI, Universit? Fran?ois Rabelais Tours Agata Savary, LI, Universit? Fran?ois Rabelais Tours Du?ko Vitas, Universit? de Belgrade Gilles Vollant, Ergonotics Organizing Committee Nathalie Friburger, Universit? Fran?ois Rabelais Tours Tita Kyriacopoulou, IGM, Universit? de Marne-la-Vall?e Claude Martineau, LIGM, Universit? Paris-Est Marne-la-Vall?e Denis Maurel, Universit? Fran?ois Rabelais Tours Agata Savary, LI, Universit? Fran?ois Rabelais Tours Steering Committee C?drick Fairon, CENTAL, Universit? catholique de Louvain Tita Kyriacopoulou, IGM, Universit? de Marne-la-Vall?e ?ric Laporte, IGM, Universit? de Marne-la-Vall?e Denis Maurel, Universit? Fran?ois Rabelais Tours Gaelle Recource, Kwaga Du?ko Vitas, Universit? de Belgrade ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 8 15:44:35 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 8 Jul 2014 17:44:35 +0200 Subject: Appel: TRELA 2015 Message-ID: Date: Fri, 04 Jul 2014 09:06:57 +0200 From: Kubler Natalie Message-ID: <53B65291.7010401 at> Cher-e-s Coll?gues, Dear Colleagues, Veuillez trouver ci-dessous l'appel ? communication du colloque TRELA 2015: Terrains de Recherche en Linguistique Appliqu?e (Universit? Paris Diderot du 8 au 10 juillet 2015). PLease find below the CFP tp the TRELA 2015 Conference: Areas of Research in Applied Linguistics (Paris Diderot University, Paris, France, 8-10 July, 2015: Aujourd'hui, au vingt et uni?me si?cle, la linguistique appliqu?e est une discipline riche qui se d?cline en de multiples domaines, par ailleurs li?s aux traditions scientifiques des divers pays dans lesquels elle s'est d?velopp?e: acquisition / apprentissage, bi- / plurilinguisme, didactique des langues, lexicographie, linguistique de corpus, terminologie, traductologie, traitement automatique des langues, variation linguistique. Pourtant, les terrains de recherche de ces disciplines ou domaines sont souvent communs. On le voit dans ce que recouvrent les d?nominations dans d'autres langues, comme /applied linguistics/ en anglais, /angewandte Linguistik/ en allemand ou /Ling??stica aplicada/en espagnol. Le vingt et uni?me si?cle est celui de la transdisciplinarit? et de la pluridisciplinarit?, comme le montre l'?mergence de nombreux domaines pluri- ou transdisciplinaires, combinant sciences humaines et sciences exactes par exemple. La linguistique appliqu?e, par les terrains de recherche qu'elle parcourt, accompagne le d?veloppement incontournable de cette pluridisciplinarit?. Ce faisant, elle d?montre qu'elle ne consiste pas en la seule application de connaissances th?oriques mais qu'elle permet l'?mergence de nouveaux champs d'investigation qui viennent alimenter la r?flexion en sciences du langage. L'objectif du colloque international /TRELA/, qui s'inscrit dans la suite de la r?flexion engag?e lors du colloque CRELA ? Nancy en 2013, est de permettre aux chercheurs et autres acteurs des diff?rents domaines de la linguistique appliqu?e de se rencontrer sur des enjeux de recherche partag?s, amenant ainsi ? offrir un ?clairage pluridisciplinaire ou transdisciplinaire sur des probl?matiques crois?es. Dans cette perspective sont attendues des communications portant sur l'un des cinq axes suivants : - Notion de ? terrain ? en linguistique appliqu?e (analyse multicrit?res de terrains sp?cifiques, analyse comparative de terrains) - Linguistique appliqu?e et sciences du langage: o? situer le terrain ? Les terrains se croisent-ils ? - Mod?lisation et approches th?oriques multiples (combiner plusieurs approches pour ?tudier le m?me terrain) - Ressources, outils et m?thodologies d'approche du terrain - Linguistique appliqu?e et linguistique th?orique : comment construire la compl?mentarit? ? Interrogations ?pist?mologiques face ? une compl?mentarit? ?vidente ? - Croisements entre linguistique appliqu?e et traductologie: une approche hybride? Ech?ance : 15 octobre 2014 Notification : 20 janvier 2015 ======================English CFP======================= *Paris 8-10 July 2015* Applied Linguistics in the 21^st Century is a rich and varied discipline, with many sub-domains. Each of these has its own research tradition, often associated with the particular countries in which it developed: language acquisition / learning, bi- and multi-lingualism, didactics, lexicography, corpus linguistics, terminology, translation studies, computational linguistics, variation, etc. However, these disciplines often share similar research fields. A prime example of this to look at the areas covered by the term /Applied Linguistics /and its equivalents /linguistique appliqu?e/, /angewandte Linguistik/, /Ling??stica aplicada/ in their respective languages. Transdisciplinarity and multidisciplinarity are trademarks of the 21^st Century, as can be seen in the emergence of so many multi- or transdisciplinary fields, including examples which combine 'humanities' and 'pure sciences'. Applied Linguistics, because of the variety of fields which it is involved in, has followed the inexorable development of this process of hybridization. Furthermore, the practice of Applied Linguistics has come to involve not only the application of theoretical knowledge, but also the emergence of new fields of investigation, which then feed back into current debates within the language sciences. The aim of the international conference TRELA is to follow up on the CRELA conference in Nancy 2013, and to allow researchers and other practitioners in the different fields of Applied Linguistics to discuss and debate issues relating to common areas of research, pooling ideas on these topics from a multidisciplinary or transdisciplinary perspective. To this end, we invite submissions on any of the following areas: - The notion of 'area' or 'field' in Applied Linguistics (multicritieral analysis in specific areas, comparative analysis of different fields, etc.) - Applied Linguistics and Linguistics:Are they two distinct'areas' or a single 'field' or can they not be divided? - Models and multiple-theory approaches (combining several approaches to explore the same area) - Resources, tools and methodologies to explore an 'area' (or conduct 'field' work). - Applied Linguistics and Theoretical Linguistics: building a complementary approach? And what are the epistemological issues raised by obvious complementary? Deadline for proposals: 15 October 2014 Notification: 20 January 2015 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:21:31 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:21:31 +0200 Subject: Appel: TRELA, site web Message-ID: Date: Wed, 09 Jul 2014 09:26:35 +0200 From: Kubler Natalie Message-ID: <53BCEEAB.50104 at> X-url: Cher-e-s Coll?gues, Vous trouverez les d?tails et modalit?s de l'appel ? communications pour le colloque TRELA sur le site suivant: contact: trela-local at Bien cordialement Natalie K?bler ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:42:24 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:42:24 +0200 Subject: Job: Ingenieur R&D Speech to text, OWI Technologies Message-ID: Date: Thu, 10 Jul 2014 15:37:49 +0200 From: "Xiaolu CHEN" Message-ID: <003f01cf9c44$244d5230$6ce7f690$> X-url: ? Ing?nieur Recherche & D?veloppement Speech to Text ? OWI ( - ?diteur de solutions d?analyse s?mantique et de traitement des interactions client (mails, web, r?seaux sociaux, mobile, enqu?tes, etc). Le proc?d? innovant de Traitement Automatique du Langage Naturel (TALN) qu?exploite OWI a re?u de nombreux troph?es (Laur?at du concours SFR Jeunes Talents, Prix TECHINNOV du Cr?ateur Innovant, Entreprise Innovante du P?le Cap Digital, concours Microsoft ? Finance Innovation, label Scientip?le initiative, etc.). Nos clients sont les grands et moyens comptes de divers secteurs d?activit? tels que Canal +, Ikea, EDF, Bouygues Telecom, MGEN, BPCE Assurance, etc. Contexte : Confront? par les succ?s commerciaux dans le domaine du traitement des ?crits, OWI a d?sormais la volont? d??tendre son business vers le domaine de la voix. Dans cette optique, nous recherchons un ? Ing?nieur Recherche & D?veloppement ? sp?cialis? dans le speech to text pour renforcer notre ?quipe R&D et travailler sur des projets ? vocaux ? afin de r?pondre aux attentes de nos clients. Au sein de l??quipe Recherche et D?veloppement, vous serez en charge des activit?s suivantes li?es au domaine vocal : - Exp?rimenter des technologies de reconnaissance vocale - Mettre au point les algorithmes d?optimisation de ces technologies, au moyen des ?l?ments fournis par le moteur OWI - Permettre une alimentation des solutions OWI ? partir d?enregistrements vocaux ou de conversations temps r?el - Participer ? l??laboration de tableaux de bord ? customer experience ? et ? quality monitoring ? - Exp?rimenter la possibilit? d?apporter une assistance temps r?el aux conseillers t?l?phoniques (? aide ? la r?ponse et ? la conduite de conversation ?) Profil du candidat : - Docteur ou ing?nieur, en informatique ou en TAL - Vous poss?dez obligatoirement une bonne connaissance des technologies de reconnaissance vocale - Vous maitrisez C++, Java et SQL - Une premi?re exp?rience dans un poste similaire est fortement souhait?e - La ma?trise d?une autre langue ?trang?re serait un plus - Votre potentiel et votre personnalit? feront la diff?rence : motivation, sens de l'engagement, rigueur, capacit? ? s'impliquer dans des projets collaboratifs. Modalit?s : - Poste ? pourvoir rapidement ? Bourg-la-Reine (92) - Type de contrat : CDI - R?mun?ration : de 38 ? 50 k? selon exp?rience - CV et lettre de motivation ? recrut at Je vous remercie par avance de votre aide Cordialement Xiaolu CHEN Service Marketing T?l : 01 78 16 12 10 | Email : xiaolu.chen at OWI Technologies | 31, Av du G?n?ral Leclerc, 92340 Bourg-la-Reine Suivez nos actualit?s : ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:43:13 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:43:13 +0200 Subject: Job: Post-doc IRIT/MELODI, Toulouse Message-ID: Date: Fri, 11 Jul 2014 16:32:15 +0200 From: philippe muller Message-ID: <53BFF56F.5060607 at> X-url: Dans le cadre du projet ANR Asfalda, nous proposons un contrat post-doctoral pour travailler avec le partenaire IRIT du projet, ? l'Universit? Paul Sabatier (Toulouse). Le projet Asfalda a pour objectif de d?velopper un corpus annot? s?mantiquement, et des outils de traitement automatique pour l'analyse s?mantique, utilisant les corpus constitu?s durant le projet. Les annotations s?mantiques s'appuient sur le standard FrameNet. Framenet d?finit un ensemble de cadres s?mantiques, des situations prototypiques et la caract?risation de leurs participants. Les cadres sont reli?s entre eux dans une structure hi?rarchique enrichie par des liens s?mantiques de natures diverses. L'objectif du post-doc est d'enrichir cette structure, qui est pour l'instant peu dense. Les relations entre frames ainsi cr??es sont utiles pour l'analyse discursive, et pour compl?ter les annotations s?mantiques de structures de cadres partiellement connect?es en contexte. Ceci se fera en deux temps: premi?rement en se concentrant sur des liens typiques entre types de lemmes impliqu?s dans les cadres consid?r?s, ? partir d'approches non supervis?es sur corpus, deuxi?mement en d?sambiguisant les lemmes reli?s pour identifier simultan?ment les frames reli?s dans leurs contextes d'apparition, ainsi que les liens entre leurs participants. Cette ?tape s'appuiera sur les outils d'annotation d?velopp?s dans les autres t?ches du projet, et les donn?es annot?es collect?es. Nous recherchons des candidats avec des comp?tences en Traitement Automatique des Langues, en apprentissage automatique, et id?alement une expertise sur les th?matiques du projet. Mots-clefs: semantic role labelling, analyse du discours, s?mantique lexicale Coordinateur du projet: Marie Candito, Alpage, Univ Paris Diderot & INRIA Nous accepterons les candidatures jusqu'au 31 aout 2014. P?riode: 1 an, d?marrage 1er octobre 2014. Salaire brut: 3413?/mois (~ 2730 net) Contact: farah.benamara, philippe.muller, en ajoutant le domaine IRIT, ?quipe MELODI CNRS/Universit? Paul Sabatier ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:22:34 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:22:34 +0200 Subject: Cursus: Master en Veille et analyse de l'information, Extension de la date de cloture des inscriptions Message-ID: Date: Wed, 09 Jul 2014 13:04:13 +0200 From: St?phane Chaudiron Message-ID: <53BD21AD.3070000 at> X-url: *Master 2 : Sp?cialit? ? Sciences de l'Information Documentation ?* *Parcours PRISME : Veille et analyse strat?gique de l'information* *Ann?e 2014-2015* *Important : La cl?ture des inscriptions au Master PRISME est repouss?e au 20 juillet 2014.* Objectifs de la formation Le master PRISME est une formation de niveau Bac+5 qui forme des professionnels de la veille et de l'analyse de l'information strat?gique. ? l'issue de la formation, les ?tudiants sont aptes ? analyser les besoins des organisations (entreprise, ?tablissement public, collectivit? locale...) en terme de veille strat?gique (acc?s,collecte, traitement et communication de l'information). Ils peuvent ainsi r?aliser l'audit d'un dispositif de veille, concevoir et mettre en place un dispositif de veille automatis?, r?aliser des produits et services d'information ?lectronique et animer une communaut?s de veilleurs. La formation fournit aux ?tudiants les comp?tences conceptuelles, m?thodologiques, techniques et pratiques permettant d'assurer la responsabilit? de projets de veille et d'analyse de l'information dans diff?rents domaines : veille commerciale et marketing, veille r?glementaire, veille documentaire, veille d'image et de e-r?putation, veille concurrentielle... Entreprises partenaires La formation s'appuie sur un r?seau de partenaires du monde de l'industrie de l'information qui participent ? la formation : des ?diteurs de logiciels (AMI Software, KB Crawl, TEMIS, Web Site Watcher...), des agr?gateurs de presse et de contenu (Europresse), des sp?cialistes de la veille et du /knowledge management/ (Histen Riller, la CCIR Nord Pas de Calais, Kurt Salmon, OTO Research...), le GFII (Groupement fran?ais de l'industrie de l'information) ainsi qu'un ensemble d'organismes accueillant des stagiaires (Cofidis, CCIR Nord Pas de Calais, Pas de Calais Habitat, SNCF, LVMH, Norauto, Decathlon, Carrefour, Lesaffre, Pierre Fabre, BNP Paribas...). Modalit?s de formation La formation est accessible en formation initiale, en alternance (contrat de professionnalisation) et en VAE (Validation des Acquis d'Exp?rience). Contact & inscription Responsable p?dagogique : St?phane Chaudiron, Professeur en sciences de l'information et de la communication, stephane.chaudiron at Secr?tariat : Mme Delerue, beatrice.delerue at Site web : ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:26:38 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:26:38 +0200 Subject: These: Anais Lefeuvre, Semantique des temps du francais, une formalisation compositionnelle Message-ID: Date: Wed, 9 Jul 2014 18:56:11 +0200 From: anais lefeuvre Message-ID: X-url: X-url: Chers coll?gues, j'ai le plaisir de vous annonc? que j'ai soutenu ma th?se intitul?e "S?mantique des temps du fran?ais: une formalisation compositionnelle" le 23 juin dernier au LaBRI. Le manuscrit peut ?tre consult? ? l'adresse suivante: R?sum?: Cette th?se s?inscrit dans le cadre du projet R?gion Aquitaine - INRIA : ITIPY. Ce projet vise l?extraction automatique d?itin?raires ? partir de r?cits de voyage en milieu pyr?n?en du XIX?me et du d?but du XX?me si?cle. Notre premier travail fut de caract?riser le corpus comme ?chantillon du fran?ais, par une ?tude contrastive d?une part de donn?es quantitatives et d?autre part de la structure des r?cits de voyage. Nous nous sommes ensuite consacr?e ? l??tude du temps, et plus particuli?rement ? l?analyse automatique de la s?mantique des temps verbaux du fran?ais. Disposant d?un analyseur syntaxique et s?mantique ? large ?chelle du fran?ais, bas? sur les grammaires cat?gorielles et la s?mantique compositionnelle (?-DRT), notre t?che a ?t? de prendre en compte les temps des verbes pour reconstituer la temporalit? des ?v?nements et des ?tats, notions regroup?es sous le termes d??ventualit?. Cette th?se se concentre sur la construction d?un lexique s?mantique traitant des temps verbaux du fran?ais. Nous proposons une extension et une adaptation d?un syst?me d?op?rateurs compositionnels con?u pour les temps du verbe anglais, aux temps et ? l?aspect du verbe fran?ais du XIX?me si?cle ? nos jours. Cette formalisation est de facto op?rationnelle, car elle est d?finie en terme d?op?rateurs du ?-calcul dont la composition et la r?duction, d?j? programm?es, calculent automatiquement les repr?sentations s?mantiques souhait?es, des formules multisortes de la logique d?ordre sup?rieur. Le passage de l??nonc? comportant une ?ventualit? seule au discours, dont le maillage r?f?rentiel est complexe, est discut? et nous concluons par les perspectives qu?ouvre nos travaux pour l?analyse du discours. Cordialement, Ana?s Lefeuvre A. T. E. R. Universit? Fran?ois Rabelais de Tours Bureau 323 D?partement Informatique Campus de Blois 3 place Jean Jaur?s 41029 Blois 02 54 55 21 48 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:25:01 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:25:01 +0200 Subject: Seminaire: Towards an Online Rhyming Dictionary for Mexican Spanish, Labex CLRI, Avignon Message-ID: Date: Wed, 9 Jul 2014 18:21:02 +0200 From: Nad?ra Bureau Message-ID: <00a201cf9b91$c7324f40$5596edc0$@bureau at> X-url: X-url: Vendredi 18 juillet 2014 14h LIA, chemin des Meinajari?s 84911 Avignon cedex 9 (salle ? pr?ciser) (Labex BLRI) Alfonso Medina (Directeur de Recherches ? EL COLEGIO DE MEXICO ( Towards an Online Rhyming Dictionary for Mexican Spanish. Rhyming dictionaries are a kind of reverse dictionaries. They group words according to rhyming patterns. Rhymes can share exact sequences of vowel and consonant sounds towards the end of a word (consonant rhyme) or just similar vowel sounds (assonant rhyme). Thus, these dictionaries are based on pronunciation, not on writing patterns. Also, since consonance and assonance depend on the stressed syllable, words which end with a stressed syllable are grouped together, those whose stressed syllable is the next to last appear together, and so on. In addition, word pronunciation may vary with time and across geographical and social dialects. In Spanish, this is particularly clear when word loans (for instance, Anglicisms and Galicisms) are considered. In fact, they tend to keep their original writing, at least in the Mexican variant which is the most spoken one. For example, the following loan words, common in Mexican Spanish, rhyme: flash, collage, garage, cottage, squash. Their last syllable is stressed and they are ordered in reverse according to their sounds and not their letters: (respectively, /fl??/, /ko.l??/, /ga.r??/, /ko.t??/ and / The project described takes the current nomenclature of the Diccionario del espa?ol de M?xico ( to generate automatically a rhyming dictionary. Also, since the results of an online query to such a dictionary can be quite large, a procedure was developed to rank them semantically. The idea is to measure the similarities of the query definition to each of the definitions of the rhyming words. These words are then ordered from highest to lowest similarity to the query. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:39:28 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:39:28 +0200 Subject: Job: Linguiste, Semantia Marseille (septembre 2014) Message-ID: Date: Thu, 10 Jul 2014 12:27:15 +0200 From: Lucile Paroz Message-Id: <5E07C45A-CCEF-47F2-8F5C-5A95938429FD at> Fiche de poste - Linguiste Semantia est une soci?t? fran?aise issue de la recherche universitaire sp?cialis?e dans le traitement automatique du langage naturel ?crit. Semantia propose une gamme compl?te de produits d?di?s ? des march?s sp?cifiques : - les annonces class?es ou petites annonces avec Semantia Classifieds - le e-commerce avec Semantia eCommerce - le voyage avec Semantia Travel - l'emploi avec Semantia Job Ces services s'appuient sur des r?f?rentiels m?tiers et sur des mod?les linguistiques et s?mantiques robustes, performants et sp?cialis?s : t?l?communication, banque, assurance, ?nergie, informatique, hi-fi, vid?o, textile, tourisme, ?lectrom?nager bricolage, sport, emploi, rencontre, ameublement, alimentaire, photographie, m?dical, immobilier, automobile,... Aujourd'hui, les services d?ploy?s chez les clients de Semantia traitent pr?s de 5 000 000 de questions Internautes et pr?s de 30 000 000 fiches par mois. Semantia est pr?sente en r?gion marseillaise (si?ge), ? Paris et ? San Francisco. Missions Int?gr?(e) ? l??quipe de traitement du langage, vous aurez pour principales missions : 1. dans le cadre du suivi des projets en cours : - d?velopper et maintenir les bases de connaissances des clients existants - suivre, classer et r?pondre aux demandes des clients - proposer de nouvelles th?matiques - produire des rapports mensuels destin?s aux clients - communiquer et ?changer avec les clients - mettre ? jour les documentations - partager les bonnes pratiques aux autres membres de l'?quipe 2. dans le cadre de d?veloppement de nouveaux projets : - d?terminer la meilleure strat?gie linguistique en fonction du domaine du client, des langues ? d?velopper et du projet concern? - d?terminer la meilleure strat?gie de param?trage du moteur linguistique - d?velopper les bases de connaissances - pr?parer ou r?diger les documentations n?cessaires 3. dans le cadre de la recherche et d?veloppement : - participer activement aux activit?s de R&D de la soci?t? 4. dans le cadre de l?appui au marketing et au commercial : - apporter son exp?rience et son expertise dans la r?flexion de nouveaux produits et services Connaissances requises - Expressions r?guli?res et syst?mes symboliques - Bonnes connaissances en syntaxe et morphologie des langues du monde - Capacit? d'adaptation en fonction des langues travaill?es - Notions en langages informatiques environnement web : PHP, SQL, HTML - Ma?trise des suites bureautiques Connaissances appr?ci?es - M?thode de gestion Agile, environnement de d?veloppement Eclipse, syst?mes Mac OS X et Linux de graphisme et de Mind Mapping. Qualit?s demand?es - Facilit? d'adaptation et efficacit?, travail en ?quipe, logique, rigueur et motivation, cr?ativit?. Profil Titulaire d'un Bac+5 en linguistique informatique Contrat propos? Contrat ? dur?e ind?termin?e
Salaire selon exp?rience
Localisation G?menos (13)
Date Poste ? pourvoir ? partir du 01 septembre 2014
Contacts
e-mail : drh at
T?l. : +33 4 42 36 80 91 Message-ID: Date: Wed, 09 Jul 2014 21:25:45 +0200 From: pap Message-ID: <53BD9739.9040401 at> X-url: Chers coll?gues et autres destinataires, Un nouvel appel ? communications, cette fois non seulement pour faire le point sur les avatars de la terminologie, cette discipline voisine et soeur de la traduction, mais aussi pour rendre hommage ? notre coll?gue John Humbley, qui en est une des figures ?minentes : *Quo Vadis, Terminologia ?*, donc *Colloque international en hommage ? John Humbley* Site Web: Dates : 18-20 f?vrier 2015, Date limite d'envoi des propositions : 15 septembre 2014, Notification: 20 d?cembre 2014 Depuis sa naissance, notamment sous l?impulsion de Eugen W?ster, la terminologie a connu maintes ?volutions, tant techniques que conceptuelles. Elle s?est trouv?e appliqu?e ? une multitude d?applications, de la politique linguistique ? l?indexation du web, qui l?ont amen?e tout ? tour ? se confronter ou ? s?hybrider ? une grande vari?t? de disciplines : lexicographie, traduction, r?daction technique... Plus frappant, encore, elle a donn? lieu ? des controverses de haute vol?e scientifique, par exemple sur l?h?ritage v?ritable de son fondateur, sur la source ? privil?gier dans la recherche d?informations, sur son caract?re descriptif ou normatif ou encore sur son rapport avec l?ontologie. Elle nous para?t en particulier voisine de la traduction, en ceci que l?une et l?autre sont, comme l?auteur chez Flaubert, ? pr?sent[es] partout et visible[s] nulle part ?. Comme elle, son histoire est intimement li?e ? celle du fait national, puisque la survie ou l??mergence d?une langue tient ? la capacit? de cette derni?re de nommer la totalit? des objets du r?el (Michel Serres). Tout le monde fait de la terminologie, utilise la terminologie, mais souvent avec une forme d?ignorance qui va de la franche na?vet? au franc d?ni. On pourrait d?ailleurs en dire de m?me de la linguistique de corpus ? laquelle la terminologie s?est intimement li?e ces deux derni?res d?cennies dans une approche contextuelle, ch?re aux contextualistes britanniques, et non plus seulement conceptuelle. De la bo?te ? chaussures de jadis aux corpus et aux bases informatiques d?aujourd?hui, qui seront demain interconnect?es, on ne classe plus, on n?ordonne plus les donn?es du savoir sp?cialis?, c?est-?-dire les concepts et les termes, ainsi que les relations qui les unissent comme on le faisait par le pass?. Autant de raison pour organiser un colloque international et pluridisciplinaire pour poser, avec des sp?cialistes de ces diff?rents domaines, la question de l?unit? de la ou des terminologies, pour faire le point sur ces diverses branches, ces diverses hybridations et pour en discerner les perspectives de d?veloppement. Et s?il fallait chercher un point de contact, un ?l?ment unificateur, une boussole dans cet oc?an terminologique, peut-?tre faudrait-il se tourner, par-del? les aspects scientifiques, vers une personnalit?, dont le nom serait susceptible de faire r?f?rence dans chacun de ces champs. Un nom s?impose ici : John Humbley. Par l?extr?me richesse de ses travaux sur les multiples domaines cit?s plus haut, par la diversit? de son exp?rience professionnelle et institutionnelle, par les contacts qu?il a nou?s et qu?il entretient ? l??chelle de la plan?te, par sa hauteur de vue et sa disponibilit? sur tous ces aspects, par sa participation ? une multitude de comit?s de lecture, notre coll?gue John Humbley est et a ?t? un acteur et un t?moin de premier plan dans ces bouleversements. C?est en hommage ? sa personne et ? son ?uvre que nous avons d?cid? d?organiser ce grand colloque international, dont les fruits feront, apr?s passage en comit? de lecture, l?objet d?un num?ro th?matique dans une prestigieuse revue internationale de traductologie et de terminologie. Les communications portant sur les sujets suivants et int?grant les travaux de John Humbley seront les bienvenues:
* l'h?ritage de W?ster aujourd'hui
* l'apport de la linguistique de corpus dans l'?volution de la terminologie
* l'apport de la phras?ologie dans l'?volution des ?tudes en terminologie
* terminologie et traduction sp?cialis?e
* terminologie et r?daction technique
* la n?ologie dans les langues de sp?cialit?
* variation terminologique
* emprunts, m?taphores et autres matrices ? l'oeuvre dans la dynamique terminologique, ressources terminologiques
* am?nagement terminologique Diff?rentes m?thodes de recherche y sont disponibles : par mot-cl?, par auteur, par ann?e ou par texte int?gral. Amiti?s,
Florian Boudin Theme: Data-to-Text Generation Funded by the ITEA ModelWriter Project Main topic: Natural Language Generation for the Semantic Web Description: There is a growing need in the semantic web (SW) community for technologies that give humans easy access to the machine-oriented Web of data. Because it maps data to text, Natural Language Generation (NLG) provides a natural mean for presenting this data in an organized, coherent and accessible way. Conversely, the representation languages used by the semantic web (e.g., OWL ontologies and RDF data) are a natural starting ground for NLG systems. The aim of the PhD thesis will be to explore the interaction between the semantic web, the textual web and Natural Language Generation (NLG). More precisely, the goal will be to develop generic weakly supervised methods for generating text from semantic web data in particular, content selection and verbalisation methods. The project will build on an ongoing collaboration between LORIA (Nancy, France), the KRDB group at (Bolzano, Italy) and Stanford Research International (USA), bringing together high level academic partners with internationally recognised expertise in both NLG (LORIA) and knowledge processing (KRDB, SRI). Profile: We are looking for outstanding young research scientists with a good honours degree in Computational Linguistics or Computer Science, with programming skills and with a strong interest in Natural Language Processing. Required skills: - Master's degree in Computational Linguistics or Computer Science - experience in Natural Language Processing - good command of the English language Desirable skills: - experience in natural language generation Supervisor: - Claire Gardent, Research Environment: LORIA is a computer science research unit which conducts most of its scientific activities in partnership with the Inria Nancy - Grand-Est Centre, the French National Centre for Scientific Research (CNRS), the University of Lorraine. We also maintain close ties with research institutes and universities from the wider region, notably in Saarbr?cken and Luxembourg. With around 500 staff and 27 research teams, it is one of the biggest research unit in Lorraine. It conducts research in Algorithms, Computation, Image & Geometry; Formal methods; Networks, Systems and Services; Knowledge & Natural Language Processing; and Complex Systems & Artificial Intelligence. The PhD will be funded by the ITEA3 ModelWriter project ( for a period of 36 months. Including industrial and academic partners from France, Belgium and Turkey, this projects targets the Development of an integrated authoring environment combining a semantic parser, a data-to-text generator and Knowledge Capture Tools. The PhD Candidate will be working in collaboration with the members of the Synalp team (, a Research Group in Computational Linguistics. Synalp research focuses on hybrid, symbolic and statistical approaches to natural language processing and applications built thereon, including NLP for Man-Machine Dialog, for language learning and for Data Verbalisation. Location: Nancy ( is a high-tech city located at the heart of the Lorraine Region, in outstanding scientific and natural surroundings. It is 1h30 by train from Paris, Germany and Luxemburg and 1h from Paris Roissy international airport. To apply, please send CV, Transcripts of Records for Master degree and either Reference letters or Referee names to :
Claire Gardent, claire.gardent at JOB REQUIREMENTS - Ph.D. in Computer Science, Natural Language Processing, Information Retrieval, Information Extraction - Solid programming skills in Java environment - Strong publication record - Participation in international evaluation campaign like TREC or KBP is a plus - English and French speaking Please send your resume / Merci d'envoyer votre CV ? : Eric Charton eric.charton at ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sat Jul 12 09:35:39 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sat, 12 Jul 2014 11:35:39 +0200 Subject: Appel: Special Session: Environmental and geo-spatial data analytics (EnGeoData) - DSAA'2014 Message-ID: Date: Thu, 10 Jul 2014 09:32:00 +0200 From: Mathieu Roche Message-ID: <6f398480685853b8b5519f758d7cb42d at> X-url: ################### ########## Call for Papers ####### ##### Special Session: Environmental and geo-spatial data analytics #### (EnGeoData) ### ## DSAA'2014 - IEEE International Conference on Data Science and # Advanced Analytics # with ACM SIGKDD and technically co-sponsored by IEEE Computational # Intelligence Society # # 30 October - 1 November, 2014, Shanghai, China # # Contact: engeodata at # Web: AIM AND SCOPE Environmental and more generally geo-spatial information is now provided by crowdsourcing but also by public administrations in the context of the open data policies. Analyses of such data are still challenging. Firstly because of their heterogeneity (structural, semantic, spatial and temporal), and secondly because of the difficulty in choosing the ?best? knowledge discovery process to apply, according to the needs of the experts in the field. This special session aims at discussing and assessing some of these strategies covering all or part of the issues mentioned above, from a theoretical or experimental point of view. TOPICS - Pre and Post Data processing - Data Quality, Result Evaluation - Data Mining or Data Warehousing Applications - Text-Mining - Visual Analytics - KDD real use-cases dedicated to environmental and geo-spatial Data PAPER SUBMISSION - Papers should be submitted by DSAA submission site choosing Special Session on "Environmental and geo-spatial data analytics (EnGeoData)" before 22nd July 2014 (PST). - Conference paper submissions should be limited to a maximum of seven (7) pages, in the IEEE 2-column format (see the IEEE Proceedings Author Guidelines: - All submissions will be blind reviewed by the Program Committee on the basis of technical quality, relevance to conference topics of interest, originality, significance, and clarity. Author names and affiliations must not appear in the submissions, and bibliographic references must be adjusted to preserve author anonymity. - Accepted conference papers will be published in the conference proceedings by IEEE and included into the IEEE Xplore Digital Library and will be submitted for EI indexing through INSPEC by IEEE. WEB SITE AND SUBMISSION

CHAIRS
- Maguelonne Teisseire (Irstea, TETIS, France)
- Mathieu Roche (Cirad, TETIS, France) Apologies for any cross-postings.] CALL FOR PARTICIPATION WoLLIC 2014 21st Workshop on Logic, Language, Information and Computation September 1st to 4th, 2014 Valparaiso, Chile (Co-located with ISR 2014 - 7th International School on Rewriting SCIENTIFIC SPONSORSHIP Interest Group in Pure and Applied Logics (IGPL) The Association for Logic, Language and Information (FoLLI) Association for Symbolic Logic (ASL) European Association for Theoretical Computer Science (EATCS) European Association for Computer Science Logic (EACSL) Sociedade Brasileira de Computa??o (SBC) Sociedade Brasileira de L?gica (SBL) ORGANISATION Department of Computer Science, Universidad de Chile, Chile Department of Computer Science, Pontificia Universidad Cat?lica de Chile, Chile Centro de Inform?tica, Universidade Federal de Pernambuco, Brazil HOSTED BY Department of Informatics, Universidad T?cnica Federico Santa Mar?a, Chile CALL FOR PARTICIPATION WoLLIC is an annual international forum on inter-disciplinary research involving formal logic, computing and programming theory, and natural language and reasoning. Each meeting includes invited talks and tutorials as well as contributed papers. The twentieth WoLLIC will be held at the Universidad T?cnica Federico Santa Mar?a, from September 1st to 4th, 2014. It is sponsored by the Association for Symbolic Logic (ASL), the Interest Group in Pure and Applied Logics (IGPL), the The Association for Logic, Language and Information (FoLLI), the European Association for Theoretical Computer Science (EATCS), the European Association for Computer Science Logic (EACSL), the Sociedade Brasileira de Computa??o (SBC), and the Sociedade Brasileira de L?gica (SBL). INVITED TALKS Ver?nica Becher (Universidad de Buenos Aires): *On Normal Numbers* Juha Kontinen (University of Helsinki): *Dependence Logic* Aarne Ranta (University of Gothenburg): *Syntax and Semantics for Translation* Kazushige Terui (Kyoto University): *Intersection Types for Normalization and Verification* Luca Vigano (Universit? di Verona): *Modal and Temporal Deduction Systems for Quantum State Transformations* Thomas Wilke (Christian-Albrechts-Universit?t zu Kiel): *Backward Deterministic B?chi Automata* TUTORIAL LECTURES Aarne Ranta (University of Gothenburg) Luca Vigano (Universit? di Verona) EARLY REGISTRATION (UNTIL AUGUST 20TH) General: US$ 300 Latinamerican students: US$ 150 LATE REGISTRATION General: US$ 350 Latinamerican students: US$ 200 PROGRAMME COMMITTEE Ulrich Kohlenbach (Technische Universit?t Darmstadt) - Chair Natasha Alechina (University of Nottingham) Eric Allender (Rutgers University) Marcelo Arenas (Pontificia Universidad Cat?lica de Chile) Steve Awodey (Carnegie Mellon University) Stefano Berardi (Universit? di Torino) Julian Bradfield (University of Edinburgh) Xavier Caicedo (Universidad de los Andes de Chile) Olivier Danvy (University of Aarhus) Hans van Ditmarsch (LORIA) Marcus Kracht (University of Bielefeld) Michiel van Lambalgen (University of Amsterdam) Klaus Meer (Technische Universit?t Cottbus) George Metcalfe (University of Bern) Dale Miller (INRIA/LIX) Russell Miller (City University of New York) Sara Negri (University of Helsinki) Grigory Olkhovikov (Urals State University) Nicole Schweikardt (Goethe-University Frankfurt am Main) Sebastiaan Terwijn (Radboud University Nijmegen) STEERING COMMITTEE Samson Abramksy, Johan van Benthem, Anuj Dawar, Joe Halpern, Wilfrid Hodges, Daniel Leivant, Leonid Libkin, Angus Macintyre, Grigori Mints (in memoriam), Luke Ong, Hiroakira Ono, Ruy de Queiroz. ORGANISING COMMITTEE Pablo Barcel? (Universidad de Chile) (Local chair) Anjolina G. de Oliveira (U Fed Pernambuco) Ruy de Queiroz (U Fed Pernambuco) (co-chair) Juan Reutter (Pontificia Universidad Cat?lica de Chile) Cristi?n Riveros (Pontificia Universidad Cat?lica de Chile) FURTHER INFORMATION Contact one of the Co-Chairs of the Organising Committee. ORGANISING COMMITTEE
Pablo Barcel? (Universidad de Chile) (Local chair)
Anjolina G. de Oliveira (U Fed Pernambuco)
Ruy de Queiroz (U Fed Pernambuco) (co-chair)
Juan Reutter (Pontificia Universidad Cat?lica de Chile)
Cristi?n Riveros (Pontificia Universidad Cat?lica de Chile)

FURTHER INFORMATION
Contact one of the Co-Chairs of the Organising Committee.

WEB PAGE Message-Id: <1954B34C-1F37-4DBD-A461-37EACE44A1B7 at> X-url: Apologies for cross-posting Please forward this message to colleagues in the areas of interest EXTENDED DEADLINE: July 25, 2014 Second International Workshop on Definitions in Ontologies (DO 2014) at the International Conference on Biomedical Ontologies (ICBO 2014) October 6-7, 2014 Houston, USA Website: This workshop is a follow-up to the workshop on Definitions in Ontologies (DO 2013) held last year in Montreal in conjunction with ICBO 2013. The focus of this second workshop is on definition practices in either human or machine-assisted ontology development. PRESENTATION A current problem in ontology development is constructing the needed definitions of terms either logical or in natural language. For example, ontologies built using OBO Foundry principles are advised to include both logical and natural language definitions, but ontology developers too often focus on only one of these, or they pay insufficient attention to whether they are equivalent. Explicit definitions of terms in ontologies serve a number of purposes. Logical definitions allow reasoners to create inferred hierarchies, lessening the burden of asserting and checking the validity of subsumptions. Natural language definitions help to ameliorate the pervasive problem of low inter-annotator agreement. In specialized domains, experts will know their own field well, but may only have limited knowledge of adjacent disciplines. Good definitions make it possible for non-experts to understand unfamiliar terms and thereby make it possible for more confident reuse of terms by external ontologies, which in turn facilitates data integration. The goal of this workshop is to bring together interested researchers and developers to explore these issues by presenting case studies in a biomedical domain discussing the difficulties that arise when constructing definitions with a view to sharing strategies in the future. Even in the seemingly narrow domain of definition construction, cross-fertilization from related disciplines should yield benefits in quality and help to identify novel approaches. Papers submitted should include one or more case studies and raise specific questions related to definitions with a link to a biomedical domain. Reports on successful or unsuccessful methods are both appropriate. TOPICS - experiences in formulating definitions - tools that assist in definition editing, including collaborative systems - coordination of logical and textual definitions - validation and quality control of definitions, e.g., checking that definitions comply with the all/some form - methods for constructing definitions from multiple sources - use of controlled languages such as Rabbit or ACE for more user-friendly logical definition creation - use of templates to systematize definition creation FORMAT AND OUTCOMES This will be a half-day workshop with a selected mix of presentations based on accepted papers. In order to promote discussion, each presentation will be followed by a short response by a participant of the workshop to be arranged in advance of the workshop. This workshop will document findings on the workshop?s website ( We expect accepted papers to be published in the Journal of Biomedical Semantics (JBS). INTENDED AUDIENCE - ontologists, tool developers, and domain experts whose work encounters issues regarding definitions - tool developers building definition- or ontology-authoring tools - philosophers and logicians - biomedical researchers working on definitions in nomenclatures such as SNOMED - computer scientists addressing these issues in languages like OWL - NLP researchers working on definition extraction, generation, or checking - NLP/IR researchers reusing definitions produced for ontologies SUBMISSIONS All papers should include one or more case studies and raise specific questions related to definitions with a link to a biomedical domain. Papers should be between 5 and 10 pages long (rendered), excluding references, formatted using the JBS templates at, and submitted via EasyChair ( IMPORTANT DATES Workshop paper submission EXTENDED DEADLINE: July 25, 2014 Notification of paper acceptance: August 15, 2014 Camera-ready copies for the proceedings: September 15, 2014 Workshops: October 6-7, 2014 ORGANIZING COMMITTEE Selja Sepp?l? PROGRAM COMMITTEE
Nathalie Aussenac-Gilles (National Center for Scientific Research (CNRS), France)
M?lanie Courtot (MBB Department Simon Fraser University and BC Public Health Microbiology & Reference Laboratory, Canada)
Natalia Grabar (Universit? de Lille 3, France)
Janna Hastings (European Bioinformatics Institute, Cambridge, UK)
James Malone (European Bioinformatics Institute, Cambridge, UK)
Alexis Nasr (Aix Marseille Universit?, France)
Richard Power (The Open University, UK)
Allan Third (The Open University, UK)

SUPPORTED BY
The Swiss National Science Foundation (SNSF)
The State University of New York at Buffalo It is an international forum for researchers and practitioners interested in the advances of computer systems and their applications. The 11th edition of AICCSA will be organized in Doha, Qatar by the Department of Computer Science and Engineering (CSE), College of Engineering at Qatar University on November 10-13, 2014. We are pleased to invite you to submit your papers to AICCSA?14. Any theoretical, conceptual or applicative paper, or a survey of the state of the art contribution is welcome. Topics of interest include the following areas (but not limited to): * Cloud and Distributed Computing * Networking, Sensor Networks, MobileComputing * High Performance Computing * Multimedia, Computer Vision and Image Processing * Big Data, Business Intelligence, Analytics * IR, Data and Knowledge Management * BPM, Web Services, SOA * Natural Language Processing * Interoperability, Semantic Web and Future Internet Technologies * Social Computing * Security and Privacy * E-Learning, M-Learning Proceedings Papers selected for presentation will appear in the Conference Proceedings, which will be published by the IEEE Computer Society and be submitted to IEEE Xplore for inclusion. Regular Papers Papers must be submitted electronically by the deadline below to All papers will be reviewed and judged on merits including originality, significance, interest, correctness, clarity, and relevance to the broader community. The papers must include original work, not be under review or be submitted to another forum during the review process. Authors should submit full papers or posters electronically following the instructions from the conference web site. Authors of accepted papers/poster are expected to present their work at the conference. If you have difficulties with electronic submission, please contact the PC Chairs. Submitted papers that are deemed of good quality but could not be accepted as regular papers will be accepted as short papers. Posters and Doctoral Symposium Research still in early stages may be submitted as extended abstracts that must not exceed 750 words. Accepted abstracts will be included in a special poster session. A specific Doctoral Symposium will be held on the first day of the conference. PhD students are encouraged to send their research proposals/plans and attend this rewarding symposium. Beyond a great feedback on their work, they will have a unique opportunity to be part of an international PhD network. Tutorial Proposals Proposals tutorials and panels should be submitted to the tutorial chair with a copy to the program chairs. Conference Awards Awards will be given to the best Conference Paper and the best Doctoral Symposium presentation. Selection for Journals Best papers of the conference will be selected and proposed for publication in some indexed international journals such as Cluster Computing Journal,International Journal of Secure Software Engineering, International Journal of Product Lifecycle Management, International Journal Engineering Applications of Artificial Intelligence (EAAI), International Journal of Computer Vision and Image Processing (IJCVIP) Industrial Sessions A half-day will be dedicated to industrial sessions and panels, managed by the Industrial Advisory board of the conference. Research Collaboration and Networking Take a unique opportunity to attend the specific Panel on the collaboration possibilities with Qatar research teams and get the latest news on current running projects and the funding possibilities through Qatar National Research Fund (QNRF) (nprp, exceptional projects, etc). Rich Social Program Discover Qatar culture through the organized visits to local famous museums, picturesque Corniche, Qatara cultural village, and the fantastic Souk Waqif!... Experience an unforgettable complementary 4x4 Desert Safari and bedouin campsite in the heart of the wonderful dunes of Qatar! Important Dates Research Paper Submissions July 1, 2014 July 21, 2014 Notification of Acceptance September 1, 2014 Camera Ready September 8, 2014 Author Registration September 8, 2014 PhD Symposium Submissions September 8, 2014 Tutorial Proposals July 1, 2014 For more information about the conference, please visit: For any inquiries, please, use this email: aiccsa2014 at Please, kindly redistribute this CFP to all research relevant venues. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 15 19:56:11 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 15 Jul 2014 21:56:11 +0200 Subject: Appel: Deadline extension, ToRPorEsp (at Propor 2014) Message-ID: Date: Tue, 15 Jul 2014 09:07:36 -0300 From: Muntsa Padr? Message-ID: X-url: ---------------------------------- Extended deadline for abstract submission: 24th of July ---------------------------------- The workshop on Tools and Resources for Automatically Processing Portuguese and Spanish aims to be a forum for the presentation and discussion of language-specific developments for Portuguese and Spanish. We expect to join together researchers and developers with a focus on the creation of tools and linguistic resources for these two languages. A special interest of the workshop is to facilitate access to technologies and resources that are specific to Portuguese and Spanish. We intend that the workshop will contribute to make tools and resources easily available to the local community. To that aim, we encourage submissions that are oriented to simplify the integration of a given tool or resource to address specific needs in these regions. Detailed descriptions, motivation of utility with specific scenarios, even tutorial-like approaches will be highly appreciated. All workshop-related materials will be readily available from the workshop webpage, to promote adoption. Important Dates: - NEW! Le comit? d'organisation des J?Tou 2015 The corpus contains a total of 28 hours of audio speech that were manually transcribed by several trained annotators. The corpus is comprised of technical University lectures. For more information, see: *ELRA-S0367 CORAL Corpus* The CORAL Corpus is a collection of spoken dialogues in European Portuguese. It consists of 56 dialogues about a predetermined subject: maps. One of the participants (giver) has a map with some landmarks and a route drawn between them; the other (follower) has also landmarks, but no route and consequently must reconstruct it. Only orthographic transcription was done for the whole corpus. A pilot recording was annotated in several levels. For more information, see: *ELRA-S0370 MoveOn Speech and Noise Corpus* The MoveOn Speech and Noise Corpus is a corpus recorded under the extreme conditions of the motorcycle environment within the MoveOn project. The speech utterances are in British English approaching the issue of command and control and template driven dialog systems with a focus on -- but not limited to - the police domain. The major part of the corpus comprises noisy speech and environmental noise recorded on a motorcycle. Several clean speech recording sessions with the same recording setup (including the motorcycle helmet) in an office environment complete the corpus. For more information on the catalogue, please contact
Val?rie Mapelli
mapelli at

Visit our On-line Catalogue:
Visit the Universal Catalogue:
Archives of ELRA Language Resources Catalogue Updates: Please send it to interested colleagues and students. Thanks! CALL FOR EXTENDED ABSTRACTS, PAPERS, WORKSHOPS and TUTORIALS! ************************************************************************ International Conference on Information Society (i-Society 2014) Technical Co-Sponsored by IEEE UK/RI Computer Chapter 10-12 November, 2014 Venue: London Heathrow Marriott Hotel London, UK ************************************************************************ The i-Society 2014 is Technical Co-Sponsored by UK/RI Computer Chapter. The i-Society is a global knowledge-enriched collaborative effort that has its roots from both academia and industry. The conference covers a wide spectrum of topics that relate to information society, which includes technical and non-technical research areas. The mission of i-Society 2014 conference is to provide opportunities for collaboration of professionals and researchers to share existing and generate new knowledge in the field of information society. The conference encapsulates the concept of interdisciplinary science that studies the societal and technological dimensions of knowledge evolution in digital society. The i-Society bridges the gap between academia and industry with regards to research collaboration and awareness of current development in secure information management in the digital society. The topics in i-Society 2014 include but are not confined to the following areas: *New enabling technologies - Internet technologies - Wireless applications - Mobile Applications - Multimedia Applications - Protocols and Standards - Ubiquitous Computing - Virtual Reality - Human Computer Interaction - Geographic information systems - e-Manufacturing *Intelligent data management - Intelligent Agents - Intelligent Systems - Intelligent Organisations - Content Development - Data Mining - e-Publishing and Digital Libraries - Information Search and Retrieval - Knowledge Management - e-Intelligence - Knowledge networks *Secure Technologies - Internet security - Web services and performance - Secure transactions - Cryptography - Payment systems - Secure Protocols - e-Privacy - e-Trust - e-Risk - Cyber law - Forensics - Information assurance - Mobile social networks - Peer-to-peer social networks - Sensor networks and social sensing *e-Learning - Collaborative Learning - Curriculum Content Design and Development - Delivery Systems and Environments - Educational Systems Design - e-Learning Organisational Issues - Evaluation and Assessment - Virtual Learning Environments and Issues - Web-based Learning Communities - e-Learning Tools - e-Education *e-Society - Global Trends - Social Inclusion - Intellectual Property Rights - Social Infonomics - Computer-Mediated Communication - Social and Organisational Aspects - Globalisation and developmental IT - Social Software *e-Health - Data Security Issues - e-Health Policy and Practice - e-Healthcare Strategies and Provision - Medical Research Ethics - Patient Privacy and Confidentiality - e-Medicine *e-Governance - Democracy and the Citizen - e-Administration - Policy Issues - Virtual Communities *e-Business - Digital Economies - Knowledge economy - eProcurement - National and International Economies - e-Business Ontologies and Models - Digital Goods and Services - e-Commerce Application Fields - e-Commerce Economics - e-Commerce Services - Electronic Service Delivery - e-Marketing - Online Auctions and Technologies - Virtual Organisations - Teleworking - Applied e-Business - Electronic Data Interchange (EDI) *e-Art - Legal Issues - Patents - Enabling technologies and tools *e-Science - Natural sciences in digital society - Biometrics - Bioinformatics - Collaborative research *Industrial developments - Trends in learning - Applied research - Cutting-edge technologies * Research in progress - Ongoing research from undergraduates, graduates/postgraduates and professionals Important Dates: *Extended Abstract (Work in Progress) Submission Date: August 20, 2014 *Notification of Extended Abstract (Work in Progress) Acceptance/Rejection: August 31, 2014 *Research Paper, Student Paper, Case Study, Report Submission Date: August 31, 2014 *Notification of Research Paper, Student Paper, Case Study, Report Acceptance/Rejection: September 15, 2014 *Camera Ready Paper Due: October 10, 20124 *Proposal for Workshops: September 01, 2014 *Notification of Workshop Acceptance/Rejection: September 10, 2014 *Poster/Demo Proposal Submission: August 31, 2014 *Notification of Poster/Demo Acceptance: September 10, 2014 *Participant(s) Registration (Open): May 01, 2014 *Early Bird Registration Deadline: September 30, 2014 *Late Bird Registration Deadline (Authors only): October 01 to October 15, 2014 *Late Bird Registration Deadline (Participants only): October 01 to November 03, 2014 *Conference Dates: November 10-12, 2014 For more details, please visit From hamon at LIMSI.FR Tue Jul 15 20:05:06 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 15 Jul 2014 22:05:06 +0200 Subject: Appel: SSST-8, 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (EMNLP 2014) Message-ID: Date: Tue, 15 Jul 2014 16:06:09 +0100 From: Eva Maria Vecchi Message-Id: <8AEBA765-3C7B-4998-AEF8-E82AB7F38E08 at> X-url: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) EMNLP 2014 / SIGMT / SIGLEX Workshop Oct 2014, Doha, Qatar *** Special theme: Compositional Distributional Semantics and Machine Translation *** The Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) seeks to bring together a large number of researchers working on diverse aspects of structure, semantics and representation in relation to statistical machine translation. Since its first edition in 2006, its program each year has comprised high-quality papers discussing current work spanning topics including: new grammatical models of translation; new learning methods for syntax- and semantics-based models; formal properties of synchronous/transduction grammars (hereafter S/TGs); discriminative training of models incorporating linguistic features; using S/TGs for semantics and generation; and syntax- and semantics-based evaluation of machine translation. We invite two types of submissions this year: 1. Extended abstracts for poster or hands-on presentations on the special theme 2. Full papers spanning all areas of interest for SSST =========================== Special Theme Extended Abstracts =========================== This year, the special theme of semantics of the past three editions of SSST takes a new step with a "working workshop" bringing together researchers interested in compositional distributional semantics, distributed representations, and continuous vector space models in MT, with tutorials bridging both directions, as well as discussions and hands-on work on relevant tasks with real data. Such models have proven beneficial for a number of NLP tasks, for example phrasal similarity, lexical entailment, modeling semantic deviance, detecting order restrictions in recursive structures, or improving NP bracketing in parsing. However, they have not received as much attention in MT. Extended abstracts of at most two (2) pages should describe poster or hands-on presentations that will stimulate discussions on the special theme of compositional distributional semantics and machine translation, including position papers, recent work, pilot studies, negative results. We encourage the presentation of relevant work that has been published or submitted elsewhere, as well as new work in progress. ========= Full Papers ========= The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is now wide consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and similar notational equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages. Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations. At the same time, SMT has seen a movement toward semantics over the past few years, which has been reflected at recent SSST workshops, including the last three editions which had semantics for SMT as a special theme. The issues of deep syntax and shallow semantics are closely linked and SSST-8 continues to encourage submissions on semantics for MT in a number of directions, including semantic role labeling, sense disambiguation, and compositional distributional semantics for translation and evaluation. We invite papers on: syntax-based / semantics-based / tree-structured SMT machine learning techniques for inducing structured translation models algorithms for training, decoding, and scoring with semantic representation structure empirical studies on adequacy and efficiency of formalisms creation and usefulness of syntactic/semantic resources for MT formal properties of synchronous/transduction grammars learning semantic information from monolingual, parallel or comparable corpora unsupervised and semi-supervised word sense induction and disambiguation methods for MT lexical substitution, word sense induction and disambiguation, semantic role labeling, textual entailment, paraphrase and other semantic tasks for MT semantic features for MT models (word alignment, translation lexicons, language models, etc.) evaluation of syntactic/semantic components within MT (task-based evaluation) scalability of structured translation methods to small or large data applications of S/TGs to related areas including: speech translation formal semantics and semantic parsing paraphrases and textual entailment information retrieval and extraction syntactically- and semantically-motivated evaluation of MT compositional distributional semantics in MT distributed representations and continuous vector space models in MT ========= Organizers ========= Dekai WU, Hong Kong University of Science and Technology (HKUST) Marine CARPUAT, National Research Council (NRC) Canada Xavier CARRERAS, Universitat Polit?cnica de Catalunya (UPC) Eva Maria VECCHI, Cambridge University ============= Important Dates ============= Submission deadline for papers and extended abstracts: 26 Jul 2014 Notification to authors: 26 Aug 2014 Camera copy deadline: 15 Sep 2014 For more information ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 15 20:15:08 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 15 Jul 2014 22:15:08 +0200 Subject: Appel: TSD 2014, Call for Demonstrations and Participation, 8-12 September 2014, Brno, Czech Republic Message-ID: Date: Tue, 15 Jul 2014 22:06:24 +0200 From: TSD 2014 Message-Id: X-url: ********************************************************* TSD 2014 - CALL FOR DEMONSTRATIONS AND PARTICIPATION ********************************************************* Seventeenth International Conference on TEXT, SPEECH and DIALOGUE (TSD 2014) Brno, Czech Republic, 8-12 September 2014 SUBMISSION OF DEMONSTRATION ABSTRACTS Authors are invited to present actual projects, developed software and hardware or interesting material relevant to the topics of the conference. The authors of the demonstrations should provide the abstract not exceeding one page as plain text. The submission must be made using the online form available at the conference www pages. The accepted demonstrations will be presented during a special Demonstration Session (see the Demo Instructions at Demonstrators can present their contribution with their own notebook with an Internet connection provided by the organisers or the organisers can prepare a PC computer with multimedia support for demonstrators. IMPORTANT DATES August 3 2014 ............ Submission of demonstration abstracts August 10 2014 ............ Notification of acceptance for demonstrations sent to the authors September 3-7 2014 ........ Conference dates The demonstration abstracts will not appear in the Proceedings of TSD 2014 but they will be published electronically at the conference website. KEYNOTE SPEAKERS Ralph Grishman, New York University, USA Active Learning for Information Extraction Bernardo Magnini, FBK - Fondazione Bruno Kessler, Italy Entailment graphs for text analytics Salim Roukos, IBM, USA Recent Progress in Statistical Machine Translation: Algorithms and Applications The conference is organized by the Faculty of Informatics, Masaryk University, Brno, and the Faculty of Applied Sciences, University of West Bohemia, Pilsen. The conference is supported by International Speech Communication Association. Venue: Brno, Czech Republic TSD SERIES TSD series evolved as a prime forum for interaction between researchers in both spoken and written language processing from all over the world. Proceedings of TSD form a book published by Springer-Verlag in their Lecture Notes in Artificial Intelligence (LNAI) series. TSD Proceedings are regularly indexed by Thomson Reuters Conference Proceedings Citation Index. Moreover, LNAI series are listed in all major citation databases such as DBLP, SCOPUS, EI, INSPEC or COMPENDEX. OFFICIAL LANGUAGE
The official language of the conference is English.

ACCOMMODATION
The organizing committee will arrange discounts on accommodation in the 4-star hotel at the conference venue. The current prices of the accommodation are available at the conference website.

ADDRESS
All correspondence regarding the conference should be addressed to
Ales Horak, TSD 2014
Faculty of Informatics, Masaryk University
Botanicka 68a, 602 00 Brno, Czech Republic
phone: +420-5-49 49 18 63
fax: +420-5-49 49 18 20
email: tsd2014 at

The official TSD 2014 homepage is: PROGRAM COMMITTEE Hynek Hermansky, USA (general chair) Eneko Agirre, Spain Genevieve Baudoin, France Paul Cook, Australia Jan Cernocky, Czech Republic Simon Dobrisek, Slovenia Karina Evgrafova, Russia Darja Fiser, Slovenia Radovan Garabik, Slovakia Alexander Gelbukh, Mexico Louise Guthrie, GB Jan Hajic, Czech Republic Eva Hajicova, Czech Republic Yannis Haralambous, France Ludwig Hitzenberger, Germany Jaroslava Hlavacova, Czech Republic Ales Horak, Czech Republic Eduard Hovy, USA Maria Khokhlova, Russia Daniil Kocharov, Russia Ivan Kopecek, Czech Republic Valia Kordoni, Germany Steven Krauwer, The Netherlands Siegfried Kunzmann, Germany Natalija Loukachevitch, Russia Vaclav Matousek, Czech Republic Diana McCarthy, United Kingdom France Mihelic, Slovenia Hermann Ney, Germany Elmar Noeth, Germany Karel Oliva, Czech Republic Karel Pala, Czech Republic Nikola Pavesic, Slovenia Fabio Pianesi, Italy Maciej Piasecki, Poland Adam Przepiorkowski, Poland Josef Psutka, Czech Republic James Pustejovsky, USA German Rigau, Spain Leon Rothkrantz, The Netherlands Anna Rumshisky, USA Milan Rusko, Slovakia Mykola Sazhok, Ukraine Pavel Skrelin, Russia Pavel Smrz, Czech Republic Petr Sojka, Czech Republic Stefan Steidl, Germany Georg Stemmer, Germany Marko Tadic, Croatia Tamas Varadi, Hungary Zygmunt Vetulani, Poland Pascal Wiggers, The Netherlands Yorick Wilks, GB Marcin Wolinski, Poland Victor Zakharov, Russia FORMAT OF THE CONFERENCE The conference program will include presentation of invited papers, oral presentations, and poster/demonstration sessions. Papers will be presented in plenary or topic oriented sessions. Social events including a trip in the vicinity of Brno will allow for additional informal interactions. OFFICIAL LANGUAGE The official language of the conference is English. ACCOMMODATION The organizing committee will arrange discounts on accommodation in the 4-star hotel at the conference venue. The current prices of the accommodation are available at the conference website. ADDRESS All correspondence regarding the conference should be addressed to Ales Horak, TSD 2014 Faculty of Informatics, Masaryk University Botanicka 68a, 602 00 Brno, Czech Republic phone: +420-5-49 49 18 63 fax: +420-5-49 49 18 20 email: tsd2014 at The official TSD 2014 homepage is: LOCATION Brno is the second largest city in the Czech Republic with a population of almost 400.000 and is the country's judiciary and trade-fair center. Brno is the capital of South Moravia, which is located in the south-east part of the Czech Republic and is known for a wide range of cultural, natural, and technical sights. South Moravia is a traditional wine region. Brno had been a Royal City since 1347 and with its six universities it forms a cultural center of the region. Brno can be reached easily by direct flights from London, Moscow, and Eindhoven, and by trains or buses from Prague (200 km) or Vienna (130 km). For the participants with some extra time, nearby places may also be of interest. Local ones include: Brno Castle now called Spilberk, Veveri Castle, the Old and New City Halls, the Augustine Monastery with St. Thomas Church and crypt of Moravian Margraves, Church of St. James, Cathedral of St. Peter & Paul, Cartesian Monastery in Kralovo Pole, the famous Villa Tugendhat designed by Mies van der Rohe along with other important buildings of between-war Czech architecture. For those willing to venture out of Brno, Moravian Karst with Macocha Chasm and Punkva caves, battlefield of the Battle of three emperors (Napoleon, Russian Alexander and Austrian Franz - Battle by Austerlitz), Chateau of Slavkov (Austerlitz), Pernstejn Castle, Buchlov Castle, Lednice Chateau, Buchlovice Chateau, Letovice Chateau, Mikulov with one of the largest Jewish cemeteries in Central Europe, Telc - a town on the UNESCO heritage list, and many others are all within easy reach. From hamon at LIMSI.FR Sun Jul 20 20:25:34 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 20 Jul 2014 22:25:34 +0200 Subject: Appel: SSST-8, 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (EMNLP 2014) Message-ID: Date: Wed, 16 Jul 2014 13:31:00 +0100 From: Eva Maria Vecchi Message-Id: <6809CE3F-E50D-448E-97B1-8EE2B6F7682A at> X-url: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) EMNLP 2014 / SIGMT / SIGLEX Workshop Oct 2014, Doha, Qatar * Special theme: Compositional Distributional Semantics and Machine Translation * The Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) seeks to bring together a large number of researchers working on diverse aspects of structure, semantics and representation in relation to statistical machine translation. Since its first edition in 2006, its program each year has comprised high-quality papers discussing current work spanning topics including: new grammatical models of translation; new learning methods for syntax- and semantics-based models; formal properties of synchronous/transduction grammars (hereafter S/TGs); discriminative training of models incorporating linguistic features; using S/TGs for semantics and generation; and syntax- and semantics-based evaluation of machine translation. We invite two types of submissions this year: 1. Extended abstracts for poster or hands-on presentations on the special theme 2. Full papers spanning all areas of interest for SSST =========================== Special Theme Extended Abstracts =========================== This year, the special theme of semantics of the past three editions of SSST takes a new step with a "working workshop" bringing together researchers interested in compositional distributional semantics, distributed representations, and continuous vector space models in MT, with tutorials bridging both directions, as well as discussions and hands-on work on relevant tasks with real data. Such models have proven beneficial for a number of NLP tasks, for example phrasal similarity, lexical entailment, modeling semantic deviance, detecting order restrictions in recursive structures, or improving NP bracketing in parsing. However, they have not received as much attention in MT. Extended abstracts of at most two (2) pages should describe poster or hands-on presentations that will stimulate discussions on the special theme of compositional distributional semantics and machine translation, including position papers, recent work, pilot studies, negative results. We encourage the presentation of relevant work that has been published or submitted elsewhere, as well as new work in progress. ========= Full Papers ========= The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is now wide consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and similar notational equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages. Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations. At the same time, SMT has seen a movement toward semantics over the past few years, which has been reflected at recent SSST workshops, including the last three editions which had semantics for SMT as a special theme. The issues of deep syntax and shallow semantics are closely linked and SSST-8 continues to encourage submissions on semantics for MT in a number of directions, including semantic role labeling, sense disambiguation, and compositional distributional semantics for translation and evaluation. ========= 
Organizers
=========
Dekai WU, Hong Kong University of Science and Technology (HKUST)
Marine CARPUAT, National Research Council (NRC) Canada
Xavier CARRERAS, Universitat Polit?cnica de Catalunya (UPC)
Eva Maria VECCHI, Cambridge University

=============
Important Dates
=============
Submission deadline for papers and extended abstracts: 26 Jul 2014
Notification to authors: 26 Aug 2014
Camera copy deadline: 15 Sep 2014

For more information The workshop continues the successful series of workshops held previously in cooperation with ACM SIGSPATIAL and in conjunction with SIGIR and CIKM conferences. The purpose of the workshop is to bring together members of the vibrant and growing community of researchers and practitioners working in the field of geographic information retrieval to discuss current research activity and potential future research directions. The subject and format of the workshop ----------------------------------------------------- There is a vast quantity of information in text documents and other media that is referenced to geographic space. The discipline of Geographical Information Retrieval (GIR) is concerned with developing methods to gain access to this geographical information, with a particular focus on the content of web documents and social media. Because much of the information is in the form of unstructured or semi-structured text, there is a challenge to develop methods that can automatically recognise and interpret the geographical terminology and spatial or spatio-temporal concepts that people use when recording and querying the information. GIR falls at the intersection of Information Retrieval (IR) and Geographical Information Science (GIScience) resulting in research and systems development that benefits from the fusion of text-based methods for information extraction, natural language processing, indexing and search with GIS methods for spatial data management, analysis and visualization. The workshop invites contributions on the following topics, and other research related to GIR: - Detection, disambiguation and geocoding of geographical references in text; - User needs for geographic search; - Classification of web documents and social media with regard to their geographic foci; - Interpretation of spatial natural language in documents and queries; - Extraction of geographically-specific facts and events from text documents and social media; - Spatial and spatio-temporal indexing of documents and other media objects; - Modelling, construction and integration of ontologies, gazetteers and geographic thesauri; - Reasoning with geo-spatial facts for purposes of information retrieval; - Geographical query interfaces for search on the web; - Geographic question / answering systems; - Geographic search engine architectures; - Relevance ranking of geographical information; - Evaluation methods for geographic search. We invite both long papers (8 pages) and short papers (2 pages). Long papers are expected to report on relatively mature research results, while short papers may also cover more speculative or early stage research that may stimulate discussion at the workshop. All submissions will be reviewed by three members of the programme committee and all accepted papers will be published in the ACM Digital Library. Please note that we welcome contributions both from academic researchers and from practitioners working in industry and in public agencies engaged in GIR-related activities. The workshop programme will ensure opportunity for discussion of the presented papers and of the broader agenda for research in GIR. Submission procedure ------------------------------ You should prepare your paper in accordance with the ACM camera-ready instructions ( and submit it using the EasyChair system ( by 29th August 2014. Decisions on acceptance will be announced by 15th September 2014. Camera ready versions of accepted papers to be submitted by 29th September 2014. At least one author of accepted papers will be required to register for the workshop before the paper is published in the ACM Digital Library, and to present the paper at the workshop. Please note that attendance at the workshop also requires registration for the main ACM SIGSPATIAL GIS conference ( in addition to registering for the workshop. Further details of the workshop can be found at From hamon at LIMSI.FR Sun Jul 20 20:21:02 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 20 Jul 2014 22:21:02 +0200 Subject: Appel: Special Session, Environmental and geo-spatial data analytics (EnGeoData), DSAA'2014 Message-ID: Date: Wed, 16 Jul 2014 06:01:35 +0200 From: Mathieu Roche Message-ID: <5b38f2179a7ac5d1034ad52f3221bb1b at> X-url: X-url: ################### ########## 2nd Call for Papers ####### ##### Special Session: Environmental and geo-spatial data analytics #### (EnGeoData) ### ## DSAA'2014 - IEEE International Conference on Data Science and ## Advanced Analytics # with ACM SIGKDD and technically co-sponsored by IEEE Computational # Intelligence Society # # 30 October - 1 November, 2014, Shanghai, China # # Contact: engeodata at # Web: # # Deadline: 22nd July 2014 AIM AND SCOPE Environmental and more generally geo-spatial information is now provided by crowdsourcing but also by public administrations in the context of the open data policies. Analyses of such data are still challenging. Firstly because of their heterogeneity (structural, semantic, spatial and temporal), and secondly because of the difficulty in choosing the ?best? knowledge discovery process to apply, according to the needs of the experts in the field. This special session aims at discussing and assessing some of these strategies covering all or part of the issues mentioned above, from a theoretical or experimental point of view. TOPICS - Pre and Post Data processing - Data Quality, Result Evaluation - Data Mining or Data Warehousing Applications - Text Mining - Visual Analytics - KDD real use-cases dedicated to environmental and geo-spatial Data PAPER SUBMISSION - Papers should be submitted by DSAA submission site choosing Special Session on "Environmental and geo-spatial data analytics (EnGeoData)" before 22nd July 2014 (PST). - Conference paper submissions should be limited to a maximum of seven (7) pages, in the IEEE 2-column format (see the IEEE Proceedings Author Guidelines: ). - All submissions will be blind reviewed by the Program Committee on the basis of technical quality, relevance to conference topics of interest, originality, significance, and clarity. Author names and affiliations must not appear in the submissions, and bibliographic references must be adjusted to preserve author anonymity. - Accepted conference papers will be published in the conference proceedings by IEEE and included into the IEEE Xplore Digital Library and will be submitted for EI indexing through INSPEC by IEEE. WEB SITE AND SUBMISSION CHAIRS - Maguelonne Teisseire (Irstea, TETIS, France) - Mathieu Roche (Cirad, TETIS, France) PROGRAM COMMITTEE (to be completed) - Gloria Bordogna, CNR Milan, Italy - Mete Celik, Erciyes University, Turkey - Pierre Ganc?arski, University of Strasbourg, France - Diana Inkpen, University of Ottawa, Canada - Eric Kergosien, University Lille 3, France - Florence Le Ber, ENGEES, France - Corrado Loglisci, University of Bari, Italy - Donato Malerba, University of Bari, Italy - Stan Matwin, Dalhousie University, Canada - Jordi Nin, Polytechnic University of Catalonia, Spain - Franc?ois Petitjean, Monash University, Australia - Julien Velcin, University Lyon 2, France - Osmar R. Zai?ane, University of Alberta, Canada ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sun Jul 20 20:23:09 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 20 Jul 2014 22:23:09 +0200 Subject: Ressources: Corpus 88milSMS Message-ID: Date: Wed, 16 Jul 2014 06:13:20 +0200 From: Mathieu Roche Message-Id: <4FF5E81F-2559-4FE1-B2E9-3511DC42FE68 at> X-url: X-url: X-url: Bonjour ? tous, Nous avons le grand plaisir de faire l'annonce suivante (voir en bas de ce courriel). Le projet sud4science ( a d?marr? en janvier 2011, et fait partie d'un grand projet international, sms4science (, initi? par des chercheurs belges (Cental, UCL), en 2004. Le corpus ? 88milSMS ? est diffus? ? partir du 26 juin 2014. Il s'agit d'un grand corpus de SMS authentiques, anonymis?s, en fran?ais. Il est produit par l?Universit? Paul-Val?ry Montpellier 3 et le CNRS, en collaboration avec l?Universit? catholique de Louvain, et il est financ? gr?ce au soutien de la MSH-M et du Minist?re de la Culture (D?l?gation g?n?rale ? la langue fran?aise et aux langues de France) et avec la participation de Praxiling, Lirmm, Lidilem, Tetis, Viseo. Nous avons obtenu l'accord pour le mettre ? disposition sur la grille de services d'Huma-Num. Les conditions d'utilisation et les t?l?chargements s'effectuent ici : C'est un grand jour pour tous les membres du projet. Nous profitons de ce message pour remercier nos institutions de recherche publique, nos entreprises, nos services juridiques, nos laboratoires de recherche, nos partenaires et nos 8 stagiaires ?tudiants qui ont travaill? tout au long de ces derni?res ann?es avec nous. Nous voudrions terminer ce message par des remerciements tr?s appuy?s au service juridique de l'Universit? Paul-Val?ry, le SAJI, dirig? par St?phanie Delaunay. Si le projet sud4science a pu aboutir sur le plan juridique, et si nous pouvons mettre ? disposition le corpus ? 88milSMS ? aujourd'hui, c'est gr?ce ? l'?norme investissement dans le projet par tout le service, et, en particulier, par notre correspondant Informatique et libert?s (CIL), Nicolas Hvoinsky. Notre juriste-CIL s'est montr? tr?s actif d?s le d?but du projet en 2011 : participation ? nos s?minaires scientifiques pour comprendre les enjeux du projet, r?daction de tr?s nombreux documents juridiques, ?changes de centaines de courriels, conseils sur l'anonymisation des SMS, r?ponses ? nos questions incessantes, etc. Le temps et l'?nergie consacr?s au projet, et la patience ? toute ?preuve de Nicolas Hvoinsky ont tr?s largement contribu? ? la r?ussite de ce projet. Comme dit pr?c?demment, le corpus ? 88milSMS ? est diffus? ? partirdu 26 juin 2014 et nous sommes ravis et fiers de pouvoir le mettre ? disposition de tous. Bien cordialement, Rachel Panckhurst, Catherine D?trie, C?dric Lopez, Claudine Mo?se, Mathieu Roche, Bertrand Verine. ---------- Annonce : ---------- Le corpus de SMS en langue fran?aise 88milSMS est disponible ! Conditions d?utilisation, t?l?chargements : ? Panckhurst R., D?trie C., Lopez C., Mo?se C., Roche M., Verine B. (2014) "88milSMS. A corpus of authentic text messages in French", produit par l?Universit? Paul-Val?ry Montpellier 3 et le CNRS, en collaboration avec l?Universit? catholique de Louvain, financ? gr?ce au soutien de la MSH-M et du Minist?re de la Culture (D?l?gation g?n?rale ? la langue fran?aise et aux langues de France) et avec la participation de Praxiling, Lirmm, Lidilem, Tetis, Viseo. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sun Jul 20 20:34:44 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 20 Jul 2014 22:34:44 +0200 Subject: Appel: Call for Posters, Translating and the Computer 36 Message-ID: Date: Thu, 17 Jul 2014 11:33:30 +0000 From: "Stajner, Sanja" Message-ID: <8D7E2A326D0D3549A39B4B714C54FFAA9A8D667C at> X-url: --------------------------------------- Call for Posters Translating and the Computer 36 London, 27 and 28 November 2014 The Translating and the Computer conference ( encourages submissions for poster presentations to supplement the regular presentations of the conference. Posters are expected to present ongoing and not necessarily completed research, teaching or training activity, practical work, software programs, projects or developments in general related to translation, interpretation and terminology, and to the related industries. The Translating and the Computer conference is a unique forum for researchers, developers and users. It brings together academics involved in language technology research and in teaching translation and terminology with those who develop and market tools for language transformation and both of these groups with users: translators, terminologists, interpreters, and voice-over specialists, whether freelancers or working in translation departments of large organisations such as those of the European Parliament, European courts and the European Patent Office, the United Nations family, international companies and other organisations, and Language Services Providers (LSPs), large and small. In its 36th session Translating and the Computer has moved from ASLIB to ASLING. The conference often referred to as the ?ASLIB Conference? is now the ASLING Translating and the Computer Conference. One of the new developments is also the launch of a poster session in addition to the regular presentation slots. Poster proposals in the form of poster abstracts not exceeding 500 words (the final versions of the accepted posters can be up to 1,500 words) must be submitted using the START system at the following address:, adding the text ?Poster:? at the start of the ?Title of Submission: ? field in the online submission form. Accepted poster papers will be included (and will have the have the same status as regular papers) in the conference proceedings only after the registration fee for at least one presenter of the paper has been paid. Important dates Deadline for poster submissions: 8 August 2014 Notification of acceptance or rejection: 22 August 2014 Camera-ready poster papers due: 3 October Conference: 29 and 30 November 2014 Chairs * Juliet Macan, Arancho Doc srl. Conference Manager:
* Nicole Adamides

Association internationale pour la promotion des technologies linguistiques
International Association for Advancement in Language Technology
Bologna, Gen?ve, London, Wien, Wolverhampton Ce projet vise ? d?crire les caract?ristiques accentuelles du fran?ais afin de mettre en lumi?re la structure phonologique sous-jacente caract?risant cette langue. Cette question est abord?e au travers du patron bipolaire /AI-AF/ (Accent Initial - Accent Final), envisag? comme structure m?trique de base en fran?ais. Nous proposons d'appliquer une m?me grille d'analyse sur une s?rie de corpus allant de la parole de laboratoire ? la parole semi-contr?l?e et l'interaction dialogique spontan?e. Les exp?riences de production sur divers styles de parole nous permettront d'affiner la caract?risation acoustico-phon?tique de AI et AF afin d'am?liorer les syst?mes de d?tection automatique des ?v?nements prosodiques sur de larges corpus. Pour plus d'informations, consulter le site du projet: . *Description du travail* Le post-doctorant sera principalement impliqu? dans le traitement des donn?es. Il/elle participera aux analyses acoustiques et devra ensuite mettre en oeuvre les traitements statistiques pr?vus dans le projet. . *Pr?-requis* Une th?se en Science du Langage (phon?tique exp?rimentale/prosodie) ou en Traitement automatique du Langage ainsi qu'une solide exp?rience en statistiques et traitement de donn?es sont attendues. Des connaissances sur le traitement et l'analyse des corpus oraux sont ?galement bienvenues. . *Proc?dure* Les candidats enverront un CV d?taill? avec une liste des publications, ainsi qu'une br?ve lettre mentionnant leurs int?r?ts scientifiques en pr?cisant la nature de leur exp?rience en traitement des donn?es. Merci d'envoyer les documents ? : roxane.bertrand at (Roxane Bertrand, Responsable scientifique LPL, Aix-en-Provence, France). Date limite de r?ception des candidatures : 30 septembre 2014 Date de d?marrage pr?vue : novembre 2014 (mais flexible) Dur?e du contrat : 12 mois Salaire : environ 2000EUR/mois ------------------ *POST-DOCTORAL POSITION FOR THE PROJECT PhonIACog (LPL - AIX-EN-PROVENCE, FRANCE)** ************************************************************************ We invite applications for a one-year Post-Doctoral position at the Laboratoire Parole et Langage (LPL, Aix-Marseille Universit?, CNRS, UMR 7309, France), to work on the project PhonIACog (-/The role of the Initial Accent in prosodic structuring in French/-/From phonology to speech processing/- Main coordinator : Corine Ast?sano, Universit? de Toulouse 2). . *Description* The PhonIACog project is funded by the The French National Research Agency (ANR). The present project aims at describing the characteristics of the French accentual system in order to bring to light the underlying phonological structure of this language. It addresses the status of the bipolar pattern /IA FA/ (initial accent-final accent), considered as the basic metric pattern in French. We propose to apply the same analyses to different corpora, from laboratory speech to semi-controlled speech and dialogic spontaneous interaction. The production studies will allow us to refine the acoustic-phonetic characterization of IA and FA, with potential application to automatic detection of prosodic cues on large, spontaneous corpora. More information is available at the project website: . *Job description* The post-doctoral fellow will be mainly involved in data processing. He/she will participate in the acoustic analyses and will then have to implement the statistical analyses planned in the project. . *Qualifications* A Ph.D. in linguistics (experimental phonetics/prosody) or in computer science and solid competence/experience in statistics and data analysis are required. Experience in processing and analysis of large speech database is also welcome. . *Application procedure* Candidates should send a detailed CV with a list of publications, and a cover letter with statement of research interests and details of their experience in data analysis. Please e-mail documents to:roxane.bertrand at (Roxane Bertrand, Scientific coordinator LPL, Aix-en-Provence, France). Deadline for submission: September 30, 2014 Expected start date: November 2014 (with some flexibility.) Length of contact: 12 months Salary: about EUR2000/month including health care ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Sun Jul 20 20:42:24 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 20 Jul 2014 22:42:24 +0200 Subject: Appel: Ireland International Conference on Education (IICE-2014) Message-ID: Date: Fri, 18 Jul 2014 17:49:04 +0100 From: "Ireland International Conference on Education" Message-ID: <6424377868576222042271 at PC-11> X-url: Apologies for cross-postings. Kindly email this call for papers to your colleagues, faculty members and postgraduate students. Call for Papers, Extended Abstracts, Posters, Tutorials and Workshops! ********************************************************************************************* Ireland International Conference on Education (IICE-2014) October 27-29 Dublin, Ireland ********************************************************************************************* The Ireland International Conference on Education (IICE-2014) is an international refereed conference dedicated to the advancement of the theory and practices in education. The IICE promotes collaborative excellence between academicians and professionals from Education. The aim of IICE is to provide an opportunity for academicians and professionals from various educational fields with cross-disciplinary interests to bridge the knowledge gap, promote research esteem and the evolution of pedagogy. The IICE 2014 invites research papers that encompass conceptual analysis, design implementation and performance evaluation. All the accepted papers will appear in the proceedings and modified version of selected papers will be published in special issues peer reviewed journals. Topics: The topics in IICE-2014 include but are not confined to the following areas: * Academic Advising and Counselling * Art Education * Adult Education * APD/Listening and Acoustics in Education Environment * Business Education * Counsellor Education * Curriculum, Research and Development * Competitive Skills * Continuing Education * Distance Education * Early Childhood Education * Education for Sustainable Development * Educational Administration * Educational Foundations * Educational Psychology * Educational Technology * Education Policy and Leadership * Elementary Education * E-Learning * E-Manufacturing * ESL/TESL * E-Society * Geographical Education * Geographic information systems * Health Education * Higher Education * History * Home Education * Human Computer Interaction * Human Resource Development * Inclusive Education * Indigenous Education * ICT Education * Internet technologies * Imaginative Education * Kinesiology and Leisure Science * K12 * Language Education * Mathematics Education * Mobile Applications * Multi-Virtual Environment * Music Education * Pedagogy * Physical Education (PE) * Reading Education * Writing Education * Religion and Education Studies * Research Assessment Exercise(RAE) * Rural Education * Science Education * Secondary Education * Second life Educators * Social Studies Education * Special Education * Student Affairs * Teacher Education * Cross-disciplinary areas of Education * Ubiquitous Computing * Virtual Reality * Wireless applications * Other Areas of Education Submission: - You can submit your research paper at http:// or email it to papers-2014october at Important Dates: * Extended Abstract (Work in Progress) Submission Date: August 20, 2014 * Notification of Extended Abstract (Work in Progress) Acceptance/Rejection: August 31, 2014 * Research Paper, Student Paper, Case Study, Report Submission Date: August 25, 2014 * Notification of Research Paper, Student Paper, Case Study, Report Acceptance/Rejection: September 05, 2014 * Proposal for Workshops Submission Date: July 25 2014 * Notification of Workshop Acceptance/Rejection: July 30, 2014 * Posters Proposal Submission Date: August 01, 2014 * Notification of Posters Acceptance/Rejection: August 10, 2014 * Camera Ready Paper Due: September 20, 2014 * Early Bird Registration Deadline (Authors and Participants): May 31, 2014 - September 10, 2014 * Late Bird Registration Deadline (Authors only): September 11, 2014 - October 10, 2014 * Late Bird Registration Deadline (Participants only): August 31, 2014 - October 20, 2014 * Conference Dates: October 27- 29, 2014 For further information please visit IICE-2014 at From hamon at LIMSI.FR Sun Jul 20 20:46:11 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Sun, 20 Jul 2014 22:46:11 +0200 Subject: Ecole: BigDat 2015, 23 July registration deadline Message-ID: Date: Fri, 18 Jul 2014 22:55:33 +0200 From: "GRLMC - URV" Message-ID: <001301cfa2ca$a3821ae0$6400a8c0 at GRLMC.local> X-url: ***************************************************** INTERNATIONAL WINTER SCHOOL ON BIG DATA BigDat 2015 Tarragona, Spain January 26-30, 2015 Organized by Rovira i Virgili University ***************************************************** --- 2nd registration deadline: July 23, 2014 --- ***************************************************** AIM: BigDat 2015 is a research training event for graduates and postgraduates in the first steps of their academic career. It aims at updating them about the most recent developments in the fast developing area of big data, which covers a large spectrum of current exciting research, development and innovation with an extraordinary potential for a huge impact on scientific discoveries, medicine, engineering, business models, and society itself. Renowned academics and industry pioneers will lecture and share their views with the audience. All big data subareas will be displayed, namely: foundations, infrastructure, management, search and mining, security and privacy, and applications. Main challenges of analytics, management and storage of big data will be identified through 4 keynote lectures and 24 six-hour courses, which will tackle the most lively and promising topics. The organizers believe outstanding speakers will attract the brightest and most motivated students. Interaction will be a main component of the event. ADDRESSED TO: Graduate and postgraduates from around the world. There are no formal pre-requisites in terms of academic degrees. However, since there will be differences in the course levels, specific knowledge background may be required for some of them. BigDat 2015 is also appropriate for more senior people who want to keep themselves updated on recent developments and future trends. They will surely find it fruitful to listen and discuss with major researchers, industry leaders and innovators. REGIME: In addition to keynotes, 3 courses will run in parallel during the whole event. Participants will be able to freely choose the courses they will be willing to attend as well as to move from one to another. VENUE: BigDat 2015 will take place in Tarragona, located 90 kms. to the south of Barcelona. The venue will be: Campus Catalunya Universitat Rovira i Virgili Av. Catalunya, 35 43002 Tarragona KEYNOTE SPEAKERS: Ian Foster (Argonne National Laboratory), tba Geoffrey C. Fox (Indiana University, Bloomington), Mapping Big Data Applications to Clouds and HPC C. Lee Giles (Pennsylvania State University, University Park), Scholarly Big Data: Information Extraction and Data Mining William D. Gropp (University of Illinois, Urbana-Champaign), tba COURSES AND PROFESSORS: Hendrik Blockeel (Katholieke Universiteit Leuven), [intermediate] Decision Trees for Big Data Analytics Diego Calvanese (Free University of Bozen-Bolzano), [introductory/intermediate] End-User Access to Big Data Using Ontologies Jiannong Cao (Hong Kong Polytechnic University), [introductory/intermediate] Programming with Big Data Edward Y. Chang (HTC Corporation, New Taipei City), [introductory/advanced] From Design of Distributed and Online Algorithms to Hands-on Code Lab Practice on Real Datasets Ernesto Damiani (University of Milan), [introductory/intermediate] Process Discovery and Predictive Decision Making from Big Data Sets and Streams Gautam Das (University of Texas, Arlington), [intermediate/advanced] Mining Deep Web Repositories Maarten de Rijke (University of Amsterdam), tba Geoffrey C. Fox (Indiana University, Bloomington), [intermediate] Using Software Defined Systems to Address Big Data Problems Minos Garofalakis (Technical University of Crete, Chania) [intermediate/advanced], Querying Continuous Data Streams Vasant G. Honavar (Pennsylvania State University, University Park) [introductory/intermediate], Learning Predictive Models from Big Data Mounia Lalmas (Yahoo! Research Labs, London), [introductory] Measuring User Engagement Tao Li (Florida International University, Miami), [introductory/intermediate] Data Mining Techniques to Understand Textual Data Kwan-Liu Ma (University of California, Davis), [intermediate] Big Data Visualization Christoph Meinel (Hasso Plattner Institute, Potsdam), [introductory/intermediate] New Computing Power by In-Memory and Multicore to Tackle Big Data David Padua (University of Illinois, Urbana-Champaign), [intermediate] Data Parallel Programming Manish Parashar (Rutgers University, Piscataway), [intermediate] Big Data in Simulation-based Science Srinivasan Parthasarathy (Ohio State University, Columbus), [intermediate] Scalable Data Analysis Evaggelia Pitoura (University of Ioannina), [intermediate] Online Social Networks Vijay V. Raghavan (University of Louisiana, Lafayette), [introductory/intermediate] Visual Analytics of Time-evolving Large-scale Graphs Pierangela Samarati (University of Milan), [intermediate], Data Security and Privacy in the Cloud Peter Sanders (Karlsruhe Institute of Technology), [introductory/intermediate] Algorithm Engineering for Large Data Sets Johan Suykens (Katholieke Universiteit Leuven), [introductory/intermediate] Fixed-size Kernel Models for Big Data Domenico Talia (University of Calabria, Rende), [intermediate] Scalable Data Mining on Parallel, Distributed and Cloud Computing Systems Jieping Ye (Arizona State University, Tempe), [introductory/advanced] Large-Scale Sparse Learning and Low Rank Modeling ORGANIZING COMMITTEE: Adrian Horia Dediu (Tarragona) Carlos Mart?n-Vide (Tarragona, chair) Florentina Lilica Voicu (Tarragona) REGISTRATION: It has to be done at The selection of up to 8 courses requested in the registration template is only tentative and non-binding. For the sake of organization, it will be helpful to have an approximation of the respective demand for each course. Since the capacity of the venue is limited, registration requests will be processed on a first come first served basis. The registration period will be closed and the on-line registration facility disabled when the capacity of the venue will be complete. It is much recommended to register prior to the event. FEES: As far as possible, participants are expected to stay full-time. Fees are a flat rate covering the attendance to all courses during the week. There are several early registration deadlines. Fees depend on the registration deadline. ACCOMMODATION: Suggestions of accommodation will be provided in due time. CERTIFICATE: Participants will be delivered a certificate of attendance. QUESTIONS AND FURTHER INFORMATION: florentinalilica.voicu at POSTAL ADDRESS: BigDat 2015 Lilica Voicu Rovira i Virgili University Av. QUESTIONS AND FURTHER INFORMATION:
florentinalilica.voicu at

POSTAL ADDRESS:
BigDat 2015
Lilica Voicu
Rovira i Virgili University
Av. Catalunya, 35
43002 Tarragona, Spain
Phone: +34 977 559 543
Fax: +34 977 558 386

ACKNOWLEDGEMENTS:
Universitat Rovira i Virgili Benoist, G. Col, T. Poibeau ; et l'autre dan sa s?rie r?guli?re (Vol. 12, num. 1 : ) dont je vous invite ? d?couvrir le sommaire : Joasha Boutault Enough et too : expression de la suffisance et de l'exc?s dans les constructions ? tough ? en anglais Gilles Corminboeuf et Christophe Benzitoun Evaluation critique des mod?les graduels et non graduels de l'int?gration syntaxique Lise Hamelin Vers une analyse des marqueurs yet et still : There is still much to say Gilbert Ghio Temporalit? et aspectualit? en anglais : op?rations, repr?sentations, cognition Sonia Benamsil Les st?r?otypes de la femme dans la caricature de Dilem Ali Rudy Loock et Cyril Auran Magnitude Estimation: can it do something for your pragmatics? Tchaa Pali L'item ? y? ? du miyob? (Togo/B?nin) : verbe plein, auxiliaire ou auxiliant ? Nous vous souhaitons une bonne lecture. Bien cordialement
Gilles Col
Directeur ?ditorial Pour plus de d?tails, voici le descriptif du poste en question en cliquant sur le lien suivant : La soci?t? promet une multitude de projets vari?s et de bonnes perspectives d??volution ? long terme. Si vous ?tes int?ress? et souhaitez en savoir plus, n?h?sitez pas ? me faire parvenir votre CV actualis? ainsi que vos disponibilit?s pour une premi?re conversation t?l?phonique. A contrario, vous pouvez toujours diffuser cette offre aupr?s de personnes susceptibles d??tre int?ress?es. Bien ? vous, Ali RIAD Business Consultant Experis IT Luxembourg WE HAVE MOVED ! Bien ? vous,

Ali RIAD
Business Consultant
Experis IT Luxembourg

WE HAVE MOVED !
Rue de l?industrie 11
B?timent SOLARWIND
L-8399 Windhof
T: +352 27 40 16 20 - 22
M: +352 661 209 107
ali.riad at CANDIDATURE* *POSTE D'INGENIEUR de RECHERCHE (IR)* *ou D'INGENIEUR D'ETUDES (IE)* *TRAITEMENT AUTOMATIQUE DES LANGUES* UMR MoDyCo (UMR 7114 -- Mod?les Dynamiques Corpus) Dur?e du contrat : 6 mois Date de recrutement : 01/10/2014 Statut : Ing?nieur de recherche ou ing?nieur d'?tudes type CNRS en CDD PROFIL DU CANDIDAT : Les comp?tences requises sont celles d'un ?tudiant en informatique sp?cialis? en traitement automatique des langues (TAL) CONTEXTE SCIENTIFIQUE: Ce travail s'inscrit dans le contexte institutionnel et de recherche des projet Lyrics ( et Labex PdP-CommNum (En ligne : ). Il concerne de la classification automatique de messages sur les r?seaux sociaux (Facebook et Twetter) DESCRIPTION DU POSTE : Le programme de travail portera sur la classification automatique de tweets ? l'aide de classifeurs. Une premi?re exp?rimentation a ?t? men?e (classifieur Naives Bayes , SVM). Les premiers r?sultats nous am?nent ? approfondir d'une part le type de traits linguistiques ? consid?rer, et d'autre part ? ?laborer une m?thodologie d'?laboration des classes en fonctions des actions de communication ? observer. COMPETENCES ATTENDUES - La ma?trise des techniques d'apprentissage automatique et des logiciels (Weka, scikit-learn) ; - La connaissances des pratiques sur les r?seaux sociaux ; - Une bonne capacit? r?dactionnelle en fran?ais (r?daction d'un rapport de synth?se) ; - Capacit? ? travailler de mani?re autonome et ? faire des synth?ses sur son activit? CONDITIONS D'ADMISSION : Pour un profil IR, ?tre titulaire d'un doctorat en informatique ou en TAL Pour un profil IE, ?tre titulaire d'un master en informatique ou en TAL. LOCALISATION : Poste situ? ? l'UMR MoDyCo, Universit? Paris Ouest Nanterre La D?fense, 200 avenue de la R?publique, 92200 Nanterre. COMMENT CANDIDATER
Avant le 10 Septembre 2014, envoyer un dossier compos? de:
- Curriculum vitae
- Lettre de motivation
- Copie du dipl?me de master ou du doctorat
- Une lettre de recommandation ou un contact
Ce dossier doit ?tre envoy? par courriel ? l'adresse suivante : myriam.djedi at

SALAIRE IR : 2000 EUR
SALAIRE IE : 1 700 EUR

POUR PLUS D'INFORMATIONS :
Brigitte Juanals par courriel : brigitte.juanals at
Jean-Luc Minel jean-luc.minel at The term SENTIRE comes from the Latin feel and it is root of words such as sentiment and sensation. SENTIRE aims to provide an international forum for researchers in the field of opinion mining and sentiment analysis to share information on their latest investigations in social information retrieval and their applications both in academic research areas and industrial sectors. The broader context of the workshop comprehends Web mining, AI, Semantic Web, information retrieval and natural language processing. The workshop is going to be held in Shenzhen on 14th December 2014. For more information, please visit: RATIONALE Memory and data capacities double approximately every two years and, apparently, the Web is following the same rule. User-generated contents, in particular, are an ever-growing source of opinion and sentiments which are continuously spread worldwide through blogs, wikis, fora, chats and social networks. The distillation of knowledge from such sources is a key factor for applications in fields such as commerce, tourism, education and health, but the quantity and the nature of the contents they generate make it a very difficult task. Due to such challenging research problems and wide variety of practical applications, opinion mining and sentiment analysis have become very active research areas in the last decade. Our understanding and knowledge of the problem and its solution are still limited as natural language understanding techniques are still pretty weak. Most of current research in sentiment analysis, in fact, merely relies on machine learning algorithms. Such algorithms, despite most of them being very effective, produce no human understandable results such that we know little about how and why output values are obtained. All such approaches, moreover, rely on syntactical structure of text, which is far from the way human mind processes natural language. Next-generation opinion mining systems should employ techniques capable to better grasp the conceptual rules that govern sentiment and the clues that can convey these concepts from realization to verbalization in the human mind. TOPICS SENTIRE aims to provide an international forum for researchers in the field of opinion mining and sentiment analysis to share information on their latest investigations in social information retrieval and their applications both in academic research areas and industrial sectors. The broader context of the workshop comprehends Web mining, AI, Semantic Web, information retrieval and natural language processing. Topics of interest include but are not limited to: - Sentiment identification & classification - Opinion and sentiment summarization & visualization - Explicit & latent semantic analysis for sentiment mining - Concept-level opinion and sentiment analysis - Sentic computing - Opinion and sentiment search & retrieval - Time evolving opinion & sentiment analysis - Semantic multidimensional scaling for sentiment analysis - Multidomain & cross-domain evaluation - Domain adaptation for sentiment classification - Multimodal sentiment analysis - Multimodal fusion for continuous interpretation of semantics - Multilingual sentiment analysis & re-use of knowledge bases - Knowledge base construction & integration with opinion analysis - Transfer learning of opinion & sentiment with knowledge bases - Sentiment corpora & annotation - Affective knowledge acquisition for sentiment analysis - Biologically inspired opinion mining - Sentiment topic detection & trend discovery - Big social data analysis - Social ranking - Social network analysis - Social media marketing - Comparative opinion analysis - Opinion spam detection TIMEFRAME - August 1st, 2014: Submission deadline - September 26th, 2014: Notification of acceptance - October 20th, 2014: Final manuscripts due - December 14th, 2014: Workshop date SUBMISSIONS AND PROCEEDINGS Authors are required to follow IEEE Computer Society Press Proceedings Author Guidelines. The paper length is limited to 10 pages, including references, diagrams, and appendices, if any. Manuscripts are to be submitted through EasyChair. Each submitted paper will be evaluated by three PC members with respect to its novelty, significance, technical soundness, presentation, and experiments. Accepted papers will be published in IEEE ICDM proceedings. Selected, expanded versions of papers presented at the workshop will be invited to a forthcoming Special Issue of Cognitive Computation on opinion mining and sentiment analysis. ORGANIZERS - Erik Cambria, Nanyang Technological University (Singapore) - Bing Liu, University of Illinois at Chicago (USA) - Yunqing Xia, Tsinghua University (China) - Yongzheng Zhang, LinkedIn Inc. (USA) From hamon at LIMSI.FR Fri Jul 25 19:51:25 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Fri, 25 Jul 2014 21:51:25 +0200 Subject: Appel: Deadline extension for SSST-8 (EMNLP 2014) Message-ID: Date: Wed, 23 Jul 2014 13:56:57 -0400 From: "Carpuat, Marine" Message-ID: X-url: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) EMNLP 2014 / SIGMT / SIGLEX Workshop Oct 2014, Doha, Qatar * New submission deadline for papers and abstracts: August 1st, 2014 * * Special theme: Compositional Distributional Semantics and Machine Translation * The Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8) seeks to bring together a large number of researchers working on diverse aspects of structure, semantics and representation in relation to statistical machine translation. Since its first edition in 2006, its program each year has comprised high-quality papers discussing current work spanning topics including: new grammatical models of translation; new learning methods for syntax- and semantics-based models; formal properties of synchronous/transduction grammars (hereafter S/TGs); discriminative training of models incorporating linguistic features; using S/TGs for semantics and generation; and syntax- and semantics-based evaluation of machine translation. We invite two types of submissions this year: 1. Extended abstracts for poster or hands-on presentations on the special theme 2. Full papers spanning all areas of interest for SSST =========================== Special Theme Extended Abstracts =========================== This year, the special theme of semantics of the past three editions of SSST takes a new step with a "working workshop" bringing together researchers interested in compositional distributional semantics, distributed representations, and continuous vector space models in MT, with tutorials bridging both directions, as well as discussions and hands-on work on relevant tasks with real data. Such models have proven beneficial for a number of NLP tasks, for example phrasal similarity, lexical entailment, modeling semantic deviance, detecting order restrictions in recursive structures, or improving NP bracketing in parsing. However, they have not received as much attention in MT. Extended abstracts of at most two (2) pages should describe poster or hands-on presentations that will stimulate discussions on the special theme of compositional distributional semantics and machine translation, including position papers, recent work, pilot studies, negative results. We encourage the presentation of relevant work that has been published or submitted elsewhere, as well as new work in progress. ========= Full Papers ========= The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is now wide consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and similar notational equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages. Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations. At the same time, SMT has seen a movement toward semantics over the past few years, which has been reflected at recent SSST workshops, including the last three editions which had semantics for SMT as a special theme. The issues of deep syntax and shallow semantics are closely linked and SSST-8 continues to encourage submissions on semantics for MT in a number of directions, including semantic role labeling, sense disambiguation, and compositional distributional semantics for translation and evaluation. We invite papers on: syntax-based / semantics-based / tree-structured SMT machine learning techniques for inducing structured translation models algorithms for training, decoding, and scoring with semantic representation structure empirical studies on adequacy and efficiency of formalisms creation and usefulness of syntactic/semantic resources for MT formal properties of synchronous/transduction grammars learning semantic information from monolingual, parallel or comparable corpora unsupervised and semi-supervised word sense induction and disambiguation methods for MT lexical substitution, word sense induction and disambiguation, semantic role labeling, textual entailment, paraphrase and other semantic tasks for MT semantic features for MT models (word alignment, translation lexicons, language models, etc.) evaluation of syntactic/semantic components within MT (task-based evaluation) scalability of structured translation methods to small or large data applications of S/TGs to related areas including: speech translation formal semantics and semantic parsing paraphrases and textual entailment information retrieval and extraction syntactically- and semantically-motivated evaluation of MT compositional distributional semantics in MT distributed representations and continuous vector space models in MT ========= Organizers ========= Dekai WU, Hong Kong University of Science and Technology (HKUST) Marine CARPUAT, National Research Council (NRC) Canada Xavier CARRERAS, Universitat Polit?cnica de Catalunya (UPC) Eva Maria VECCHI, Cambridge University ============= Important Dates ============= Submission deadline for papers and extended abstracts: 1 Aug 2014 Notification to authors: 26 Aug 2014 Camera copy deadline: 15 Sep 2014 For more information ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Fri Jul 25 19:56:13 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Fri, 25 Jul 2014 21:56:13 +0200 Subject: Sujet de these: PhD thesis offer in France/Learning from Post-Edition in Machine Translation / LIFL (Lille) and LIG (Grenoble) Message-ID: Date: Wed, 23 Jul 2014 23:26:34 +0200 From: Laurent Besacier Message-Id: <75FD7EE1-DFCB-4AC7-A8D4-4B3AA30BCA57 at> PhD thesis offer in France/ Learning from Post-Edition in Machine Translation / LIFL (Lille) and LIG (Grenoble) Contacts : Olivier Pietquin : olivier.pietquin at Laurent Besacier : laurent.besacier at Problem Statistical Machine Translation (SMT) is the process by which texts are automatically translated from a source language to a target language by a machine that has been trained on corpora in both languages. Thanks to progress in the training of SMT engines, machine translation has become good enough so that it has become advantageous for translators to post-edit machine outputs rather than translate from scratch. However, current enhancement of SMT systems from human post-edition (PE) are rather basic: the post-edited output is added to the training corpus and the translation model and language model are re-trained, with no clear view of how much has been improved and how much is left to be improved. Moreover, the final PE result is the only feedback used: available technologies do not take advantage of logged sequences of post-edition actions, which inform on the cognitive processes of the post-editor. The proposed thesis aims at using the post-edition process as a demonstration of how an expert translator modifies the SMT result to produce a perfect translation. Learning from demonstration is an emerging field in machine learning, mostly applied to robotics [1] that will thus be explored further in the particular framework of SMT. Topic of research A novel approach to SMT training will be adopted in this thesis, i.e. considering the post-edition process as a sequential decision making process performed by human experts who should be imitated. This thesis? first fundamental contribution to SMT will be to reformulate the problem of post-edition in SMT as a sequential decision making problem [4]. Indeed, the hypothesis selection and ranking process occurring in an SMT system can be seen as an action selection strategy, choosing after each post-edition step amongst a large number of actions (all possible hypotheses and rankings). This strategy has to be modified according to post-edition results arising sequentially and being influenced by previous actions (hypothesis selection) of the system. From this, SMT will be casted into an imitation learning problem, that is learning from demonstrations made by an expert: post-edition results can be seen as examples of what the system should do, again in a sequential decision making process and not in a static one such as supervised learning. Indeed, SMT decoding, whether it is based on phrases or chunks, can be seen as a sequential decision making process. The sequences of decisions taken by an expert during the post-edition process can be seen as a target for the system, which will try to imitate them in similar situations. To do so, we will extend the work described in [2], that modelled semantic parsing as an Inverse Reinforcement Learning (IRL) [3]. In addition, the question of automatically selecting the sentences that should be used for post-edition and further learning will be addressed. Especially, this will be studied under the active learning paradigm. Large and diversified amounts of post-edited data, collected in an industrial setting, will be made available for the research project. Profile The applicants must hold an Engineering or a Master degree in Computational Linguistics or computer science, preferably with experience in the fields of statistical machine learning and/or natural language processing. Good background in programming will also be required. He/she will also be involved in a research project, funded by the French National Agency for Research, involving 2 research labs (LIFL in Lille and LIG in Grenoble) and a company (Lingua & Machina). For this reason good English level is required (good command of French being a plus). Finally effective communication skills in English, both written and verbal are mandatory. Context The candidate will be hired by University Lille 1 in the framework of a national research project. S/he will mainly be hosted in the SequeL ( Sequential Learning) team of the Laboratoire d?Informatique Fondamentale de Lille (LIFL). SequeL is also a common team-project with INRIA (national institute for research in computer science and mathematics) and espe- cially the INRIA Lille - Nord Europe Center. The group involves around 25 researchers working on sequential learning and is internationally recognized. Lille is the largest city of the north of France, a metropolis with 1 million inhabitants, with excellent train connections to Brussels (30 min), Paris (1h) and London (1h30). This thesis will be supervised in strong collaboration with the GETALP team of Laboratoire d?Informatique de Grenoble (LIG), widely renowned for its research on natural language and speech processing. Grenoble is a high-tech city with 4 universities. It is located at the heart of the Alps, in outstanding scientific and natural surroundings. It is 3h by train from Paris ; 2h from Geneva ; 1h from Lyon ; 2h from Torino and is less than 1h from Lyon international airport. The PhD thesis will be co-supervised by Olivier Pietquin in Lille and Laurent Besacier in Grenoble. Contacts Interviews will be held in Sept 2014. Meetings during Interspeech 2014 in Singapore can be also organized. For further info, please contact: Olivier Pietquin : olivier.pietquin at Laurent Besacier : laurent.besacier at References [1] Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5):469?483, May 2009. [2] Gergely Neu and Csaba Szepesv??ri. Training parsers by inverse reinforcement learning. Machine Learning, 77(2-3):303?337, 2009. [3] Andrew Y. Ng and Stuart J. Russell. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning, ICML ?00, pages 663?670, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc. [4] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. The MIT Press, 3rd edition, March 1998. ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Fri Jul 25 20:01:52 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Fri, 25 Jul 2014 22:01:52 +0200 Subject: Appel: Deadline Extension and Paper Length Change for Arabic Natural Language Processing Workshop at EMNLP 2014 Message-ID: Date: Thu, 24 Jul 2014 10:06:49 -0700 From: Wajdi Zaghouani Message-ID: <1406221609.28486.YahooMailNeo at> X-url: Dear all, due to several requests, we decided the following: (1) We extend the submission deadline to July 28 11:59pm (UTC/GMT -11 hours). This includes both the main workshop papers and the shared task descriptions. (2) We extend the page length of the main workshop papers to be up to 9 pages + any number of reference pages. The shared task descriptions are still up to 4 pages with 2 additional pages of references. Regards, Workshop Organizers ======================================================= Last Call for Papers and Participation EMNLP Workshop on Arabic Natural Language Processing Including Shared Task on Automatic Arabic Error Correction Apologies for multiple postings Please distribute to colleagues ======================================================= Last Call for Papers and Participation Arabic Natural Language Processing Workshop collocated with EMNLP 2014, Doha, Qatar Workshop date: Saturday October 25, 2014 Paper submission deadline: July 26, 2014 Workshop Website: Shared Task Website: ======================================================= WORKSHOP DESCRIPTION There has been a lot of progress in the last 15 years in the area of Arabic Natural Language Processing (NLP). Many Arabic NLP (or Arabic NLP-related) workshops and conferences have taken place, both in the Arab World and in association with international conferences. This workshop follows in the footsteps of previous efforts to provide a forum for researchers to share and discuss their ongoing work. We invite submissions on topics that include, but are not limited to, the following: * Basic core technologies: morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling, sentiment analysis, Arabic dialect modeling, etc. * Applications: machine translation, speech recognition, speech synthesis, optical character recognition, pedagogy, assistive technologies, social media, etc. * Resources: dictionaries, annotated data, specialized databases etc. Submissions may include work in progress as well as finished work. Submissions must have a clear focus on specific issues pertaining to the Arabic language whether it is standard Arabic, dialectal, or mixed. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work. Submissions are expected to be 8 pages long plus 2 pages for references. Associated with the workshop will be a shared task on Arabic text error correction (see link to Shared Task Website above). IMPORTANT DATES Paper submission deadline: July 26, 2014 => July 28 11:59pm (UTC/GMT -11 hours) Author notification: August 26, 2014 Camera Ready: September 15, 2014 Workshop: October 25, 2014 ORGANIZERS Program Co-chairs Nizar Habash, Columbia University Stephan Vogel, Qatar Computing Research Institute Publication Co-chairs Nadi Tomeh, Paris 13 University Houda Bouamor, Carnegie Mellon University Qatar Website Committee Kareem Darwish, Qatar Computing Research Institute Noura Farra, Columbia University Shared Task Committee Behrang Mohit (co-chair), Carnegie Mellon University Qatar Alla Rozovskaya (co-chair), Columbia University Wajdi Zaghouani, Carnegie Mellon University Qatar Ossama Obeid, Carnegie Mellon University Qatar Nizar Habash (advisor), Columbia University Program Committee Members Abdelmajid Ben-Hamadou, University of Sfax, Tunisia Abdelhadi Soudi, Ecole Nationale de l?Industrie Min?rale, Morocco Abdelsalam Nwesri, University of Tripoli, Libya Achraf Chalabi , Microsoft Research, Egypt Ahmed Ali, Qatar Computing Research Institute, Qatar Ahmed Rafea, The American University in Cairo, Egypt Alexis Nasr, University of Marseille, France Ali Farghaly, Monterey Peninsula College, USA Almoataz B. Al-Said, Cairo University, Egypt Alon Lavie, Carnegie Mellon University, USA Aly Fahmy, Cairo University, Egypt Azadeh Shakery, University of Tehran, Iran Azzeddine Mazroui, University Mohamed I, Morocco Bassam Haddad, University of Petra, Jordan Bayan Abu Shawar, Arab Open University, Jordan Behrang Mohit, Carnegie Mellon University Qatar, Qatar Eric Atwell, University of Leeds, UK Farhad Oroumchian, University of Wollongong, Australia Ghassan Mourad, Universit? Libanaise, Lebanon Hassan Sawaf, eBay Inc., USA Hazem Hajj, American University of Beirut, Lebanon Hend Alkhalifa, King Saud University, Saudi Arabia Houda Bouamor, Carnegie Mellon University Qatar, Qatar Imed Zitouni, Microsoft Research, USA Joseph Dichy, Universit? Lyon 2, France Karim Bouzoubaa , Mohammad V University, Morocco Karine Megerdoomian, The MITRE Corporation, USA Katrin Kirchhoff, University of Washington, USA Kemal Oflazer, Carnegie Mellon University Qatar, Qatar Khaled Shaalan, The British University in Dubai, UAE Khaled Shaban, Qatar University, Qatar Khalil Sima?an, Universiteit van Amsterdam, Netherlands Lamia Hadrich Belguith, University of Sfax, Tunisia Michael Rosner, University of Malta, Malta Mohamed Elmahdy, Qatar University, Qatar Mohsen Rashwan, Cairo University, Egypt Mona Diab, George Washington University, USA Mustafa Jarrar, Bir Zeit University, Palestine Nada Ghneim, Higher Institute for Applied Sciences and Technology, Syria Nadi Tomeh, University Paris 13, France Ossama Emam, IBM, USA Otakar Smr?, D??m-e D?am Language Institute, Czech Republic Owen Rambow, Columbia University, USA Preslav Nakov, Qatar Computing Research Institute, Qatar Ramzi Abbes, TECHLIMED, France Salwa Hamada, Cairo University, Egypt Shahram Khadivi, Tehran Polytechnic, Iran Sherri Condon , The MITRE Corporation, USA Taha Zerrouki, University of Bouira, Algeria Violetta Cavalli-Sforza, Al Akhawayn University, Morocco ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Fri Jul 25 20:08:40 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Fri, 25 Jul 2014 22:08:40 +0200 Subject: Appel: MICAI 2014, keynotes Vapnik (SVM), Sowa (conceptual graphs), Liu (opinion mining), Castillo (fuzzy logic) Message-ID: Date: Fri, 25 Jul 2014 07:24:31 -0500 From: "MICAI 2014" Message-ID: <004a01cfa803$68b9a980$3a2cfc80$> MICAI-2014: 13th Mexican International Conference on ARTIFICIAL INTELLIGENCE One-week draft abstracts deadline reminder EXCELLENT KEYNOTES: Vapnik, Sowa, Liu, Castillo Excellent touristic program Wworkshops: - 7th Intelligent Learning Environments - 1st Recognizing Textual Entailment and Question Answering - 7th Hybrid Intelligent Systems Tutorials: - John Sowa: Conceptual graphs and knowledge representation - Bing Liu: Opinion mining and sentiment analysis - Oscar Castillo: Fuzzy logic and more November 16 to 22, 2014 - Tuxtla Gutierrez, Chiapas, Mexico Publication: Springer LNAI (EI, ISI), IEEE CPS, journals Submission: July 31 draft abstract / Aug 6 full text (late submissions can be considered) Topics: all areas of Artificial Intelligence, research or applications. Workshops. Tutorials. Doctoral Consortium. Best papers awards. KEYNOTES: - Vladimir Vapnik (NEC Lab): inventor of SVM - John Sowa (VivoMind Research; IBM): inventor of conceptual graphs - Bing Liu (U. of Illinois): opinion mining - Oscar Castillo (Tijuana IT): fuzzy logic Proceedings: Springer LNAI (IE, ISI); IEEE CPS, special issues of journals (including ISI JCR). Venue: Tuxtla Guti?rrez, Chiapas, Mexico. Cultural program and tours: Sumidero canyon; El Chifl?n waterfalls; Tenam Puente ancient pyramids; San Cristobal de las Casas colonial city (anticipated). Dates (late submissions are possible): July 31: draft abstract -- just a general idea of what your paper will be about (takes 1 minute; you can change the text later); Aug 6: full text for blind review. See complete CFP at PLEASE CIRCULATE this CFP among your colleagues and students. We apologize if you receive multiple copies. PLEASE CIRCULATE this CFP among your colleagues and students. We apologize if you receive multiple copies. Please reply to this message if you were contacted by error. Le comit? d'organisation des J?Tou 2015 ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 29 20:15:31 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 22:15:31 +0200 Subject: Ressources: Page dediee aux formats de transcriptions et metadonnees corpus oraux, IRCOM Message-ID: Date: Mon, 28 Jul 2014 11:12:58 +0200 From: Christophe Benzitoun Message-ID: <53D6141A.6050008 at> X-url: Ch?res et chers coll?gues, Suite ? la table-ronde qui a eu lieu le 23 juin dernier ? Paris, nous avons le plaisir de vous annoncer la cr?ation de la page suivante synth?tisant un certain nombre d'interventions : Cette table-ronde avait pour objectifs principaux d'aborder les questions des formats de transcription et des m?tadonn?es pour les corpus oraux. Il s'agissait de faire un ?tat des lieux des projets finalis?s et en cours (principalement en France et dans les pays francophones). Cette journ?e a notamment abouti ? l'?laboration d'un document de synth?se des besoins et des donn?es existantes consultable ? l'adresse ci-dessus. Bien cordialement, Christophe Benzitoun au nom du comit? d'organisation de la table-ronde (Olivier Baude, Carole Etienne, Christophe Parisse) Avec le soutien d'IRCOM et ORTOLANG ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 29 20:13:29 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 22:13:29 +0200 Subject: Revue: Inaugural Issue, International Journal of Big Data Intelligence, Free-of-Charge Message-ID: Date: Sun, 27 Jul 2014 06:03:03 +0800 From: cfp at Message-Id: <201407262203.s6QM33rD021510 at> X-url: X-url: === Apologies if you receive multiple copies of this message === Dear Colleagues, We are happy to announce that International Journal of Big Data Intelligence (IJBDI) has now published its first issue and would like to invite you to read the articles listed below. IJBDI 2014 Vol. 1 No. 1/2 *Big data (lost) in the cloud Beniamino Di Martino; Rocco Aversa; Giuseppina Cretella; Antonio Esposito; Joanna Ko?odziej *Designing and implementing a cloud-hosted SaaS for data movement and sharing with SlapOS Walid Saad; Heithem Abbes; Mohamed Jemni; Christophe C?rin *Multi-source streaming-based data accesses for MapReduce systems Jiadong Wu; Bo Hong *A new approach for accurate distributed cluster analysis for Big Data: competitive K-Means Rui M?ximo Esteves; Thomas Hacker; Chunming Rong *Peculiarities of numerical algorithms parallel implementation for exa-flops multicomputers Victor E. Malyshkin *Towards quality-of-service driven consistency for Big Data management ?lvaro Garc?a-Recuero; S?rgio Esteves; Lu?s Veiga *D-CEP4CMA: a dynamic architecture for cloud performance monitoring and analysis via complex event processing Afef Mdhaffar; Riadh Ben Halima; Mohamed Jmaiel; Bernd Freisleben *An extended analytical study of Arabic sentiments Nawaf A. Abdulla; Mahmoud Al-Ayyoub; Mohammed Naji Al-Kabi *Health big data analytics: current perspectives, challenges and potential solutions Mu-Hsing Kuo; Tony Sahama; Andre W. Kushniruk; Elizabeth M. Borycki; Daniel K. Grunwell IJBDI is a peer-reviewed journal. It provides a vehicle to help professionals, academics, researchers, scientists, engineers, educators, and policy makers, working in the field of data science and management to demonstrate and explore current advances in all aspects of big data. IJBDI aims to be a leading journal in the interdisciplinary field of big data intelligence. It encourages/publishes high-quality submissions of articles on the following subjects in this field: big data science and foundations, big data infrastructure, big data management, big data intelligence, big data privacy/security and big data applications. We are writing to invite you to submit an article to IJBDI which provides a rapid forum for the dissemination of original research articles. The IJBDI has a distinguished Editorial Board with extensive academic qualifications, ensuring that the journal maintains high scientific standards and has a broad international coverage. All published articles will be arranged for abstracting and indexing services. Manuscripts should be submitted to the journal online at Once a manuscript has been accepted for publication, it will undergo language copyediting, typesetting, and reference validation in order to provide the highest publication quality possible. Please do not hesitate to contact us if you have any questions about the journal. We look forward to reading and publishing your work! Kind regards,
Robert Hsu, Editor-in-Chief
International Journal of Big Data Intelligence

To subscribe other emails or see information of this mailing list, please go to
To unsubscribe, please click at
For other questions, please send email to cfp-admin at In addition, the student will be financially supported to attend conferences and schools, and to spend a period of 6-12 months abroad. The PhD programme lasts three years. The official language of the programme and of the faculty is English. Research in the Faculty of Computer Science is divided in three research centres (see below). Candidates are strongly advised to get in contact with the desired research centre before applying. Bolzano is the city?s Italian name while Bozen is its German name: it is the capital of the multilingual province of Alto Adige / S?dtirol. Near Italy?s northern border with Austria, the city is a gateway to the Dolomites, the majestic white mountain peaks that are part of the Alps. Bolzano is an Italian city with Austrian flair. Its two lifestyles, one Northern European and the other more Mediterranean, combine to make the perfect union, which can be clearly seen in the historic and artistic treasures of this city. Bolzano is constantly among the top-ranked cities in Italy when it comes to quality of life. It has one of Europe's lowest unemployment rates, excellent services and a wonderful landscape. The landscape of the surrounding mountain sides is characterised by old wine hamlets and villages nestling amid vineyards watched over by 200 castles, stately country houses and ruins (see the wonderful video). Instructions for application (pre-enrollment): The deadline for the applications is the 29th August 2014. RESEARCH CENTRES: KRDB RESEARCH CENTRE FOR KNOWLEDGE AND DATA - web page: - contact person: Enrico Franconi franconi at - Logic based languages for knowledge representation - Intelligent access to databases - Semantic technologies - Visual and verbal paradigms for information exploration - Temporal aspects of data and knowledge - Extending database technologies - Inter-operation, verification, and composition of business processes The research topics in knowledge representation are focused on foundational and practical aspects of knowledge representation technologies applied to information systems. The whole life cycle ranging from the design to the deployment of such technologies is covered: the conceptual modelling of various types of knowledge, the linguistic and logical aspects of knowledge, the integration of heterogeneous knowledge sources, including information coming from the Internet, the usage of knowledge to support the intelligent retrieval of information, and the usage of knowledge to create virtual services on the net. INFORMATION AND DATABASE SYSTEMS ENGINEERING - Spatial and temporal databases - Approximation Techniques in databases - Query optimisation in databases - Cooperative interfaces for information access and filtering - Data mining techniques for preference elicitation and recommendation - Cloud computing and big data - Agile development & human aspects of software engineering - Software startups and open science - Design based Hardware engineering - Technology enhanced learning The research activities in the area of database and information systems focus on key aspects of applied computer science, including data warehousing and data mining, the integration of heterogeneous and distributed databases, time-varying information, data models, and query processing. The research approach is primarily constructive in its outset, and it includes substantial experimental and analytical elements. The development activities cover the design of data models and structures, and the development of algorithms, data structures, languages, and systems. The experimental activities verify real world artifacts with the help of prototypes and simulations. The analytic activities include the analysis of the algorithmic complexity and the evaluation of languages. The main goal is theoretically sound results that solve real world problems. SOFTWARE ENGINEERING - Agile methods, lean management, and open source - Measurement and study of software quality, reliability, evolution and reuse - Distributed computing and service-oriented architectures (mobile and distributes services) - IT and business alignment - Software reuse and component based development - Interoperability in collaborative systems - IT for automation - Energy-aware systems The research topics in software engineering are focused on the empirical and quantitative study of innovative models for software development. The target analysis techniques include both traditional statistics, and new approaches, such as computational intelligence, Bayesian models, and meta-analytical systems. The innovative software development techniques include (a) methods based on lean management, such as agile methods, with a specific interest for benchmarking and identification of defects, and (b) open source development models. Following the tradition of the diverse PhD training events in the field developed at Rovira i Virgili University in Tarragona since 2002, LATA 2015 will reserve significant room for young scholars at the beginning of their career. It will aim at attracting contributions from classical theory fields as well as application areas. VENUE: LATA 2015 will take place in Nice, the second largest French city on the Mediterranean coast. The venue will be the University Castle at Parc Valrose. SCOPE: Topics of either theoretical or applied interest include, but are not limited to: algebraic language theory algorithms for semi-structured data mining algorithms on automata and words automata and logic automata for system analysis and programme verification automata networks automata, concurrency and Petri nets automatic structures cellular automata codes combinatorics on words computational complexity data and image compression descriptional complexity digital libraries and document engineering foundations of finite state technology foundations of XML fuzzy and rough languages grammars (Chomsky hierarchy, contextual, unification, categorial, etc.) grammatical inference and algorithmic learning graphs and graph transformation language varieties and semigroups language-based cryptography parallel and regulated rewriting parsing patterns power series string and combinatorial issues in bioinformatics string processing algorithms symbolic dynamics term rewriting transducers trees, tree languages and tree automata unconventional models of computation weighted automata STRUCTURE: LATA 2015 will consist of: invited talks invited tutorials peer-reviewed contributions INVITED SPEAKERS: to be announced PROGRAMME COMMITTEE: Andrew Adamatzky (West of England, Bristol, UK) Andris Ambainis (Latvia, Riga, LV) Franz Baader (Dresden Tech, DE) Rajesh Bhatt (Massachusetts, Amherst, US) Jos?-Manuel Colom (Zaragoza, ES) Bruno Courcelle (Bordeaux, FR) Erzs?bet Csuhaj-Varj? (E?tv?s Lor?nd, Budapest, HU) Aldo de Luca (Naples Federico II, IT) Susanna Donatelli (Turin, IT) Paola Flocchini (Ottawa, CA) Enrico Formenti (Nice, FR) Tero Harju (Turku, FI) Monika Heiner (Brandenburg Tech, Cottbus, DE) Yiguang Hong (Chinese Academy, Beijing, CN) Kazuo Iwama (Kyoto, JP) Sanjay Jain (National Singapore, SG) Maciej Koutny (Newcastle, UK) Anton?n Ku?era (Masaryk, Brno, CZ) Thierry Lecroq (Rouen, FR) Salvador Lucas (Valencia Tech, ES) Veli M?kinen (Helsinki, FI) Carlos Mart?n-Vide (Rovira i Virgili, Tarragona, ES, chair) Filippo Mignosi (L?Aquila, IT) Victor Mitrana (Madrid Tech, ES) Ilan Newman (Haifa, IL) Joachim Niehren (INRIA, Lille, FR) Enno Ohlebusch (Ulm, DE) Arlindo Oliveira (Lisbon, PT) Jo?l Ouaknine (Oxford, UK) Wojciech Penczek (Polish Academy, Warsaw, PL) Dominique Perrin (ESIEE, Paris, FR) Alberto Policriti (Udine, IT) Sanguthevar Rajasekaran (Connecticut, Storrs, US) J?rg Rothe (D?sseldorf, DE) Frank Ruskey (Victoria, CA) Helmut Seidl (Munich Tech, DE) Ayumi Shinohara (Tohoku, Sendai, JP) Bernhard Steffen (Dortmund, DE) Frank Stephan (National Singapore, SG) Paul Tarau (North Texas, Denton, US) Andrzej Tarlecki (Warsaw, PL) Jacobo Tor?n (Ulm, DE) Frits Vaandrager (Nijmegen, NL) Jaco van de Pol (Twente, Enschede, NL) Pierre Wolper (Li?ge, BE) Zhilin Wu (Chinese Academy, Beijing, CN) Slawomir Zadrozny (Polish Academy, Warsaw, PL) Hans Zantema (Eindhoven Tech, NL) ORGANIZING COMMITTEE: S?bastien Autran (Nice) Adrian Horia Dediu (Tarragona) Enrico Formenti (Nice, co-chair) Sandrine Julia (Nice) Carlos Mart?n-Vide (Tarragona, co-chair) Christophe Papazian (Nice) Julien Provillard (Nice) Pierre-Alain Scribot (Nice) Bianca Truthe (Giessen) Florentina Lilica Voicu (Tarragona) SUBMISSIONS: Authors are invited to submit non-anonymized papers in English presenting original and unpublished research. Papers should not exceed 12 single-spaced pages (including eventual appendices, references, etc.) and should be prepared according to the standard format for Springer Verlag's LNCS series (see Submissions have to be uploaded to: PUBLICATIONS: A volume of proceedings published by Springer in the LNCS series will be available by the time of the conference. A special issue of a major journal will be later published containing peer-reviewed substantially extended versions of some of the papers contributed to the conference. Submissions to it will be by invitation. REGISTRATION: The period for registration is open from July 21, 2014 to March 2, 2015. The registration form can be found at: DEADLINES: Paper submission: October 10, 2014 (23:59 CET) Notification of paper acceptance or rejection: November 18, 2014 Early registration: November 25, 2014 Final version of the paper for the LNCS proceedings: November 26, 2014 Late registration: February 16, 2015 Submission to the journal special issue: June 6, 2015 QUESTIONS AND FURTHER INFORMATION: florentinalilica.voicu at POSTAL ADDRESS: LATA 2015 Research Group on Mathematical Linguistics (GRLMC) Rovira i Virgili University Av. QUESTIONS AND FURTHER INFORMATION:
florentinalilica.voicu at

POSTAL ADDRESS:
LATA 2015
Research Group on Mathematical Linguistics (GRLMC)
Rovira i Virgili University
Av. Catalunya, 35
43002 Tarragona, Spain
Phone: +34 977 559 543
Fax: +34 977 558 386

ACKNOWLEDGEMENTS:
Nice Sophia Antipolis University
Rovira i Virgili University Big Data is transforming science, engineering, medicine, healthcare, finance, business, and ultimately society itself. The IEEE Big Data has established itself as the top tier research conference in Big Data. The first conference IEEE Big Data 2013 ( was held in Santa Clara , CA from Oct 6-7, 2013, 259 paper submissions for the main conference and 32 paper submissions for the industry and government program. Of those, 44 regular papers and 53 short papers were accepted, which translates into a selectivity that is on-par with top tier conferences. Also, there were 14 workshops associated with IEEE Big Data 2013 covering various important topics related to various aspects of Big Data research, development and applications, and more than 400 participants from 40 countries attend the 4-day event. The IEEE International Conference on Big Data 2014(IEEE BigData 2014) continues the success of the IEEE BigData 2013. We expect to have an exciting prgoram, IEEE Big Data 2014 has received 271 paper submissions for the main conference and 37 paper submissions for the industry and government program. Also there are 21 workshops covering a lot emerging research areas associated with it, If you miss the deadline to submit a paper to the conference, you are encouraged to submit your research work to one of the workshops or poster program I. 21 Workshops (most of the workshop paper submission deadlines are in late August) 1. Scholarly Big Data: Challenges & Issues ( 2. The 2nd Workshop on Scalable Machine Learning: Theory and Applications ( 3. 1st International Workshop on High Performance Big Graph Data Management, Analysis, and Mining ( 4. Big Data in Motion and Big Data at Rest ( 5. Workshop on Enterprise Big Data Semantic and Analytics Modeling ( 6. The Second Workshop on Distributed Storage Systems and Coding for Big Data ( 7. First IEEE International Workshop on Big Data Security and Privacy (BDSP 2014) ( 8. The 2nd International Workshop of BigData in Bioinformatics and Healthcare Informatics ( 9. Solar Astronomy Big Data (SABiD) ? 1st Workshop on Management, Search and Mining of Massive Repositories of Solar Astronomy Data ( 10. Using Big Data to Understand Spatial Connectivity ( 11. CASK-141st International Workshop on Collaborative methodologies to Accelerate Scientific Knowledge discovery in big data ( 12. Rapid Response Cyber Forensics Workshop ( 13. First Hands-On Workshop on Leveraging High Performance Computing Resources for Managing Large Datasets ( 14. Workshop on BigData and Service Discovery ( 15. Workshop on Advances in Software and Hardware for Big Data to Knowledge Discovery (ASH) ( 16. IEEE Big Data Workshop on Semantics for Big Data on the Internet of Things (SemBIoT 2014) ( 17. Big Data in Computational Epidemiology ( 18. Large Scale Data Analytics in Transportation and Railway Infrastructure ( 19. 2nd Workshop on Scalable Cloud Data Management ( 20. Big Humanities Data ( 21. Complexity for Big Data( II. Poster (Submission deadline: Sept 27, 2012) Poster abstracts are limited to one page, must be camera-ready, and must follow the same formatting requirements as the main papers Online Submission: ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 29 21:19:28 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 23:19:28 +0200 Subject: Conf: Coling 2014, Programme, 25-29 August 2014, Dublin, Ireland Message-ID: Date: Tue, 29 Jul 2014 14:34:38 +0000 From: COLING 2014 - Registration Message-ID: X-url: Less that 1 month left to register! Click here to Register Click for details on Main Conference (, 1 or 2 day Workshops ( and half day Tutorials ( and Demos ( Not forgetting the Excursion Day! ( Conference Programme Monday - August 25th ( 09:00 - 10:15 Invited Speaker: Mary Harper, IARPA Learning from 26 languages: Program Management and Science in the Babel Program 10:45 - 12:25 Modeling of Discourse and Dialogue I 10:45 - 12:25 Sentiment Analysis, Opinion Mining and Social Media I 10:45 - 12:25 Information Retrieval and Question Answering 10:45 - 12:25 Machine Learning for CL and NLP 15:45 - 17:25 Modeling of Discourse and Dialogue II 15:45 - 17:25 Sentiment Analysis, Opinion Mining and Social Media III 15:45 - 17:25 Semantic Processing, Distributional Semantics and Compositional Semantics I 15:45 - 17:25 Software, Tools Tuesday - August 26th ( 09:00 - 10:15 Invited Speaker: Ted Gibson, MIT Language for communication: Language as rational inference 10:45 - 12:25 Syntax, grammar induction, syntactic and semantic parsing I 10:45 - 12:25 Sentiment Analysis, Opinion Mining and Social Media III 10:45 - 12:25 Applications I 10:45 - 12:25 Modeling of Discourse and Dialogue III 15:45 - 17:25 Syntax, grammar induction, syntactic and semantic parsing II 15:45 - 17:25 Semantic Processing, Distributional Semantics and Compositional Semantics II 15:45 - 17:25 Applications II 15:45 - 17:25 Language Resources Wednesday - August 27th ( 09:00 - 10:15 Exursion Day Thursday - August 28th ( 09:00 - 10:15 Invited Speaker: Qun Liu, CNGL/DCU Annotation Adaptation and Language Adaptation in NLP 10:45 - 12:25 IE/database linking I 10:45 - 12:25 Lexical Semantics and Ontologies I 10:45 - 12:25 Natural Language Generation and Summarization I 10:45 - 12:25 Modeling of Discourse and Dialogue IV and Multimodal Processing 14:00 - 15:15 Semantic Processing, Distributional Semantics and Compositional Semantics III 14:00 - 15:15 Morphology, word segmentation, tagging and chunking I 14:00 - 15:15 Speech Recognition, Text-To-Speech, Spoken Language Understanding 14:00 - 15:15 Lesser Resourced Languages 15:45 - 17:25 Syntax, grammar induction, syntactic and semantic parsing III 15:45 - 17:25 Machine Translation I 15:45 - 17:25 Linguistic and Cognitive Issues in CL and NLP I 15:45 - 17:25 Natural Language Generation and Summarization II and Paraphrasing Friday - August 29th ( 09:00 - 10:15 Invited Speaker: Martin Kay, XEROX Does a Computational Linguist have to be a Linguist? 10:45 - 12:25 Machine Translation II 10:45 - 12:25 IE/database linking II 10:45 - 12:25 Linguistic and Cognitive Issues in CL and NLP II 10:45 - 12:25 Lexical Semantics and Ontologies II 14:00 - 15:15 Machine Translation III 14:00 - 15:15 Lexical Semantics and Ontologies III 14:00 - 15:15 IE/database linking III 14:00 - 15:15 Morphology, word segmentation, tagging and chunking II 15:45 - 17:25 Best Paper Talk and Closing The conference committee and organisers take no responsibility for changes or inaccuracies to the conference programme. The above programme is subject to change. Accommodation Don't forget to book your accommodation at time of registering. Rooms are limited on campus and early booking is advisable! To view accommodation options click here ( Just book on the registration form at the same time as your registration. Our Sponsors ( Ireland Inspires! Click here to see the Ireland Inspires Video ( Coling 2014 ( | coling2014reg at | View in Browser ( From hamon at LIMSI.FR Tue Jul 29 20:28:22 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 22:28:22 +0200 Subject: Job: Offre d'emploi annotateur Message-ID: Date: Mon, 28 Jul 2014 11:42:52 +0200 From: Rachid Ammari Message-ID: ===================== CDD Linguiste/Annotateur Anglais am?ricain ===================== Sujet ===== Validation linguistique de clusters de mots anglais (anglais am?ricain) Description =========== L'objectif est de constituer, valider et affiner des clusters de mots par th?matique. Vous travaillerez en collaboration avec les services R&D et marketing de Weborama. Comp?tences requises ==================== - Anglais am?ricain langue maternelle - Connaissance en linguistique (lexicologie, s?mantique) Type de contrat =============== La mission consiste dans un premier temps ? constituer les clusters (1 mois ? mi-temps environ), puis ? intervenir ponctuellement pour des mises ? jour. Ce travail peut donc s'effectuer dans le cadre d'un CDD ? temps partiel et en t?l?travail si n?cessaire. D?but du contrat ================ Ao?t/Septembre 2014 Localisation du poste ===================== 75019 Paris T?l?travail possible Indemnit? ========= Selon profil Comment postuler ================ Merci de faire parvenir votre candidature ? l'adresse suivante : rachid at ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 29 20:39:13 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 22:39:13 +0200 Subject: Appel: First Call for Participation, SemEval - Task 4, TimeLine: Cross-Document Event Ordering (pilot task) Message-ID: Date: Mon, 28 Jul 2014 13:32:44 +0000 From: "Erp, M.G.J. van" Message-ID: X-url: X-url:!forum/semeval-task4-timeline SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering (pilot task) First Call for Participation Website: Google Group:!forum/semeval-task4-timeline Evaluation period: November 15 - 30, 2014 Paper submission: January 2015 *Introduction* In any domain, professionals need to have access to knowledge in order to take well-informed decisions. An insightful way of presenting information in an easily updatable and complete manner is to present it on a timeline that is continuously updated with new information. The aim of the task is to build timelines from written news in English. More specifically, the goal is to order on a timeline all the events in which a target entity is involved. We focus mainly on cross-document event coreference resolution and cross-document temporal relation extraction. Temporal relation extraction has been the topic of the three past TempEval tasks as part of SemEval: - TempEval-1 (2007): Temporal Relation Identification - TempEval-2 (2010): Evaluating Events, Time Expressions, and Temporal Relations - TempEval-3 (2013): Temporal Annotation In addition, temporal relation extraction has been the focus of the 6th i2b2 NLP Challenge for clinical records but the cross-document aspect, has not been often explored. At RANLP 2009 there was a cross-document temporal relation extraction task, in which the goal was to link pre-defined events involving the same centroid entities (i.e. entities frequently participating in events) on a timeline. Nominal coreference resolution has been the topic of SemEval 2010 Task on Coreference Resolution in Multiple Languages. Partially motivated by the work in the NewsReader project (, TimeLine goes beyond the these tasks by addressing coreference resolution for events and temporal relation identification across documents. *Task Description* Given a set of documents and a target entity, the task is to build an event TimeLine related to that entity, i.e. to detect, anchor in time and order the events involving the target entity. As input data, we provide a set of documents and a set of target entities (people, organization, product or financial entity); only entities of interest will be selected as target entities, i.e. entities involved in many events across different documents and for which it is relevant to build a timeline. There are two tracks in this task based on the data used as input. For Track A only raw text is provided to the participants, while for Track B gold-standard event mentions are also given. For both tracks the expected output is one TimeLine for each target entity. Each TimeLine consists of an ordered list of events in which each event is associated to a time anchor. For both tracks a sub-track in which the events are not associated to a time anchor is proposed. Participants can choose to participate in any track and subtrack. Participants can submit up to two runs for each track/subtrack. *Data* The trial data consists of a set of 30 documents collected from Wikinews ( about Apple Inc. A set of target entities (input) and the corresponding ordered list of events (the output timeline) is provided with the set of documents. The trial data have been annotated with the extents of event mentions and are available from The evaluation tool can also be found there. The evaluation data will consist of 3 sets of documents annotated with event mentions and a set of target entities. Each set will contain around 30 documents from Wikinews, totalling around 30,000 tokens. For each set of documents, one file is provided containing the list of target entities. No training corpus will be provided for this task. *Evaluation Methodology* Participants will submit the TimeLines produced by their system for all target entities. *Organisers*
- Anne-Lyse Minard, Fondazione Bruno Kessler, Italy
- Eneko Agirre, The University of the Basque Country, Spain
- Itziar Aldabe, The University of the Basque Country, Spain
- Marieke van Erp, VU University Amsterdam, Netherlands
- Bernardo Magnini, Fondazione Bruno Kessler, Italy
- German Rigau, The University of the Basque Country, Spain
- Manuela Speranza, Fondazione Bruno Kessler, Italy
- Rub?n Urizar, The University of the Basque Country, Spain Since its first edition in 2006, its program each year has comprised high-quality papers discussing current work spanning topics including: new grammatical models of translation; new learning methods for syntax- and semantics-based models; formal properties of synchronous/transduction grammars (hereafter S/TGs); discriminative training of models incorporating linguistic features; using S/TGs for semantics and generation; and syntax- and semantics-based evaluation of machine translation. We invite two types of submissions this year: 1. Extended abstracts for poster or hands-on presentations on the special theme 2. Full papers spanning all areas of interest for SSST =========================== Special Theme Extended Abstracts =========================== This year, the special theme of semantics of the past three editions of SSST takes a new step with a "working workshop" bringing together researchers interested in compositional distributional semantics, distributed representations, and continuous vector space models in MT, with tutorials bridging both directions, as well as discussions and hands-on work on relevant tasks with real data. Such models have proven beneficial for a number of NLP tasks, for example phrasal similarity, lexical entailment, modeling semantic deviance, detecting order restrictions in recursive structures, or improving NP bracketing in parsing. However, they have not received as much attention in MT. Extended abstracts of at most two (2) pages should describe poster or hands-on presentations that will stimulate discussions on the special theme of compositional distributional semantics and machine translation, including position papers, recent work, pilot studies, negative results. We encourage the presentation of relevant work that has been published or submitted elsewhere, as well as new work in progress. ========= Full Papers ========= The need for structural mappings between languages is widely recognized in the fields of statistical machine translation and spoken language translation, and there is now wide consensus that these mappings are appropriately represented using a family of formalisms that includes synchronous/transduction grammars and similar notational equivalents. To date, flat-structured models, such as the word-based IBM models of the early 1990s or the more recent phrase-based models, remain widely used. But tree-structured mappings arguably offer a much greater potential for learning valid generalizations about relationships between languages. Within this area of research there is a rich diversity of approaches. There is active research ranging from formal properties of S/TGs to large-scale end-to-end systems. There are approaches that make heavy use of linguistic theory, and approaches that use little or none. There is theoretical work characterizing the expressiveness and complexity of particular formalisms, as well as empirical work assessing their modeling accuracy and descriptive adequacy across various language pairs. There is work being done to invent better translation models, and work to design better algorithms. Recent years have seen significant progress on all these fronts. In particular, systems based on these formalisms are now top contenders in MT evaluations. At the same time, SMT has seen a movement toward semantics over the past few years, which has been reflected at recent SSST workshops, including the last three editions which had semantics for SMT as a special theme. The issues of deep syntax and shallow semantics are closely linked and SSST-8 continues to encourage submissions on semantics for MT in a number of directions, including semantic role labeling, sense disambiguation, and compositional distributional semantics for translation and evaluation. We invite papers on: syntax-based / semantics-based / tree-structured SMT machine learning techniques for inducing structured translation models algorithms for training, decoding, and scoring with semantic representation structure empirical studies on adequacy and efficiency of formalisms creation and usefulness of syntactic/semantic resources for MT formal properties of synchronous/transduction grammars learning semantic information from monolingual, parallel or comparable corpora unsupervised and semi-supervised word sense induction and disambiguation methods for MT lexical substitution, word sense induction and disambiguation, semantic role labeling, textual entailment, paraphrase and other semantic tasks for MT semantic features for MT models (word alignment, translation lexicons, language models, etc.) evaluation of syntactic/semantic components within MT (task-based evaluation) scalability of structured translation methods to small or large data applications of S/TGs to related areas including: speech translation formal semantics and semantic parsing paraphrases and textual entailment information retrieval and extraction syntactically- and semantically-motivated evaluation of MT compositional distributional semantics in MT distributed representations and continuous vector space models in MT ========= Organizers ========= Dekai WU, Hong Kong University of Science and Technology (HKUST) Marine CARPUAT, National Research Council (NRC) Canada Xavier CARRERAS, Universitat Polit?cnica de Catalunya (UPC) Eva Maria VECCHI, Cambridge University ============= Important Dates ============= Submission deadline for papers and extended abstracts: 1 August 2014 Notification to authors: 26 Aug 2014 Camera copy deadline: 15 Sep 2014 For more information ------------------------------------------------------------------------- Message diffuse par la liste Langage Naturel Informations, abonnement : English version : Archives : La liste LN est parrainee par l'ATALA (Association pour le Traitement Automatique des Langues) Information et adhesion : ATALA d?cline toute responsabilit? concernant le contenu des messages diffus?s sur la liste LN ------------------------------------------------------------------------- From hamon at LIMSI.FR Tue Jul 29 21:27:27 2014 From: hamon at LIMSI.FR (Thierry Hamon) Date: Tue, 29 Jul 2014 23:27:27 +0200 Subject: Appel: 2nd International Workshop on Definitions in Ontologies (IWOOD, ex-DO 2014) - EXTENDED DEADLINE: August 15, 2014 Message-ID: Date: Tue, 29 Jul 2014 15:05:46 -0400 From: seljamar Message-ID: <2042ab4f504696d102700e43f5591adb at> X-url: Apologies for cross-posting Please forward this message to colleagues in the areas of interest NEW EXTENDED DEADLINE: August 15, 2014 Second International Workshop on Definitions in Ontologies (IWOOD 2014) at the International Conference on Biomedical Ontologies (ICBO 2014) October 6-7, 2014 Houston, USA Website: This workshop is a follow-up to the workshop on Definitions in Ontologies (DO 2013) held last year in Montreal in conjunction with ICBO 2013. The focus of this second workshop is on definition practices in either human or machine-assisted ontology development. PRESENTATION A current problem in ontology development is constructing the needed definitions of terms either logical or in natural language. For example, ontologies built using OBO Foundry principles are advised to include both logical and natural language definitions, but ontology developers too often focus on only one of these, or they pay insufficient attention to whether they are equivalent. Explicit definitions of terms in ontologies serve a number of purposes. Logical definitions allow reasoners to create inferred hierarchies, lessening the burden of asserting and checking the validity of subsumptions. Natural language definitions help to ameliorate the pervasive problem of low inter-annotator agreement. In specialized domains, experts will know their own field well, but may only have limited knowledge of adjacent disciplines. Good definitions make it possible for non-experts to understand unfamiliar terms and thereby make it possible for more confident reuse of terms by external ontologies, which in turn facilitates data integration. The goal of this workshop is to bring together interested researchers and developers to explore these issues by presenting case studies in a biomedical domain discussing the difficulties that arise when constructing definitions with a view to sharing strategies in the future. Even in the seemingly narrow domain of definition construction, cross-fertilization from related disciplines should yield benefits in quality and help to identify novel approaches. Papers submitted should include one or more case studies and raise specific questions related to definitions with a link to a biomedical domain. Reports on successful or unsuccessful methods are both appropriate. TOPICS - experiences in formulating definitions - tools that assist in definition editing, including collaborative systems - coordination of logical and textual definitions - validation and quality control of definitions, e.g., checking that definitions comply with the all/some form - methods for constructing definitions from multiple sources - use of controlled languages such as Rabbit or ACE for more user-friendly logical definition creation - use of templates to systematize definition creation FORMAT AND OUTCOMES This will be a half-day workshop with a selected mix of presentations based on accepted papers. In order to promote discussion, each presentation will be followed by a short response by a participant of the workshop to be arranged in advance of the workshop. This workshop will document findings on the workshop?s website ( We expect accepted papers to be published in the Journal of Biomedical Semantics (JBS). INTENDED AUDIENCE - ontologists, tool developers, and domain experts whose work encounters issues regarding definitions - tool developers building definition- or ontology-authoring tools - philosophers and logicians - biomedical researchers working on definitions in nomenclatures such as SNOMED - computer scientists addressing these issues in languages like OWL - NLP researchers working on definition extraction, generation, or checking - NLP/IR researchers reusing definitions produced for ontologies SUBMISSIONS All papers should include one or more case studies and raise specific questions related to definitions with a link to a biomedical domain. Papers should be between 5 and 10 pages long (rendered), excluding references, formatted using the JBS templates at, and submitted via EasyChair ( IMPORTANT DATES Workshop paper submission: August 15, 2014 Notification of paper acceptance: September 1, 2014 Camera-ready copies for the proceedings: September 15, 2014 Workshops: October 6-7, 2014 ORGANIZING COMMITTEE Selja Sepp?l? PROGRAM COMMITTEE
Nathalie Aussenac-Gilles (National Center for Scientific Research (CNRS), France)
M?lanie Courtot (MBB Department Simon Fraser University and BC Public Health Microbiology & Reference Laboratory, Canada)
Natalia Grabar (Universit? de Lille 3, France)
Janna Hastings (European Bioinformatics Institute, Cambridge, UK)
James Malone (European Bioinformatics Institute, Cambridge, UK)
Alexis Nasr (Aix Marseille Universit?, France)
Richard Power (The Open University, UK)
Allan Third (The Open University, UK)

SUPPORTED BY
The Swiss National Science Foundation (SNSF)
The State University of New York at Buffalo