27.1010, Calls: Portuguese, Comp Ling, Text/Corpus Ling, Translation/Portugal
The LINGUIST List via LINGUIST
linguist at listserv.linguistlist.org
Fri Feb 26 18:24:20 UTC 2016
LINGUIST List: Vol-27-1010. Fri Feb 26 2016. ISSN: 1069 - 4875.
Subject: 27.1010, Calls: Portuguese, Comp Ling, Text/Corpus Ling, Translation/Portugal
Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Anthony Aristar, Helen Aristar-Dry, Sara Couture)
Homepage: http://linguistlist.org
***************** LINGUIST List Support *****************
25 years of LINGUIST List!
Please support the LL editors and operation with a donation at:
http://funddrive.linguistlist.org/donate/
Editor for this issue: Anna White <awhite at linguistlist.org>
================================================================
Date: Fri, 26 Feb 2016 13:24:10
From: António Branco [Antonio.Branco at di.fc.ul.pt]
Subject: Workshop on Corpora and Tools for Processing Corpora
Full Title: Workshop on Corpora and Tools for Processing Corpora
Short Title: WCTPC'2016
Date: 12-Jul-2016 - 12-Jul-2016
Location: Tomar, Portugal
Contact Person: Hilário Fontes
Meeting Email: hilario.fontes at ec.europa.eu
Web Site: http://propor2016.di.fc.ul.pt/?page_id=383
Linguistic Field(s): Computational Linguistics; Text/Corpus Linguistics; Translation
Subject Language(s): Portuguese (por)
Call Deadline: 15-Apr-2016
Meeting Description:
Workshop on Corpora and Tools for Processing Corpora
http://propor2016.di.fc.ul.pt/?page_id=383
July 12, 2016 — Tomar, Portugal
Co-located with PROPOR 2016
http://propor2016.di.fc.ul.pt/
Motivation:
A great deal of the popularity of statistical machine translation solutions is
due to the availability of software packages that are making increasingly
easier and faster to train a working machine
translation system. For this deployment to take place, these packages have
been seen as just requiring to be fed with a sufficiently large volume of
data, including some form of parallel corpora of raw text.
While advances in ever more sophisticated aspects of language technology have
permitted this to become increasingly feasible, it has been left in the shadow
the fact that the data needed to feed these systems still require a
considerable deal of preparation. Given the volume of appropriate corpora
needed, this preparation can only be practical if suitable datasets are
available, on the one hand; and, on the other hand, if this preparation is
supported by a number of shallow processing tools, such as boilerplate
removers, tokenisers, orthographic normalisers, hyphenators, foreign word
detectors, inflectional analysers, etc.
While the construction of this type of tools is no longer a hot topic for
cutting-edge research in language technology, resorting to them may turn out
to be in many cases less easy than finding and using the much more
sophisticated modules needed to deploy the machine translation systems. This
is a specially acute situation when it comes to the vast majority of
languages, which are comparatively less resourced than English in terms of
language technology, and it comes to tools performing at the state of the art
level and furthermore are openly available to be reused.
It goes without saying that these negative circumstances go on par with and
get aggravated by the fact that suitable parallel texts are not available or
easy to obtain. Interestingly, many times such tools and
datasets exist and yet their development has never been documented in a
publication or their availability has never been disseminated.
Aims:
The present workshop seeks to contribute to improve on this state of affairs
by helping to map both available parallel datasets suitable to feed
statistical machine translation systems and available language processing
tools useful for their preparation.
While pursuing this goal, the workshop seeks also to exchange ideas and
disseminate best practices that help to foster the ELRC and CEF.AT
(http://www.lr-coordination.eu) initiatives.
Call for Papers:
We thus invite submissions reporting on language resources suitable to support
statistical machine translation from/into Portuguese and on processing tools
for their preparation. Different types of presentations are possible, under
the form of an oral presentation and/or of a demonstration. While the workshop
seeks to attract and promote papers concerning language resources and tools
not yet documented in previous publications, for the sake of encompassing
representativeness, renewed
papers on the other tools and resources are also welcome.
Dates:
February 25: First call for papers
March 21: Final call for papers
April 15: Deadline for submissions
May 16: Notification sent to authors
June 1: Camera-ready papers ready
July 12, 2016: Workshop takes place
Organization Committee:
Hilário Leal Fontes, DGT — European Commission (chair)
Paulo Batista, DGT — European Commission
António Branco, University of Lisbon
Programme Committee:
Hilário Leal Fontes, European Commission (co-chair)
António Branco, University of Lisbon (co-chair)
Alexandru Ceausu, AMPLEXOR Luxembourg
Aline Villavicencio, Universidade Federal do Rio Grande do Sul
Amália Mendes, Centro de Linguística da Universidade de Lisboa
Belinda Maia, Universidade do Porto
Francis Tyers, Universitetet i Tromsø
Gabriel Lopes, Faculdade de Ciências e Tecnologia, UNL
Gorka Labaka, University of the Basque Country
Jorge Baptista, CECL/U. Algarve and L2F-Spoken Language Lab/INESC ID Lisboa
José Ramom Pichel Campos, imaxin|software
Luís Trigo, LIAAD-INESC Porto L.A.
Luísa Coheur, IST/INESC-ID Lisboa
M.T. Carrasco Benitez, European Commission
Maria José Machado, European Commission
Michael Jellinghaus, European Commission
Mikel Forcada, DLSI — Universitat d’Alacant
Paulo Quaresma, Universidade de Évora
Paulo Correia, European Commission
Thiago Pardo, Universidade de São Paulo
Xavier Gómez Guinovart, Universidade de Vigo
Contact:
Hilário Leal Fontes, hilario.fontes at ec.europa.eu
------------------------------------------------------------------------------
***************** LINGUIST List Support *****************
Please support the LL editors and operation with a donation at:
http://funddrive.linguistlist.org/donate/
----------------------------------------------------------
LINGUIST List: Vol-27-1010
----------------------------------------------------------
More information about the LINGUIST
mailing list