[Corpora-List] 3rd CALL FOR PAPERS: Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12 WS and Shared Task) at COLING 2012

Tsuyoshi Okita tsuyoshi.okita at gmail.com
Tue Sep 4 15:38:20 UTC 2012


-----Apologies for duplicat multiple postings-----
***THIRD CALL FOR PAPERS***

Second Workshop on Applying Machine Learning Techniques to Optimise
the Division of Labour in Hybrid MT (ML4HMT-12 WS and Shared Task) at
COLING 2012

Mumbai (India), 9th December, 2012
URL: http://www.dfki.de/ml4hmt/

The workshop and associated shared task are an effort to trigger a
systematic investigation on improving state-of-the-art hybrid machine
translation, making use of advanced machine-learning (ML)
methodologies. It follows the ML4HMT-11 workshop which took place last
November in Barcelona. The first workshop also road-tested a shared
task (and associated data set) and laid the basis for a broader reach
in 2012.
Regular Papers ML4HMT-12

We are soliciting original papers on hybrid MT, including (but not
limited to):
* use of machine learning methods in hybrid MT;
* system combination: parallel in multi-engine MT (MEMT) or sequential
  in statistical post-editing (SPMT);
* combining phrases and translation units from different types of MT;
* syntactic pre-/re-ordering;
* using richer linguistic information in phrase-based or in hierarchical
  SMT;
* learning resources (e.g., transfer rules, transduction grammars) for
  probabilistic rule-based MT.

Full papers should be anonymous and follow the COLING full paper
format (http://www.coling2012-iitb.org/call_for_papers.php). To submit
contributions, please follow the instructions at the Workshop
management system submission website:
https://www.softconf.com/coling2012/ML4HMT12/. The contributions will
undergo a double-blind review by members of the programme committee.


Shared Task ML4HMT-12

The main focus of the Shared Task is to address the question:

-Can Hybrid MT and System Combination techniques benefit from extra
 information (linguistically motivated, decoding, runtime, confidence
 scores, or other meta-data) from the systems involved?

Participants are invited to build hybrid MT systems and/or system
combinations by using the output of several MT systems of different
types, as provided by the organisers.  While participants are
encouraged to use machine learning techniques to explore the
additional meta-data information sources, other general improvements
in hybrid and combination based MT are welcome to participate in the
challenge.  For systems that exploit additional meta-data information
the challenge is that additional meta-data is highly heterogeneous and
(individual) system specific.


Data: The ML4HMT-12 Shared Task involves (ES-EN) and (ZH-EN) data
sets, in each case translating into EN.


* (ES-EN): Participants are given a bilingual tuning set aligned
  at a sentence level. Each "bilingual sentence" contains: 1) the
  source sentence, 2) the target (reference) sentence and 3) the
  corresponding multiple output translations from four systems, based
  on different MT approaches (Apertium, Ramirez-Sanchez, 2006; Lucy,
  Alonso and Thurmair, 2003; Moses, Koehn et. al., 2007). The output
  has been annotated with system-internal meta-data information
  derived from the translation process of each of the systems.

* (ZH-EN) A corresponding data set for ZH-EN with output translations
  from three systems (Moses, Koehn et. al., 2007;ICT_Chiero, Mi
  et. al., 2009; and Huajian RBMT) will be provided. (Participants
  are required to fill out a shared task evaluation agreement form
  and obtain the ZH-EN data from LDC).
Participants are challenged to build an MT mechanism where possible
making effective use of the system-specific MT meta-data output. They
can provide solutions based on opensource systems, or develop their
own mechanisms. The tuning set can be used for tuning the systems or
for training the systems. Final submissions have to include
translation output on a test set, which will be made available one
week after training data release. Data will be provided to build
language/reordering models, possibly re-using existing resources from
MT research.

Participants can also make use of additional (linguistic analysis,
confidence estimation etc.) tools, if their systems require so, but
they have to explicitly declare this upon submission, so that they are
judged as "unconstrained" systems. This will allow for a better
comparison between participating systems.

System output will be judged via peer-based human evaluation as well
as automatic evaluation. During the evaluation phase, participants
will be requested to rank system outputs of other participants through
a web-based interface (Appraise, Federmann 2010). Automatic metrics
include BLEU (Papineni et. Al, 2002), TER (Snover et al., 2006) and
METEOR (Lavie, 2005).

Shared task participants will be invited to submit system description
papers (7 pages, not blind and should follow COLING format,
http://www.coling2012-iitb.org/call_for_papers.php).

For submissions, please follow the instructions at the Workshop
management system submission
website:https://www.softconf.com/coling2012/ML4HMT12/


Important Dates 2012


15th August: Shared task Tuning data release (updated ML4HMT corpus)
23rd August: Shared task Test data release
15th September: Shared task Translation results submission deadline
21st September: Shared task Evaluation results release
30th September: Workshop full paper and Shared task system description
paper submission deadline
31st October: Workshop paper accept/reject notification
15th November: Workshop and Shared task Camera ready paper due
9th December: ML4HMT-12 Workshop


Organizers


-Prof. Josef van Genabith, Dublin City University (DCU) and Centre for
 Next Generation Localisation (CNGL)
-Prof. Toni Badia, Universitat Pompeu Fabra and Barcelona Media (BM)
-Christian Federmann, German Research Center for Artificial Intelligence
 (DFKI), contact person:cfedermann at dfki.de
-Dr. Maite Melero, Barcelona Media (BM)
-Dr. Marta R. Costa-jussa, Barcelona Media (BM)
-Dr. Tsuyoshi Okita, Dublin City University (DCU)


Program committee


- Eleftherios Avramidis (German Research Center for Artificial
Intelligence, Germany)
- Prof. Sivaji Bandyopadhyay (Jadavpur University, India)
- Dr. Rafael Banchs (Institute for Infocomm Research - I2R, Singapore)
- Prof. Loic Barrault (LIUM - University of Le Mans, France)
- Prof. Antal van den Bosch (Centre for Language Studies, Radboud
University Nijmegen, Netherlands)
- Dr. Grzegorz Chrupala (Saarland University, Saarbrucken, Germany)
- Prof. Jinhua Du (Xi'an University of Technology (XAUT), China)
- Dr. Andreas Eisele (Directorate-General for Translation (DGT), Luxembourg)
- Dr. Cristina Espana-Bonet (Technical University of Catalonia, TALP,
Barcelona)
- Dr. Declan Groves (Center for Next Generation Localisation, Dublin City
University, Ireland)
- Prof. Jan Hajic (Institute of Formal and Applied Linguistics, Charles
University in Prague)
- Prof. Timo Honkela (Aalto University, Finland)
- Dr. Patrick Lambert (LIUM - University of Le Mans, France)
- Prof. Qun Liu (Institute of Computing Technology, Chinese Academy of
Sciences, China)
- Dr. Maite Melero (Barcelona Media Innovation Center, Spain)
- Dr. Tsuyoshi Okita (Dublin City University, Ireland)
- Prof. Pavel Pecina (Institute of Formal and Applied Linguistics, Charles
University in Prague)
- Dr. Marta R. Costa-jussa (Barcelona Media Innovation Center, Spain)
- Dr. Felipe Sanchez Martinez (Escuela Politecnica Superior, Universidad de
Alicante, Spain)
- Dr. Nicolas Stroppa (Google, Zurich, Switzerland)
- Prof. Hans Uszkoreit (German Research Center for Artificial Intelligence,
Germany)
- Dr. David Vilar (German Research Center for Artificial Intelligence,
Germany)


The ML4HMT workshop is supported by the META-NET T4ME project
(http://www.meta-net.eu/), funded by the DG INFSO of the European
Commission through the Seventh Framework Programme, grant agreement
no.: 249119.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120904/895728b8/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list