<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">Dear Corpora Colleagues</font></span></div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"></font></span> </div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">Happy New Year!</font></span></div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"></font></span> </div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">Some time ago I posted a query "comparable corpora and computer-aided translation" to ask about any progress of the application of comparable corpora in computer-aided translation and possible readings. Here is a late summary of the replies. I would like to thank all of the colleagues below for their contributions.</font></span></div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"></font></span> </div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt">
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">All the best</font></span></div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"></font></span> </div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">Xiaotian Guo</font></span></div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">SOAS & New Vision Language Centre</font></span></div></span></div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"></font></span> </div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">-----------------------------------------------------</font></span></div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"></font></span> </div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">1. <b style="mso-bidi-font-weight: normal">Gill Philip</b> recommends an article of hers : Gill Philip (2009) Arriving at equivalence: Making a case for comparable general reference corpora in translation studies. In Allison Beeby, Patricia Rodríguez Inés & Pilar Sánchez-Gijón (eds) Corpus Use and Translating: Corpus use for learning to translate and learning corpus use to translate pp59-73. Amsterdam / Philadelphia: John Benjamins</font></span></div>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">2. <b style="mso-bidi-font-weight: normal">Paul Rayson</b> replies as follows:</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">You should have a look at the output from the ASSIST project involving Lancaster and Leeds. Papers are available from: </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"><a href="http://ucrel.lancs.ac.uk/projects/assist/">http://ucrel.lancs.ac.uk/projects/assist/</a></font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"><a href="http://www.comp.leeds.ac.uk/ssharoff/">http://www.comp.leeds.ac.uk/ssharoff/</a> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"><span style="mso-spacerun: yes"> </span>3. <b style="mso-bidi-font-weight: normal">Dominic Widdows</b> stresses the usefulness of comparable corpora, along with a paper as follows:</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">One paper on finding translations without parallel corpora is:</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">Learning Bilingual Lexicons from Monolingual Corpora Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatrick and Dan Klein, ACL 2008</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"><a href="http://www.eecs.berkeley.edu/~aria42/pubs/acl2008-unsup-bilexicon.pdf">http://www.eecs.berkeley.edu/~aria42/pubs/acl2008-unsup-bilexicon.pdf</a></font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">In general I think there has been a lot of good work that uses language models for the target language built from large monolingual corpora. E.g., you can use a smaller parallel French-English corpus to translate into English, and a large English-only corpus to help "clean up" your translation to make sure your English translation is "reasonable English", as such. At least, that's my cartoon view of the general idea, I'm sure there are many experts out there who can enrich or correct this summary.</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"><span style="mso-spacerun: yes"> </span>4. <b style="mso-bidi-font-weight: normal">Nitin Madnani</b> enriches the list of readings as follows:</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">You may also look at the following papers/resources on leveraging comparable data for SMT:</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">(a) Language and Translation Model Adaptation using Comparable Corpora Matthew Snover, Bonnie J. Dorr, and Richard Schwartz. EMNLP 2008</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">(b) Dragos Stefan Munteanu and Daniel Marcu. 2005. Improving machine translation performance by exploiting non-parallel corpora. Computational Linguis- tics, 31(4):477–504.</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">(c) The proceedings for the workshop on Building and Using Comparable Corpora (<a href="http://comparable2009.ust.hk/">http://comparable2009.ust.hk/</a>). There have been two so far, I believe.</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"><span style="mso-spacerun: yes"> </span>5. <b style="mso-bidi-font-weight: normal">Yannick Versley</b>, recommends a paper from the perspetive of computational linguistics:</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">This is also a bit on the computational side (rather than applied corpus linguistics), but it may be interesting: Pekar V., Mitkov R., Blagoev D., and Mulloni A. (2007). Finding Translations for Low-Frequency Words in Comparable Corpora. In Proceedings of the CONTEXT-07 Workshop on "Contextual Information in Semantic Space Models" (CoSMo-2007). Roskille, Denmark. pp.17-25. <a href="http://home.wlv.ac.uk/~in8113/papers/cosmo07_pekar_et_al.pdf">http://home.wlv.ac.uk/~in8113/papers/cosmo07_pekar_et_al.pdf</a></font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"><span style="mso-spacerun: yes"> </span>6. <b style="mso-bidi-font-weight: normal">Stella Tagnin</b> mentions two papers (one written in Portuguese) as follows:</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">British vs. American English, Brazilian vs. European Portuguese: how close or how far apart? - a corpus-driven study (Frankfurt am Main: Lodz Studies in Language 9, 2004, p. 193-208)</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">Stella E. O. Tagnin & Elisa Duarte Teixeira (<a href="http://www.fflch.usp.br/dlm/comet/artigos/BRITISH%20VS.%20AMERICAN%20ENGLISH.pdf">http://www.fflch.usp.br/dlm/comet/artigos/BRITISH%20VS.%20AMERICAN%20ENGLISH.pdf</a>)</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"> </font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">A identificação de equivalentes tradutórios em corpora comparáveis (Anais do I Congresso Internacional da ABRAPUI: Belo Horizonte, 3 a 6 de junho de 2007)</font></span></p>
<p style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">Stella E. O. Tagnin</font></span></p>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">(<a href="http://www.fflch.usp.br/dlm/comet/Novo/Stella_Abrapui%202007_artigo.pdf">http://www.fflch.usp.br/dlm/comet/Novo/Stella_Abrapui%202007_artigo.pdf</a>)</font></span></div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri"></font></span> </div>
<div style="TEXT-ALIGN: left; MARGIN: 0cm 0cm 0pt" class="MsoNormal" align="left"><span style="FONT-SIZE: 12pt"><font face="Calibri">------------------------------------------------</font></span></div>