[Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA

Inaki San Vicente inaki at elhuyar.com
Mon Feb 16 15:04:34 UTC 2009


Hi,

  You can take a look at the workshop "Building and Using Comparable 
Corpora", at LREC 2008. Here are the proceedings:

                
http://www.lrec-conf.org/proceedings/lrec2008/workshops/W12_Proceedings.pdf

  The second edition of this workshop is planned to be in August, at  
ACL-IJCNLP 2009:     http://comparable2009.ust.hk/



-- 
.......................................................


Iñaki San Vicente
Hizkuntza Zerbitzuak
Elhuyar Fundazioa
Zelai Haundi, 3
Osinalde industrialdea
20170 Usurbil
tel.: 943363040
www.elhuyar.org 
<http://www.elhuyar.org/hizkuntza-zerbitzuak/EU/I-G-B-unitatea>



Xiao, Zhonghua(e)k dio:
> Here is a recent chapter -
>  
> McEnery, Tony and Richard Xiao (2007) Parallel and comparable corpora: The state of play. In Y. Kawaguchi, T. Takagaki, N. Tomimori and Y. Tsuruga (eds.) Corpus-Based Perspectives in Linguistics. Amsterdam: John Benjamins. 131-145.
>
> ________________________________
>
> From: corpora-bounces at uib.no on behalf of Helena Blancafort
> Sent: Mon 16/02/2009 12:46
> To: CORPORA at UIB.NO
> Subject: Re: [Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA
>
>
>
>
>   
>> I would be grateful for any up-to-date information  about state > of the
>> art in comparable corpus.
>>     
>
> Hi,
>
> Here some articles about comparable corpora that should be useful for a
> state-of-the-art.
>
> Helena Blancafort
> ------------------
> Syllabs
> www.syllabs.com
>
>
>
>
> Déjean, H., Gaussier, E. (2002). "Une nouvelle approche à l'extraction de
> lexiques bilingues à partir de corpus comparables". Lexicometrica,
> Alignement lexical dans les corpus multilingues, pp. 1-21.
>
> Sadat, F., Yoshikawa, M. et Uemura, S. (2003). "Learning Bilingual
> Translations from Comparable Corpora to Cross-Language Information
> Retrieval: Hybrid Statistics-based and Linguistics-based Approach".
> Proceedings of the sixth international workshop on Information retrieval
> with Asian languages - Volume 11, pages 57-64.
>
> E. MORIN et B. DAILLE (2004). Extraction de terminologies bilingues à partir
> de corpus comparables d'un domaine spécialisé.  Traitement Automatique des
> Langues (TAL) , 45:3, Hermès Lavoisier Sciences Publications, 2004. ISSN
> 1248-9433.
>
> E. Morin, B. Daille, K. Takeuchi, and K. Kageura (2007). Bilingual
> Terminology Mining -- Using Brain, not brawn comparable corpora.  In
> Proceedings of the 45th Annual Meeting of the Association for Computational
> Linguistics (ACL'07) p. 664-671, Prague, Czech Republic, 2007.  On line
> Proceedings
>
> B. DAILLE and E. MORIN. (2005) French-English terminology extraction from
> comparable corpora. In  Proceedings IJCNLP 2005: Second International Joint
> Conference, Lecture Notes in Computer Sciences, vol. 3651/2005, p. 707-719,
> Springer-Verlag, 2005. ISBN 3-540-29172.
>
> Yun-Chuang Chiao and Pierre Zweigenbaum. 2002.
> Looking for candidate translational equivalents in spe-
> cialized, comparable corpora. In Proceedings of the
> 19th International Conference on Computational Lin-
> guistics (COLING'02), pages 1208-1212, Tapei, Tai-
> wan.
>
>
> Pascale Fung. 1998. A Statistical View on Bilingual
> Lexicon Extraction: From Parallel Corpora to Non-
> parallel Corpora. In David Farwell, Laurie Gerber,
> and Eduard Hovy, editors, Proceedings of the 3rd Con-
> ference of the Association for Machine Translation in
> the Americas (AMTA'98), pages 1-16, Langhorne, PA, USA. Springer.
>
> Carol Peters and Eugenio Picchi. 1998. Cross-language
> information retrieval: A system for comparable cor-
> pus querying. In Gregory Grefenstette, editor, Cross-
> language information retrieval, chapter 7, pages 81-
> 90. Kluwer.
> Reinhard Rapp. 1999. Automatic Identification of Word
> Translations from Unrelated English and German Cor-
> pora. In Proceedings of the 37th Annual Meeting of the
> Association for Computational Linguistics (ACL'99), pages 519-526, college
> Park, Maryland, USA.
>
> Gamallo P., and J-R. Pichel (2008) "Learning Spanish-Galician Translation
> Equivalents Using a Comparable Corpus and a Bilingual Dictionary", Lecture
> Notes in Computer Science, vol. 4919, Springer-Verlag, (423-433). ISNN:
> 0302-9743.
>
> Gamallo P. (2008) "Evaluating two different methods for the task of
> extracting bilingual lexicons from comparable corpora",  In Proceedings of
> LREC 2008 Workshop on Comparable Corpora, Marrakech, Marroco, pp. 19-26.
> ISBN: 2-9517408-4-0.
>
> Gamallo P. (2007) "Learning Bilingual Lexicons from Comparable English and
> Spanish Corpora",  In Proceedings of Machine Translation Summit XI,
> Copenhagen, Denmark, pp. 191-198.
>
> Gamallo P. and J.R. Pichel (2007) "Un método de extracción de equivalentes
> de traducción a partir de un corpus comparable castellano-gallego",
> Procesamiento del Lenguaje Natural, 39, pp. 241-248.
>
>   
>> -----Message d'origine-----
>> De : corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] De la part de
>> Eric Atwell
>> Envoyé : lundi 16 février 2009 09:54
>> À : J.L. DeLucca
>> Cc : CORPORA at uib.no
>> Objet : Re: [Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA
>>
>> For large comparable corpora for English/Russian,
>> and links to other large comparable corpora and relevant publications, see
>>
>> http://corpus.leeds.ac.uk/
>>
>> I hope this is helpful
>>
>> Eric Atwell,  Leeds University
>>
>>
>>
>> On Sun, 15 Feb 2009, J.L. DeLucca wrote:
>>
>>     
>>> Dear All,
>>>
>>> I would be grateful for any up-to-date information  about state of the
>>> art in comparable corpus.
>>>
>>> Best regards.
>>> J. L. De Lucca
>>> Universidad Politécnica de Valencia
>>> Departamento de Linguistica Aplicada
>>>
>>>
>>>
>>>
>>>       
>> --
>> Eric Atwell,
>>   Senior Lecturer, Language research group, School of Computing,
>>   Faculty of Engineering, UNIVERSITY OF LEEDS, Leeds LS2 9JT, England
>>   TEL: 0113-3435430  FAX: 0113-3435468  WWW/email: google Eric Atwell
>>     
>
>
>                
> ___________________________________________________________
> Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de <http://messenger.yahoo.de/> 
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>   




_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list