[Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA
Xiao, Zhonghua
z.xiao at lancaster.ac.uk
Mon Feb 16 14:14:30 UTC 2009
Here is a recent chapter -
McEnery, Tony and Richard Xiao (2007) Parallel and comparable corpora: The state of play. In Y. Kawaguchi, T. Takagaki, N. Tomimori and Y. Tsuruga (eds.) Corpus-Based Perspectives in Linguistics. Amsterdam: John Benjamins. 131-145.
________________________________
From: corpora-bounces at uib.no on behalf of Helena Blancafort
Sent: Mon 16/02/2009 12:46
To: CORPORA at UIB.NO
Subject: Re: [Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA
> I would be grateful for any up-to-date information about state > of the
> art in comparable corpus.
Hi,
Here some articles about comparable corpora that should be useful for a
state-of-the-art.
Helena Blancafort
------------------
Syllabs
www.syllabs.com
Déjean, H., Gaussier, E. (2002). "Une nouvelle approche à l'extraction de
lexiques bilingues à partir de corpus comparables". Lexicometrica,
Alignement lexical dans les corpus multilingues, pp. 1-21.
Sadat, F., Yoshikawa, M. et Uemura, S. (2003). "Learning Bilingual
Translations from Comparable Corpora to Cross-Language Information
Retrieval: Hybrid Statistics-based and Linguistics-based Approach".
Proceedings of the sixth international workshop on Information retrieval
with Asian languages - Volume 11, pages 57-64.
E. MORIN et B. DAILLE (2004). Extraction de terminologies bilingues à partir
de corpus comparables d'un domaine spécialisé. Traitement Automatique des
Langues (TAL) , 45:3, Hermès Lavoisier Sciences Publications, 2004. ISSN
1248-9433.
E. Morin, B. Daille, K. Takeuchi, and K. Kageura (2007). Bilingual
Terminology Mining -- Using Brain, not brawn comparable corpora. In
Proceedings of the 45th Annual Meeting of the Association for Computational
Linguistics (ACL'07) p. 664-671, Prague, Czech Republic, 2007. On line
Proceedings
B. DAILLE and E. MORIN. (2005) French-English terminology extraction from
comparable corpora. In Proceedings IJCNLP 2005: Second International Joint
Conference, Lecture Notes in Computer Sciences, vol. 3651/2005, p. 707-719,
Springer-Verlag, 2005. ISBN 3-540-29172.
Yun-Chuang Chiao and Pierre Zweigenbaum. 2002.
Looking for candidate translational equivalents in spe-
cialized, comparable corpora. In Proceedings of the
19th International Conference on Computational Lin-
guistics (COLING'02), pages 1208-1212, Tapei, Tai-
wan.
Pascale Fung. 1998. A Statistical View on Bilingual
Lexicon Extraction: From Parallel Corpora to Non-
parallel Corpora. In David Farwell, Laurie Gerber,
and Eduard Hovy, editors, Proceedings of the 3rd Con-
ference of the Association for Machine Translation in
the Americas (AMTA'98), pages 1-16, Langhorne, PA, USA. Springer.
Carol Peters and Eugenio Picchi. 1998. Cross-language
information retrieval: A system for comparable cor-
pus querying. In Gregory Grefenstette, editor, Cross-
language information retrieval, chapter 7, pages 81-
90. Kluwer.
Reinhard Rapp. 1999. Automatic Identification of Word
Translations from Unrelated English and German Cor-
pora. In Proceedings of the 37th Annual Meeting of the
Association for Computational Linguistics (ACL'99), pages 519-526, college
Park, Maryland, USA.
Gamallo P., and J-R. Pichel (2008) "Learning Spanish-Galician Translation
Equivalents Using a Comparable Corpus and a Bilingual Dictionary", Lecture
Notes in Computer Science, vol. 4919, Springer-Verlag, (423-433). ISNN:
0302-9743.
Gamallo P. (2008) "Evaluating two different methods for the task of
extracting bilingual lexicons from comparable corpora", In Proceedings of
LREC 2008 Workshop on Comparable Corpora, Marrakech, Marroco, pp. 19-26.
ISBN: 2-9517408-4-0.
Gamallo P. (2007) "Learning Bilingual Lexicons from Comparable English and
Spanish Corpora", In Proceedings of Machine Translation Summit XI,
Copenhagen, Denmark, pp. 191-198.
Gamallo P. and J.R. Pichel (2007) "Un método de extracción de equivalentes
de traducción a partir de un corpus comparable castellano-gallego",
Procesamiento del Lenguaje Natural, 39, pp. 241-248.
> -----Message d'origine-----
> De : corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] De la part de
> Eric Atwell
> Envoyé : lundi 16 février 2009 09:54
> À : J.L. DeLucca
> Cc : CORPORA at uib.no
> Objet : Re: [Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA
>
> For large comparable corpora for English/Russian,
> and links to other large comparable corpora and relevant publications, see
>
> http://corpus.leeds.ac.uk/
>
> I hope this is helpful
>
> Eric Atwell, Leeds University
>
>
>
> On Sun, 15 Feb 2009, J.L. DeLucca wrote:
>
> > Dear All,
> >
> > I would be grateful for any up-to-date information about state of the
> > art in comparable corpus.
> >
> > Best regards.
> > J. L. De Lucca
> > Universidad Politécnica de Valencia
> > Departamento de Linguistica Aplicada
> >
> >
> >
> >
>
> --
> Eric Atwell,
> Senior Lecturer, Language research group, School of Computing,
> Faculty of Engineering, UNIVERSITY OF LEEDS, Leeds LS2 9JT, England
> TEL: 0113-3435430 FAX: 0113-3435468 WWW/email: google Eric Atwell
___________________________________________________________
Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de <http://messenger.yahoo.de/>
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list