[Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA

Xiao, Zhonghua z.xiao at lancaster.ac.uk
Mon Feb 16 14:14:30 UTC 2009


Here is a recent chapter -
 
McEnery, Tony and Richard Xiao (2007) Parallel and comparable corpora: The state of play. In Y. Kawaguchi, T. Takagaki, N. Tomimori and Y. Tsuruga (eds.) Corpus-Based Perspectives in Linguistics. Amsterdam: John Benjamins. 131-145.

________________________________

From: corpora-bounces at uib.no on behalf of Helena Blancafort
Sent: Mon 16/02/2009 12:46
To: CORPORA at UIB.NO
Subject: Re: [Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA




> I would be grateful for any up-to-date information  about state > of the
> art in comparable corpus.

Hi,

Here some articles about comparable corpora that should be useful for a
state-of-the-art.

Helena Blancafort
------------------
Syllabs
www.syllabs.com




Déjean, H., Gaussier, E. (2002). "Une nouvelle approche à l'extraction de
lexiques bilingues à partir de corpus comparables". Lexicometrica,
Alignement lexical dans les corpus multilingues, pp. 1-21.

Sadat, F., Yoshikawa, M. et Uemura, S. (2003). "Learning Bilingual
Translations from Comparable Corpora to Cross-Language Information
Retrieval: Hybrid Statistics-based and Linguistics-based Approach".
Proceedings of the sixth international workshop on Information retrieval
with Asian languages - Volume 11, pages 57-64.

E. MORIN et B. DAILLE (2004). Extraction de terminologies bilingues à partir
de corpus comparables d'un domaine spécialisé.  Traitement Automatique des
Langues (TAL) , 45:3, Hermès Lavoisier Sciences Publications, 2004. ISSN
1248-9433.

E. Morin, B. Daille, K. Takeuchi, and K. Kageura (2007). Bilingual
Terminology Mining -- Using Brain, not brawn comparable corpora.  In
Proceedings of the 45th Annual Meeting of the Association for Computational
Linguistics (ACL'07) p. 664-671, Prague, Czech Republic, 2007.  On line
Proceedings

B. DAILLE and E. MORIN. (2005) French-English terminology extraction from
comparable corpora. In  Proceedings IJCNLP 2005: Second International Joint
Conference, Lecture Notes in Computer Sciences, vol. 3651/2005, p. 707-719,
Springer-Verlag, 2005. ISBN 3-540-29172.

Yun-Chuang Chiao and Pierre Zweigenbaum. 2002.
Looking for candidate translational equivalents in spe-
cialized, comparable corpora. In Proceedings of the
19th International Conference on Computational Lin-
guistics (COLING'02), pages 1208-1212, Tapei, Tai-
wan.


Pascale Fung. 1998. A Statistical View on Bilingual
Lexicon Extraction: From Parallel Corpora to Non-
parallel Corpora. In David Farwell, Laurie Gerber,
and Eduard Hovy, editors, Proceedings of the 3rd Con-
ference of the Association for Machine Translation in
the Americas (AMTA'98), pages 1-16, Langhorne, PA, USA. Springer.

Carol Peters and Eugenio Picchi. 1998. Cross-language
information retrieval: A system for comparable cor-
pus querying. In Gregory Grefenstette, editor, Cross-
language information retrieval, chapter 7, pages 81-
90. Kluwer.
Reinhard Rapp. 1999. Automatic Identification of Word
Translations from Unrelated English and German Cor-
pora. In Proceedings of the 37th Annual Meeting of the
Association for Computational Linguistics (ACL'99), pages 519-526, college
Park, Maryland, USA.

Gamallo P., and J-R. Pichel (2008) "Learning Spanish-Galician Translation
Equivalents Using a Comparable Corpus and a Bilingual Dictionary", Lecture
Notes in Computer Science, vol. 4919, Springer-Verlag, (423-433). ISNN:
0302-9743.

Gamallo P. (2008) "Evaluating two different methods for the task of
extracting bilingual lexicons from comparable corpora",  In Proceedings of
LREC 2008 Workshop on Comparable Corpora, Marrakech, Marroco, pp. 19-26.
ISBN: 2-9517408-4-0.

Gamallo P. (2007) "Learning Bilingual Lexicons from Comparable English and
Spanish Corpora",  In Proceedings of Machine Translation Summit XI,
Copenhagen, Denmark, pp. 191-198.

Gamallo P. and J.R. Pichel (2007) "Un método de extracción de equivalentes
de traducción a partir de un corpus comparable castellano-gallego",
Procesamiento del Lenguaje Natural, 39, pp. 241-248.

> -----Message d'origine-----
> De : corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] De la part de
> Eric Atwell
> Envoyé : lundi 16 février 2009 09:54
> À : J.L. DeLucca
> Cc : CORPORA at uib.no
> Objet : Re: [Corpora-List] STATE OF THE ART IN COMPARABLE CORPORA
>
> For large comparable corpora for English/Russian,
> and links to other large comparable corpora and relevant publications, see
>
> http://corpus.leeds.ac.uk/
>
> I hope this is helpful
>
> Eric Atwell,  Leeds University
>
>
>
> On Sun, 15 Feb 2009, J.L. DeLucca wrote:
>
> > Dear All,
> >
> > I would be grateful for any up-to-date information  about state of the
> > art in comparable corpus.
> >
> > Best regards.
> > J. L. De Lucca
> > Universidad Politécnica de Valencia
> > Departamento de Linguistica Aplicada
> >
> >
> >
> >
>
> --
> Eric Atwell,
>   Senior Lecturer, Language research group, School of Computing,
>   Faculty of Engineering, UNIVERSITY OF LEEDS, Leeds LS2 9JT, England
>   TEL: 0113-3435430  FAX: 0113-3435468  WWW/email: google Eric Atwell


               
___________________________________________________________
Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de <http://messenger.yahoo.de/> 


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list