Parallel Spanish-Chinese corpus o f Cervantes ’ Don Quijote (CSCHDQ)

Carlos Subirats carlos.subirats at GMAIL.COM
Mon Sep 11 17:08:50 UTC 2006


--------------------      INFOLING        --------------------------
Lista de distribución de lingüí­stica del español (ISSN: 1576-3404)
http://elies.rediris.es/infoling/
Enví­o de información: infoling-request at listserv.rediris.es
EDITORES:
Carlos Subirats Rüggeberg, UAB <carlos.subirats at uab.es>
Mar Cruz Piñol, U. Barcelona <mcruz at ub.edu>
Eulalia de Bobes Soler, U. Abat Oliba-CEU <debobes1 at uao.es>
Equipo de edición: http://elies.rediris.es/infoling/editores.html
Estudios de Lingüí­stica del Español (ELiEs): http://elies.rediris.es
es una red temática de lingüística del español asociada a INFOLING.
---------------------------------------------------------------------

© Infoling Barcelona (España), 2006. Reservados todos los derechos

-------------------------------------------------------------------------------------------
Parallel Spanish-Chinese corpus of Cervantes' Don Quijote (CSCHDQ)
Samples of parallel texts from CSCHDQ are available upon request
De: Meng Ji, Imperial College London, Humanities College, United
Kingdom <m.ji at imperial.ac.uk>
Información editada por Infoling
-------------------------------------------------------------------------------------------

Parallel Spanish-Chinese corpus of Cervantes' Don Quijote (CSCHDQ)

The parallel Spanish - Mandarin Chinese corpus of Cervantes' Don
Quijote, known as CSCHDQ, has been constructed for the purpose of the
developer's Ph. D research of Cervantes' Don Quijote and two of its
Mandarin Chinese versions (1978 & 1999):

Don Quijote (Tang Ji He De), 1995, Liu Jingsheng, Li Jiang Publisher, Gui Lin
Don Quijote (Tang Ji He De), 1978, Yang Jiang, People's Literature
Publisher, Beijing

It is believed to be the first parallel corpus of Spanish and Mandarin
Chinese; and thus of much value for the study of comparative
linguistics and literature of Spanish and Chinese, on a
computational-tool-assisted basis.

An important corpus feature of the current version of CSCHDQ (part I)
is that it contains aligned text information of high precision, since
much of the alignment of the Spanish and Chinese texts has been done
manually, in the absence of adequate linguistic tools (in fact, most
existent aligners cannot deal with the alignment of Chinese and
Spanish texts). More sophisticated linguistic information, e.g.,
syntactic-semantic tagging, is being added constantly to the current
version of CSCHDQ (I), which is to facilitate complex text mining by
using relevant corpus tools.

I am seeking for institutions or potential collaborators that might be
interested in my project for the purpose of their own studies bearing
on the subject matter, e.g., the translations of Don Quijote into
other languages.


Samples of parallel texts from CSCHDQ are available upon request

Samples of the parallel texts from CSCHDQ will be available via email
by contacting me. The aligned corpus is in text format. I would rather
like to show samples only to people who are genuinely interested in my
project in the sense of providing technical or finantial support to
improve the current versions of CSCHDQ.

No specific operating system is required to do corpus mining with
CSCHDQ, so long as the software can support simplified Chinese. At the
moment only the first part of Don Quijote has been aligned, which will
be complemented by its second volume as early as in December this
year.

Potential colaborators, individuals or institutions, in developing
CSCHDQ will especially be welcome to contact Meng Ji
<m.ji at imperial.ac.uk>

----------------------------------------------------------------------

La Oficina del Español en la Sociedad de la Información (OESI) y el Portal del Hispanismo plagian las informaciones de Infoling sin citar su procedencia.

Ambos organismos violan arrogante e impunemente las leyes de la propiedad intelectual, porque disponen del dinero público para hacere frente a posibles demandas por plagio.

----------------------------------------------------------------------



More information about the Infoling mailing list