[Corpora-List] Sentence alignment between Traditional Chinese and English

liling tan alvations at gmail.com
Fri Jul 11 05:44:09 UTC 2014


Dear Richard,

>>From our previous experiments for the GaChalign, it seems like there is no
way to twist the gale-church to fit japanese. No matter how we varied the
parameters, we were not achieving good alignments cap at >70 accuracy .
(see https://db.tt/LLrul4zP and http://code.google.com/p/gachalign/ for
details)

There are other experiments that had some results with length based
alignment for english-chinese such as
http://ieeexplore.ieee.org/xpl/abstractSimilar.jsp?arnumber=6121503 (i dont
have access to the paper, so you will have to check it out to see the
details)

I think you have to go with lexicon based sentence alignment methods, like
champollion aligner (http://champollion.sourceforge.net/). They reported
results at > 95 precision.


Regards,
Liling


On Fri, Jul 11, 2014 at 2:06 AM, Francis Bond <bond at ieee.org> wrote:

> ---------- Forwarded message ----------
> From: Sutcliffe, Richard F E <rsutcl at essex.ac.uk>
> Date: Fri, Jul 11, 2014 at 7:46 AM
> Subject: [Corpora-List] Sentence alignment between Traditional Chinese
> and English
> To: "CORPORA at UIB.NO" <CORPORA at uib.no>
>
>
> Does anyone know of tools to perform sentence alignment between
> Traditional Chinese and English parallel texts? As I recall, the Gale
> & Church algorithm based on sentence length is not suitable for
> Chinese.
>
> Many thanks for any information.
>
> richard
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
> --
> Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
> Division of Linguistics and Multilingual Studies
> Nanyang Technological University
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140711/90fb7972/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list