[Corpora-List] starting a machine translation project

Francis Bond fcbond at gmail.com
Thu Sep 14 00:58:54 UTC 2006


G'day,

There was some work on Indonesian MT in the CICC project
<http://www.cicc.or.jp/english/kyoudou/mt.html>, which ended up with a
fairly useful lexicon Indonesian-English on CD-ROM.  If you email CICC
then you should be a be able to get a copy.  There are two relevant
CDs, the Indonesian one, and the terminological one, which includes a
technical lexicon for English, Malay, Thai, Indonesian, Chinese  and
Japanese.

A lexicon is essential for rule- based MT and also a useful tool for
aligning and backing-off in statistical MT.

-- 
Francis Bond  <www.kecl.ntt.co.jp/icl/mtg/members/bond/>
NTT Communication Science Laboratories | Natural Language Research Group

P.S. Here is a sample entry:

@2110IVMT
&2111mengabadikan
&2112135
&2113630
#2100to preserve; to keep alive; to immortalize
#2101IVABS
#2110to perpetuate
#2111IVABS
#2120to memorialize
#2121IVABS
#2130to capture (on canvas, etc.)
#2131IVABS
#2140to take a picture of; to photograph
#2141IVABS



More information about the Corpora mailing list