[Corpora-List] starting a machine translation project
Francis Bond
fcbond at gmail.com
Thu Sep 14 00:58:54 UTC 2006
G'day,
There was some work on Indonesian MT in the CICC project
<http://www.cicc.or.jp/english/kyoudou/mt.html>, which ended up with a
fairly useful lexicon Indonesian-English on CD-ROM. If you email CICC
then you should be a be able to get a copy. There are two relevant
CDs, the Indonesian one, and the terminological one, which includes a
technical lexicon for English, Malay, Thai, Indonesian, Chinese and
Japanese.
A lexicon is essential for rule- based MT and also a useful tool for
aligning and backing-off in statistical MT.
--
Francis Bond <www.kecl.ntt.co.jp/icl/mtg/members/bond/>
NTT Communication Science Laboratories | Natural Language Research Group
P.S. Here is a sample entry:
@2110IVMT
&2111mengabadikan
&2112135
&2113630
#2100to preserve; to keep alive; to immortalize
#2101IVABS
#2110to perpetuate
#2111IVABS
#2120to memorialize
#2121IVABS
#2130to capture (on canvas, etc.)
#2131IVABS
#2140to take a picture of; to photograph
#2141IVABS
More information about the Corpora
mailing list