[Corpora-List] corpus alignment competition
Piao, Songlin
s.piao at lancaster.ac.uk
Mon Jun 7 10:11:41 UTC 2004
Hi,
We are planning a project in Lancaster, UK, aiming to expand and align the EMILLE multilingual parallel corpora. If funded the project will result in a parallel corpus of eleven languages (Arabic, Bangla, Chinese, English, Gujarati, Hindi, Panjabi, Polish, Somali, Urdu and Vietnamese). When we have expanded the corpus, we want to hold an alignment competition on this data. Because the languages involved include a wide range of typologically different/distant languages, the corpus should present a tough challenge to current alignment algorithms, and hence provide an excellent opportunity to test the ability of current alignment algorithms/tools on a wide range of languages.
At this stage we are asking for expressions of interest in taking part in the competition. We need these expressions of interest at this stage because we intend to include a small amount of money in the project budget for each competing team so that they can hire native speakers etc. to help tune their algorithms in advance of the competition proper. If you are interested in taking part in this alignment competition, please let us know by contacting Dr. Scott Piao (s.piao at lancaster.ac.uk) in the first instance.
Thank you,
Paul Baker, Tony McEnery & Scott Piao
-------------------------------------
Dept. of Linguistics and MEL
Lancaster University
Lancaster LA1 4YT
United Kingdom
More information about the Corpora
mailing list