[Corpora-List] corpus alignment competition

Chris Brew cbrew at acm.org
Mon Jun 7 13:57:05 UTC 2004


Dear Scott

I'm interested in participating. We'll need to work out
how. There are lots of Comp Ling grad students here, and
this seems like something they could sensibly do.


Employing our grad students on RA-ships is expensive (health
insurance, fees, etc), so
I don't expect we would be able to do that, but I
think it pretty certain someone will find this engaging.
Are there tasks that we could allocate while using a
smaller percentage of somebody's time?

Columbus has a large Somali community, and many of
our grad students are Mandarin speakers. We could
probably find some speakers of languages from the
Indian subcontinent as well, but that may be
more work than we can undertake with the resources we
are able to deploy.

Chris


On Mon, Jun 07, 2004 at 11:11:41AM +0100, Piao, Songlin wrote:
> Hi,
>
> We are planning a project in Lancaster, UK, aiming to expand and align the EMILLE multilingual parallel corpora. If funded the project will result in a parallel corpus of eleven languages (Arabic, Bangla, Chinese, English, Gujarati, Hindi, Panjabi, Polish, Somali, Urdu and Vietnamese). When we have expanded the corpus, we want to hold an alignment competition on this data. Because the languages involved include a wide range of typologically different/distant languages, the corpus should present a tough challenge to current alignment algorithms, and hence provide an excellent opportunity to test the ability of current alignment algorithms/tools on a wide range of languages.
>
> At this stage we are asking for expressions of interest in taking part in the competition. We need these expressions of interest at this stage because we intend to include a small amount of money in the project budget for each competing team so that they can hire native speakers etc. to help tune their algorithms in advance of the competition proper. If you are interested in taking part in this alignment competition, please let us know by contacting Dr. Scott Piao (s.piao at lancaster.ac.uk) in the first instance.
>
> Thank you,
>
> Paul Baker, Tony McEnery & Scott Piao
> -------------------------------------
> Dept. of Linguistics and MEL
> Lancaster University
> Lancaster LA1 4YT
> United Kingdom
>

--
==================================================================
Dr. Chris Brew,  Assistant Professor of Computational Linguistics
Department of Linguistics, The Ohio State University
1712 Neil Avenue, Columbus OH 43210
Tel:  +614 292 5420 Fax: +614 292 8833
Web:http://www.ling.ohio-state.edu/~cbrew
Email:c-b-r-e-w at acm.org (delete hyphens)
Other appointments: Cognitive Science, Computer Science
==================================================================



More information about the Corpora mailing list