Corpora: Easter present

Mcenery, Tony eiaamme at exchange.lancs.ac.uk
Mon Apr 1 11:16:51 UTC 2002


Dear All,

as some of you know, I am currently working on a project (EMILLE) which aims to
build a number of corpora covering the languages of South Asia. The EMILLE
project team is Paul Baker, Andrew Hardie and Tony McEnery (Lancaster) and
Hamish Cunningham and Rob Gaizauskas (Sheffield). I thought some of you may be
interested to know that the EMILLE team has recently entered into a
collaboration with the Central Institute of Indian Languages, in Mysore, India,
which will allow us to expand the number of languages covered by the EMILLE
corpora from seven (Bangla, Gujarati, Hindi, Panjabi, Singhalese, Tamil and
Urdu) to 14 (Assamese, Bangla, Gujarati, Hindi, Kannada, Kashmiri, Malayalam,
Marathi, Oriya, Panjabi, Singhalese, Tamil, Telugu and Urdu). The CIIL/EMILLE
corpus should be ready for early 2004.

Now for the Easter present - we have updated the EMILLE website to include a
number of project reports and the proceedings of a workshop held in Mumbai
(Bombay) earlier this year. The workshop brought together UK and South Asian
researchers with an interest in corpus linguistics. The workshop was jointly
supported by the UK Engineering and Physical Sciences  Research Council and the
National Centre for Software Technology, Mumbai. The papers and reports can be
found at:

http://www.emille.lancs.ac.uk/papers.htm

Happy reading! Best,

Tony



More information about the Corpora mailing list