Corpora: language boundaries + code switching

D C Souter cs at scs.leeds.ac.uk
Tue Jul 4 11:24:33 UTC 2000


Dear all,

I'm looking for details of projects on automatic boundary identification
in bilingual/multilingual texts, and corpus material containing such texts.
I would prefer it if one of the languages were English, and the texts were
ASCII. I suppose one such source would be a corpus showing code switching.
Anyone know of such material/projects?

(I know we could create such material artificially, but I was hoping to
find naturally occurring material).

Clive

 ========================================================================
Clive Souter                                        Tel: +44 113 233 5460
Lecturer & Senior Admissions Tutor                  Fax: +44 113 233 5468
School of Computer Studies
University of Leeds
Leeds LS2 9JT
UK                                              Email: cs at scs.leeds.ac.uk
 ========================================================================



More information about the Corpora mailing list