[Corpora-List] Corpora for language identification training?
Adam Funk
a.funk at dcs.shef.ac.uk
Thu Apr 19 13:15:17 UTC 2007
[19/04/07 13:35] Dean Jones wrote:
> Sorry, I wasn't clear. Personally I'm interested in language ID for
> "written" texts - specifically, email, although others on the list may
> be interested in spoken language ID, so I wouldn't want to discourage
> responses about that.
Here's a tool you might be interested in:
http://www.let.rug.nl/~vannoord/TextCat/
along with a list of others:
http://www.let.rug.nl/~vannoord/TextCat/competitors.html
More information about the Corpora
mailing list