[Corpora-List] language sort
Daniel Zeman
zeman at ufal.mff.cuni.cz
Wed Jan 10 21:07:50 UTC 2007
Maria,
why does file-by-file approach not work for you? Does that mean that you
have potentially more than one language within one file?
Dan
Maria Esteva napsal(a):
> Dear all,
>
> I am wondering if somebody knows of a program that will recognize and
> sort large sets of files according to language. For my text mining
> project, I need to sort sets of files that contain electronic texts
> mostly in Spanish and English (although there is some French and some
> Portuguese as well).There are many free language recognition
> programmes out there but they work on a file by file bases. Let me
> know if you have some advice.
>
> Thanks,
>
> Maria Esteva
> PhD Candidate
> School of Information
> University of Texas at Austin
More information about the Corpora
mailing list