[Corpora-List] determining the correct character encoding

Alexander Schutz goalscoringsuperstarhero at gmail.com
Wed Oct 12 10:27:23 UTC 2005


Dear List,

here is s short summary of the contributions to my java
charset-detection trouble:

Peter Adolphs suggested to have a look at
http://glaforge.free.fr/wiki/index.php?wiki=GuessEncoding

David Evans proposed to use jchardet , the java port of the
mozilla charset detection, to be found at
http://jchardet.sourceforge.net/index.html#4
from which I found it is more customizable than the first one.

Thank you very much for contributing, it has already been of
great help :-)

Alex
--
Alexander Schutz
Student of Computational Linguistics
University of Saarland, Germany
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20051012/21f74be6/attachment.htm>


More information about the Corpora mailing list