[Corpora-List] Character encoding headaches

Guy De Pauw guy.depauw at ua.ac.be
Tue Aug 4 10:17:43 UTC 2009


Encoding is a huge headache when working with African languages. First 
thing I do, when I get data in, is to transcode it to UTF-8 right away. 
I tend not to bother with iconv, as I find it very unreliable. I am not 
ashamed to admit that I found Microsoft Word can actually serve as a 
pretty good transcoder. Much better than Openoffice Writer anyway. In 
the text-only realm Notepad++ does an extremely good job (better than 
(x)emacs) and I recently discovered the excellent UTFCAST app as well.

g



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list