[Corpora-List] Character encoding headaches
Dale Gerdemann
dg at sfs.uni-tuebingen.de
Mon Aug 3 09:37:04 UTC 2009
No matter what ready-made tools you use, there will be errors and
corruptions. There is no substitute for learning about character
encodings and writing the fix-up programs yourself. Start by reading the
Wikipedia article on UTF-8.
Dale Gerdemann
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list