[Corpora-List] "Multi-encoded" corpora
Martin Wynne
martin.wynne at oucs.ox.ac.uk
Wed Oct 8 11:44:27 UTC 2008
Albretch Mueller wrote:
> ~
> I was browsing around the BAWE corpus info previously posted here and
> when I noticed all texts are in PDF format (!), it made me wonder...
Oh no, they're not! The corpus is composed text files, with a choice of
text encodings. None of it is in PDF files. There is some prose
documentation in PDF files to accompany the corpus in the package of
files which can be downloaded from the OTA.
Martin
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list