[Corpora-List] corpus of textbooks; "Just download the PDF's and convert to text"
Laurence Anthony
anthony0122 at gmail.com
Fri Oct 12 13:21:45 UTC 2012
I've just started working on a simple PDF to text converter. It's
basically a wrapper around the Python PDFMiner module. I plan to
extend this shortly to convert .doc(x) files and other file types to
plain text. Just drag and drop in any PDF files (or use the file menu)
and hit "Start".
You can download the alpha version (0.0.2) here:
http://www.antlab.sci.waseda.ac.jp/software/antconverter002/AntConverter.exe
I'll make an official release shortly that you'll be able to download
from the regular software page of my website:
http://www.antlab.sci.waseda.ac.jp/software.html
If anyone would like to see a Mac or Linux version developed, please
let me know.
Regards,
Laurence.
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list