Corpora: Conversion of PDF files

Simon G. J. Smith smithsgj at eee.bham.ac.uk
Thu May 24 10:40:17 UTC 2001


 MSword -- www.adobe.com will do free conversions FROM word (they get emailed back to you, and you can only do abt 5 per email address), but I don't know about the other way round.

To extract text:

from acrobat (mine is 4.0) choose the text select tool (capital T with a little box). Then just cut and paste the text you want. This works one page at a time.

>From ghostview (if it can read your particular PDF, sometimes doesn't work for me), do the whole thing at once by Edit|Text Extract. It's in the gsview help.

You can convert whole pages to bitmaps with gsview, and I think in Acrobat you can select graphics from the pdf file (the Acrobat help says use the graphics select tool, but I can't find this tool). The bitmap file can then be viewed from Word.



More information about the Corpora mailing list