<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hello!<br>
<br>
On GNU/Linux systems there is, among many others, easy programs such
as 'pdftotext' and 'pdftohtml'. They have served me well, but all I
do is in latin script.<br>
<br>
What has been the complications with the files converted with Adobe
Acrobat Professional? Problems with encoding or problems with fonts?<br>
<br>
Best wishes<br>
Kristian K<br>
<br>
<br>
<div class="moz-cite-prefix">11.09.2014 13:45, Eric Atwell kirjutas:<br>
</div>
<blockquote
cite="mid:alpine.LRH.2.11.1409111136340.32330@cslin-gps.csunix.comp.leeds.ac.uk"
type="cite">Can anyone recommend PDF-to=txt (or PDF-to=xml) tools
for Arabic?
<br>
I have had enquiries from several Arabic corpus linguistics
researchers,
<br>
example below from Anastasiya Andrusenko in Valencia
<br>
<br>
thanks - Eric Atwell, Leeds University
<br>
WWW: <a class="moz-txt-link-freetext" href="http://www.comp.leeds.ac.uk/eric">http://www.comp.leeds.ac.uk/eric</a>
<br>
<a class="moz-txt-link-freetext" href="http://www.comp.leeds.ac.uk/arabic">http://www.comp.leeds.ac.uk/arabic</a>
<br>
<br>
---------- Forwarded message ----------
<br>
Date: Thu, 11 Sep 2014 10:50:36 +0100
<br>
From: Anastasiya Andrusenko <a class="moz-txt-link-rfc2396E" href="mailto:anisika2002@gmail.com"><anisika2002@gmail.com></a>
<br>
To: Eric Atwell <a class="moz-txt-link-rfc2396E" href="mailto:E.S.Atwell@leeds.ac.uk"><E.S.Atwell@leeds.ac.uk></a>
<br>
Subject: Converting PDFs in Arabic to txt. for further corpus
analysis
<br>
<br>
<br>
Hi,
<br>
<br>
I saw your profile in internet and thought may be you can help me.
<br>
My name is Anastasiia Andrusenko, currently I am doing research on
<br>
metadiscourse features in Arabic Research Articles (Analysis of
Arabic corpus)
<br>
at the Department of Applied Linguistics of the Universitat
Politècnica de
<br>
València.
<br>
I have PDF files in Arabic. I need them to be in txt. format. But
the problem
<br>
is that by converting them with Adobe Acrobat Prof. the txt. files
are not
<br>
readible.
<br>
<br>
Could you please advice any solution to this problem or may be you
know any
<br>
tool for text analysis for Arabic.
<br>
Thank you in advance
<br>
<br>
Regards,
<br>
<br>
Anastasiia
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
UNSUBSCRIBE from this page: <a class="moz-txt-link-freetext" href="http://mailman.uib.no/options/corpora">http://mailman.uib.no/options/corpora</a>
Corpora mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Corpora@uib.no">Corpora@uib.no</a>
<a class="moz-txt-link-freetext" href="http://mailman.uib.no/listinfo/corpora">http://mailman.uib.no/listinfo/corpora</a>
</pre>
</blockquote>
<br>
</body>
</html>