[Corpora-List] Convertor XML to TXT

Greg Peterson peterson at notredame.ac.jp
Fri May 23 13:51:37 UTC 2008


At Fri, 23 May 2008 13:32:10 +0200, souhir hajji wrote:

> Does anyone know about any FREE convertor system that could be used
> to convert XML files to TXT files?

If you have a free XSLT processor, such as Saxon, you can make a
simple XSLT stylesheet, "xml2txt.xsl", with this content:

<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="text"/>
</xsl:stylesheet>

That will not format it nicely, but it should work with any XML file.
You can add XSLT templates for your XML application and precisely
control output plain txt formatting.

Saxon is a free XSLT and XQuery processor that requires Java
(http://saxon.sourceforge.net/).  It will work on many systems,
including Microsoft Windows.  You can also use "xsltproc" from Gnome
libxml2.  GNU/Linux systems should already have libxml2 (free).

In a command-line environment (terminal) on FreeBSD, GNU/Linux, 
Mac OS X, Solaris, etc., convert XML to text like this:

    saxon -novw FILENAME.xml xml2txt.xsl > FILENAME.txt

or:

    xsltproc xml2txt.xsl FILENAME.xml > FILENAME.txt

See the documentation for various options.

Another way that produces nicer output is to first transform your
XML to XHTML.  Then with a browser you can save the XHTML file as
plain text.  You can also add a CSS stylesheet for nice printing.
Of course, that requires an XSLT stylesheet for XML -> XHTML.

Best wishes,

Greg Peterson <peterson at notredame.ac.jp>
Kyoto Notre Dame University

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list