[Corpora-List] Convertor XML to TXT
Emiliano Guevara
emiliano.guevara at unibo.it
Fri May 23 12:13:34 UTC 2008
Hi Souhir,
there's many ways to do this, but you will have to be more detailed
about what you mean by "convert".
- If you are NOT interested at all about keeping the structured
information in your XML, then even a simple Regex could do the job
(basically deleting all the tags and keeping the raw text)
- If you DO want to keep the structured information (or a part of it),
then you will need to parse the XML, find the elements that you want
to keep and print them out as you wish. You can do it writing a script
in any programming language with good XML libraries, but I think that
Perl could be give you good start. Some examples/tutorials:
http://www.perlmonks.org/?node_id=46517
http://www.ibm.com/developerworks/xml/library/x-domprl/
http://articles.techrepublic.com.com/5100-10878_11-1044612.html
http://articles.techrepublic.com.com/5100-10878_11-5363190.html?tag=rbxccnbtr1
- In alternative, you could apply a simple "transformation" with XSL/
XSLT (some XML editors allow you to do this straight away, I have used
Oxygen for this: http://www.oxygenxml.com/ )
All of these involve some programming/coding, I don't know of any pre-
cooked tools for the task.
Good luck,
E.
On May 23, 2008, at 13:32 PM, souhir hajji wrote:
> Dear all,
> Does anyone know about any FREE convertor system that could be used
> to convert XML files to TXT files?
>
> Many thanks for your help.
>
> Souhir Hajji
> Master student
> MIRACL Laboratory
> Sfax, TUNISIA
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
****************************************
Emiliano R. Guevara
Facoltà di Lingue e Lett. Straniere
Dipart. di Lingue e Lett. Straniere
Università di Bologna
Via Cartoleria 5 (40124) Bologna, Italia
http://morbo.lingue.unibo.it/
emiliano.guevara at unibo.it
emiguevara at gmail.com
****************************************
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list