[Corpora-List] DTD for HTML documents?

Peter Adolphs peter.adolphs at student.hu-berlin.de
Fri Jun 13 11:32:16 UTC 2003


wassim souayah wrote:
>       I'm attempting to convert HTML documents to XML.
>
> Someone could Help me to have (if exist) a DTD for
> HTML documents?

Why do you need a DTD to convert HTML to XML?

You could use HTML Tidy to convert your HTML files to XHTML (which is an
XML format). If you want to process those files further, you could use XSLT.

See
http://www.w3.org/People/Raggett/tidy/
http://www.w3.org/TR/xslt
http://www.w3.org/MarkUp/ (XHTML and HTML)
http://xml.apache.org/xalan-j/index.html (an XSLT processor)

Best regards,
Peter Adolphs.



More information about the Corpora mailing list