[Corpora-List] XML parsers vs regex

Adam Lopez alopez at cs.jhu.edu
Mon Jun 30 12:37:26 UTC 2014


I'm just going to leave this here.
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

More seriously, it depends on what you want to do.


On Mon, Jun 30, 2014 at 7:55 AM, Matías Guzmán Naranjo <mortem.dei at gmail.com
> wrote:

> Dear all,
>
> When working with xml tagged corpora I have always used regex to extract
> the information I need, I have never used xml parsers like nltk's or any
> other. Is there an advantage to using parsers vs using regex? Which? what
> do you personally use?
>
> Best,
>
> Matías
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140630/783c40ed/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list