[Corpora-List] XML parsers vs regex

Kilian Evang maschinenraum at texttheater.net
Mon Jun 30 20:40:26 UTC 2014


On 06/30/2014 06:02 PM, Milos Jakubicek wrote:
> Exactly. Though XML-aware tools (like XPath) look like "the right
> thing", you should try to avoid them as far as you only can. A regexp
> will be always faster, simpler, easier to understand for others.

Easier to understand? I'd say once you have a basic understanding of
XPath, it is way more readable than regexes. For example:

Regex: <word pos="([^"]+)
XPath: //word/@pos

Plus of course, the regex will break without you noticing and when you
least expect it.

Kilian

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list