[Corpora-List] Methods for detecting lists?

Paul Kalmar pkalmar at gmail.com
Tue Dec 16 23:29:13 UTC 2008


I am working on a problem and would like to be able to detect and parse
lists in plain text.  Is there any program/algorithm/research/heuristic for
detecting lists in raw (and structured) text?  I have tried simplistic
things like looking for commas and "and", but would like a more robust
approach that could also detect lists in other forms with possibly
extraneous text within.  I would be grateful for any pointers you could give
me.
Thanks,
Paul Kalmar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20081216/a1b0b5c7/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list