[Corpora-List] sentence detector and phrase chunker returning absolute positions in text

Wiebke Wagner wagner at lifebiosystems.com
Mon Jul 19 05:58:12 UTC 2010


Dear all,

I am looking for a tool that performs sentece detection, part-of-speech
tagging and phrase-chunking. My problem is that most of these tools
return annotated text. What I need, however, is the absolute positions
in text of the sentece boundaries and of the chunks. For example,
consider the following sentences:

"This is a sentence. And here is another one."

I would need the information that the 19th and respectivly the 44th
character in the text is a sentence boundary. For the chunks, the
position and the length of the chunk would be ideal.
I have checked OpenNLP, Gate, LingPipe and MontyLingua but did not find
any information about such an output (at leas not for sentences AND
chunks).
Is anyone aware of such a tool? 

Best,
Wiebke Wagner



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list