[Corpora-List] Sentence boundary detection

Eric Garbin egarbin at thetdgroup.com
Mon Aug 27 21:07:25 UTC 2007


I'm curious about the errors that can occur in sentence boundary
detection in Arabic and also in Mandarin.  How do these errors vary with
training/testing on particular corpora (LDC, Web-derived or other).
Pointers to relevant reading would be appreciated, as would any other
suggestions from someone who has trained a detector on those languages.


 

Thank you,

 

Eric Garbin

Computational Linguist

The Technology Development Group

www.thetdgroup.com

571-262-2693

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070827/a0850c47/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list