[Corpora-List] English on-line sentence and word tokenization

WHITELOCK, Pete pete.whitelock at oup.com
Thu Apr 10 09:29:26 UTC 2014


I tried a search for "incremental tokenization" and found this:

http://www.english-linguistics.de/fr/teaching/ws09-10/i2cl/slides/lecture10.pdf

I think it's relevant - maybe you can find more detail in Frank Richter's papers.

Pete Whitelock, PhD
Principal Language Engineer, Technology
Academic Dictionaries
Oxford University Press
Gt. Clarendon St.
OX2 6DP
United Kingdom

From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Phil Gooch
Sent: 10 April 2014 10:15
To: Hugo Mougard
Cc: corpora at uib.no <corpora at uib.no> <corpora at uib.no>
Subject: Re: [Corpora-List] English on-line sentence and word tokenization

I think Clinithink does something along these lines, but it is a commercial product

http://clinithink.com/

Phil

On Thu, Apr 10, 2014 at 9:59 AM, Hugo Mougard <mog at crydee.eu<mailto:mog at crydee.eu>> wrote:
Dear all,

I'm looking for any pointers on works handling on-line tokenization (especially at the sentence level but word level also interests me). By on-line I mean "while the text is being typed". My current exploration gave no interesting result, likely because on-line is mainly used for something different than the above definition (eg being on the internet).

Best,
Hugo

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no<mailto:Corpora at uib.no>
http://mailman.uib.no/listinfo/corpora


Oxford University Press (UK) Disclaimer

This message is confidential. You should not copy it or disclose its contents to anyone. You may use and apply the information for the intended purpose only. OUP does not accept legal responsibility for the contents of this message. Any views or opinions presented are those of the author only and not of OUP. If this email has come to you in error, please delete it, along with any attachments. Please note that OUP may intercept incoming and outgoing email communications.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140410/09e1a8f1/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list