[Corpora-List] A Lemmatizer is Required

True Friend true.friend2004 at gmail.com
Tue Mar 4 03:35:46 UTC 2008


Thanks.
There is a script called Tokenize.pl I've used it to tokenize first and then
tagged it.

On Tue, Mar 4, 2008 at 8:24 AM, Ciarán Ó Duibhín <
ciaran at oduibhin.freeserve.co.uk> wrote:

>  > I have used TreeTagger but it required tokanized words i.e. each word
> in a new line
>
> This is true of the basic TreeTagger program, but the Windows distribution
> contains tools to tokenize the input.
>
> If you want to run TreeTagger from the Windows command-line, do so using
> the supplied batch file, which calls a perl script to tokenize the input.
> If you want to run TreeTagger from the Windows graphic interface, just tick
> the checkbox labelled "built-in tokenization".
>
> Ciarán Ó Duibhín.
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
محمد شاکر عزیز
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080304/aae6622c/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list