[Corpora-List] A Lemmatizer is Required
True Friend
true.friend2004 at gmail.com
Tue Mar 4 03:35:46 UTC 2008
Thanks.
There is a script called Tokenize.pl I've used it to tokenize first and then
tagged it.
On Tue, Mar 4, 2008 at 8:24 AM, Ciarán Ó Duibhín <
ciaran at oduibhin.freeserve.co.uk> wrote:
> > I have used TreeTagger but it required tokanized words i.e. each word
> in a new line
>
> This is true of the basic TreeTagger program, but the Windows distribution
> contains tools to tokenize the input.
>
> If you want to run TreeTagger from the Windows command-line, do so using
> the supplied batch file, which calls a perl script to tokenize the input.
> If you want to run TreeTagger from the Windows graphic interface, just tick
> the checkbox labelled "built-in tokenization".
>
> Ciarán Ó Duibhín.
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
--
محمد شاکر عزیز
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080304/aae6622c/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list