[Corpora-List] spanish tokenizer
Jorge Civera Saiz
jorcisai at iti.upv.es
Mon Oct 16 14:07:35 UTC 2006
Hi Maria,
Take a look at Freeling:
FreeLing 1.5 An Open Source Suite of Language Analyzers
Here you can find information about FreeLing, an open source language analysis
tool suite, released under the GNU Lesser General Public License (LGPL) of the
Free Software Foundation.
These tools have been developed at TALP Research Center, in Universitat
Politècnica de Catalunya. Spanish and Catalan morphological dictionaries and
grammars were initially developed by Centre de Llenguatge i Computació, in
Universitat de Barcelona, and since then improved and extended to other
languages thanks to many contributions.
www: http://garraf.epsevg.upc.es/freeling/
Best regards,
Jorge
Mensaje citado por Maria Esteva <mesteva at mail.utexas.edu>:
> Dear all,
>
> I am a PhD student in the School of Information, University of Texas
> at Austin. For my dissertation, I will text mine a large set of
> corporate electronic records in Spanish. For this, I need to find an
> open source spanish tokenizer, if possible in C++ although other
> languages would be fine as well. I am familiar with the Lucene tool
> set so if you know about another source where I can find this tool I
> will appreciate your help.
>
> Thanks in advance,
>
> Maria Esteva
>
>
-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/
More information about the Corpora
mailing list