[Corpora-List] Unitex 2.1

Sébastien Paumier sebastien.paumier at univ-mlv.fr
Thu Feb 10 10:06:46 UTC 2011


Dear colleagues,
it's a pleasure for us to announce you the release of Unitex 2.1 at:

http://igm.univ-mlv.fr/~unitex

Unitex is an Open Source corpus processor that uses linguistic resources
such as eletronic dictionaries and local grammars. It works on Windows, MacOS 
and Linux. At the
given URL, you will find the software, the user manual (up-to-date) and
a brief overview of the history of Unitex.

Major improvements since 2.0:
- introduction of LocateTfst that performs locate operations on the text automaton
- introduction of a statistical tagger that can trim text automata to make them 
linear
- introduction of the transducer cascade system CasSys
- introduction of output variables that can be used to catch outputs emitted by 
grammars
- introduction of operators to test and compare variables
- advanced search options
- support of semitic inflection, including Arabic typographical variations
- logging of error messages emitted by programs
- Unitex is now pure LGPL
- Unitex can be compiled as a dynamic library (.dll or .so)
- the whole code is thread-safe
- introduction of UnitexToolLogger that can be used to create and run again log 
of Unitex programs execution

Best regards,
SèŒ
bastien Paumier


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list