[Corpora-List] NooJ v2.0 release

Max Silberztein max.silberztein at univ-fcomte.fr
Thu Dec 20 15:48:55 UTC 2007


Dear colleagues,

we are pleased to announce the release of NooJ v2.0. NooJ is a linguistic
engineering development platform that allows linguists and NLP developers to
formalize various levels of linguistic phenomena, and build various
applications of NLP. See http://www.nooj4nlp.net to freely download the
software, its manual and linguistic resources.

Beside a number of enhancements of the interface (syntax coloring,
linguistic resource management, etc.) and of its linguistic resources, v2.0
contains:

-- A new corpus processor that applies a typical NooJ linguistic query to a
corpus made of 10,000+ texts in a few minutes.

-- A more robust dictionary compiler. For instance, it compiles the
Hungarian dictionary that describes the equivalent of a list of 120+ million
word forms in a few hours (it takes a few minutes to compile the English
dictionary).

-- A new linguistic engine that better integrates the morphological and
syntactic levels of analyses via new operations on variables. Its more
visible enhancements are:

-- its new constraints that allow to perform unification checks on lexical
properties (e.g. <$DET$Nb=$N$Nb> to make sure a noun agrees in number with a
determiner; "<N> $N$Vsup" to look for a noun followed by its support verb)

-- its transformation engine that allows to perform local Machine
Translation; partial translations can be accumulated

-- noojapply can parse texts in which text units are delimited with XML tags
(such as <p> or <s>).

Enjoy,
--Max Silberztein



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list