[Corpora-List] New Releases of SMULTRON and the TreeAligner
Torsten Marek
marek at ifi.uzh.ch
Tue Jun 2 09:34:21 UTC 2009
Dear all,
the Parallel Treebank Group at the Institute of Computational
Linguistics at the University of Zürich is proud to announce the
availability of new releases for SMULTRON, an aligned parallel treebank,
and the TreeAligner, a tool for annotating, browsing and querying
parallel treebanks.
SMULTRON v1.1
=============
SMULTRON (Stockholm MULtilingual TReebank) is a parallel treebank which
contains around 1000 sentences in English, German and Swedish. The
sentences have been PoS-tagged and annotated with phrase structure
trees. The trees have been aligned across languages on sentence, phrase
and word level. Additionally, the German and Swedish monolingual
treebanks contain lemma information. The SMULTRON corpus is freely
available for research purposes, please see the registration page[0].
New in version 1.1:
* new German-Swedish alignments
* various annotation errors fixed
* compatibility updates for the new TreeAligner
TreeAligner v1.1
================
The TreeAligner is a graphical tool for creating aligned parallel
treebanks by drawing alignment links between phrases. The monolingual
treebanks must currently be encoded in TIGER-XML.
The TreeAligner also allows querying the aligned treebanks, using an
extended version of the TIGER corpus query language.
Easy installers for the TreeAligner are available for Windows and
Ubuntu[1]. The source code is available as a package or from public
repositories[2].
All code is licensed under the GPLv2.
New in version 1.1:
* improved annotation workflow & tree interactivity
* corpus information display (feature values etc.)
* automatic alignment suggestions (experimental feature)
* monolingual queries for parallel treebanks
* much faster query evaluation engine
* query language extensions for restricted universal quantification[3]
* improved tree layout algorithm
* sampler from the SMULTRON corpus included in the installers
If you are interested, please join us on the TreeAligner mailing
list[4]!
With best regards,
Torsten
[0] http://www.cl.uzh.ch/kitt/smultron/
[1] http://www.cl.uzh.ch/kitt/treealigner/wiki/TreeAlignerDownload
[2] http://www.cl.uzh.ch/kitt/hg/sta/branch-1.1/
[3]
http://www.ifi.uzh.ch/arvo/cl/volk/papers/Marek_Lundborg_Volk__KONVENS_2008.pdf
[4] https://lists.ifi.uzh.ch/listinfo/treealigner
--
.: Torsten Marek
.: University of Zurich
.: Institute of Computational Linguistics
.: http://www.cl.uzh.ch/en/tmarek.html
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list