[Corpora-List] BulTreeBank data release: Dependency Part and Morphologically Annotated Part

Kiril Simov kivs at bultreebank.org
Mon Oct 23 14:34:38 UTC 2006


Dear Colleagues,

I would like to inform you that we have released the dependancy part and the
morphologically annotated part of our treebank.

The dependancy part of the treebank contains above 196000 tokens (13200 
sentences).
It was used for the CoNNL-X shared task this year 
(http://nextens.uvt.nl/~conll/).

The morphologically annotated part of the treebank contains above 214000 
tokens (15000 sentences).
It was used for training of the TreeTagger for Bulgarian
(http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/).

Both datasets are available from our web page:

http://www.bultreebank.org/Resources.html

With best regards,

Kiril Simov

-----------------------------------------------------------------
Kiril Simov
BulTreeBank Project
Linguistic Modelling Laboratory, IPP,
Bulgarian Academy of Sciences
Acad. G.Bonchev St. 25A
1113 Sofia, Bulgaria
E-mail: kivs at bultreebank.org
Web: http://www.bultreebank.org/
-----------------------------------------------------------------



More information about the Corpora mailing list