[Corpora-List] Croatian National Corpus reached 100 million tokens

Marko Tadic mtadic at ffzg.hr
Sat Feb 4 07:50:05 UTC 2006


Dear colleagues,
it is our privilege to inform you that the Croatian National Corpus (HNK v
2.0) reached 100 million tokens (101.3 in fact) at the end of December
2005.
The corpus is available for public and free access using Bonito
(http://www.textforge.cz/download) free client program.
All neccessary details are available at the HNK web site:
http://www.hnk.ffzg.hr.
The new version of HNK v 2.5 (scheduled for spring 2006) will feature
lemma and MSD search possibility as well. Meanwhile only a small test
subcorpus (cw2000) offers this kind of search.
Anyway, I hope that you will find HNK useful as a source of primary
linguistic data for Croatian.
Any comment, suggestion, criticism etc. is more than welcome.
All the best
Marko Tadic
-----------------------------------------------------------------------
Marko Tadic, Associate Professor
Head of the Department of Linguistics
Faculty of Philosophy, University of Zagreb
Ivana Lucica 3, HR-10000 Zagreb, Croatia
tel. +385 1 6120-142, 6120-045
fax. +385 1 6156-879
personal homepage: www.hnk.ffzg.hr/mt/

*** Visit the pages of Croatian national corpus: www.hnk.ffzg.hr ***
*** Visit the Croatian Morphological Lexicon: hml.ffzg.hr ***
*** Visit the Croatian Language Technologies portal: www.hnk.ffzg.hr/jthj/
***
*** Visit the pages of Croatian Language Technologies Society: www.hdjt.hr
***



More information about the Corpora mailing list