[Corpora-List] Re: Looking for super large Russian corpus

Sergey Protasov svp at zuzino.net.ru
Thu Nov 4 06:03:22 UTC 2004


Dear, All!

Thank you very much for your links!

I found exactly that I looking for...

Special thanks to
Philip Resnik (USA) for 2Gbyte url list of Russian sites
http://umiacs.umd.edu/~resnik/strand/,
Jonathan Young (USA) for good collection at http://www.wordtheque.com,
Roman Yangarber (USA) for link to more than 3Gbyte Moshkov library at
http://lib.ru,
Viktor Zaharov (Russia) for link to the special search engine in Moshkov
library  http://www.aot.ru/search1.html,
Stefan Bordag for the idea to buy some special CDs in Russia (This
method works!),

Vladimir Rykov (Russia, I am postgraduate of) for posting my question to
the Corpora List.


Now I am trying to extract sentences from html/plain texts.

I need a big text splited by sentences to train my Russian Link Grammar
http://sz.ru/parser/,

I will tell you about my results..  Later.


--
Sergey Protasov
PhD student in Computational Linguistics,
Moscow Institute of Physics and Technology



More information about the Corpora mailing list