Corpora: massive text corporisation

P bI K O B B.B. rykov at narod.ru
Fri Jun 1 14:04:50 UTC 2001


Hello !

Maybe somebody remembers that I mentioned before that there is enourmous collection of Russian texts here collected by Sergey Lesnikov in Komi Republic University.

There are 4 Gb of thousands of texts there.

Now he thinks that his problem is to begin converting them into corpus/corpora. I think that the corpus is smth totally different word unit. Maybe I am wrong.

Maybe there are people who will be too kind to have time to give him a good advice?

I am not sure I am guru or No 1 in Corpus Linguistics Phylosophy. 


-- 
Vladimir Rykov, PhD in Comp Linguistics, Linguistic Institute RAS, MOSCOW



More information about the Corpora mailing list