[Corpora-List] Concordancing russian text

Kiril Simov kivs at bultreebank.org
Tue Jun 24 14:28:26 UTC 2003


Dear Yvonne Breyer,

For standalone concordance you can use the CLaRK system.
It is an XML-based system for corpora development and allows
search in XML annotated corpora, but also in textual files with
simple XML mark-up. The queries for the search can be
formulated as regular expressions.

The system is based on Unicode, but it can read from several
encodings of Cyrillic letters, including KOI8.

You could download the system from our website:

http://www.bultreebank.org/clark/index.html

With best regards,

Kiril

-----------------------------------------------------------------
Kiril Simov
BulTreeBank Project
Linguistic Modelling Laboratory, CLPP,
Bulgarian Academy of Sciences
Acad. G.Bonchev St. 25A
1113 Sofia, Bulgaria
E-mail: kivs at bultreebank.org
Web: http://www.bultreebank.org/
-----------------------------------------------------------------
----- Original Message -----
From: "Webmaster CL" <webmaster at corpus-linguistics.de>
To: <CORPORA at UIB.NO>
Sent: Tuesday, June 24, 2003 3:56 PM
Subject: [Corpora-List] Concordancing russian text


Dear members of the Corpora-Mailing List,



I have been trying to put together a small corpus in Russian (Cyrillic
coding) for classroom concordancing, i.e. for the use with concordancers
such as MonoConc and the like. I am aware of the online availability of the
Uppsala corpus at  <http://www.sfb441.uni-tuebingen.de/b1/korpora.html>
http://www.sfb441.uni-tuebingen.de/b1/korpora.html yet for presentation
purposes at a workshop I was hoping to present offline concordancing with a
standalone program. Currently, my problem lies not so much on the
linguistic/theoretical side of things but rather the technical realisation
of working with Cyrillic text. So far I have not succeeded in putting
together a *.txt-file in Russian that MonoConcPro would read. I thought that
it was actually capable of it though. Could somebody point me into the right
direction? Any help would be much appreciated.

Best regards,

Yvonne Breyer



----------------------------------------------

Webmaster @  <http://www.corpus-linguistics.info/>
http://www.corpus-linguistics.info

University of Essen, Germany



More information about the Corpora mailing list