[Corpora-List] Romanian language corpora

Vlad V. Gojol gojol at rnc.ro
Thu Dec 6 07:50:00 UTC 2007


   Dear Mr. Frumuselu, 

   We have a corpus of newspapers texts : 56 mil. words 
with diacritics ( Unicode .txt format ), parsed with 
GojolParser ( two formats : dependency maps and trees ) 
with an accuracy comparable to the manual one : cca 0.1% 
tagging errors rate. We may add up its translation into 
English, sentence by sentence, with GojolTranslator 
( GoTra ), intelligible enough ( anyway superior to 
Systran translator's one for English-French, for instance ). 
Not downloadable, but deliverable on DVD/CD. 
   Regards, 
            Vlad Gojol 

---------------------------
LANGOS S.R.L.
Phone: +40 21 778 5315
Fax: +40 21 413 7420
E-mail: gojol at rnc.ro


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list