[Corpora-List] lexicographic tools for parallel/comparable corpora
mickel.gronroos at masterin.com
mickel.gronroos at masterin.com
Tue Feb 6 18:03:27 UTC 2007
Joerg Tiedemann <tiedeman at let.rug.nl> kirjoitti:
> I'm looking for information about tools for the lexicographic use of
> parallel and comparable corpora.
The Finnish translation technology company Masterin has a bilingual term extractor that builds a raw bilingual translation lexicon from translation memory databases (which are, naturally, comparable to parallel corpora). (Shameless plug: The term extraction module will be available in the forth-coming Masterin 2007 translation tool.)
Masterin's solution is language-aware and supports English, Swedish and Finnish (any pair and direction). This enables the use of both rule-based and more traditional statistical approaches which in turn leads to impressive results. The tool is being used for the extraction of domain-specific translation lexica as we speak and very efficiently I might add.
I'll be glad to do a test run for you, should you have any parallel data in the languages covered by Masterin. (Maybe some English-Swedish stuff?) Please feel free to contact me directly.
Best regards,
Mickel Grönroos
--
Mickel Grönroos
Chief Language Officer, Masterin
Tekniikantie 14, FIN-02150 Espoo, Finland, www.masterin.com
More information about the Corpora
mailing list