[Corpora-List] filter parallel corpus

Saeed Farzi saeedfarzi at gmail.com
Thu Jan 16 15:43:25 UTC 2014


Dear all,

I am working on a translation task with a very large parallel corpus.
Because of computational cost of training such a parallel corpus, i am
going to filter it regarding to the test set ( of course , by the
filtering, the evaluation must be still fair).

I am looking for  a solution  or a tool for filtering parallel corpus sentences.

Note that  i do not need to filter phrase table. I know that the
filter_ moses tool reduces the phrase table size.

cheers
-- 
           S.Farzi, Ph.D. Student
    Natural Language Processing Lab,
  School of Electrical and Computer Eng.,
               Tehran University
             Tel: +9821-6111-9719

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list