[Corpora-List] Cambridge SMT Toolkit - Open Source Release
Bill Byrne
bill.byrne at eng.cam.ac.uk
Fri Jun 6 17:30:02 UTC 2014
Statistical machine translation tools developed at Cambridge University are now available at http://ucam-smt.github.io/ .
This is an initial release, featuring:
- HiFST -- Hierarchical phrase-based statistical machine translation based on the Google OpenFst Toolkit http://openfst.org
- Direct production of translation lattices as Weighted Finite State Automata
- Efficient WFSA rescoring procedures
- OpenFst wrappers for direct inclusion of KenLM and ARPA language models as WFSAs
- Lattice Minimum Error Rate Training
- Lattice Minimum Bayes Risk decoding
- Recursive Transition Networks and Pushdown Automata
- Client/Server mode
- WFSA true-casing
- and much more
A tutorial (http://ucam-smt.github.io/tutorial) based on the Cambridge 2013 WMT Russian-English system is also included
To get the toolkit:
- https://github.com/ucam-smt/ucam-smt/archive/master.zip
- git clone https://github.com/ucam-smt/ucam-smt.git
--
Bill Byrne
University of Cambridge
http://mi.eng.cam.ac.uk/~wjb31
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list