[Corpora-List] English-German Word Alignments

Hieu Hoang Hieu.Hoang at ed.ac.uk
Wed Feb 5 09:34:30 UTC 2014


You can get the Europarl de-en with the alignment as part of the pre-built
models we release with Moses:
   http://www.statmt.org/moses/RELEASE-2.1/models/de-en/

You would probably want the cleaned, tokenized files
   corpus/europarl.clean.1.*
and the alignment file
   model/aligned.1.grow-diag-final-and


fyi, the word alignment was done with mgiza


On 5 February 2014 07:06, Sebastian Sulger <sebastian.sulger at uni-konstanz.de
> wrote:

> Dear All,
>
> I would like to do some experiments using a parallel English-German
> corpus, ideally word or phrase aligned. I know about the Europarl corpora,
> and I was wondering whether there are word or phrase alignments for those
> available somewhere.
>
> I am not familiar with word alignment tools (GIZA++ and the like), so
> before I get into that, readily usable alignments would be preferred.
>
> I'm thankful for any pointers!
>
> Best,
> Sebastian
>
> --
> Sebastian Sulger
> FB Sprachwissenschaft
> Universität Konstanz
> http://ling.uni-konstanz.de/pages/home/sulger
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140205/859439d6/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list