[Corpora-List] List Parallel Corpora with Cronological data
Paulo Malvar
paulomal at gmail.com
Tue Jul 15 17:33:36 UTC 2008
The Software Localization English-Galician Parallel Corpus was compiled
during summer 2007 and was released under GPL in January 2008. It is
composed of Open Source (linux distributions and applications) software
localization translation units and it currently has 5,163,524 Part-of-Speech
tagged and lemmatized tokens (2,535,405 Eglish tokens and 2,628,119 Galician
tokens). It can be found at:
http://paulomalvar.dyndns.org:8080/Paulo_Malvar_personal_webpage/Resources.html
Best regards,
Paulo Malvar Fernández
--
Paulo Malvar Fernández
Research Assistant of the SDSU Computational Linguistics Lab
http://paulomalvar.homeunix.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080715/52e4df79/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list