[Corpora-List] List Parallel Corpora with Cronological data

Paulo Malvar paulomal at gmail.com
Tue Aug 12 01:16:51 UTC 2008


The Software Localization English-Galician Parallel Corpus was compiled
during summer 2007 and was released under GPL in January 2008. It is
composed of Open Source (linux distributions and applications) software
localization translation units and it currently has 5,163,524 Part-of-Speech
tagged and lemmatized tokens (2,535,405 Eglish tokens and 2,628,119 Galician
tokens). It can be found at:
http://d108.dinaserver.com/hosting/paulomalvar.com/Paulo_Malvar_personal_webpage/Resources.html

Best regards,

Paulo Malvar Fernández


-- 
Paulo Malvar Fernández

M.A. in Computational Linguistics

http://d108.dinaserver.com/hosting/paulomalvar.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080811/8aadcded/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list