[Corpora-List] List Parallel Corpora with Cronological data
Paulo Malvar
paulomal at gmail.com
Tue Aug 12 01:16:51 UTC 2008
The Software Localization English-Galician Parallel Corpus was compiled
during summer 2007 and was released under GPL in January 2008. It is
composed of Open Source (linux distributions and applications) software
localization translation units and it currently has 5,163,524 Part-of-Speech
tagged and lemmatized tokens (2,535,405 Eglish tokens and 2,628,119 Galician
tokens). It can be found at:
http://d108.dinaserver.com/hosting/paulomalvar.com/Paulo_Malvar_personal_webpage/Resources.html
Best regards,
Paulo Malvar Fernández
--
Paulo Malvar Fernández
M.A. in Computational Linguistics
http://d108.dinaserver.com/hosting/paulomalvar.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080811/8aadcded/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list