[Corpora-List] Special-domain corpora

Carlos Rodriguez crodriguezp at gmail.com
Sat Mar 26 22:12:49 UTC 2005


Hi,

I was wondering if anyone could point me to  domain corpora  with the
following  characteristics:

1.- Written texts (ASCII, xml, txt,pdf, no need to be tagged) from
specialized or technical domains.
2.- Open source, or reasonably priced, that can be downloaded to be
processed (web-accesible through proprietary interfaces won't cut it).
 3.- If possible, with machine-readable or electronic lexicons or
dictionaries available for the domain represented by the corpora.

I am thinking about experimenting with techniques for lexical acquisition.

Thanks and best to all,


Carlos Rodríguez



More information about the Corpora mailing list