[Corpora-List] Spanish reference corpus

Mario Crespo Miguel mario.crespo at uca.es
Thu Feb 1 13:17:17 UTC 2007


Thank you very much for helping me, but I think it is more 
convenient for me if the frequencies of the words of this open 
domain / general corpus could be obtained. Does anybody know if 
such an information is available some way? Best,

Mario



El dia 30 ene 2007 16:10, Serge Sharoff <s.sharoff at leeds.ac.uk> 
escribió:

> one answer is the Spanish Internet corpus with the interface from
> http://corpus.leeds.ac.uk/internet.html
> and the URL list 
> http://corpus.leeds.ac.uk/internet/final-url-es.gz
> 
> This is a random snapshot of the Spanish Internet of about 120 
> million
> words, see
> Sharoff, S (2006) Creating general-purpose corpora using 
> automated
> search engine queries. In Marco Baroni and Silvia Bernardini, 
> editors,
> WaCky! Working papers on the Web as Corpus. Gedit, Bologna.
> http://wackybook.sslmit.unibo.it/
> 
> S
> 
> On Tue, 2007-01-30 at 15:54 +0100, Mario Crespo Miguel wrote:
>> Dear everybody,
>> 
>> Thank you again for all the help that I always get with this 
>> mailing list, and  this time I would like to ask if there is 
>> some reference / open-domain corpus for Spanish which is freely 
>> available and could be downloaded. Thank you in advance. Best 
>> wishes,
>> 
>> Mario Crespo Miguel
>> 
>> 
> 
> 



More information about the Corpora mailing list