[Corpora-List] General Italian wordlist

Emiliano Guevara emiliano.guevara at unibo.it
Wed Nov 14 23:07:52 UTC 2007


Dear Jane,

unfortunately, there are still neither freely available, nor freely  
manipulable "general" corpora in Italian comparable to the BNC (I  
suppose what you mean is a reference corpus, balanced according to  
genre, medium, large enough in size to be representative of the whole  
language, etc).

I guess the best you can get is either wordlists generated from web  
corpora or from large unbalanced corpora such as "La Repubblica  
corpus" (check http://dev.sslmit.unibo.it/corpora/corpus.php? 
path=&name=Repubblica).

The good news is: you can get all of this right at Bologna University!

I'll be happy to help you with any of these alternatives, and  
eventually also to find a better way to do the keyword search beyond  
what WSTools has to offer (when you start playing with several  
million words, WSTools just chokes...).

Cheers,

Emiliano



On 14 Nov 2007, at 16:12, jane..johnson@@libero..it wrote:

> Similar to the BNC_World.lst for use with the Keyword tool of the  
> WordSmith suite, I am looking for a wordlist generated from a  
> general corpus of contemporary Italian  to create a Keyword list  
> for a selection of Italian novels. Can anyone point me in the right  
> direction? thanks
> Jane Johnson
> University of Bologna
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

****************************************
Emiliano R. Guevara
Facoltà di Lingue e Lett. Straniere
Dip. di Lingue e Lett. Straniere
Università di Bologna
Via Cartoleria 5 (40124) Bologna, Italia

Homepage: http://morbo.lingue.unibo.it/

E-mail:   emiliano.guevara at unibo.it
           emiguevara at gmail.com
****************************************


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list