[Corpora-List] General Italian wordlist
Emiliano Guevara
emiliano.guevara at unibo.it
Wed Nov 14 23:07:52 UTC 2007
Dear Jane,
unfortunately, there are still neither freely available, nor freely
manipulable "general" corpora in Italian comparable to the BNC (I
suppose what you mean is a reference corpus, balanced according to
genre, medium, large enough in size to be representative of the whole
language, etc).
I guess the best you can get is either wordlists generated from web
corpora or from large unbalanced corpora such as "La Repubblica
corpus" (check http://dev.sslmit.unibo.it/corpora/corpus.php?
path=&name=Repubblica).
The good news is: you can get all of this right at Bologna University!
I'll be happy to help you with any of these alternatives, and
eventually also to find a better way to do the keyword search beyond
what WSTools has to offer (when you start playing with several
million words, WSTools just chokes...).
Cheers,
Emiliano
On 14 Nov 2007, at 16:12, jane..johnson@@libero..it wrote:
> Similar to the BNC_World.lst for use with the Keyword tool of the
> WordSmith suite, I am looking for a wordlist generated from a
> general corpus of contemporary Italian to create a Keyword list
> for a selection of Italian novels. Can anyone point me in the right
> direction? thanks
> Jane Johnson
> University of Bologna
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
****************************************
Emiliano R. Guevara
Facoltà di Lingue e Lett. Straniere
Dip. di Lingue e Lett. Straniere
Università di Bologna
Via Cartoleria 5 (40124) Bologna, Italia
Homepage: http://morbo.lingue.unibo.it/
E-mail: emiliano.guevara at unibo.it
emiguevara at gmail.com
****************************************
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list