[Corpora-List] Looking for genre-specific corpora

Marina Santini marinamailinglists at gmail.com
Tue Apr 26 12:06:11 UTC 2011


Dear CorporaList members,

I am doing some research in concept extraction from differet types of
texts or genres.

I am looking for free research corpora belonging to the following genres:

1) FAQs (I have already downloaded some small collections, but I would
like to have a more comprehensive range of topics).
2) Chat logs transcripts (I have already downloaded the NPS
Collection, 3 Codiac datasets and several smallish Many Eyes datasets)
3) Telephone conversation transcripts (missing)
4) emails (I have already downloaded the Enron dataset  and a couple
of junk mail collections)
5) Twitter's posts corpora (missing, apparently the Edinburgh's
Twitter corpus is not available any more)
6) corporate weblog corpora (missing)

I will be glad to share all the links and related documentation, once
I got all the genres in the list.

Thanks in advance for your suggestions.

Best Regards

-- 
Marina Santini
Researcher at Artificial Solutions

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list