[Corpora-List] Searching for an email corpus - SUMMARY
Ute Römer
ute.roemer at engsem.uni-hannover.de
Wed Apr 11 18:39:56 UTC 2007
Dear All,
Here is a quick summary of the messages I got in response to my recent query
on email corpora. I'd like to thank the following list members for helpful
pointers:
Stefan Bordag
Chris Jordan
Sabine Bartsch
Ramesh Krishnamurthy
Stefan Bordag mentioned the (huge) USENET corpus which does not contain
emails but texts of a similar type (from an internet discussion forum):
<http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.h
tml>
http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.ht
ml
Chris Jordan suggested the SpamAssassin Corpus
(http://spamassassin.apache.org/).
Sabine Bartsch and Ramesh Krishnamurthy sent me a link to the Wolverhampton
junk email corpus(http://clg.wlv.ac.uk/projects/junk-email/); Sabine also
mentioned the email messages corpus from W3C lists
(http://tides.umiacs.umd.edu/webtrec/trecent/parsed_w3c_corpus.html).
I have now got plenty of corpus material to keep my 'Analysing Texts'
students busy... Thanks!
Very best wishes... Ute
************************************************************
Dr. Ute Römer
English Department
Leibniz University of Hanover
Königsworther Platz 1
30167 Hannover
Germany
Phone: +49 (0)511 762 2997
Fax: +49 (0)511 762 2996
Please note NEW e-mail address: ute.roemer at engsem.uni-hannover.de
http://www.uteroemer.com <http://www.uteroemer.com/>
http://www.engsem.uni-hannover.de/angli/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070411/7a7e5217/attachment.htm>
More information about the Corpora
mailing list