22.1852, Qs: Genre-Specific Corpora
linguist at LINGUISTLIST.ORG
linguist at LINGUISTLIST.ORG
Tue Apr 26 16:50:14 UTC 2011
LINGUIST List: Vol-22-1852. Tue Apr 26 2011. ISSN: 1068 - 4875.
Subject: 22.1852, Qs: Genre-Specific Corpora
Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews: Veronika Drake, U of Wisconsin-Madison
Monica Macaulay, U of Wisconsin-Madison
Rajiv Rao, U of Wisconsin-Madison
Joseph Salmons, U of Wisconsin-Madison
Anja Wanner, U of Wisconsin-Madison
<reviews at linguistlist.org>
Homepage: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University,
and donations from subscribers and publishers.
Editor for this issue: Danielle St. Jean <danielle at linguistlist.org>
================================================================
We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then strongly encouraged to post a summary to the list. This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.
In addition to posting a summary, we'd like to remind people that it
is usually a good idea to personally thank those individuals who have
taken the trouble to respond to the query.
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.cfm.
===========================Directory==============================
1)
Date: 26-Apr-2011
From: Marina Santini [MarinaSantini.MS at gmail.com]
Subject: Genre-Specific Corpora
-------------------------Message 1 ----------------------------------
Date: Tue, 26 Apr 2011 12:48:12
From: Marina Santini [MarinaSantini.MS at gmail.com]
Subject: Genre-Specific Corpora
E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=22-1852.html&submissionid=4515235&topicid=8&msgnumber=1
Hi,
I am doing some research in concept extraction from different types of
texts or genres.
I am looking for free research corpora (in English and in any other
language) belonging to the following genres:
1) FAQs (I have already downloaded some small collections, but I
would like to have a more comprehensive range of topics).
2) Chat logs transcripts (I have already downloaded the NPS
Collection, 3 Codiac datasets and several smallish Many Eyes
datasets)
3) Telephone conversation transcripts (missing)
4) Emails (I have already downloaded the Enron dataset and a couple
of junk mail collections)
5) Twitter posts corpora (missing, apparently the Edinburgh's Twitter
corpus is not available any more)
6) Corporate weblog corpora (missing)
I will be glad to share all the links and related documentation, once I got
all the genres in the list.
Thanks in advance for your suggestions.
Best Regards,
Marina Santini
Researcher at Artificial Solutions
Linguistic Field(s): Computational Linguistics
Text/Corpus Linguistics
-------------------------------------------------------------------------------
This Year the LINGUIST List hopes to raise $67,000. This money will go to help
keep the List running by supporting all of our Student Editors for the coming year.
See below for donation instructions, and don't forget to check out Fund
Drive 2011 site!
http://linguistlist.org/fund-drive/2011/
There are many ways to donate to LINGUIST!
You can donate right now using our secure credit card form at
https://linguistlist.org/donation/donate/donate1.cfm
Alternatively you can also pledge right now and pay later. To do so, go to:
https://linguistlist.org/donation/pledge/pledge1.cfm
For all information on donating and pledging, including information on how to
donate by check, money order, or wire transfer, please visit:
http://linguistlist.org/donation/
The LINGUIST List is under the umbrella of Eastern Michigan University and as
such can receive donations through the EMU Foundation, which is a registered
501(c) Non Profit organization. Our Federal Tax number is 38-6005986. These
donations can be offset against your federal and sometimes your state tax return
(U.S. tax payers only). For more information visit the IRS Web-Site, or contact
your financial advisor.
Many companies also offer a gift matching program, such that they will match
any gift you make to a non-profit organization. Normally this entails your
contacting your human resources department and sending us a form that the
EMU Foundation fills in and returns to your employer. This is generally a simple
administrative procedure that doubles the value of your gift to LINGUIST, without
costing you an extra penny. Please take a moment to check if your company
operates such a program.
Thank you very much for your support of LINGUIST!
-----------------------------------------------------------
LINGUIST List: Vol-22-1852
----------------------------------------------------------
More information about the LINGUIST
mailing list