Date: Mon, 26 Feb 2018 17:17:21
From: Maite Taboada [mtaboada at sfu.ca]
Subject: Announcing the SFU Opinion and Comments Corpus

The Discourse Processing Lab at Simon Fraser University
(http://www.sfu.ca/discourse-lab) is pleased to announce the release of the
SFU Opinion and Comments Corpus.

The SFU Opinion and Comments Corpus (SOCC) is a corpus for the analysis of
online news comments. Our corpus contains comments and the articles from which
the comments originated. The articles are all opinion articles, not hard news
articles. The corpus is larger than any other currently available comments
corpora, and has been collected with attention to preserving reply structures
and other metadata. In addition to the raw corpus, we also present annotations
for four different phenomena: constructiveness, toxicity, negation and its
scope, and appraisal.

Full details, and download link, are available from our GitHub project page:

For more information about this work, please see our papers.

Kolhatkar, V., H. Wu, L. Cavasso, E. Francis, K. Shukla and M. Taboada (2018)
The SFU Opinion and Comments Corpus: A corpus for the analysis of online news
comments. Journal paper under review.

Kolhatkar. V. and M. Taboada (2017) Using New York Times Picks to identify
constructive comments. Proceedings of the Workshop Natural Language Processing
Meets Journalism, Conference on Empirical Methods in Natural Language
Processing. Copenhagen. September 2017.

Kolhatkar, V. and M. Taboada (2017) Constructive language in news comments.
Proceedings of the 1st Abusive Language Online Workshop, 55th Annual Meeting
of the Association for Computational Linguistics. Vancouver. August 2017, pp.


Varada Kolhatkar (vkolhatk at sfu.ca)
Maite Taboada (mtaboada at sfu.ca)

Linguistic Field(s): Computational Linguistics
                     Discourse Analysis
                     Text/Corpus Linguistics

Subject Language(s): English (eng)



