[Corpora-List] Weblogs Corpus + 2nd CFP for the Int. Conference on Weblogs and Social Media (ICWSM)
Nicolas Nicolov
Nicolas at umbrialistens.com
Fri Sep 15 16:56:09 UTC 2006
=============================================
Int. Conference on Weblogs and Social Media
March 26-28, 2007
Boulder, Colorado, U.S.A.
www.icwsm.org
=============================================
Availability of Data
Continuing the tradition from the WWE'06
workshop, we are once again offering a large
blog dataset to conference participants. The
data release comprises a complete set of
weblog posts collected by Nielsen BuzzMetrics
for May 2006 (consisting of about 14M posts
from 3M weblogs). The data set includes the
full content of the posts plus mark-up and
represents an unprecedented collection for
blog researchers. Our hope is that a communal
dataset, approached from many different
directions, will yield many interesting
results. More information on the dataset,
which is available for immediate download,
can be found at:
http://www.icwsm.org/data.html
Call for Papers
Recent years have seen a flourishing of social
media - the promise of the WWW coming to fruition.
Across the world, individuals can share opinions,
experiences and expertise at the push of a button.
There has been a fundamental shift thanks to
significant advances in the ease of publishing
content. Creating web content was for years the
domain of tech-savvy people; now the barrier has
been torn down.
Perhaps the most visible among the successes of
social media in recent years is the blogosphere.
Tens of thousands of new blogs are created every day;
blog content is becoming ubiquitous, surfacing
in news portals, search results and corporate
public relations. Even those who are unaware of the
blogosphere are still influenced by its content.
Although blogs are highly visible currently, other
forms of conversational spaces continue to flourish,
especially message boards, mailing lists, review
sites and Usenet.
Social media covers all forms of sharing: from
photos, to videos, to recommendations. In the past
few years, many examples of social media have
become hugely successful. Flickr is a premier photo
sharing site; del.icio.us has become a touchstone
for sharing recommendations of websites; Web 2.0
applications in general abound with newcomers in
the social media space.
One of the fascinating aspects of social media
has been the drive from within to study the
ecology as it evolves. People act at once as
creators, observers and influencers of the space
in which they participate. At the same time,
businesses are quickly grasping the potential
benefit to attending to the new space of social
media. Monitoring the aggregate trends and
opinions revealed by social media provides
valuable insight to a number of business
applications: marketing intelligence, competitive
intelligence.
The fast growing blogosphere and social media space
is a fruitful area for investigations across many
disciplines. For example:
* Natural language processing and machine learning
researchers study the extraction of factual
information from text; can blogs be processed in
a robust manner and can knowledge bases be
populated with facts from blogs?
* Social network researchers and graph theory
researchers are concerned with inferring
community structure; analyzing the linkage
patterns among blog entries can provide explicit
community structure; can we infer implicit
communities through the content of the blogs?
* Political scientists are looking at ways of
identifying influencers in a community; who are
the influential bloggers whose voice is echoed
by others?
* Multimedia researchers are attempting to
categorize audio and video content, aggregate
information from diverse sources (textual, audio,
video); can visual & audio social media be stored
in a way that allows search across different
modalities?
* Market analysis researchers are concerned with
what people think of the products and services
of a company; can we process blogs automatically
and find consumer complaints and breaking reports
about vulnerabilities of products; also when does
a burst of blogging activity become a trend?
* Social psychologists study the response to
current events, including emotional and
attitudinal dimensions as well as content and
patterns of influence.
Despite the growing relevance of blogs and social
media, existing research has only begun to address
the spectrum of issues that arise in their analysis.
Blogs, for example, are a different kind of document
than the relatively clean text that NLP research is
based on. Such differences in term of structure,
content and grammaticality will be a challenge
considering that blogs will likely represent the most
common way of publicly accessible personal expression.
AREAS OF INTEREST
The conference aims to bring together researchers
from different subject areas (e.g., computer science,
linguistics, psychology, statistics, sociology,
multimedia and semantic web technologies) and foster
discussions about ongoing research in the following
areas:
[01] AI methods for ethnographic analysis through
social media.
[02] Blogosphere vs. mediasphere; measuring the
influence of blogs on the media.
[03] Centrality/influence of bloggers/blogs; ranking/
relevance of blogs; web pages ranking based on
blogs.
[04] Crawling/spidering and indexing.
[05] Human Computer Interaction; social media tools;
navigation.
[06] Multimedia; audio/visual processing; aggregating
information from different modalities.
[07] Semantic analysis; cross-system and cross-media
name tracking; named relations and fact
extraction; discourse analysis; summarization.
[08] Semantic Web; unstructured knowledge management.
[09] Sentiment analysis; polarity/opinion
identification and extraction.
[10] Social Network Analysis; communities
identification; expertise discovery;
collaborative filtering.
[11] Text categorization; gender/age identification;
spam filtering.
[12] Time Series Forecasting; measuring
predictability of phenomena based on social
media.
[13] Trend identification/tracking.
[14] Visualization, aggregation and filtering.
[15] New social media applications, interfaces,
interaction techniques
IMPORTANT DATES
Submissions: December 8, 2006
Acceptance Notifications: February 2, 2007
Camera ready copies: February 16, 2007
Tutorials: March 25, 2007
Conference: March 26-28, 2007
SUBMISSION
People interested in participating should submit
through the conference website a technical paper
(up to 8 pages), a short paper (up to 4 pages),
a poster or demo description (up to 2 pages)
by midnight (PST) of Dec 8, 2006. Each submission
should, to the extent possible, indicate a list of
relevant areas from the list above (e.g., 03, 04, 10).
CHAIRS
* Natalie Glance, Nielsen BuzzMetrics.
* Nicolas Nicolov, Umbria Inc.
CO-CHAIRS
* Eytan Adar, Univ. of Washington.
* Matthew Hurst, Nielsen BuzzMetrics.
* Mark Liberman, Univ. of Pennsylvania.
* Franco Salvetti, Univ. of Colorado at Boulder &
Umbria Inc.
LOCAL CHAIR
* James H. Martin, Univ. of Colorado at Boulder.
PROGRAM COMMITTEE
* Paolo Avesani, ITC-irst, Italy
* Bran Boguraev, IBM Research, USA
* Chris Brooks, Univ. of San Francisco, USA
* Claire Cardie, Cornell Univ., USA
* Scott Carter, UC Berkeley, USA
* Steve Cayzer, HP Labs Bristol, UK
* Thierry Declerck, DFKI Language Lab, Germany
* Donghui Feng, ISI, USC, USA
* Tim Finin, UMBC, USA
* Kathy Gill, Univ. of Washington, USA
* Michelle Gumbrecht, Stanford Univ., USA
* John Henderson, MITRE, USA
* Eduard Hovy, ISI, USC, USA
* Jussi Karlgren, SICS, Sweden
* Laura Knudsen, OSC, USA
* Moshe Koppel, Bar-Ilan Univ., Israel
* Cameron Marlow, Yahoo! Research, USA
* Lluis Marquez, Univ. Poli. de Catalunya, Spain
* Rada Mihalcea, Univ. of North Texas, USA
* Gilad Mishne, Univ. of Amsterdam, The Netherlands
* Tomoyuki Nanno, Google, Japan
* Apostol Natsev, IBM Research, USA
* Kamal Nigam, Google, USA
* Peter Norvig, Google, USA
* Jon Oberlander, Univ. of Edinburgh, Scotland
* Peter Pirolli, PARC, USA
* Oana Postolache, Univ. of Saarland, Germany
* John Prager, IBM Research, USA
* Alessandro Provetti, Univ. of Messina, Italy
* Drago Radev, Univ. of Michigan, USA
* Jonathon Read, Univ. of Sussex, UK
* Maarten de Rijke, Univ. of Amsterdam
* Laura Ripamonti, Univ. of Milan, Italy
* Irina Rish, IBM Watson Research Center, USA
* Dan Roth, Univ. of Illinois at Urbana-Champaign
* James G. Shanahan, Turn Inc., USA
* Emma Shen, OSC, USA
* Suresh Sood, Univ. of Tech. Sydney, Australia
* Savitha Srinivasan, IBM Research, USA
* Carlo Strapparava, ITC-irst, Italy
* V.S. Subrahmanian, Univ. of Maryland, USA
* Belle Tseng, NEC Labs America, USA
* Janyce M. Wiebe, Univ. of Pittsburgh, USA
* Tong Zhang, Yahoo! Research, USA
* Liang Zhou, ISI, USC, USA
* Ethan Zuckerman, Harvard Univ., USA
VENUE
The conference will take place at Marriott Boulder
(http://marriott.com/property/propertypage/DENBO)
located near downtown Boulder, Colorado.
SPONSORS
ICWSM is proud to be supported by:
* Google, Inc.
* Microsoft Live Labs
* NEC Labs America
* Sphere
and
* Nielsen BuzzMetrics.
* Umbria, Inc.
* University of Pennsylvania
* University of Maryland, Baltimore County
ICWSM is a IW3C2 endorsed conference
(http://www.iw3c2.org/).
HISTORY
The International Conference on Weblogs and social
media grew out of two events: the annual series of
Workshops on the Weblogging Ecosystem (WWE 2006,
WWE 2005, WWE 2004) held in conjunction with the
International World Wide Web Conference and the
Spring Symposium organized by the American
Association for Artificial Intelligence (AAAI)
on Computational Approaches to Analyzing Weblogs
(CAAW 2006).
CONTACT
info (at) icwsm dot org
Best wishes
Nicolas
---
Dr Nicolas Nicolov
Chief Scientist
Umbria Inc.
1655 Walnut St, Suite 300
Boulder, CO 80302, U.S.A.
Tel: (310) 754-5010
More information about the Corpora
mailing list