21.1439, Calls: Computational Ling, Text/Corpus Ling/Switzerland

linguist at LINGUISTLIST.ORG linguist at LINGUISTLIST.ORG
Wed Mar 24 15:53:40 UTC 2010


LINGUIST List: Vol-21-1439. Wed Mar 24 2010. ISSN: 1068 - 4875.

Subject: 21.1439, Calls: Computational Ling, Text/Corpus Ling/Switzerland

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
 
Reviews: Monica Macaulay, U of Wisconsin-Madison  
Eric Raimy, U of Wisconsin-Madison  
Joseph Salmons, U of Wisconsin-Madison  
Anja Wanner, U of Wisconsin-Madison  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Kate Wu <kate at linguistlist.org>
================================================================  

LINGUIST is pleased to announce the launch of an exciting new feature:  
Easy Abstracts! Easy Abs is a free abstract submission and review facility 
designed to help conference organizers and reviewers accept and process 
abstracts online.  Just go to: http://www.linguistlist.org/confcustom, 
and begin your conference customization process today! With Easy Abstracts, 
submission and review will be as easy as 1-2-3!

===========================Directory==============================  

1)
Date: 23-Mar-2010
From: Evgeniy Gabrilovich < gabr at yahoo-inc.com >
Subject: ACM SIGIR Conference
 

	
-------------------------Message 1 ---------------------------------- 
Date: Wed, 24 Mar 2010 11:50:44
From: Evgeniy Gabrilovich [gabr at yahoo-inc.com]
Subject: ACM SIGIR Conference

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=21-1439.html&submissionid=2620294&topicid=3&msgnumber=1
  

Full Title: ACM SIGIR Conference 
Short Title: SIGIR 

Date: 18-Jul-2010 - 23-Jul-2010
Location: Geneva, Switzerland 
Contact Person: SIGIR 2010 Announce
Meeting Email: announce at sigir2010.org
Web Site: http://www.sigir2010.org 

Linguistic Field(s): Computational Linguistics; Text/Corpus Linguistics 

Call Deadline: 30-May-2010 

Meeting Description:

Feature Generation and Selection for Information Retrieval
Workshop at the 33rd Annual ACM SIGIR Conference (SIGIR 2010)
http://alex.smola.org/workshops/sigir10/
July 23, 2010
Geneva, Switzerland

Submissions Due May 30, 2010

SIGIR is the major international forum for the presentation of new research
results and for the demonstration of new systems and techniques in the broad
field of information retrieval (IR). 

Call for Papers

We solicit submissions for the Workshop on Feature Generation and Selection for
Information Retrieval, to be held on July 23, 2010, in Geneva, Switzerland, in
conjunction with the 33rd Annual International ACM SIGIR Conference on Research
and Development in Information Retrieval (SIGIR 2010). The workshop will bring
together researchers and practitioners from academia and industry to discuss the
latest developments in various aspects of feature generation and selection for
textual information retrieval.

Modern information retrieval systems facilitate information access at
unprecedented scale and level of sophistication. However, in many cases the
underlying representation of text remains quite simple, often limited to using a
weighted bag of words. Over the years, several approaches to automatic feature
generation have been proposed (such as Latent Semantic Indexing, Explicit
Semantic Analysis, Hashing, and Latent Dirichlet Allocation), yet their
application in large scale systems still remains the exception rather than the
rule. On the other hand, numerous studies in NLP and IR resort to manually
crafting features, which is a laborious and expensive process. Such studies
often focus on one specific problem, and consequently many features they define
are task- or domain-dependent. Consequently, little knowledge transfer is
possible to other problem domains. This limits our understanding of how to
reliably construct informative features for new tasks.

An area of machine learning concerned with feature generation (or constructive
induction) studies methods that endow computers with the ability to modify or
enhance the representation language. Feature generation techniques search for
new features that describe the target concepts better than the attributes
supplied with the training instances. It is worthwhile to note that traditional
machine learning data sets, such as those available from the UCI data
repository, are only available as feature vectors, while their feature set is
essentially fixed. In fact, feature generation for specific UCI benchmark
datasets is scorned upon. On the other hand, textual data is almost always
available in its raw format (in some case as structured data with sufficient
side information). Given the importance of text as a data format, it is well
worthwhile designing text-specific feature generation algorithms. Complementary
to feature generation, the issue of feature selection arises. It aims to retain
only the most informative features, e.g., in order to reduce noise and to avoid
overfitting, and is essential when numerous features are automatically
constructed. This allows us to deal with features that are correlated,
redundant, or uninformative, and hence we may want to decimate them through a
principled selection process.

We believe that much can be done in the quest for automatic feature generation
for text processing, for example, using large-scale knowledge bases as well as
sheer amounts of textual data easily accessible today. We further believe the
time is ripe to bring together researchers from many related areas (including
information retrieval, machine learning, statistics, and natural language
processing) to address these issues and seek cross-pollination among the
different fields.

Papers from a rich set of empirical, experimental, and theoretical perspectives
are invited. Topics of interest for the workshop include but are not limited to:
- Identifying cases when new features should be constructed
- Knowledge-based methods (including identification of appropriate knowledge
resources)
- Efficiently utilizing human expertise (akin to active learning, assisted 
feature construction)
- (Bayesian) nonparametric distribution models for text (e.g. LDA, hierarchical
Pitman-Yor model)
- Compression and autoencoder algorithms (e.g., information bottleneck, deep
belief networks)
- Feature selection (L1 programming, message passing, dependency measures,
submodularity)
- Cross-language methods for feature generation and selection
- New types of features, e.g., spatial features to support geographical IR
- Applications of feature generation in IR (e.g., constructing new features for
indexing, ranking)

The workshop will include invited talks as well as presentations of accepted
research contributions. The schedule will provide time for both organized and
open discussion. Registration will be open to all SIGIR 2010 attendees.

Submission Instructions
Submissions should report new (unpublished) research results or ongoing
research. Submissions can be up to 8 pages long for full papers, and up to 4
pages long for short papers. Papers should be formatted in double-column ACM SIG
proceedings format (http://www.acm.org/sigs/publications/proceedings-templates;
for LaTeX, use "Option 2"). Papers must be in English and must be submitted as
PDF files.

Papers should be submitted electronically using the EasyChair system at
http://www.easychair.org/conferences/?conf=fgsir10 no later than 23:59 Pacific
Standard time, Sunday, May 30, 2010. 
At least one author of each accepted paper will be expected to attend and
present their findings at the workshop.

Important Dates
Submission Deadline:  May 30,  2010
Acceptance notification: June 25, 2010
Camera-ready submission: July 5,  2010
Workshop date:  July 23, 2010

Invited Speakers
The workshop will feature a keynote talk by Dr. Kenneth Church, Chief Scientist
of the Human Language Technology Center of Excellence at the Johns Hopkins
University. Additional invited speakers are to be announced.

Organizing Committee
- Evgeniy Gabrilovich, Yahoo! Research, USA
- Alex Smola, Australian National University and Yahoo! Research, USA
- Naftali Tishby, Hebrew University of Jerusalem, Israel

Program Committee
- Francis Bach, INRIA, France
- Misha Bilenko, Microsoft Research, USA
- David Blei, Princeton, USA
- Karsten Borgwardt, Max Planck Institute, Germany
- Wray Buntine, NICTA, Australia
- Raman Chandrasekar, Microsoft Research, USA
- Kevyn Collins-Thompson, Microsoft Research, USA
- Silviu Cucerzan, Microsoft Research, USA
- Brian Davison, Lehigh University, USA
- Gideon Dror, Academic College of Tel-Aviv-Yaffo, Israel
- Wai Lam, CUHK, Hong Kong SAR, China
- Tie-Yan Liu, Microsoft Research Asia, China
- Shaul Markovitch, Technion, Israel
- Donald Metzler, Yahoo Research, USA
- Daichi Mochihashi, NTT, Japan
- Filip Radlinski, Microsoft Research, United Kingdom
- Rajat Raina, Facebook, USA
- Pradeep Ravikumar, University of Texas at Austin, USA
- Mehran Sahami, Stanford, USA
- Le Song, CMU, USA
- Krysta Svore, Microsoft Research, USA
- Volker Tresp, Siemens, Germany
- Kai Yu, NEC, USA
- ChengXiang Zhai, UIUC, USA
- Jerry Zhu, University of Wisconsin, USA





-----------------------------------------------------------
This Year the LINGUIST List hopes to raise $65,000. This money will go to help 
keep the List running by supporting all of our Student Editors for the coming year.

See below for donation instructions, and don't forget to check out our Space Fund 
Drive 2010 and join us for a great journey!

http://linguistlist.org/fund-drive/2010/

There are many ways to donate to LINGUIST!

You can donate right now using our secure credit card form at  
https://linguistlist.org/donation/donate/donate1.cfm

Alternatively you can also pledge right now and pay later. To do so, go to: 
https://linguistlist.org/donation/pledge/pledge1.cfm

For all information on donating and pledging, including information on how to 
donate by check, money order, or wire transfer, please visit: 
http://linguistlist.org/donation/

The LINGUIST List is under the umbrella of Eastern Michigan University and as 
such can receive donations through the EMU Foundation, which is a registered 
501(c) Non Profit organization. Our Federal Tax number is 38-6005986. These 
donations can be offset against your federal and sometimes your state tax return 
(U.S. tax payers only). For more information visit the IRS Web-Site, or contact 
your financial advisor.

Many companies also offer a gift matching program, such that they will match 
any gift you make to a non-profit organization. Normally this entails your 
contacting your human resources department and sending us a form that the 
EMU Foundation fills in and returns to your employer. This is generally a simple 
administrative procedure that doubles the value of your gift to LINGUIST, without 
costing you an extra penny. Please take a moment to check if your company 
operates such a program.

Thank you very much for your support of LINGUIST! 
-----------------------------------------------------------
LINGUIST List: Vol-21-1439	

	



More information about the LINGUIST mailing list