20.1591, Calls: Computational Linguistics/Thailand

LINGUIST Network linguist at LINGUISTLIST.ORG
Sat Apr 25 03:33:32 UTC 2009


LINGUIST List: Vol-20-1591. Fri Apr 24 2009. ISSN: 1068 - 4875.

Subject: 20.1591, Calls: Computational Linguistics/Thailand

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
 
Reviews: Randall Eggert, U of Utah  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Elyssa Winzeler <elyssa at linguistlist.org>
================================================================  

LINGUIST is pleased to announce the launch of an exciting new feature:  
Easy Abstracts! Easy Abs is a free abstract submission and review facility 
designed to help conference organizers and reviewers accept and process 
abstracts online.  Just go to: http://www.linguistlist.org/confcustom, 
and begin your conference customization process today! With Easy Abstracts, 
submission and review will be as easy as 1-2-3!

===========================Directory==============================  

1)
Date: 23-Apr-2009
From: Thepchai Supnithi < thepchai at nectec.or.th >
Subject: Workshop on InterBEST 2009 Thai Word Segmentation
 

	
-------------------------Message 1 ---------------------------------- 
Date: Fri, 24 Apr 2009 23:30:24
From: Thepchai Supnithi [thepchai at nectec.or.th]
Subject: Workshop on InterBEST 2009 Thai Word Segmentation

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=20-1591.html&submissionid=214892&topicid=3&msgnumber=1
  

Full Title: Workshop on InterBEST 2009 - Thai Word Segmentation 
Short Title: InterBEST 2009 

Date: 19-Oct-2009 - 19-Oct-2009
Location: Bangkok, Thailand 
Contact Person: Thepchai Supnithi
Meeting Email: thepchai at nectec.or.th
Web Site: http://www.hlt.nectec.or.th/best/index.php 

Linguistic Field(s): Computational Linguistics 

Call Deadline: 30-Jun-2009 

Meeting Description:

This workshop is the second event in the series of BEST (Benchmark for Enhancing
the Standard of Thai Language Processing), a series of contests on Thai language
processing, which is expected to help accelerate the progress of this
technology. The topic of the first contest, held in February 2009 as a special
topic in the 11th National Software Contest in Bangkok Thailand, is Thai word
segmentation. The result from this contest has set quite a high standard for a
Thai word segmentation algorithm.

This workshop still focuses on the topic of Thai word segmentation as we believe
that it is possible to improve beyond the current level of accuracy. Workshop
participants will have an opportunity to work on the same task, developing a
Thai word segmentation algorithm using the provided training data. In addition
to the 5-million word training corpus released since the first contest, a
2-million word corpus in more diverse genres will be released. The test set will
consist of a different set of genres to evaluated the generality of a word
segmentation algorithm on variety of text domains and styles. The submitted
algorithms will be evaluated with the same test set and the same scoring
program, thus allowing comparisons among various word segmentation algorithms.

Another goal of this workshop is to provide a venue for researchers to share
their experience and discuss current obstacles and future directions of Thai
word segmentation and Thai language processing in general. Since the InterBEST
2009 workshop is co-located with the SNLP2009, the Eight International Symposium
on Natural Language Processing, the participants will then have an opportunity
to submit a paper and present their word segmentation algorithm and its
performance at the workshop. All the contest procedures and guidelines will be
provided in English, to reach out for more researchers in an international
community who may be interested in Thai language processing. 

Call for Papers

Step 1
To participate in this workshop, please submit an extended abstract (1,000 words
maximum, references excluded) describing your word segmentation algorithm and
its result on the 100-thousand word initial test data. The criteria of
acceptance are based on both the validity of the algorithm and its
performance.Please see the download page
(http://www.hlt.nectec.or.th/best/index.php?option=com_content&task=viewid=13&Itemid=27)
for further information on how to download a 5-million word training corpus and
submit your result for evaluation. A corpus description, as well as the
guidelines established for word segmentation criteria, are also available for
downloading. Registration is required to download training/testing data and
upload test results for evaluation. Please register only one account for each
system submitted, and use this account both when submit your test result and
paper. Also note that a permission to use the data is granted only for a
non-commercial R&D.


Step 2
Authors of accepted abstracts will be notified and provided with additional
2-million word training data to improve their systems. After the final test set
is released, participants will have one week to submit their final result for
evaluation, and additional 3 weeks to submit a full paper describing their
algorithm and result.

Important Dates:
Release of 5-million word training data and 100-thousand word initial test data:
18 Mar 2009

Abstract submission deadline:
30 Jun 2009 

Notification of acceptance and release of additional 2-million word training
data for accepted abstracts:
15 Jul 2009

Release of final test data:
17 Aug 2009

Test result submission deadline:	 
24 Aug 2009

Camera-ready submission deadline:
11 Sep 2009

Workshop day (co-located with SNLP):	 
19 Oct 2009





-----------------------------------------------------------
LINGUIST List: Vol-20-1591	

	



More information about the LINGUIST mailing list