IJCNLP 2008 WORKSHOP<br>                              ON<br>         NAMED ENTITY RECOGNITION (NER) FOR SOUTH<br>              AND SOUTH EAST ASIAN LANGUAGES<br><br>            (<a href="http://ltrc.iiit.ac.in/ner-ssea-08">

http://ltrc.iiit.ac.in/ner-ssea-08</a>)<br><br><br>                  CALL FOR PARTICIPATION<br><br>                  Friday, 12 January 2008<br>                      Hyderabad, India<br><br><br><br>Most of the South and South East Asian (SSEA) languages are

<br>scarce in resources and tools and Named Entity Recognition (NER)<br>systems are no exception. It is very important that good systems<br>for NER be available, because many problems in information<br>extraction and machine translation (among others) are dependent

<br>on accurate NER. However, the issues involved are significantly<br>different for these languages from those for European languages<br>or even East Asian languages. For example, these languages do<br>not have capitalization, which is a major feature for NER systems

<br>for European languages.<br><br>Another similarity among these languages is that many of them use<br>scripts of Brahmi origin. For some languages, there are additional<br>issues such as word segmentation (e.g. for Thai). Large gazetteers

<br>are not available for most of these languages. Lack of<br>standardization and spelling variation add further problems.<br>The number of frequently used common nouns which can also be used<br>as names is very large for many languages, unlike European

<br>languages where a larger proportion of the first names are not<br>used as common words. Lastly, and most importantly, there is<br>a serious lack of labeled data for machine learning.<br><br>The papers going to be presented at the workshop include both

<br>regular research papers as well as papers in the Shared Task.<br>There will also be two invited talks by senior researchers who<br>have worked on the NER problem for South Asian languages.<br><br>The workshop is being held in conjunction with the Third

<br>International Joint Conference on Natural Language Processing<br>(January 7-12, 2008), which is one of the major conferences<br>in NLP/CL. The workshop program is given below.<br><br>FOR WORKSHOP SPECIFIC ENQUIRIES, PLEASE CONTACT:

<br><br>Anil Kumar Singh<br>Language Technologies Research Centre<br>IIIT, Hyderabad, India<br>Email: <a href="mailto:anil@research.iiit.ac.in">anil@research.iiit.ac.in</a><br><br>FOR GENERAL ENQUIRIES (ACCOMMODATION ETC.), PLEASE CONTACT:

<br><br>IJCNLP-08 Secretariat<br>International Institute of Information Technology<br>Gachibowli, Hyderabad 500 032, Andhra Pradesh, India<br>Tel: +91-40-2300 0646; Fax: +91-40-2300 0044<br>Email: <a href="mailto:ijcnlp08@iiit.ac.in">

ijcnlp08@iiit.ac.in</a><br><br>*************************************************************<br><br>WORKSHOP PROGRAM<br><br>Named Entity Recognition for South and South East Asian Languages:<br>Taking Stock<br><br>[Anil Kumar Singh]

<br><br>SESSION 1<br><br>Invited Talk: Named Entity Recognition: Different Approaches<br><br>[Sobha L]<br><br>A Hybrid Approach for Named Entity Recognition in Indian Languages<br><br>[Sujan Kumar Saha, Sanjay Chatterji, Sandipan Dandapat, Sudeshna Sarkar

<br>and Pabitra Mitra]<br><br>SESSION 2<br><br>Invited Talk: Multilingual Named Entity Recognition<br><br>[Sivaji Bandyopadhyay]<br><br>Aggregating Machine Learning and Rule Based Heuristics for Named<br>Entity Recognition

<br><br>[Karthik Gali, Harshit Surana, Ashwini Vaidya, Praneeth Shishtla<br>and Dipti Misra Sharma]<br><br>Language Independent Named Entity Recognition in Indian Languages<br><br>[Asif Ekbal, Rejwanul Haque, Amitava Das, Venkateswarlu Poka and

<br>Sivaji Bandyopadhyay]<br><br>SESSION 3<br><br>Named Entity Recognition for Telugu<br><br>[Srikanth P and Narayana Murthy Kavi]<br><br>POSTER DISPLAY AND DISCUSSION<br><br>An experiment on automatic detection of Named Entity in Bangla

<br><br>[Bidyut Baran Chaudhuri and Suvankar Bhattacharya]<br><br>A Hybrid Named Entity Recognition System for South Asian Languages<br><br>[Praveen P and Ravi Kiran V]<br><br>Named Entity Recognition for South Asian Languages

<br><br>[Amit Goyal]<br><br>Named Entity Recognition for Indian Languages<br><br>[Animesh Nayan, B. Ravi Kiran Rao, Pawandeep Singh, Sudip Sanyal<br>and Ratna Sanyal]<br><br>Experiments in Telugu NER: A Conditional Random Field Approach

<br><br>[Praneeth Shishtla, Prasad Pingali, Vasudeva Varma and Karthik Gali]<br><br>SESSION 4<br><br>Bengali Named Entity Recognition using Support Vector Machine<br><br>[Asif Ekbal and Sivaji Bandyopadhyay]<br><br>Domain focused Named Entity Recognizer for Tamil using Conditional

<br>Random Fields<br><br>[Vijayakrishna R and Sobha L]<br><br>A Character n-gram Based Approach for Improved Recall in Indian<br>Language NER<br><br>[Praneeth Shishtla, Prasad Pingali and Vasudeva Varma]<br><br>CLOSING DISCUSSION

<br><br>