[Corpora-List] Call for Participation: Coling Workshop on Arabic Script Languages [Correction]
Karine Megerdoomian
karinem at inxight.com
Wed Jul 14 17:41:07 UTC 2004
Note corrected workshop date. We apologize for any double-postings.
** Call for Participation **
COLING 2004 WORKSHOP ON
COMPUTATIONAL APPROACHES TO ARABIC SCRIPT-BASED LANGUAGES
Geneva, Switzerland, 28 August 2004
Invited Speaker: Martin Kay (Stanford University)
http://members.cox.net/karinem/COLING2004
WORKSHOP THEME
Recently, there has been a surge of interest in the study of the languages of the Middle East, especially Arabic, Persian (Farsi), Pashto, Kurdish and Urdu. The usage of the Arabic script gives rise to certain issues that are common to all these languages despite their being of distinct language families. Hence, these languages share properties such as the absence of capitalization, right to left direction, lack of clear word boundaries, complex word structure, a high degree of ambiguity due to non-representation of short vowels in the writing system, and related encoding issues. Yet the research on these various languages have rarely been brought together in a single forum, and most development has been the result of initiatives by individual research establishments or industry firms.
The goal of this workshop is to provide a forum for those involved in the development of NLP systems in Arabic script languages to exchange ideas, approaches and implementations of computational systems; to discuss the common challenges faced by all practitioners; and to assess the state of the art in the field. In addition, one of the aims of the workshop is to identify promising areas for future collaborative research in the development of NLP systems for Arabic script languages.
WORKSHOP PROGRAM
I. Opening and Overview
8:30-9:00 Computer Processing of Arabic Script-based Languages: Current State and Future Directions - Ali Farghaly
II. Session 1: Lexicon and Corpora
9:00-9:30 Developing an Arabic Treebank: Methods, Guidelines, Procedures, and Tools - Mohamed Maamouri and Ann Bies
9:30-10:00 Preliminary Lexical Framework for English-Arabic Semantic Resource Construction - Anne R. Diekema
10:00-10:30 The Architecture of a Standard Arabic Lexical Database: Some Figures, Ratios, and Categories from the DIINAR.1 Source Program - Ramzi Abbès, Joseph Dichy and Mohamed Hassoun
10:30-10:45 Break
III. Session 2: Morphology
10:45-11:15 Systematic Verb Stem Generation for Arabic - Jim Yaghi and Sane Yagi
11:15-11:45 Issues in Arabic Orthography and Morphology Analysis - Tim Buckwalter
11:45-12:15 Finite-State Morphological Analysis of Persian - Karine Megerdoomian
12:15-2:00 Lunch & Demo Sessions
IV. Demonstrations
Urdu Localization Project - Sarmad Hussain
FarsiSum: A Persian Text Summarizer - Martin Hassel and Nima Mazdak
Stemming the Qur'an - Naglaa Thabet
Language Weaver Arabic->English MT - Daniel Marcu, Alex Fraser, William Wong and Kevin Knight
V. Invited Speaker
2:00-2:45 Arabic Script-Based Languages Deserve to be Studied Linguistically - Martin Kay
VI. Session 3: Statistical Approaches
2:45-3:15 An Unsupervised Approach for Bootstrapping Arabic Sense Tagging - Mona T. Diab
3:15-3:45 Automatic Arabic Document Categorization Based on the Naive Bayes Algorithm - Mohamed El Kourdi, Amine Bensaid and Tajje-eddine Rachidi
3:45-4:00 Break
VII. Session 4: Speech Processing
4:00-4:30 A Transcription Scheme for Languages Employing the Arabic Script Motivated by Speech Processing Applications - Shadi Ganjavi, Panayiotis G. Georgiou and Shrikanth Narayanan
4:30-5:00 Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition - Dimitra Vergyri and Katrin Kirchhoff
5:00-5:30 Letter-to-Sound Conversion for Urdu Text-to-Speech System - Sarmad Hussain
VIII. Discussion and Closing
5:30-6:00 Ali Farghaly and Karine Megerdoomian
Accepted papers and formal demonstrations will be published in a proceedings volume, which will be made available at the workshop.
WORKSHOP REGISTRATION
For the workshops to take place, the COLING 2004 organizers require at least 20 participants to register for the workshop. Speakers and participants are therefore asked to register via the official Coling 2004 website as soon as possible by visiting http://www.issco.unige.ch/coling2004/.
Workshop fees (in Swiss Francs):
* Student early chf 90
* Student late chf 120
* Student on-site chf 150
* Regular early chf 120
* Regular late chf 150
* Regular on-site chf 180
ORGANIZING COMMITTEE
Ali Farghaly (SYSTRAN Software, Inc.)
Karine Megerdoomian (Inxight Software and University of California, San Diego)
PROGRAM COMMITTEE
Jan W. Amtrup (Bowne Global Solutions)
Tim Buckwalter (Linguistic Data Consortium)
Miriam Butt (Konstanz University, Germany)
Violetta Cavalli-Sforza (Carnegie Mellon University)
Joseph Dichy (Lyon University)
Abdelkadir Fassi Fehri (Mohammed V University-Souissi Rabat, Morocco)
Andrew Freeman (University of Washington)
Nizar Habash (University of Maryland, College Park)
Masayo Iida (Inxight Software, Inc)
Simin Karimi (University of Arizona)
Martin Kay (Stanford University)
Kevin Knight (USC/Information Sciences Institute)
Farhad Oroumchian (University of Wollongong in Dubai)
Ahmed Rafea (The American University in Cairo)
Jean Senellart (SYSTRAN Software)
Bonnie Glover Stalls (University of Southern California)
Rémi Zajac (SYSTRAN Software)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20040714/ae260135/attachment.htm>
More information about the Corpora
mailing list