<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>------------------------------------------------------------------------</div><div>Arabic-L: Mon 23 Feb 2010</div><div>Moderator: Dilworth Parkinson <<a href="mailto:dilworth_parkinson@byu.edu">dilworth_parkinson@byu.edu</a>></div><div>[To post messages to the list, send them to </div><a href="mailto:arabic-l@byu.edu">arabic-l@byu.edu</a><div>]</div><div>[To unsubscribe, send message from same address you subscribed from to</div><br class="Apple-interchange-newline">listserv@byu.edu<div> with first line reading:</div><div> unsubscribe arabic-l ]</div><div><br></div><div>-------------------------Directory------------------------------------</div><div><br></div><div>1) Subject:Workshop on Language Resources fo Semitic Languages</div><div><br></div><div>-------------------------Messages-----------------------------------</div><div>1)</div><div>Date: 23 Feb 2010</div><div>From:Owen Rambow <<a href="mailto:rambow@ccls.columbia.edu">rambow@ccls.columbia.edu</a>></div><div>Subject:Workshop on Language Resources fo Semitic Languages</div><div><br></div><div>CALL FOR PAPERS<br>Workshop on Language Resources (LRs) and Human Language Technologies<br>(HLT) for Semitic Languages: Status, Updates, and Prospects<br><br><a href="http://www1.ccls.columbia.edu/~rambow/LREC2010-semitic-ws.html">http://www1.ccls.columbia.edu/~rambow/LREC2010-semitic-ws.html</a><br><br>To be held in conjunction with the 7th International Language<br>Resources and Evaluation Conference (LREC 2010)<br><br>17 May 2010, Mediterranean Conference Centre, Valetta, Malta<br>Deadline for submission: 26 February 2010<br><br>This workshop serves as the 2010 meeting of the ACL SIG on<br>Computational Approaches to Semitic Languages (<a href="http://semitic.tk/">http://semitic.tk</a>).<br><br>Description<br><br>The Semitic family includes languages and dialects spoken by a large<br>number of native speakers (around 300 million). Prominent members of<br>this family are Arabic (and its varieties), Hebrew, Amharic, Tigrinya,<br>Aramaic, Maltese and Syriac. Their shared ancestry is apparent through<br>pervasive cognate sharing, a rich and productive pattern-based<br>morphology, and similar syntactic constructions. In addition, there<br>are several languages which are used in the same geographic area such<br>as Amazigh or Coptic, which, while not Semitic, have common features<br>with Semitic languages, such as borrowed vocabulary.<br><br>The recent surge in computational work for processing Semitic<br>languages, particularly Modern Standard Arabic (MSA) and Modern Hebrew<br>(MH), has brought modest improvements in terms of actual empirical<br>results for various language processing components (e.g.,<br>morphological analyzers, parsers, named entity recognizers, audio<br>transcriptions, etc.). Apparently, reusing existing approaches<br>developed for English or French for processing Semitic language<br>text/speech, e.g., Arabic parsing is not as straightforward as<br>initially thought. Apart from the limited availability of suitable<br>language resources, there is increasing evidence that Semitic<br>languages demand modeling approaches and annotations that deviate from<br>those found suitable for English/French. Issues such as the<br>pattern-based morphology, the frequently head-initial syntactic<br>structure, the importance of the interface between morphology and<br>syntax, and the difference between spoken and written forms<br>(especially in Colloquial Arabic(s)) exemplify the kind of challenges<br>that may arise when processing Semitic languages. For language<br>technologies, such as information retrieval and machine translation,<br>these challenges are compounded by sparse data and often result in<br>poorer performance than for other languages.<br><br>This Workshop intends to follow on topics of paramount importance for<br>Semitic-language NLP that were discussed at previous events (LREC,<br>MEDAR/NEMLAR Conferences, the workshops of the ACL Special Interest<br>Group for Semitic languages, etc.) and which are worth revisiting.<br><br>The workshop will bring together people who are actively involved in<br>Semitic language processing in a mono- or cross/multilingual context,<br>and give them an opportunity to update the community through reports<br>on completed or ongoing work as well as on the availability of LRs,<br>evaluation protocols and campaigns, products and core technologies (in<br>particular open source ones). We also invite authors to address other<br>languages spoken in the Semitic language area (languages such as<br>Amazigh, Coptic, etc.). This should enable participants to develop a<br>common view on where we stand and to foster the discussion of the<br>future of this research area. Particular attention will be paid to<br>activities involving technologies such as Machine Translation and<br>Cross-Lingual Information Retrieval/Extraction, Summarization, etc.<br>Evaluation methodologies and resources for evaluation of HLT will be<br>also a main focus.<br><br>We expect to elaborate on the HLT state of the art, identify problems<br>of common interest, and debate on a potential roadmap for the Semitic<br>languages. Issues related to sharing of resources, tools, standards,<br>sharing and dissemination of information and expertise, adoption of<br>current best practices, setting up joint projects and technology<br>transfer mechanisms will be an important part of the workshop.<br><br>Topics of Interest<br><br>This full-day workshop is not intended to be a mini-conference, but as<br>a real workshop aiming at concrete results that should clarify the<br>situation of Semitic languages with respect to Language Resources and<br>Evaluation. We expect to launch at least two evaluation campaigns:<br>Comparative evaluation of Morphology taggers and Named Entities<br>Recognizers.<br><br>Among the many issues to be addressed, below follow a few suggestions:<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Issues in the design, the acquisition, creation, management,<br>access, distribution, use of Language Resources, in particular in a<br>bilingual/multilingual setting (Standard Arabic, Hebrew, Colloquial<br>Arabic, Amazigh, Coptic, Maltese, etc.)<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Impact on LR collections/processing and NLP of the crucial<br>issues related to "code switching" between different dialects and<br>languages<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Specific issues related to the above-mentioned languages such<br>as the role of morphology, named entities, corpus alignment, etc.<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Multilinguality issues including relationship between<br>Colloquial and Standard Arabic<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Exploitation of LR in different types of applications<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Industrial LR requirements and community's response<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Benchmarking of systems and products; resources for<br>benchmarking and evaluation for written and spoken language<br>processing;<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Focus on some key technologies such as MT (all approaches e.g.<br>Statistical, Example-Based, etc.), Information Retrieval, Speech<br>Recognition, Spoken Documents Retrieval, CLIR, Question-Answering,<br>Summarization, etc.<br><br>*<span class="Apple-tab-span" style="white-space: pre; "> </span>Local, regional, and international activities and projects and<br>needs, possibilities, forms, initiatives of/for regional and<br>international cooperation.<br><br>We invite submissions on computational approaches to processing<br>text/speech in all Semitic and Semitic-area languages. The call is<br>open for all kinds of computational work, e.g., work on computational<br>linguistic processing components (e.g., analyzers, taggers, parsers),<br>on state-of-the-art NLP applications and systems, on leveraging<br>resource and tool creation for the Semitic language family, and on<br>using computational tools to gain new linguistic insight. We<br>especially welcome submissions on work that crosses individual<br>language boundaries, heightens awareness amongst Semitic-language<br>researchers of shared challenges and breakthroughs, and highlights<br>issues and solutions common to any subset of the Semitic languages<br>family.<br><br><br>Workshop general chair:<br>Khalid Choukri, ELRA/ELDA, Paris, France<br><br>Workshop co-chairs:<br>Owen Rambow, Columbia University, New York, USA -- <a href="mailto:rambow@ccls.columbia.edu">rambow@ccls.columbia.edu</a><br>Bente Maegaard , University of Copenhagen, Denmark<br>Ibrahim A. Al-Kharashi, Computer and Electronics Research Institute,<br>King Abdulaziz City for Science and Technology, Saudi Arabia<br><br><br>Organizing Committee information<br>Khalil Sima’an, Language and Computation, University of Amsterdam<br>(The Netherlands).<br>Mona Diab , Center for Computational Learning Systems,Columbia<br>University (USA).<br>Mike Rosner , Dept. Intelligent Computer Systems,University of Malta<br>(Malta).<br>Shuly Wintner , Computer Science Dept., Haifa University, (Israel).<br>Christopher Cieri, Linguistic Data Consortium, Philadelphia, (USA)<br>Paolo Rosso, Universidad Politécnica Valencia, (Spain)<br><br><br>The Program and Scientific Committees will be listed on the web pages.<br><br>Important Dates<br><br>Deadline for abstract submissions:<span class="Apple-tab-span" style="white-space: pre; "> </span>26 February 2010<br>Notification of acceptance:<span class="Apple-tab-span" style="white-space: pre; "> </span><span class="Apple-tab-span" style="white-space: pre; "> </span>15 March 2010<br>Final version of accepted paper:<span class="Apple-tab-span" style="white-space: pre; "> </span>11 April 2010<br>Workshop full-day:<span class="Apple-tab-span" style="white-space: pre; "> </span><span class="Apple-tab-span" style="white-space: pre; "> </span><span class="Apple-tab-span" style="white-space: pre; "> </span>17 May 2010<br><br>Submission Details<br><br>Submissions should comply with LREC standards (including the LREC Map<br>initiative) and must be in English. Abstracts for workshop<br>contributions should not exceed Four A4 pages (excluding references).<br>An additional title page should state: the title; author(s);<br>affiliation(s); and contact author's e-mail address, as well as postal<br>address, telephone and fax numbers.<br><br>Submission will use the LREC START facility. Expected deadline is 26<br>February 2010.<br><br>Submitted papers will be judged based on relevance to the workshop<br>aims, as well as the novelty of the idea, technical quality, clarity<br>of presentation, and expected impact on future research within the<br>area of focus.<br><br>Registration to LREC’2010 will be required for participation, so<br>potential participants are invited to refer to the main conference<br>website for all details not covered in the present call<br>(<a href="http://www.lrec-conf.org/lrec2010/">http://www.lrec-conf.org/lrec2010/</a>)<br><br>Formatting instructions for the final full version of papers will be<br>sent to authors after notification of acceptance and will be identical<br>to LREC main conference instructions.<br><br>When submitting a paper through the START page, authors will be kindly<br>asked to provide relevant information about the resources that have<br>been used for the work described in their paper or that are the<br>outcome of their research. For further information on this new<br>initiative, please refer to<br><a href="http://www.lrec-conf.org/lrec2010/?LREC2010-Map-of-Language-Resources">http://www.lrec-conf.org/lrec2010/?LREC2010-Map-of-Language-Resources</a>.<br><br><br></div><div>--------------------------------------------------------------------------</div><div>End of Arabic-L: 23 Feb 2010</div><br class="Apple-interchange-newline"><br></body></html>