[Corpora-List] State-of-the-art POS tagging results

Nizar Habash habash at cs.columbia.edu
Tue Nov 18 15:37:53 UTC 2008


Hi

Please also check the results from the CADIM group at Columbia on  
morphological disambiguation (POS tagging) for Arabic:

Roth, Ryan, Owen Rambow, Nizar Habash, Mona Diab, and Cynthia Rudin.  
Arabic Morphological Tagging, Diacritization, and Lemmatization Using  
Lexeme Models and Feature Ranking. In Proceedings of Association for  
Computational Linguistics (ACL), Columbus, Ohio. 2008.

Diab, Mona. Towards an optimal POS tag set for Modern Standard Arabic  
Processing. Recent Advances in Natural Language Processing (RANLP),  
Borovets, Bulgaria, 2007.

Diab, Mona, Kadri Hacioglu and Daniel Jurafsky. Automated Methods for  
Processing Arabic Text: From Tokenization to Base Phrase Chunking.  
Book Chapter. In Arabic Computational Morphology: Knowledge-based and  
Empirical Methods. Editors Antal van den Bosch and Abdelhadi Soudi.  
Kluwer/Springer Publications, 2007.

Habash, Nizar and Rambow, Owen, 2007. Arabic Diacritization through  
Full Morphological Tagging. In Human Language Technologies 2007: The  
Conference of the North American Chapter of the Association for  
Computational Linguistics (NAACL HLT 2007); Companion Volume, Short  
Papers.  [PDF]

Habash, Nizar and Owen Rambow. Arabic Tokenization, Morphological  
Analysis, and Part-of-Speech Tagging in One Fell Swoop. In  
Proceedings of the Conference of American Association for  
Computational Linguistics (ACL05). [PDF]

Diab, Mona, Kadri Hacioglu and Daniel Jurafsky. Automatic Tagging of  
Arabic Text: From Raw Text to Base Phrase Chunks. Proceedings of  
Human Language Technology-North American Association for  
Computational Linguistics (HLT-NAACL), 2004.



Nizar



On Nov 14, 2008, at 9:39 AM, Khalil Simaan wrote:

> Hi,
> Hebrew and Arabic may count under ``morphologically complex  
> languages".
>
> For Hebrew have a look at:
>
> Roy Bar-Haim, Khalil Sima'an and Yoad Winter.    Part-of-Speech  
> Tagging
> of Modern Hebrew Text.  In  Journal of Natural Language Engineering
> (J-NLE)
> <http://www.cambridge.org/journals/journal_catalogue.asp? 
> mnemonic=nle>,
> 14(2):223-251, 2008.
>
> the work extended for Arabic:
>
> Saib Mansour, Khalil Sima'an and Yoad Winter. Smoothing a Lexicon- 
> based
> POS tagger for Arabic and Hebrew.  In proceedings of  ACL 2007  
> Workshop
> on Computational Approaches to Semitic Languages: Common Issues and
> Resources. Prague, Czech Republic, 2007.
>
> Best regards
> Khalil Sima'an
> University of Amsterdam
>
> Hrafn Loftsson wrote:
>> Hello all.
>>
>> Can anyone point me to papers presenting state-of-the-art POS tagging
>> results for some morphologically complex languages?
>>
>> In his paper "Morphological Tagging: Data vs.  
>> Dictionaries" (2000), Jan
>> Hajic presents an evaluation for Czech, Estonian, Hungarian Romanian,
>> and Slovene, but I wonder if you know of more recent work.
>>
>> --
>> Regards,
>> Hrafn Loftsson, Ph.D. - www.ru.is/faculty/hrafn
>> Assistant Professor
>> School of Computer Science - www.ru.is/cs
>> Reykjavik University - www.ru.is
>>
>>
>> Vinsamlega athugið að upplýsingar í tölvupósti þessum og viðhengi  
>> eru eingöngu ætlaðar þeim sem póstinum er beint til og gætu  
>> innihaldið upplýsingar sem eru trúnaðarmál. Sjá nánar: http:// 
>> www.ru.is/trunadur
>>
>> Please note that this e-mail and attachments are intended for the  
>> named addresses only and may contain information that is  
>> confidential and privileged. Further information:
>> http://www.ru.is/trunadur
>>
>>
>> --------------------------------------------------------------------- 
>> ---
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
> -- 
> ----
> *P.S. EMAIL ADDRESS CHANGE:
>              k.simaan at uva.nl
> (old email simaan at science.uva.nl will expire soon).*
> ----
>
> Khalil Sima'an
> Institute for Logic, Language and Computation (ILLC)
> Universiteit van Amsterdam
> http://staff.science.uva.nl/~simaan
> Tel 0205256573
> email k.simaan at uva.nl
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20081118/aee6d4a0/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list