[Corpora-List] Spanish corpora and pos-taggers

Eckhard Bick eckhard.bick at mail.dk
Wed Nov 29 09:37:42 UTC 2006


Dear Mario Poe,

We have a rule-based parser for Spanish (HIS-PALAVRAS), with a 
web-interface, at http://beta.visl.sdu.dk, under the menu item "Sentence 
Analysis" -> "Machine Analysis" -> Spanish. It handles both part of 
speech, syntactic function and full tree structures, and can be used 
on-line as well as by file upload or remote access. I have an evaluation 
paper, if you are interested.

There is also a corpus search interface (http://corp.hum.sdu.dk), where 
we have versions of the Spanish Europarl and Wikipedia corpora, annotted 
with HIS-PALAVRAS.

Regards,
Eckhard Bick
VISL / University of Southern Denmark

Mario Poe wrote:

>Dear all,
>
>For a research on lexical issues, I am in search of
>freely available Spanish corpora, either raw or
>(preferably) POS-tagged (with lemmas), representing,
>if possible, written and spoken language varieties.
>Besides, I would appreciate if you could point me to
>relevant text tokenizers and POS-taggers available for
>Spanish, either in the form of downloadable packages
>or as web demos. I have compiled some URIs myself but
>I have found very few.
>
>Thanks for your help!
>
>--Mario Poe
>PhD student
>
>
>		
>______________________________________________ 
>LLama Gratis a cualquier PC del Mundo. 
>Llamadas a fijos y móviles desde 1 céntimo por minuto. 
>http://es.voice.yahoo.com
>
>
>  
>


-- 
Eckhard Bick,
cand.med., dr.phil.
University of Southern Denmark
e-mail: eckhard.bick at mail.dk
web: http://beta.visl.sdu.dk



More information about the Corpora mailing list