[Corpora-List] Request for tips to French resources
sajous at univ-tlse2.fr
sajous at univ-tlse2.fr
Sat Mar 5 12:08:33 UTC 2011
Dear Ineta,
You may be interested by an XML version of the French Wiktionary. This
file converted from the wiki dump includes lemmatized forms with pos.
Resource is a bit noisy, however it contains a significant number of
interesting neologisms.
An archive is available from the REDAC website:
http://redac.univ-tlse2.fr/lexiques/wiktionaryx_en.html
If you are looking for annotated French corpora, a lemmatized and
pos-tagged 260 million-words corpus extracted from the French
Wikipedia is available at:
http://redac.univ-tlse2.fr/corpus/wikipedia_en.html
Regards,
Franck
--
Franck Sajous - CLLE-ERSS
Maison de la Recherche
Bureau B521
Université de Toulouse-Le Mirail
5, allées Antonio Machado
31058 Toulouse Cedex 9
Tel : +33 (0)5 61 50 36 93
Fax : +33 (0)5 61 50 46 77
http://w3.univ-tlse2.fr/erss/membre/fsajous/
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list