[Corpora-List] Portuguese thesaurus/dictionary

Cédrick Fairon fairon at tedm.ucl.ac.be
Wed Mar 17 20:41:53 UTC 2004


Hello,
Have you tried the LabEL resources? : http://label.ist.utl.pt/ (click on
Recursos Publicos).

The labEL is giving away its data for research purpose:

Dicionários de palavras simples
-Dicionário de formas canónicas (cerca de 120 000 entradas)
-Dicionário de formas flexionadas (cerca de um milhão)
-Dicionário de siglas e acrónimos (cerca de 4 000)

Dicionários de palavras compostas
-Dicionário de nomes compostos (amostragem com 10 000 entradas)
-Dicionário de advérbios compostos (amostragem com 1 000 entradas)
-Dicionário de preposições compostas
-Dicionário de conjunções compostas

Their full resources are also available in an Open Source corpus processor:
http://www-igm.univ-mlv.fr/~unitex/. But in this case, you don't have access
to the raw dictionaries directly (they are compressed).

Best,

Cedrick

Le Mercredi 17 Mars 2004 15:02, Mark Davies a écrit :
> I'm looking for a thesaurus (and perhaps also a dictionary) of Portuguese
> in machine-readable form.  In other words, I don't want an off-the-shelf
> Portuguese electronic dictionary with which I have to use the regular user
> interface.  Rather, I need one where I can access the raw data directly.  I
> know that I can access this type of information via web-based dictionaries,
> but it would be easier to do it with a local resource on my own machine.
>
> Eventually, the thesaurus (and perhaps dictionary as well) will be
> converted to relational database form, so the closer it is to that form
> already, the better.  Also, I'm willing to pay for the resource, though
> hopefully it won't be too much, since this will be strictly for
> non-commercial use.
>
> Thanks in advance.
>
> Mark Davies
>
> =================================================
> Mark Davies
> Assoc. Prof., Linguistics
> Brigham Young University
> (phone) 801-422-9168 / (fax) 801-422-0906
> http://davies-linguistics.byu.edu
>
> ** Corpus design and use // Web-database scripting **
> ** Historical linguistics // Functional-typological grammar **
> ** Spanish and Portuguese historical and dialectal syntax **
> =================================================
--
Cédrick Fairon
Directeur du CENTAL
Centre de traitement automatique du langage
Université de Louvain
Place Blaise Pascal, 1
1348 Louvain-la-Neuve
Belgique

=======================================
**** JADT 2004 in Louvain-la-Neuve ****
10-12 March 2004
7th International Conference on the statistical analysis of textual data
7th Journées internationales d'analyse statistique des données textuelles
http://www.jadt.org

Visit our web sites:
http://cental.fltr.ucl.ac.be
http://glossa.fltr.ucl.ac.be
=======================================



More information about the Corpora mailing list