[Corpora-List] English-UNL Dictionary

Ronaldo Martins r.martins at undlfoundation.org
Tue Apr 20 12:54:56 UTC 2010


(Please distribute, and apologies for multiple postings)

The UNDL Foundation has released a new version of the English-UNL
dictionary. The English-UNL dictionary is a bidirectional (EN>UNL, UNL>EN)
machine-tractable lexical database comprising more than 200,000 mappings
between English and UNL. It brings extensive information about lexical items
of English, including morphological structure, inflectional paradigms and
subcategorization frames, as well as semantic information about UNL entries.
The dictionary is available under an Attribution Share Alike (CC-BY-SA)
Creative Commons license at the UNLarium (http://www.unlweb.net/unlarium). 

==============================
How the English-UNL dictionary was created?
==============================
The English-UNL dictionary was mainly derived from a word list extracted
from the English WordNet 3.0, which was automatically analyzed and humanly
revised for lexical categories, lexical structure (roots, affixes), part of
speech, number (singular, plural, singulare tantum, plurale tantum,
invariant), valence, transitivity, inflectional paradigms (for nouns and
verbs) and subcategorization frames (according to the X-bar theory). English
entries were mapped onto entries of the UNL dictionary (i.e., UWs) and may
be freely exported in two different formats: generative, containing only
base forms and the corresponding generation (inflectional and composition)
rules; and enumerative, containing word forms and lexical features. A sample
of entries is presented below. 

base form
[foot] {2883} "100284665" (POS=NOU, MOR=STE, LST=WRD, NUM=SNG, INF=M1,
FLX(PLR:="feet";)) <eng,0,0>;

word forms
[foot] {2883} "100284665" (POS=NOU, MOR=WFO, LST=WRD, NUM=SNG, INF=M1)
<eng,0,0>;
[feet] {2883} "100284665" (POS=NOU, MOR=WFO, LST=WRD, NUM=PLR, INF=M1)
<eng,0,0>;


The English-UNL dictionary is generated in real time according to the
specifications and to the tagset described at the UNLwiki
(http://www.unlweb.net/wiki). As an ongoing project and a dynamic database,
the dictionary is subject to permanent augmentation and improvement, and
reports on problems and other contributions are mostly welcome.

==============================
Further information
==============================
For further information, please contact 

Ronaldo MARTINS (mailto:r.martins at undlfoundation.org)
Language Resources Manager
UNDL Foundation
48, route de Chancy
CH-1213 - Geneva - Switzerland 
+41 22 879 8090

==============================
What is UNL?
==============================
The UNL is an artificial language that has been used for several different
tasks in natural language processing, such as machine translation,
multilingual document generation, summarization, information retrieval and
semantic reasoning. It has been originally proposed by the Institute of
Advanced Studies of the United Nations University, in Tokyo, and has been
currently promoted by the UNDL Foundation, in Geneva, Switzerland, under a
mandate of the United Nations. [read more about UNL in
http://www.unlweb.net]

==============================
The UNDL Foundation
==============================
The UNDL Foundation (http://www.undlfoundation.org) is a non-profit
organization based in Geneva, Switzerland, which has received, from the
United Nations, the mandate for implementing the Universal Networking
Language (UNL). The UNL Programme is a collaborative effort to create
natural language resources and technology to reduce language barriers and
strengthen cross-cultural communication in the framework of the United
Nations. Participation in the Programme is free and open to individuals and
institutions, either as researchers or as developers. Special funds are
available for some languages.





_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list