[Corpora-List] XML/TEI Human Rights Corpus
    Pincemin 
    benie at club-internet.fr
       
    Tue Oct 11 07:57:07 UTC 2005
    
    
  
We are happy to announce the release of the Human Rights Corpus / Corpus 
Droits de l'Homme, v.1, available on our web site :
Université de Paris 13 - Laboratoire de Linguistique Informatique
http://www-lli.univ-paris13.fr/ressources
The corpus is composed of 28 International Conventions, from 1948 
(Universal Declaration of Human Rights) up to 2000. The choice of the 
texts has been made with an expert of the field, with the aim to have a 
representative view of the Human Rights reference texts and of the 
language and vocabulary used.
Each text is given in 2 or 3 languages : English and French, and Spanish 
when the Convention is one of the United Nations. These versions are 
aligned at the level of the finest subdivision (article) through an 
appropriate design of identifiers.
The encoding is in XML and follows the guidelines of the TEI. A special 
attention has been devoted to the realization of the Header ; in 
particular, the "TagUsage" part is fully developped in order to make 
understandable the choices made for the encoding and the meaning of each 
XML/TEI tag in our context.
Please contact us to let us know your interests or remarks :
corpus at lli.univ-paris13.fr
Fabrice ISSAC, Computational Linguist
Christine CHODKIEWICZ, Lawyer and Linguist
Bénédicte PINCEMIN, Linguist
    
    
More information about the Corpora
mailing list