Ressources: ELRA - Language Resources Catalogue - Update

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Sun Dec 16 21:20:37 UTC 2012


Date: Wed, 12 Dec 2012 17:22:17 +0100
From: ELRA ELDA Information <info at elda.org>
Message-ID: <50C8AF39.8030207 at elda.org>
X-url: http://catalog.elra.info/product_info.php?products_id=1178
X-url: http://catalog.elra.info/product_info.php?products_id=1179
X-url: http://catalog.elra.info/product_info.php?products_id=1180
X-url: http://catalog.elra.info/product_info.php?products_id=1181

Our apologies if you have received multiple copies of this announcement.

*****************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************

ELRA is happy to announce that 4 new Written Corpora are now available 
in its catalogue.

*ELRA-W0059 LT Corpus*
The LT Corpus is composed of 70 fiction texts from Portuguese renowned 
authors. The corpus contains 1,781,083 tokens. The texts date from 
before 1940. The corpus is delivered in one file, in two different 
formats. The txt version has one sentence per line, an identification 
number for each text and no further annotation. The cqpweb file is one 
token per line, followed by pos tag and lemma, and is annotated for NP 
chunks.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1178

*ELRA-W0060 PTPARL Corpus*
The PTPARL Corpus contains 1,076 texts consisting of adapted
transcriptions of the Portuguese Parliament sessions. The corpus
contains 1,000,441 tokens. The corpus is delivered in one file, in two
different formats. The txt version has one sentence per line, an
identification number for each text and no further annotation. The
cqpweb file is one token per line, followed by pos tag and lemma, and is
annotated for NP chunks.
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1179

*ELRA-W0061 CINTIL-DependencyBank*
The CINTIL-DependencyBank (Silva and Branco, 2012) is a corpus of
sentences annotated with their syntactic dependency graphs and
grammatical function tags composed of 10,039 sentences and 110,166
tokens taken from different sources and domains: news (8,861 sentences;
101,430 tokens), novels (399 sentences; 3,082 tokens). In addition,
there are 779 sentences (5,654 tokens) that are used for regression
testing of the computational grammar that supported the annotation of
the corpus.
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1180

*ELRA-W0062 CINTIL-DeepBank*
The CINTIL-DeepBank (Branco et al., 2010) is a corpus of sentences
annotated with their full-fledged deep grammatical representations,
composed of 10,039 sentences and 110,166 tokens taken from different
sources and domains: news (8,861 sentences; 101,430 tokens), and novels
(399 sentences; 3,082 tokens). In addition, there are 779 sentences
(5,654 tokens) used for regression testing of the computational grammar
that supported the annotation of the corpus.
For more information, see: 
http://catalog.elra.info/product_info.php?products_id=1181


For more information on the catalogue, please contact Valérie Mapelli
(mapelli at elda.org)

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: 
http://www.elra.info/LRs-Announcements.html

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list