[Corpora-List] Compilation of language resources for French - ELRA

Yannick Versley versley at sfs.uni-tuebingen.de
Tue Apr 12 10:56:49 UTC 2011


Dear Valérie,

The "free" price point is interesting especially for masters students who
have the
desire to do actual research, but may have to provide the materials out of
their own
pocket. For these people, a price of 200-500EUR is definitely out of reach
(some would
balk at 50EUR, which I'd understand if you need to combine data from
multiple resources
to carry out your research), and labeling these "at media cost" will not
change this.

I do think that LDC and ELRA play an important role in the ecosystem around
language resources,
but I am also sure that, to ensure the widest possible use of a resource in
academic research,
the most effective way is to make it available free of cost and under a
liberal license,
as has been done with the Lefff. I understand that this is not always
possible, but
I applaud the people behind Lefff (and similar resources) for making it a
possibility.

Best wishes,
Yannick Versley

On Tue, Apr 12, 2011 at 12:08 PM, Valérie Mapelli <mapelli at elda.org> wrote:

>   Dear Corpora readers,
>
> Recently, Ineta Sejane circulated a message listing a number of French
> language resources.
> With the aim to contribute to the enrichment of this list, we identified
> some language resources, with a French component, available in the ELRA
> Catalogue, which are either free or at media cost for research purposes.
> These are distributed as follows:
>
> *Written Corpora:
> *W0003    CRATER corpus<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=84>
> W0004    ECI/MCI (European Corpus Initiative/Multilingual Corpus I)<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=85>
> W0013    TSNLP (Test Suites for NLP Testing)<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=51>
> W0015    Text corpus of "Le Monde"<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=438>
> W0017    MULTEXT JOC Corpus<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=534>
> W0018    ARCADE/ROMANSEVAL corpus<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=535>
> W0023    MLCC Multilingual and Parallel Corpora<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=764>
> W0025-01    A "scientific" corpus of modern French ("La Recherche"
> magazine) - Raw data<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=594>
> W0025-02    A "scientific" corpus of modern French ("La Recherche"
> magazine) - Complete version<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=595>
> W0032    Modern French Corpus including Anaphors Tagging<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=634>
> W0033    CRATER 2 Corpus<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=636>
> W0036-01    "Le Monde Diplomatique" Text corpus in French - archives
> 1980-1998<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=7>
> W0036-02    "Le Monde Diplomatique" Text corpus in French - archives from
> 1999 <http://catalog.elra.info/product_info.php?cPath=42_43&products_id=9>
>
> *Lexicons:
> *L0010    MULTEXT Lexicons<http://catalog.elra.info/product_info.php?products_id=29>
> M0020    EuroWordNet French<http://catalog.elra.info/product_info.php?products_id=550>
>
> *Speech LRs:
> *S0006    BREF-80<http://catalog.elra.info/product_info.php?products_id=36>
>  S0007    BREF-POLYGLOT<http://catalog.elra.info/product_info.php?products_id=37>
>  S0021    M2VTS Speaker Verification Database<http://catalog.elra.info/product_info.php?products_id=758>
>  S0033    BDBRUIT<http://catalog.elra.info/product_info.php?products_id=80>
>  S0060    MULTEXT Prosodic database<http://catalog.elra.info/product_info.php?products_id=530>
>  S0088    Twin database - TWINDB1<http://catalog.elra.info/product_info.php?products_id=579>
>  S0163    ILPho phonetic lexicon<http://catalog.elra.info/product_info.php?products_id=760>
>  S0238    MIST Multi-lingual Interoperability in Speech Technology database<http://catalog.elra.info/product_info.php?products_id=988>
>  S0241    ESTER Corpus<http://catalog.elra.info/product_info.php?products_id=999>
>  S0305    EPAC Corpus: orthographic transcriptions<http://catalog.elra.info/product_info.php?products_id=1119>
>
> *Evaluation Packages:
> *E0008    The CLEF Test Suite for the CLEF 2000-2003 Campaigns -
> Evaluation Package<http://catalog.elra.info/product_info.php?products_id=888>
> E0018    ARCADE II Evaluation Package<http://catalog.elra.info/product_info.php?products_id=992>
> E0019    CESART Evaluation Package<http://catalog.elra.info/product_info.php?products_id=993>
> E0020    CESTA Evaluation Package<http://catalog.elra.info/product_info.php?products_id=994>
> E0021    ESTER Evaluation Package<http://catalog.elra.info/product_info.php?products_id=995>
> E0022    EQueR Evaluation Package<http://catalog.elra.info/product_info.php?products_id=996>
> E0023    EvaSy  Evaluation Package<http://catalog.elra.info/product_info.php?products_id=997>
> E0024    MEDIA Evaluation Package<http://catalog.elra.info/product_info.php?products_id=998>
> E0034    EASy Evaluation Package<http://catalog.elra.info/product_info.php?products_id=1112>
> E0036    CLEF AdHoc-News Test Suites (2004-2008) - Evaluation Package<http://catalog.elra.info/product_info.php?products_id=1127>
> E0038    CLEF Question Answering Test Suites (2003-2008) - Evaluation
> Package <http://catalog.elra.info/product_info.php?products_id=1129>
> W0029    Amaryllis Corpus - Evaluation Package<http://catalog.elra.info/product_info.php?cPath=42_43&products_id=626>
>
> Other French language resources and many other languages are available both
> for research and commercial communities in our catalogue that you may visit
> at:
> http://catalogue.elra.info
>
> Best regards,
>
> Valérie Mapelli
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110412/331ad7bf/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list