Spanish Corpora on the Web

Carlos Subirats Rüggeberg Carlos.Subirats at UAB.ES
Fri Jan 29 16:40:18 UTC 1999


INFOLING  Lista moderada de lingüística española
http://listserv.rediris.es/archives/infoling.html
Envío de información: INFOLING-request at listserv.rediris.es
Editor: Carlos Subirats Rüggeberg <Carlos.Subirats at uab.es>
Colaboradoras:
Paola Bentivoglio <pbentivo at reacciun.ve>, UCV
Eulalia de Bobes <ebobes at seneca.uab.es>, UAB
Mar Cruz <mcruz at lingua.fil.ub.es>, UB
Emma Martinell <martinell at lingua.fil.ub.es>, UB
____________________________________________________________

                 Spanish Corpora on the Web
Mark Davies, Illinois State University, Estados Unidos
      Reproducción de la página del Prof. Mark Davies:
          http://138.87.135.33/personal/roanoke.htm
____________________________________________________________

                 Spanish Corpora on the Web

     This page was created to list some Spanish texts that
were available via the Web as of March 1996, when I
presented a paper on "Using Large Computer-Based Corpora in
Teaching and Research" at the Colloquium on Spanish
Linguistics at Roanoke College. I have not updated the page
much since then, so please beware.

                    Modern Spanish texts

- Corpora of Spoken and Written Spanish:

    http://elvira.lllf.uam.es/docs_es/corpus/corpus.html

    Download 1,000,000 words of spoken Spanish from Spain;
2,000,000 words of written texts from Argentina, and
1,000,000 words of written texts from Chile European
Corpus.

- Initiative Multilingual Corpus:

         http://www.cogsci.ed.ac.uk/elsnet/eci.html

    Includes 1,300,000 words from two newspapers from Spain
[About $50; includes many texts from other European
languages also]

- Spanish News Corpus (from the Linguistic Data Consortium
at U Penn):

  http://www.cis.upenn.edu:80/~ldc/ldc_catalog.html#spanish

    Over 170 million words of text from Spanish newspapers
[$2000]

- ABC Corpus, Phone number: 34-91-322-6566:
    About 4,000,000 words of texts from the Sunday culture
supplement of ABC from Madrid. [11,671 ptas]

- Listings of Spanish newspapers and magazines on the Web:

         http://lanic.utexas.edu:80/la/region/news/
  http://www.sbcc.cc.ca.us/~chavez/spanish.html#Magazines

- MundoLatino: Literatura

            http://www.mundolatino.org/litera.htm

- Listing of Spanish poems, short stories, magazines, and
other media available on the Internet

- (Future) Real Academia Espanola: Corpus de Referencia del
Español Actual (CREA)

http://www.issc1.ibm.com/star/country/spain/srae.html
(limited info)

    100 million words of text


                 Historical Spanish texts:

- ADMYTE:

http://history.cc.ukans.edu/history/subject_tree/e3/gen/cpet/18.html
(limited info)

    Vol 0 = texts from 1200s to approx. 1500 [$300];
c5,000,000 words / Vol 1 = texts from c1480-1520 [$600] /
more volumes in progress).

- Gonzalo de Berceo

      http://www.lang.uiuc.edu/LLL/etexts/teofilo.html
  Can be downloaded

- Textos de comedias

          http://listserv.arizona.edu/comedia.html

Can be downloaded

- Proyecto Cervantes

              http://158.122.3.3/servicio.html

Can be downloaded

- (Future) Real Academia: Corpus Diacrónico del Español
(CORDE)

http://www.issc1.ibm.com/star/country/spain/srae.html
(limited info)

   70 million words of text


- General info on corpora and corpus-based research:
Corpus Linguistics (Michael Barlow / Rice University):

        http://www.ruf.rice.edu/%7Ebarlow/corpus.html

    Very comprehensive listing, with links to texts,
bibliography, software, and many other corpus-related pages
throughout the world.

----------------------------------------------------
Normas para el correcto uso del correo electrónico:
		http://www.rediris.es/mail/estilo.html
----------------------------------------------------





More information about the Infoling mailing list