Spanish Corpora on the Web
Carlos Subirats Rüggeberg
Carlos.Subirats at UAB.ES
Fri Jan 29 16:40:18 UTC 1999
INFOLING Lista moderada de lingüística española
http://listserv.rediris.es/archives/infoling.html
Envío de información: INFOLING-request at listserv.rediris.es
Editor: Carlos Subirats Rüggeberg <Carlos.Subirats at uab.es>
Colaboradoras:
Paola Bentivoglio <pbentivo at reacciun.ve>, UCV
Eulalia de Bobes <ebobes at seneca.uab.es>, UAB
Mar Cruz <mcruz at lingua.fil.ub.es>, UB
Emma Martinell <martinell at lingua.fil.ub.es>, UB
____________________________________________________________
Spanish Corpora on the Web
Mark Davies, Illinois State University, Estados Unidos
Reproducción de la página del Prof. Mark Davies:
http://138.87.135.33/personal/roanoke.htm
____________________________________________________________
Spanish Corpora on the Web
This page was created to list some Spanish texts that
were available via the Web as of March 1996, when I
presented a paper on "Using Large Computer-Based Corpora in
Teaching and Research" at the Colloquium on Spanish
Linguistics at Roanoke College. I have not updated the page
much since then, so please beware.
Modern Spanish texts
- Corpora of Spoken and Written Spanish:
http://elvira.lllf.uam.es/docs_es/corpus/corpus.html
Download 1,000,000 words of spoken Spanish from Spain;
2,000,000 words of written texts from Argentina, and
1,000,000 words of written texts from Chile European
Corpus.
- Initiative Multilingual Corpus:
http://www.cogsci.ed.ac.uk/elsnet/eci.html
Includes 1,300,000 words from two newspapers from Spain
[About $50; includes many texts from other European
languages also]
- Spanish News Corpus (from the Linguistic Data Consortium
at U Penn):
http://www.cis.upenn.edu:80/~ldc/ldc_catalog.html#spanish
Over 170 million words of text from Spanish newspapers
[$2000]
- ABC Corpus, Phone number: 34-91-322-6566:
About 4,000,000 words of texts from the Sunday culture
supplement of ABC from Madrid. [11,671 ptas]
- Listings of Spanish newspapers and magazines on the Web:
http://lanic.utexas.edu:80/la/region/news/
http://www.sbcc.cc.ca.us/~chavez/spanish.html#Magazines
- MundoLatino: Literatura
http://www.mundolatino.org/litera.htm
- Listing of Spanish poems, short stories, magazines, and
other media available on the Internet
- (Future) Real Academia Espanola: Corpus de Referencia del
Español Actual (CREA)
http://www.issc1.ibm.com/star/country/spain/srae.html
(limited info)
100 million words of text
Historical Spanish texts:
- ADMYTE:
http://history.cc.ukans.edu/history/subject_tree/e3/gen/cpet/18.html
(limited info)
Vol 0 = texts from 1200s to approx. 1500 [$300];
c5,000,000 words / Vol 1 = texts from c1480-1520 [$600] /
more volumes in progress).
- Gonzalo de Berceo
http://www.lang.uiuc.edu/LLL/etexts/teofilo.html
Can be downloaded
- Textos de comedias
http://listserv.arizona.edu/comedia.html
Can be downloaded
- Proyecto Cervantes
http://158.122.3.3/servicio.html
Can be downloaded
- (Future) Real Academia: Corpus Diacrónico del Español
(CORDE)
http://www.issc1.ibm.com/star/country/spain/srae.html
(limited info)
70 million words of text
- General info on corpora and corpus-based research:
Corpus Linguistics (Michael Barlow / Rice University):
http://www.ruf.rice.edu/%7Ebarlow/corpus.html
Very comprehensive listing, with links to texts,
bibliography, software, and many other corpus-related pages
throughout the world.
----------------------------------------------------
Normas para el correcto uso del correo electrónico:
http://www.rediris.es/mail/estilo.html
----------------------------------------------------
More information about the Infoling
mailing list