<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">

<META content="MSHTML 6.00.2900.3020" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY bgColor=#ffffff>

<DIV>Many thanks to everyone who responded to my recent query about free online 

corpora. Here is a summary of the responses I have received: </DIV>

<DIV> </DIV>

<DIV>Jenny in Hong Kong directed me to the Hong Kong Polytechnic 

University's Virtual Language Centre <A 

href="http://vlc.polyu.edu.hk/">http://vlc.polyu.edu.hk/</A>, which takes you to 

a concordancer with different corpora.</DIV>

<DIV> </DIV>

<DIV dir=ltr><FONT color=#000000>Lene Petersen highlighted the KEMPE <EM>Korpus 

of Early Modern Playtexts in English</EM> which is available to search free of 

charge via <A 

href="http://corp.hum.sdu.dk/cqp.en.html">http://corp.hum.sdu.dk/cqp.en.html</A>. "The VISL 

site also hosts wikipedia and chat corpora that are password 

free."</FONT></DIV>

<DIV><BR>Jörg Tiedemann pointed me to the OPUS collection of parallel corpora 

(including English). There is an on-line search interface at <A 

href="http://logos.uio.no/cgi-bin/opus/opuscqp.pl">http://logos.uio.no/cgi-bin/opus/opuscqp.pl</A>, 

and another (hidden) search interface for Europarl with some more features: 

<A 

href="http://logos.uio.no/opus/EUROPARL/frames-cqp.html">http://logos.uio.no/opus/EUROPARL/frames-cqp.html</A></DIV>

<DIV> </DIV>

<DIV>Elzbieta Dura <FONT color=#000080>mentioned<SPAN 

style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial"><FONT 

face="Times New Roman" size=3> </FONT><A href=""><FONT 

face="Times New Roman" 

size=3>http://bergelmir.iki.his.se/culler/</FONT></A><FONT size=3><FONT 

face="Times New Roman"> <FONT color=#000000>where there are a number of corpora 

in biomedicine and also an English-Swedish JRC-Acquis parallel corpus. At <A 

href="http://www.nla.se.culler">http://www.nla.se.culler</A> there is a corpus 

of older English. She also noted that comments on the corpus tool Culler 

are welcome.</FONT></FONT></FONT></SPAN></FONT><FONT color=#000000><FONT 

face="Times New Roman"><FONT size=3><SPAN 

style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial"> 

</SPAN></FONT></FONT></FONT></DIV>

<DIV><FONT color=#000000><FONT face="Times New Roman"><FONT size=3><SPAN 

style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial"></SPAN> </DIV></FONT></FONT></FONT>

<DIV>Michaela Geierhos said: "Perhaps you are already aware of Mark 

Davies's TIME corpus. He provides an web interface to do basic KWIC, collocates, 

n-gram searches, etc. TIME corpus (new May 2007; 100m words; US 1900s) <A 

href="http://view.byu.edu/timemag" 

target=_blank>http://view.byu.edu/timemag</A>. Another quite useful thing is 

GlossaNet. It's <FONT style="COLOR: rgb(0,0,0)" color=#cccccc>a search engine 

that gives you daily access to the online editions of more than 100 newspapers 

in 12 languages. </FONT><SPAN><A href="http://glossa.fltr.ucl.ac.be/" 

target=_blank>http://glossa.fltr.ucl.ac.be/</A>. </SPAN>It requires 

registration for intensive use because it's possible to get the concordances of 

all chosen newspapers daily or weekly etc. by e-mail. You can also take a look 

at the system before registering: <SPAN><A 

href="http://glossa.fltr.ucl.ac.be/scripts/gtoday/gtoday.pl" 

target=_blank>http://glossa.fltr.ucl.ac.be/scripts/gtoday/gtoday.pl</A>. </SPAN>There 

you'll see an overview of all accessible newspapers by language."<BR></DIV>

<DIV> </DIV>

<DIV>Eckhard Bick highlights the English section of Corpus Eye (at <A 

href="http://corp.hum.sdu.dk">http://corp.hum.sdu.dk</A>), which contains a 

number of further online corpora (all morphologically and syntactically 

annotated and searchable), of which the following are 

password-free: Europarl corpus (25.7 mill. words); Wikipedia corpus 

(115 mill. words); Chat corpus (23.5 mill. words); KEMPE Shakespeare 

corpus (8.9 mill. words); Enron e-mail corpus (75 mill. words)</DIV>

<DIV> </DIV>

<DIV>Ana Frankenberg directed me to the COMPARA corpus, a 3 million-word 

bidirectional parallel corpus of English and Portuguese. "People can use just 

the English (or just the Portuguese) side of the corpus if they wish. The corpus 

is online, free and requires no registration. See <A 

href="">http://www.linguateca.pt/COMPARA/Welcome.html</A>"<BR></DIV>

<DIV> </DIV>

<DIV>Elisa Duarte Teixeira and Stella Tagnin told me that "the English part 

of the CorTec corpus, a Portuguse-English technical comparable corpus, 

which is part of the COMET Project (Multiligual Corpora for Teaching and 

Translation), can be freely searched at this address: (<A 

href="">http://www.fflch.usp.br/dlm/comet/consulta_cortec.html</A>). Although 

the English version of the site is not finished, there you'll find the 

documentation that explains the composition of the 5 corpora in English. Soon, 

all the 5 corpora will receive more texts and new areas will be added 

- we'll announce it here, when it's ready."  Stella Tagnin also 

pointed out a monolingual Brazilian Portuguese Corpus - Lácio-Web, at <A 

href="http://www.nilc.icmc.usp.br/lacioweb">www.nilc.icmc.usp.br/lacioweb</A>. 

</DIV>

<DIV> </DIV>

<DIV>Huaqing Hong suggested the SCoRE corpus at: <A 

href="http://score.crpp.nie.edu.sg/">http://score.crpp.nie.edu.sg/</A>. You can 

register online to try the demo version. <BR></DIV>

<DIV>Ilya at the Linguistic Data Consortium directed me to: <A 

class=moz-txt-link-freetext 

href="https://online.ldc.upenn.edu/login.html">https://online.ldc.upenn.edu/login.html</A> to 

sign up for a guest account to LDC Online. "With a guest account, you can search 

a subset of English newstext the LDC has acquired, as well as search and listen 

to English telephone conversations.  The American English Spoken Lexicon is 

also included."<BR></DIV>

<DIV> </DIV>

<DIV>Stefan Bordag suggested I look at corpora.uni-leipzig.de, which contains an 

English corpus as well as others and is freely accessible online, as well as 

downloadable. </DIV>

<DIV> </DIV>

<DIV>Ralf Steinberger highlighted the 55 million word <SPAN 

style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial"><FONT 

face="Times New Roman" color=#000000 size=3>English part of the multilingual 

parallel corpus JRC-Acquis. "The overall corpus, including all 22 languages, 

consists of over 1 Billion words. </FONT></SPAN><SPAN 

style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial"><FONT 

face="Times New Roman" size=3><FONT color=#000000>You cannot search the corpus 

via a web interface, but you can simply download the JRC-Acquis documents from 

the site</FONT><FONT color=navy> </FONT><A href=""><FONT face="Times New Roman" 

size=3>http://langtech.jrc.it/JRC-Acquis.html</FONT></A><FONT 

face="Times New Roman" size=3>."</FONT></SPAN></FONT><BR></DIV>

<DIV> </DIV>

<DIV>For completeness, here are the corpora I included in my first message: 

</DIV>

<DIV> </DIV>

<DIV>BNC (<A href="">http://www.natcorp.ox.ac.uk/</A>)<BR>VIEW interface to the 

BNC (<A href="">http://view.byu.edu/</A>)<BR>COBUILD Corpus Concordance 

Sampler (<A 

href="http://www.collins.co.uk/corpus/CorpusSearch.aspx">http://www.collins.co.uk/corpus/CorpusSearch.aspx</A>)<BR>SCOTS 

(<A href="">http://www.scottishcorpus.ac.uk</A>)<BR>ELISA (<A 

href="">http://www.uni-tuebingen.de/elisa/html/elisa_index.html</A>)<BR>Compleat 

Lexical Tutor (access to Brown and BNC sampler among others) (<A 

href="http://www.lextutor.ca/">http://www.lextutor.ca/</A>)<BR>Virtual Language 

Centre Web Concordancer (access to Brown, LOB among others) (<A 

href="">http://www.edict.com.hk/default.htm</A>)<BR>IViE Corpus (<A 

href="">http://www.phon.ox.ac.uk/IViE/</A>)<BR>Speech Accent Archive (<A 

href="http://accent.gmu.edu/">http://accent.gmu.edu/</A>)<BR></DIV>

<DIV> </DIV>

<DIV>thanks again!</DIV>

<DIV> </DIV>

<DIV>Wendy</DIV>

<DIV>....................<BR>Dr Wendy J Anderson<BR>Scottish Corpus of Texts and 

Speech<BR>Department of English Language<BR>University of Glasgow<BR>12 

University Gardens<BR>Glasgow<BR>G12 8QQ<BR>Scotland, UK</DIV>

<DIV> </DIV>

<DIV>Website: <A 

href="http://www.scottishcorpus.ac.uk">http://www.scottishcorpus.ac.uk</A><BR></DIV></BODY></HTML>