<html dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style id="owaParaStyle">P {
MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px
}
</style>
</head>
<body fPStyle="1" ocsi="0">
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 8pt;">
<p>As long as others are listing online interfaces to large corpora that do regular expressions / wildcards, I might as well mention the BYU corpora (<a href="http://corpus.byu.edu">http://corpus.byu.edu</a>).</p>
<p> </p>
<p>For example, BYU-BNC (<a href="http://corpus.byu.edu/bnc">http://corpus.byu.edu/bnc</a>) can do "[vh<a></a><a></a>*] [v?n<a></a><a></a>*] [a*] [jj<a></a><a></a>*] [nn<a></a><a></a>*]" in less than four seconds:</p>
<p> </p>
<p><a href="http://corpus.byu.edu/bnc/?c=bnc&q=21313156">http://corpus.byu.edu/bnc/?c=bnc&q=21313156</a></p>
<p> </p>
<p>And of course the interface also allows searches by synonyms, lemma, wildcards, alternates<a></a>, customized word lists, and any combinations of these, etc etc<a></a><a></a></p>
<p> </p>
<div>
<p>MD</p>
<p> </p>
<div style="FONT-FAMILY: Tahoma; FONT-SIZE: 13px">
<p>============================================<br>
Mark Davies<br>
Professor of Linguistics / Brigham Young University<br>
<a href="http://davies-linguistics.byu.edu/">http://davies-linguistics.byu.edu/</a></p>
<p>** Corpus design and use // Linguistic databases **<br>
** Historical linguistics // Language variation **<br>
** English, Spanish, and Portuguese **<br>
============================================<br>
</p>
</div>
</div>
<div style="FONT-FAMILY: Times New Roman; COLOR: #000000; FONT-SIZE: 16px">
<hr tabindex="-1">
<div style="DIRECTION: ltr" id="divRpF859954"><font color="#000000" size="2" face="Tahoma"><b>From:</b> corpora-bounces@uib.no [corpora-bounces@uib.no] on behalf of Gemma Boleda [gemma.boleda@upf.edu]<br>
<b>Sent:</b> Monday, February 25, 2013 2:24 PM<br>
<b>To:</b> Corpora@uib.no<br>
<b>Subject:</b> Re: [Corpora-List] corpora with regular expression engine (syntactic pattern)<br>
</font><br>
</div>
<div></div>
<div>Hi Austina,<br>
<div><br>
</div>
<div>there are also a couple of online interfaces to corpora that allow for POS queries in regular expressions, such as for example:<br>
</div>
<div><br>
</div>
<div>Serge Sharoff's "Leeds CQP" search interface (English corpora available, and also corpora for other languages):
<a href="http://corpus.leeds.ac.uk/internet.html" target="_blank">http://corpus.leeds.ac.uk/internet.html</a></div>
<div><br>
</div>
<div>UPF's interface to CUCWeb (Catalan corpus): <a href="http://ramsesii.upf.es/cgi-bin/cucweb/search-form.pl?lang=en_US" target="_blank">http://ramsesii.upf.es/cgi-bin/cucweb/search-form.pl?lang=en_US</a></div>
<div><br>
</div>
<div>These two interfaces are based on the IMS Open Corpus Workbench that Marco Baroni mentioned; indeed, this tool provides a module to easily build web interfaces with its core corpus processor as a back-end.</div>
<div><br>
</div>
<div>Best,</div>
<div>Gemma.</div>
<div><br>
</div>
-- <br>
Gemma Boleda<br>
The University of Texas at Austin<br>
<a href="http://gboleda.utcompling.com" target="_blank">http://gboleda.utcompling.com</a><br>
<br>
<br>
</div>
</div>
</div>
</body>
</html>