Keely,<div><br></div><div>our tool WebBootCaT will do the scraping, and deliver a corpus, for you; you can point it to the music-reviews sites Joel mentions (under 'advanced options')  and then it will build a corpus from pages there.  You'll first need to self-register, at <a href="http://www.sketchengine.co.uk">http://www.sketchengine.co.uk</a></div>


<div><br></div><div>Regards</div><div><br></div><div>Adam<br><br><div class="gmail_quote">On 11 November 2011 15:01, Tetreault, Joel <span dir="ltr"><<a href="mailto:JTetreault@ets.org">JTetreault@ets.org</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Hi Keely, instead of collecting music reviews from news corpora, it might be more effective to go to music review sites and scrape them off the webpage.  Some good ones are:<br>


<br>

<a href="http://allmusic.com" target="_blank">allmusic.com</a> - one of the largest repositories of music reviews in the world.  If you want to make a massive corpus, I would just scrape that.<br>

<br>

<a href="http://pitchfork.com" target="_blank">pitchfork.com</a> - has (indie) music reviews going back to 1999, and they have 5 or so album reviews a day.  The album reviews section is here:  <a href="http://pitchfork.com/reviews/albums/" target="_blank">http://pitchfork.com/reviews/albums/</a>  They changed their site format a few months ago, but before that I scraped all the reviews to make a music review corpus.  I could zip that up and send it to you, though it may require some post-processing here and there.  There are over 10,000 reviews in that scrape.<br>


<br>

<a href="http://metacritic.com" target="_blank">metacritic.com</a> - is a review aggregator site for movies, music, games, etc.  It has links to reviews on other websites and then normalizes the scores from each website to give a composite score.<br>


<br>

<a href="http://nme.com" target="_blank">nme.com</a> / <a href="http://spin.com" target="_blank">spin.com</a> / rollingstone - all have music reviews on their website, another good source for webscraping.<br>

<br>

Joel<br>

<br>

------------------------------<br>

<br>

Message: 5<br>

Date: Thu, 10 Nov 2011 14:51:38 -0500<br>

From: Keely <<a href="mailto:km.mimnagh@gmail.com">km.mimnagh@gmail.com</a>><br>

Subject: [Corpora-List] North america newspaper corpus<br>

To: <a href="mailto:corpora@uib.no">corpora@uib.no</a><br>

<br>

Hi I am a master's student. I am running a study on the language of music<br>

critics. Does anyone know of a corpus that breaks down newspapers by<br>

sections. So I can parse on music reviews?<br>

<br>

Any help would be much appreciated.<br>

<br>

<br>

Keely<br>

<font color="#888888"><br>

--<br>

Keely Mimnagh<br>

M.A. Candidate<br>

Music and Culture<br>

Carleton University<br>

Ottawa, Ontario<br>

_______________________________________________<br>

UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>

Corpora mailing list<br>

<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>

<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>

</font></blockquote></div><br><br clear="all"><div><br></div>-- <br>========================================<br><a href="http://www.kilgarriff.co.uk/" target="_blank">Adam Kilgarriff</a>                  <a href="mailto:adam@lexmasterclass.com" target="_blank">adam@lexmasterclass.com</a>                                             <br>


Director                                    <a href="http://www.sketchengine.co.uk/" target="_blank">Lexical Computing Ltd</a>                <br>Visiting Research Fellow                 <a href="http://leeds.ac.uk" target="_blank">University of Leeds</a>     <div>


<i><font color="#006600">Corpora for all</font></i> with <a href="http://www.sketchengine.co.uk" target="_blank">the Sketch Engine</a>                 </div><div>                        <i><a href="http://www.webdante.com" target="_blank">DANTE: <font color="#009900">a lexical database for English</font></a><font color="#009900"> </font>                 </i><div>


========================================</div></div><br>

</div>