Hi Martyn,<div><br></div><div>we have a Serbian corpus in the Sketch Engine so all she needs to do is upload her corpus and then run 'keywords' to compare hers with the reference.</div><div><br></div><div>The one that is currently available is not lemmatised so comparisons there would be wordform-baed, however we are lemmatising and POS-tagging a newer, bigger dataset (courtesy of Nikola Ljubešić) as we speak so can make that available too, then she can get key lemmas. If you or she ask, we can make a big sample of the lemmatised material available at a day or two's notice</div>
<div><br></div><div>Best</div><div><br></div><div>Adam</div><div><br><br><div class="gmail_quote">On 22 February 2013 15:39, Martin Wynne <span dir="ltr"><<a href="mailto:martin.wynne@it.ox.ac.uk" target="_blank">martin.wynne@it.ox.ac.uk</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I would like to pose a question on behalf of a student who would like to generate keywords by comparing her corpus of contemporary online personal ads in Serbian with a reference corpus.<br>
<br>
Does anyone know of any freely available wordlists for the modern Serbian language? Ideally, we'd like a lemma frequency list generated from a general reference corpus, although lists from various other text types could be useful. We'd be interested if there is a corpus available to use as well.<br>
<br>
Many thanks for any help.<br>
<br>
<br>
-- <br>
Martin Wynne<br>
IT Services, University of Oxford<br>
Oxford e-Research Centre<br>
Faculty of Linguistics, Philology and Phonetics<br>
<br>
<a href="mailto:martin.wynne@it.ox.ac.uk" target="_blank">martin.wynne@it.ox.ac.uk</a><br>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/<u></u>corpora</a><br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no" target="_blank">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/<u></u>listinfo/corpora</a><br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>========================================<br><a href="http://www.kilgarriff.co.uk/" target="_blank">Adam Kilgarriff</a> <a href="mailto:adam@lexmasterclass.com" target="_blank">adam@lexmasterclass.com</a> <br>
Director <a href="http://www.sketchengine.co.uk/" target="_blank">Lexical Computing Ltd</a> <br>Visiting Research Fellow <a href="http://leeds.ac.uk" target="_blank">University of Leeds</a> <div>
<i><font color="#006600">Corpora for all</font></i> with <a href="http://www.sketchengine.co.uk" target="_blank">the Sketch Engine</a> </div><div> <i><a href="http://www.webdante.com" target="_blank">DANTE: <font color="#009900">a lexical database for English</font></a><font color="#009900"> </font> </i><div>
========================================</div></div>
</div>