<div dir="ltr"><div><font color="#222222" face="Arial, sans-serif"><span style="line-height:15.994318008422852px">Dear Adam, Eric, and Marc</span></font></div><div><font color="#222222" face="Arial, sans-serif"><span style="line-height:15.994318008422852px">Thank you for your responses. </span></font></div>
<div><font color="#222222" face="Arial, sans-serif"><span style="line-height:15.994318008422852px"><br></span></font></div><div><font color="#222222" face="Arial, sans-serif"><span style="line-height:15.994318008422852px">Suppose I use the </span></font><span style="color:rgb(34,34,34);font-family:Arial,sans-serif;line-height:15.994318008422852px">Corpus of Contemporary Arabic for some NLP or corpus linguistics purpose, would it be strange to cite it as follows:</span></div>
<div><span style="color:rgb(34,34,34);font-family:Arial,sans-serif;line-height:15.994318008422852px">Al-Sulaiti, L., & Atwell, E. S. (2006). Corpus of Contemporary Arabic (CCA). Leeds, UK: University of Leeds. Retrieved from <a href="http://www.comp.leeds.ac.uk/eric/latifa/research.htm">http://www.comp.leeds.ac.uk/eric/latifa/research.htm</a></span></div>
<div><br></div><div><font color="#222222" face="Arial, sans-serif"><span style="line-height:15.994318008422852px">instead of:</span></font></div><div><span style="font-size:13px;line-height:16px;color:rgb(34,34,34);font-family:Arial,sans-serif">Al-Sulaiti, L., & Atwell, E. S. (2006). The design of a corpus of contemporary Arabic. </span><i style="font-size:13px;line-height:16px;color:rgb(34,34,34);font-family:Arial,sans-serif">International Journal of Corpus Linguistics</i><span style="font-size:13px;line-height:16px;color:rgb(34,34,34);font-family:Arial,sans-serif">, </span><i style="font-size:13px;line-height:16px;color:rgb(34,34,34);font-family:Arial,sans-serif">11</i><span style="font-size:13px;line-height:16px;color:rgb(34,34,34);font-family:Arial,sans-serif">(2), 135-171.</span></div>
<div><br></div><div>?</div><br><div class="gmail_quote">On Thu, Mar 7, 2013 at 11:33 AM, Marc Brysbaert <span dir="ltr"><<a href="mailto:marc.brysbaert@ugent.be" target="_blank">marc.brysbaert@ugent.be</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="NL-BE" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Hi,<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Researchers get most credit for their work when it is published
in a journal that features in ISI or Scopus, as it is then used for all types
of metrics (whether you like this or not). From my own experience, I’ve
noticed that it is not so easy, however, to get manuscripts on corpora (or word
frequency lists) published, even though they are well cited. Does anyone have a
list of ISI journals that publish information on corpora? Thus far I have
published most of my findings in Behavior Research Methods, but this is aimed
at a psychological audience (and hence will only accept papers that are
interesting for them).<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Best, marc<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u></span></p></div></div></blockquote><div><br></div><div><div class="gmail_quote">
On Thu, Mar 7, 2013 at 11:04 AM, Eric Atwell <span dir="ltr"><<a href="mailto:E.S.Atwell@leeds.ac.uk" target="_blank">E.S.Atwell@leeds.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Morteza,<br><br>This question is timely in the UK where we are preparing for REF.<br>Whatever Corpus Linguists may think, the wider academic world expects<br>citations of published journal/conference papers or books. So, when a<br>
corpus is created, the developers should also publish a paper or book on the research undertaken to develop the corpus, and this is what you<br>should cite. Even if you don't directly quote from the paper, you are<br>
citing the academic research idea embodied in the paper. Sometimes a corpus project can lead to several publications.<br>It is good practice for creators of a corpus to nominate a specific paper<br>whcih should be cited by users of the corpus, e.g. on the website where<br>
you get the corpus from. This helps people like you who want to know<br>what to cite; and it helps the corpus creators to accumulate due credit for their work. For example for REF, we nominate up to 4 key papers for<br>assessment, so it helps if others cite these specific 4 papers.<br>
<br>Eric Atwell, Leeds University</blockquote><div><span style="color:rgb(136,136,136)">-- </span></div><span style="color:rgb(136,136,136)">Eric Atwell, Associate Professor, Language research group,</span><br style="color:rgb(136,136,136)">
<span style="color:rgb(136,136,136)"> I-AIBS Institute for Artificial Intelligence and Biological Systems</span><br style="color:rgb(136,136,136)"><span style="color:rgb(136,136,136)"> School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS</span><br style="color:rgb(136,136,136)">
<span style="color:rgb(136,136,136)"> Leeds LS2 9JT, England. TEL: 0113-3435430 FAX: 0113-3435468</span><br style="color:rgb(136,136,136)"><span style="color:rgb(136,136,136)"> WWW: </span><a href="http://www.comp.leeds.ac.uk/eric" target="_blank">http://www.comp.leeds.ac.uk/<u></u>eric</a><br style="color:rgb(136,136,136)">
<span style="color:rgb(136,136,136)"> </span><a href="http://www.comp.leeds.ac.uk/nlp" target="_blank">http://www.comp.leeds.ac.uk/<u></u>nlp</a><br style="color:rgb(136,136,136)"><span style="color:rgb(136,136,136)"> </span><a href="http://www.comp.leeds.ac.uk/arabic" target="_blank">http://www.comp.leeds.ac.uk/<u></u>arabic</a><br style="color:rgb(136,136,136)">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> </blockquote></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="NL-BE" link="blue" vlink="purple"><div><p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> <u></u></span></p>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> <a href="mailto:corpora-bounces@uib.no" target="_blank">corpora-bounces@uib.no</a>
[mailto:<a href="mailto:corpora-bounces@uib.no" target="_blank">corpora-bounces@uib.no</a>] <b>On Behalf Of </b>Adam Kilgarriff<br>
<b>Sent:</b> 07 March 2013 08:37<br>
<b>To:</b> M. Rezaei<br>
<b>Cc:</b> <a href="mailto:corpora@uib.no" target="_blank">corpora@uib.no</a><br>
<b>Subject:</b> Re: [Corpora-List] Question: Citing Linguistic Corpora<u></u><u></u></span></p>
</div><div><div class="h5">
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Dear Morteza,<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Yes, you definitely should cite the corpus.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">It is always likely that your POS-tagger will have failings
because of characteristics of the corpus it was trained on. People should
be able to look at it in this light, with an account of how the corpus was
prepared, available to them.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Sometimes there is no obvious way to cite the corpus.
Sometimes a URL is best (which is what I do for example for the BNC, as
the website is long-life and with full and good documentation, and the only
alternative is to a technical report that no-one is actually going to track
down). As a producer of corpora, I aim to write them up in a paper that
is easy to find and to read and serves as a reference.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"> Adam</p></div></div></div></div></div></blockquote><p class="MsoNormal">-- <br>========================================<br><a href="http://www.kilgarriff.co.uk/" target="_blank">Adam Kilgarriff</a> <a href="mailto:adam@lexmasterclass.com" target="_blank">adam@lexmasterclass.com</a> <br>
Director <a href="http://www.sketchengine.co.uk/" target="_blank">Lexical Computing Ltd</a> <br>Visiting Research Fellow <a href="http://leeds.ac.uk/" target="_blank">University of Leeds</a> <u></u><u></u></p>
<div><p class="MsoNormal"><i><span style="color:rgb(0,102,0)">Corpora for all</span></i> with <a href="http://www.sketchengine.co.uk/" target="_blank">the Sketch Engine</a> <u></u><u></u></p></div><div><p class="MsoNormal">
<i><a href="http://www.webdante.com/" target="_blank">DANTE: <span style="color:rgb(0,153,0)">a lexical database for English</span></a><span style="color:rgb(0,153,0)"> </span> </i><u></u><u></u></p>
</div><div>======================================== </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="NL-BE" link="blue" vlink="purple"><div><div><div class="h5">
<div><p class="MsoNormal"><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">On 7 March 2013 06:27, M. Rezaei <<a href="mailto:mrezaeis@mehr.sharif.ir" target="_blank">mrezaeis@mehr.sharif.ir</a>>
wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Tahoma","sans-serif"">Dear all,</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Tahoma","sans-serif"">Salam.<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Tahoma","sans-serif"">Suppose I
use a text corpus and I extract some statistical information from it or I train
a POS tagger based on it. Well, I have used the corpus, but I have not directly
used the paper which describes it i.e. I have not quoted a paragraph from the
paper in my research. Is there any standard style for citing the corpus itself,
as a data set? Is it a good idea to do so? What about the corpus authors, do
they prefer users to cite their paper rather than the corpus itself?</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Tahoma","sans-serif"">Looking
forward to receiving your responses.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Tahoma","sans-serif"">Best Regards</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Tahoma","sans-serif"">Morteza
Rezaei</span></p></div></div></div></div></div></div></div></div></blockquote></div></div>