[Corpora-List] Question: Citing Linguistic Corpora
M. Rezaei
mrezaeis at mehr.sharif.ir
Thu Mar 7 08:29:30 UTC 2013
Dear Adam, Eric, and Marc
Thank you for your responses.
Suppose I use the Corpus of Contemporary Arabic for some NLP or corpus
linguistics purpose, would it be strange to cite it as follows:
Al-Sulaiti, L., & Atwell, E. S. (2006). Corpus of Contemporary Arabic
(CCA). Leeds, UK: University of Leeds. Retrieved from
http://www.comp.leeds.ac.uk/eric/latifa/research.htm
instead of:
Al-Sulaiti, L., & Atwell, E. S. (2006). The design of a corpus of
contemporary Arabic. *International Journal of Corpus Linguistics*, *11*(2),
135-171.
?
On Thu, Mar 7, 2013 at 11:33 AM, Marc Brysbaert <marc.brysbaert at ugent.be>wrote:
> Hi,****
>
> ** **
>
> Researchers get most credit for their work when it is published in a
> journal that features in ISI or Scopus, as it is then used for all types of
> metrics (whether you like this or not). From my own experience, I’ve
> noticed that it is not so easy, however, to get manuscripts on corpora (or
> word frequency lists) published, even though they are well cited. Does
> anyone have a list of ISI journals that publish information on corpora?
> Thus far I have published most of my findings in Behavior Research Methods,
> but this is aimed at a psychological audience (and hence will only accept
> papers that are interesting for them).****
>
> ** **
>
> Best, marc****
>
> **
>
On Thu, Mar 7, 2013 at 11:04 AM, Eric Atwell <E.S.Atwell at leeds.ac.uk> wrote:
> Morteza,
>
> This question is timely in the UK where we are preparing for REF.
> Whatever Corpus Linguists may think, the wider academic world expects
> citations of published journal/conference papers or books. So, when a
> corpus is created, the developers should also publish a paper or book on
> the research undertaken to develop the corpus, and this is what you
> should cite. Even if you don't directly quote from the paper, you are
> citing the academic research idea embodied in the paper. Sometimes a
> corpus project can lead to several publications.
> It is good practice for creators of a corpus to nominate a specific paper
> whcih should be cited by users of the corpus, e.g. on the website where
> you get the corpus from. This helps people like you who want to know
> what to cite; and it helps the corpus creators to accumulate due credit
> for their work. For example for REF, we nominate up to 4 key papers for
> assessment, so it helps if others cite these specific 4 papers.
>
> Eric Atwell, Leeds University
--
Eric Atwell, Associate Professor, Language research group,
I-AIBS Institute for Artificial Intelligence and Biological Systems
School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
Leeds LS2 9JT, England. TEL: 0113-3435430 FAX: 0113-3435468
WWW: http://www.comp.leeds.ac.uk/**eric <http://www.comp.leeds.ac.uk/eric>
http://www.comp.leeds.ac.uk/**nlp <http://www.comp.leeds.ac.uk/nlp>
http://www.comp.leeds.ac.uk/**arabic<http://www.comp.leeds.ac.uk/arabic>
>
**
>
> *From:* corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] *On Behalf
> Of *Adam Kilgarriff
> *Sent:* 07 March 2013 08:37
> *To:* M. Rezaei
> *Cc:* corpora at uib.no
> *Subject:* Re: [Corpora-List] Question: Citing Linguistic Corpora****
>
> ** **
>
> Dear Morteza,****
>
> ** **
>
> Yes, you definitely should cite the corpus.****
>
> ** **
>
> It is always likely that your POS-tagger will have failings because of
> characteristics of the corpus it was trained on. People should be able to
> look at it in this light, with an account of how the corpus was prepared,
> available to them.****
>
> ** **
>
> Sometimes there is no obvious way to cite the corpus. Sometimes a URL is
> best (which is what I do for example for the BNC, as the website is
> long-life and with full and good documentation, and the only alternative is
> to a technical report that no-one is actually going to track down). As a
> producer of corpora, I aim to write them up in a paper that is easy to find
> and to read and serves as a reference.****
>
> ** **
>
> Adam
>
--
========================================
Adam Kilgarriff <http://www.kilgarriff.co.uk/>
adam at lexmasterclass.com
Director Lexical Computing
Ltd<http://www.sketchengine.co.uk/>
Visiting Research Fellow University of
Leeds<http://leeds.ac.uk/>
****
*Corpora for all* with the Sketch Engine <http://www.sketchengine.co.uk/>
****
*DANTE: a lexical database for
English<http://www.webdante.com/>
*****
========================================
> ****
>
> ** **
>
> On 7 March 2013 06:27, M. Rezaei <mrezaeis at mehr.sharif.ir> wrote:****
>
> Dear all,****
>
> Salam.****
>
> Suppose I use a text corpus and I extract some statistical information
> from it or I train a POS tagger based on it. Well, I have used the corpus,
> but I have not directly used the paper which describes it i.e. I have not
> quoted a paragraph from the paper in my research. Is there any standard
> style for citing the corpus itself, as a data set? Is it a good idea to do
> so? What about the corpus authors, do they prefer users to cite their paper
> rather than the corpus itself?****
>
> Looking forward to receiving your responses.****
>
> Best Regards****
>
> Morteza Rezaei
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130307/88a4812d/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list