[Corpora-List] Call for papers - ESSE-10 - Scottish corpora

John Corbett j.corbett at englang.arts.gla.ac.uk
Tue Nov 10 18:18:50 UTC 2009


 
Call for papers
European Society for the Study of English (ESSE)
Turin/Torino

24-28 AUGUST 2010

The 10th Conference of the European Society for the Study of English will be hosted by the University of Torino in the first capital city of Italy, which will celebrate 150 years as a nation in 2011.

Seminar S.62: New Developments in Digital Resources for Researching and Teaching Scottish Language and Literature

Papers are invited exploring recent developments in the use of corpus linguistics and other electronic resources to teach and research Scots, Scottish English and Ulster-Scots. Papers may address literary and/or non-literary texts, and cover the periods of Older Scots (pre-1700) and Modern Scots (post-1700). 

Possible issues might include:

* the structure and development of Scots/Scottish English digital resources (corpora, dictionaries, thesauri,
grammars, electronic editions, etc.);
* the interplay between different digital resources;
* research findings derived from the use of digital resources;
* teaching applications of such tools;
* the use of such tools in language planning and promotion.

Convenors
John CORBETT (University of Glasgow, Scotland) j.corbett at englang.arts.gla.ac.uk;
Marina DOSSENA (Università di Bergamo, IT) marina.dossena at unibg.it

Procedure for submitting proposals for papers:

Those wishing to participate in the Conference are invited to submit 200-word abstracts of their proposed papers directly to all convenors of the panel in question before 31 January 2010. The convenors will let the proponents know whether their proposals have been accepted by no later than 28 February 2010.

Please note that authors of seminar papers will be expected to give an oral presentation of not more than 15 minutes' duration, rather than simply reading their papers aloud. Convenors should ensure that reduced versions of the papers are circulated among all speakers in advance of the seminar in question. There will be
a maximum of 5 papers in each two-hour seminar session, and convenors should plan so that there is time for discussion between speakers and with the audience.

It is possible that we may be able to extend some seminars over two sessions, but this is very much dependent on the proposals received and on the way the programme as a whole develops, and cannot be determined until after all convenors have reported to the Academic Programme Committee in February.

ESSE members (or other participants) can only make one paper proposal per conference. Organising Committees should ensure that this is implemented. Those giving lectures might be encouraged to be respondents in Seminars or to participate in Round Tables. A speaker at a Seminar can participate in a Round Table or be the co-convenor in a different Seminar.

For a full programme, see:

http://www.essenglish.org/cfp/ESSE-10-Torino.pdf

John Corbett
Professor of Applied Language Studies
Head of the Department of English Language
School of Scottish and English Language and Literature 
University of Glasgow
12 University Gardens, GLASGOW G12 8QQ, 
Tel: +44 (0)141 330 6340/2978 Fax: +44 (0)141 330 3531

http://www.glasgow.ac.uk/englishlanguage/  

www.scottishcorpus.ac.uk

The University of Glasgow, charity number SC004401

-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of corpora-request at uib.no
Sent: 10 November 2009 11:00
To: corpora at uib.no
Subject: Corpora Digest, Vol 29, Issue 10

Today's Topics:

   1.  New Portuguese-English corpus available (Diana Santos)
   2.  CFP: SNPD, London, June 9-11, 2010, IEEE Proc. (Grigori Sidorov)
   3.  Linguistics and corpus linguistics at Univ. of	Calif., Santa
      Barbara (Stefan Th. Gries)
   4.  BNC n-grams (Mark Davies)
   5. Re:  BNC n-grams (Serge Sharoff)
   6.  Text Mining for Scholarly Communications and	Repositories
      workshop (Sophia Ananiadou)


----------------------------------------------------------------------

Message: 1
Date: Mon, 9 Nov 2009 13:06:16 +0100
From: Diana Santos <Diana.Santos at sintef.no>
Subject: [Corpora-List] New Portuguese-English corpus available
To: "corpora at uib.no" <corpora at uib.no>


We are pleased to announce that CorTrad, a subproject of the COMET project, is now available for search on the Web at http://www.fflch.usp.br/dlm/comet/consulta_cortrad.html

Making CorTrad available on the Web is a joint USP (http://www.usp.br/internacional/home.php?&idioma=en), NILC (http://www.nilc.icmc.usp.br/nilc/) and Linguateca (http://www.linguateca.pt) project, using the DISPARA system.

CorTrad features two special properties:
it is multiversion (with several versions of a translated text) it has specific search capabilities relative to the structure of the specific texts

Currently it comprises three subcorpora:
- technical text, with a Brazilian cookbook translated into English
- scientific magazine, with Brazilian short research news translated into English
- Australian short stories translated into Portuguese

The corpora are annotated with PALAVRAS for Portuguese and with CLAWS for English.

We welcome comments and feedback!

The CorTrad team
Diana Santos, Elisa D. Teixeira, Sandra Aluísio and Stella E.O.Tagnin


------------------------------

Message: 2
Date: Mon, 9 Nov 2009 10:14:28 -0600
From: "Grigori Sidorov" <sidorov at cic.ipn.mx>
Subject: [Corpora-List] CFP: SNPD, London, June 9-11, 2010, IEEE Proc.
To: "'Corpora list'" <corpora at uib.no>

CALL FOR PAPERS
11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD
2010)
9-11 June, 2010
The University of Greenwich, London, United Kingdom In Cooperation with the IEEE Computer Society Sponsored by the International Association for Computer and Information Science (ACIS)
 
SNPD 2010 Proceedings will be published by IEEE, and indexed by EI, INSPEC and DBLP.
Five to ten of the best SNPD 2010 papers will be recommended for publication in the International Journal of Computer and Information Science(IJCIS).
15-18 outstanding papers will be selected for publication in  the Springer's SCI.
 
 

The 11th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD 2010) brings together researchers, scientists, engineers, industry practitioners, and students to discuss, encourage and exchange new ideas, research results, and experiences on all aspects of computer and information science. SNPD 2010 aims to facilitate cross-fertilizations among, and is soliciting papers in the key technology enabling areas. 
The topics of interests include but not limited to:
 

Algorithms
Information Assurance
Artificial Intelligence
Internet Technology and Applications
Audio and Video Technology
Mobile/Wireless/Ad-Hoc Networks
Case-Based Reasoning
Natural Language Processing
Collaborative Computing
Neural Networks & Genetic Algorithms
Communication Systems and Networks
Operating Systems
Component-Based Software Engineering
Parallel and Distributed Computing
Cryptography and Network Security
Software Specification & Architecture
Data Mining and Machine Learning
Software Testing
Database System Management
Visual and Multimedia Computing
E-Commerce and its applications
Voice-over-IP and Security
Embedded Systems
User-Centered Design Methods
Image, Speech, and Signal Processing
Web-Based Applications
 
Conference Chairs:
Program Chairs:
Liz Bacon, The University of Greenwich, U.K Jixin Ma, The University of Greenwich, U.K.
Roger Lee, Central Michigan University, U.S.A.
Miltos Petridis, The University of Greenwich, U.K.
 
 
Publicity co-Chairs :
Local Arrangements co-Chairs :
Elena Navarro, University of Castilla-La Mancha, Spain.
Markus Wolf, The University of Greenwich, U.K.
Xiao Bai, Beijing University of Aeronautics & Astronautics, China.
Aihua Zheng, The University of Greenwich, U.K.
Kerstin Bach, University of Hildesheim, Germany.
Elena Teodorescu, The University of Greenwich, U.K.
Laszlo Böszörmenyi, Klagenfurt University, Austria Xiaoyi Zhou, The University of Greenwich, U.K.
Ricardo Campos, Polytechnic Institute of Tomar, Portugal
 
Wencai Du, Hainan University, China
 
Grigori Sidorov, National Polytechnic Institute, Mexico
 
Alfredo Cuzzocrea, Uniiversity of Calabria, Italy
 
Michael Biehl, Uniiversity of Groningen, Netherlands
 
Tunga Gungor, Bogazici University, Turkey
 
IMPORTANT DEADLINES

Full Paper Submission:                        
16 January, 2010
Acceptance Notification:  
15 February, 2010
Early Registration                         
05 March, 2010
Camera-ready Paper & Final Registration:         
15 March, 2010
Conference:
09 - 11 June, 2010





------------------------------

Message: 3
Date: Mon, 9 Nov 2009 11:54:30 -0800
From: "Stefan Th. Gries" <stgries at gmail.com>
Subject: [Corpora-List] Linguistics and corpus linguistics at Univ. of
	Calif., Santa Barbara
To: corpora at uib.no

Apologies for cross-posting ...

Linguistics at the University of California, Santa Barbara

The Department of Linguistics at UCSB offers a Ph.D. program with a functional theoretical orientation and a strong commitment to the principle that linguistic theory should be based on language use. We seek explanations for the linguistic structures of the world?s languages in discourse and interaction, the sociocultural, cognitive, and physical forces shaping language use, and the ways in which these forces motivate language change.

Our recently restructured graduate program offers Ph.D. tracks in structural, sociocultural, cognitive, and corpus linguistics.
Following a rigorous two-year Master?s program including courses in all four areas, students take doctoral-level courses in their chosen track, with flexibility reflecting their individual interests, and advance to Ph.D. candidacy by the end of their fourth year. Training in empirical methodologies is an essential component of our program; in addition to Master?s level courses in discourse transcription and basic statistics for linguistics, each Ph.D. track features relevant methods courses, such as field methods, sociocultural methods, and advanced statistics.

Our department has a strong tradition of language documentation and description and, in addition to field methods, offers courses in typology, language contact, grammar writing, and documentary linguistics. The department also has an international reputation in sociocultural linguistics, a broadly interdisciplinary specialization originating at UCSB that encompasses the traditional fields of sociolinguistics, linguistic anthropology, socially oriented discourse analysis, and related areas. We now offer a corpus linguistics track that gives students in-depth training on how to transcribe, annotate, and retrieve corpus data, how to compile a corpus, and how to analyze corpus data of different types using the most current statistical techniques. Our cognitive track provides training in the cognitive and psycholinguistic underpinnings of language, including language acquisition, production, and comprehension, as well as advanced statistical analysis.

We welcome applications to our graduate program. Be sure to note our application deadline of December 1. UCSB offers four- and five-year central fellowships to qualified applicants; smaller awards are also available.

Please visit our website?www.linguistics.ucsb.edu?for further information about our graduate program, faculty, research specializations, and language areas.



------------------------------

Message: 4
Date: Mon, 9 Nov 2009 22:41:15 -0700
From: Mark Davies <Mark_Davies at byu.edu>
Subject: [Corpora-List] BNC n-grams
To: "'Corpora list'" <corpora at uib.no>

Is anyone aware of a source for n-grams (2-grams and 3-grams) from the BNC? I'm aware of Phrases in English (pie.usna.edu), but I'm referring to the full set of n-grams, e.g. a downloadable file with all 15,000,000+ 2-grams in the BNC. I can generate and distribute these n-grams from my BYU-BNC (http://corpus.byu.edu/bnc), but I first wanted to see whether they're already available somewhere else. I've googled this, but haven't found anything.

I guess the more basic question is whether this data would be useful. We already have, of course, the Google ngrams data, based on a "corpus" tens of thousands of times as large as the BNC. As I see it, though, the ngrams data from a structured 100-500 million word corpus might have the following advantages over the Google data:

-- at 10-15 million rows (for 2-grams; 30-40m 3-grams (??) ), small enough to actually load on most machines
-- it could include separate frequency figures for different genres (e.g. spoken, fiction, newspaper, academic)
-- since the BNC is tagged (and in my version, lemmatized as well), it would have an advantage over the untagged and unlemmatized Google data

Comments?

============================================
Mark Davies
Professor of (Corpus) Linguistics
Brigham Young University
(phone) 801-422-9168 / (fax) 801-422-0906

http://davies-linguistics.byu.edu

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================ 





------------------------------

Message: 5
Date: Tue, 10 Nov 2009 09:04:33 +0000
From: Serge Sharoff <s.sharoff at leeds.ac.uk>
Subject: Re: [Corpora-List] BNC n-grams
To: Mark Davies <Mark_Davies at byu.edu>
Cc: 'Corpora list' <corpora at uib.no>

I did this quite some time ago, but I never thought of this as an achievement, since it's trivial to produce.  In case you need them, http://corpus.leeds.ac.uk/frqc/bnc-bi.gz
(it's based on lemmas, but I didn't use POS tags).

Another advantage of the BNC over Google data is noise coming from navigation frames (Have your say, Click here) as well as from duplicate pages (Stefan Evert published some examples of this, nothing comes from the top of my head).  The disadvantage of the BNC is obviously the time frame (Soviet Union is still quite prominent there) and British English only.
Serge


On Tue, 2009-11-10 at 05:41 +0000, Mark Davies wrote:
> Is anyone aware of a source for n-grams (2-grams and 3-grams) from the 
> BNC? I'm aware of Phrases in English (pie.usna.edu), but I'm referring 
> to the full set of n-grams, e.g. a downloadable file with all 
> 15,000,000+ 2-grams in the BNC. I can generate and distribute these 
> n-grams from my BYU-BNC (http://corpus.byu.edu/bnc), but I first 
> wanted to see whether they're already available somewhere else. I've 
> googled this, but haven't found anything.
> 
> I guess the more basic question is whether this data would be useful.
> We already have, of course, the Google ngrams data, based on a 
> "corpus" tens of thousands of times as large as the BNC. As I see it, 
> though, the ngrams data from a structured 100-500 million word corpus 
> might have the following advantages over the Google data:
> 
> -- at 10-15 million rows (for 2-grams; 30-40m 3-grams (??) ), small 
> enough to actually load on most machines
> -- it could include separate frequency figures for different genres 
> (e.g. spoken, fiction, newspaper, academic)
> -- since the BNC is tagged (and in my version, lemmatized as well), it 
> would have an advantage over the untagged and unlemmatized Google data
> 
> Comments?
> 
> ============================================
> Mark Davies
> Professor of (Corpus) Linguistics
> Brigham Young University
> (phone) 801-422-9168 / (fax) 801-422-0906
> 
> http://davies-linguistics.byu.edu
> 
> ** Corpus design and use // Linguistic databases **
> ** Historical linguistics // Language variation **
> ** English, Spanish, and Portuguese ** 
> ============================================
> 
> 
> 
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora




------------------------------

Message: 6
Date: Tue, 10 Nov 2009 10:37:31 +0000
From: "Sophia Ananiadou" <Sophia.Ananiadou at manchester.ac.uk>
Subject: [Corpora-List] Text Mining for Scholarly Communications and
	Repositories workshop
To: "corpora at uib.no" <corpora at uib.no>

 

Slides of the presentations of the Text Mining for Scholarly Communications and Repositories workshop are now available to download. 

http://www.nactem.ac.uk/tm-ukoln.php 

 

 

=========================================================

Professor Sophia Ananiadou, School of Computer Science,

Director, National Centre for Text Mining,

Manchester Interdisciplinary Biocentre

University of Manchester

131 Princess Street, M1 7DN

www.nactem.ac.uk <http://www.nactem.ac.uk> 

sophia.ananiadou at manchester.ac.uk <mailto:sophia.ananiadou at manchester.ac.uk>  

tel: +44 161 306 3092

PA Paul Thompson paul.thompson at manchester.ac.uk <mailto:paul.thompson at manchester.ac.uk>  

 

 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 5578 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20091110/22696ed9/attachment.txt>

----------------------------------------------------------------------
Send Corpora mailing list submissions to
	corpora at uib.no

To subscribe or unsubscribe via the World Wide Web, visit
	http://mailman.uib.no/listinfo/corpora
or, via email, send a message with subject or body 'help' to
	corpora-request at uib.no

You can reach the person managing the list at
	corpora-owner at uib.no

When replying, please edit your Subject line so it is more specific than "Re: Contents of Corpora digest..."


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


End of Corpora Digest, Vol 29, Issue 10
***************************************

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list