[Corpora-List] corpora of grammatical errors

Krishnamurthy, Ramesh r.krishnamurthy at aston.ac.uk
Sun Apr 15 12:42:20 UTC 2012


Hi Anabela

#1 Do ALL the currently available public corpora not 'contain sentences with grammatical errors'?
Very few (if any) texts will be 100% grammatically 'correct' (whichever model of grammar you use)?
So BNC, COCA, etc should be OK for you?
But the specific 'errors' your system identifies will of course depend on your choice of model.

#2 If you want a corpus with a high proportion of 'errors', would any available LANGUAGE LEARNER,
NON-NATIVE-SPEAKER, NON-STANDARD, or VARIETAL corpus be sufficient for your purposes? These
corpora should be easy to find via Google, by specifying one of those attributes?

Hope this helps
Ramesh

Ramesh Krishnamurthy
Visiting Academic Fellow, School of Languages and Social Sciences, Aston University, Birmingham B4 7ET

Director, ACORN (Aston Corpus Network project): http://acorn.aston.ac.uk/
Corpus Analyst:
(a) GeWiss (Volkswagen Foundation) project: http://www1.aston.ac.uk/lss/research/research-projects/gewiss-spoken-academic-discourse/
(b) Discourse of Climate Change: http://www1.aston.ac.uk/lss/research/research-projects/discourse-of-climate-change-project/
(c) Feminism: http://acorn.aston.ac.uk/projects.html
(d) COMENEGO (Corpus Multilingüe de Economía y Negocios) - Multilingual Corpus of Business and Economics: http://dti.ua.es/comenego
(e) European Phraseology Project: http://labidiomas3.ua.es/phraseology/login/login.php
-------------------------------------------------------------------------------------------------------------------------


Date: Sat, 14 Apr 2012 10:24:50 +0000

From: Anabela Barreiro <barreiro_anabela at hotmail.com<mailto:barreiro_anabela at hotmail.com>>

Subject: [Corpora-List] corpora of grammatical errors

To: "corpora at uib.no<mailto:corpora at uib.no>" <corpora at uib.no<mailto:corpora at uib.no>>





Dear Corpora List Members,

I am looking for public corpora containing sentences with grammatical errors.

I plan to use the corpora as input to grammar checking and correction routines.

The corpora can be in English or romance languages. I appreciate any indication of where I can find those corpora. Thank you!

-------------------------------------------------------------------------------------------------Think GREEN - Act GREEN!



Anabela M. Barreiro

Personal webpage: https://www.l2f.inesc-id.pt/wiki/index.php/Anabela_BarreiroLinkedIn: http://www.linkedin.com/in/anabelabarreiro

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120415/71d1684a/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list