[Corpora-List] Auto-generation and how to spot it

Ramesh Krishnamurthy r.krishnamurthy at aston.ac.uk
Mon Nov 13 14:57:46 UTC 2006


I dont know if it helps, but  via Google I discovered that:

>My eyes tell me that there are fabulous talents in every decade, 
>including this one

is from
http://www.hoopshype.com/columns/caste_hans.htm

>You have to remember where these young guys were picked
no hits
>You know things  are different when there's a press seat assigned to 
>someone representing lebronjames
no hits
>Like many sports, you are going to have writers who are too 
>close  to the teams they cover and writers who aren't
no hits

Best
Ramesh
At 12:06 13/11/2006, you wrote:
>"My eyes tell me that there are fabulous talents in every decade, 
>including this one. You have to remember where these young guys were 
>picked. You know things  are different when there's a press seat 
>assigned to someone representing lebronjames. Like many sports, you 
>are going to have writers who are too close  to the teams they cover 
>and writers who aren't."
>
>
>This is the start of a spam which I (and presumably several thousand 
>other people) just received. My suspicion is that the text has been 
>automatically generated from a reasonably large corpus of authentic 
>email material (in this case, presumably, from some collection of 
>sports writing). The interesting question for this list is: how do I 
>know it's artificially generated? I'm guessing that the lack of 
>coherence has something to do with it, but what are the factors 
>which indicate that? And how much text would you need to scan before 
>determining that there was no natural coherence amongst its components?
>
>It's a question that several spam filter makers would probably pay 
>good money for an answer to.

Ramesh Krishnamurthy

Lecturer in English Studies, School of Languages and Social Sciences, 
Aston University, Birmingham B4 7ET, UK
[Room NX08, North Wing of Main Building] ; Tel: +44 (0)121-204-3812 ; 
Fax: +44 (0)121-204-3766
http://www.aston.ac.uk/lss/staff/krishnamurthyr.jsp

Project Leader, ACORN (Aston Corpus Network): http://corpus.aston.ac.uk/ 



More information about the Corpora mailing list