FYI: Google's ngram viewer

Alexander King a.king at ABDN.AC.UK
Fri Dec 17 11:11:47 UTC 2010


Hmm. No mention of corpus linguistics, which strikes me as a well established subfield that has been doing similar research for a generation or two. I don't see any computational linguists listed as an author on the Science article, either, but maybe they are working as psychologists or Google employees? Didn't we already know that verb morphology in English has been trending toward 'regularization'? 

The NYT seems to be banging on about memes, and we all know how useful that idea is for the humanities. Maybe if they had real linguists as friends/colleagues, they wouldn't have "had to scrutinize stacks of Anglo-Saxon texts page by page. The process took 18 months.
	'We were exhausted,' Mr. Lieberman Aiden said. That painstaking work 'was a total Hail Mary pass; we could have collected this data set and proved nothing.'  Wasn't the entire corpus of all known words (texts) of Anglo Saxon and Old English digitized in the late 1980s? I remember reading somewhere that the corpus is so small, as is the number of experts, that they quickly got onto the digital bandwagon and were posting texts to usenet and ftp sites and printing CDs.

I will have to go read the paper copy of Science in my library, because our subscription doesn't cover electronic access of recent articles. It is nice to have more data, in any case. I just went to a fascinating talk on corpus linguistics based on 1641 l depositions in Ireland, looking at Irish EME and one of their principal tasks was investigating change in verb phrase syntax. This is a small project here at Aberdeen (http://www.abdn.ac.uk/news/archive-details-9305.php) The larger project is here: http://1641.tcd.ie/

Alex


On 17 Dec 2010, at 4:25 am, Kerim Friedman wrote:

> And some good discussion here:
> 
> http://goo.gl/en64C
> 
> The cultural genome: Google Books reveals traces of fame, censorship
> and changing languages
> 
> kerim
> 
> On Fri, Dec 17, 2010 at 11:50 AM, Adam Hodges <adamhodges at cmu.edu> wrote:
>> 
>> In 500 Billion Words, New Window on Culture
>> By PATRICIA COHEN
>> New York Times
>> December 16, 2010
>> 
>> With little fanfare, Google has made a mammoth database culled from
>> nearly 5.2 million digitized books available to the public for free
>> downloads and online searches, opening a new landscape of
>> possibilities for research and education in the humanities.
>> http://www.nytimes.com/2010/12/17/books/17words.html?hp
>> 
>> Google's ngram viewer:  http://ngrams.googlelabs.com/
> 
> -- 
> 
> P. Kerim Friedman 傅可恩
> 
> Assistant Professor
> Department of Indigenous Cultures
> College of Indigenous Studies
> National DongHwa University, TAIWAN
> 助理教授國立東華大學民族文化學系

- tel:+44(1224)27 2732, fax:+44(1224)27 2552 - http://www.koryaks.net - http://www.abdn.ac.uk/anthropology



More information about the Linganth mailing list