[Corpora-List] Another "Search Inside" tool: Google Print...

Ute Römer ute.roemer at anglistik.uni-hannover.de
Thu Jun 16 11:40:40 UTC 2005


Dear all, 

David's message (Thanks, David! I didn't know about the update) reminded me
of a related search tool which might also be of interest for some of you
(maybe you know about it already, but I only discovered it a few weeks ago):
Google Print (check http://print.google.com/ and
http://print.google.com/googleprint/about.html). The system allows you to
search the full text of a huge number of books (apparently, they collaborate
with publishers and libraries; they don't say how many books have been
scanned and uploaded so far though) and gives you selected pages from those
books which contain your search string. It's not so much a concordancing
facility but certainly a new way of doing (literature) research. 

A search for "corpus linguistics", for instance, retrieves 3,040 hits (with
the Biber/Conrad/Reppen 1998 textbook topping the list);
http://print.google.com/print?ie=UTF-8&q=%22corpus+linguistics%22&btnG=Searc
h. You can then follow a link and separately search within each of the
"corpus linguistics" books. For example, you find that "register variation"
occurs on 45 different pages in Biber/Conrad/Reppen 1998, and there are
links that take you to the scanned image of each of the relevant pages (with
the search item highlighted). That option is also very useful when you need
to check the page number of a quote and don't have the book at hand. You can
also see which library near you has this book -- and, of course, where you
can buy it. 

Best wishes... Ute   


********************************************

Ute Römer
English Department
University of Hanover
Königsworther Platz 1
30167 Hannover
Germany
 
Phone: +49 (0)511 762 2997
Fax: +49 (0)511 762 2996
E-mail: ute.roemer at anglistik.uni-hannover.de
http://www.uteroemer.de 
http://www.fbls.uni-hannover.de/angli/ 
 
> -----Original Message-----
> From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
> Behalf Of David Oakey
> Sent: Thursday, June 16, 2005 12:06 PM
> To: CORPORA at UIB.NO
> Subject: [Corpora-List] Additions to amazon.com "Search Inside" feature
> 
> Apologies if I'm be reporting something that everyone already knows
> about except me, but Amazon.com's "Inside this book" feature now
> provides - for all books in its "Search Inside" scheme - a concordance
> (in the sense of a frequency list rather than KWIC citations), text
> statistics, and statistically improbable phrases (SIPs). A SIP works a
> bit like an n-gram version of a keyword in Wordsmith Tools, with the
> reference corpus being all the books in Amazon's "Search Inside" corpus.
> If Amazon finds "a phrase that occurs a large number of times in a
> particular book relative to all Search Inside books, that phrase is a
> SIP in that book." On the shopping page for the book "Into the void with
> Ace Frehley," (the notoriously spaced former guitarist in the rock band
> KISS) for example, the SIP they list is "black nail polish". This is
> impressive - and not at all improbable - if you know much about the
> career of Ace Frehley.
> 
> The concordance results are presented alphabetically, with more frequent
> words shown in a larger font size. Text statistics include standard
> readability indices (the Fog Index seems apt here) and they have a "fun
> stats" section where they calculate words per dollar and words per ounce
> (words per pound and words per kilo on amazon.co.uk). More information
> on the Amazon site about the number of books in the scheme (yes, 120,000
> books, 33 million pages etc., but that was nearly 2 years ago), their
> subject areas, authorship details etc. would of course be useful. While
> this is intended as a marketing feature (it "allows you to search
> millions of pages to find exactly the book you want to buy"), I believe
> it would be interesting to corpora list members in itself.
> 
> Best wishes,
> 
> David Oakey
> ------------------------------
> Lecturer in English Language
> English for International Students Unit
> University of Birmingham, UK
> phone: + 44 121 4145703
> email: d.j.oakey at bham.ac.uk
> http://www.eisu.bham.ac.uk/staff/oakeydavid.htm
> ------------------------------



More information about the Corpora mailing list