[Corpora-List] Google "region"-based searches

Mark Davies Mark_Davies at byu.edu
Tue Nov 27 14:34:10 UTC 2012


I'm looking at creating a corpus based on the web pages from a particular country, and I'd like to use Google's advanced search "region" field to limit the pages (https://www.google.com/advanced_search, see http://www.googleguide.com/sharpening_queries.html#region). Supposedly, this limits pages based on IP address, rather than just TLD (such as .sg or .sk).

Has anyone heard how accurate this region field is? I'm wondering, because I'm seeing links to (for example) *.blogspot.com for region-based searches from countries other than the US (e.g. Singapore or Sri Lanka). In order for Google to be accurate in these cases, presumably there are servers for blogspot.com in these other countries (or any other domain), and as people from those countries create blogs they are stored on servers in that country, and then Google is recognizing their location by IP address, rather than just the domain. And the same would hold true for any US or UK-based domain that would return results from other countries.

Thanks in advance,

Mark Davies

============================================
Mark Davies
Professor of Linguistics / Brigham Young University
http://davies-linguistics.byu.edu/

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list