[Corpora-List] Google "region"-based searches

Tristan Miller miller at ukp.informatik.tu-darmstadt.de
Wed Nov 28 12:56:12 UTC 2012


Greetings.

On 28/11/12 01:25 PM, Trevor Jenkins wrote:
> On 28 Nov 2012, at 11:48, Roland Schäfer <roland.schaefer at fu-berlin.de> wrote:
> 
>> Whatever Google use: IP-based geolocation is totally unreliable as far
>> as language varieties are concerned.
> 
> Definitely. My current ISP has various nodes connecting to the Internet. 
> My connections appear to be in either Bangor in north Wales or in
> Winchester in southern England but never where I'm actually located.

I don't think you can use single cases like this to make blanket
statements about the "total unreliability" of geolocation.  Sure, the
user of any one IP can't be pinpointed with certainty to the nearest
square centimetre, but neither is geolocation totally random.  Were we
to analyze a large enough sample of geolocations, we could probably
conclude that m% of all IPs can be correctly resolved geographically to
within a n-kilometre radius.  For large enough areas (say, entire
countries) the accuracy of geolocation may be high enough for one's
purposes to make some informed estimates on the distribution of
coarse-grained language varieties.  For example, given a large enough
random sample of English texts written by people whose IPs resolve to
Ireland, could we not reasonably expect the distribution of language
varieties in those texts to roughly match that of the Irish population
in general, or at least that portion of it which is online?

Regards,
Tristan

-- 
Tristan Miller, Doctoral Researcher
Ubiquitous Knowledge Processing Lab (UKP-TUDA)
Department of Computer Science, Technische Universität Darmstadt
Tel: +49 6151 16 6166 | Web: http://www.ukp.tu-darmstadt.de/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121128/9e762d73/attachment-0001.sig>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list