[Corpora-List] Re: problems with Google

Marian Olteanu mou_softwin at yahoo.com
Fri Mar 18 07:47:13 UTC 2005


Well, the results in Google API were ALWAYS a little bit (or not quite a little) different than
those reported by http://www.google.com/ . You will see a different order for the results, and a
small (or big) difference in counts - what we are interested in.
--- "Deane, Paul" <pdeane at ets.org> wrote:

> Has anybody checked whether the behavior with Google's Web API and its
> standard search is different?
>
> I have code using the Java Web API which makes use of the asterisk to blank
> out a single word (not an unrestricted wildcard.) As of yesterday, when I
> tested the code, it still appeared to be working as designed.
>
> -----Original Message-----
> From: Andrew Kehoe [mailto:Andrew.Kehoe at uce.ac.uk]
> Sent: Thursday, March 17, 2005 9:27 AM
> To: CORPORA at uib.no
> Subject: RE: [Corpora-List] Re: problems with Google
>
>
>
> John
>
> Even if you put double quotes around the wildcard character Google will
> ignore it. When you search for:
>
> "what does "*" mean"
>
> Google is actually searching for 2 'phrases': "what does " and " mean". You
> cannot nest double quotes in Google so the double quotes around the * are
> actually closing your initial quote and beginning a new quote, with the
> wildcard ignored completely.
>
> It may be the case that SOME of the pages Google returns will contain "what
> does", followed by one other word, followed by "mean" but your query does
> not ask for this specifically. Google could (and does) also return pages
> containing "mean" and "what does" in the opposite order, or with multiple
> words in between.
>
> Similarly, "what does "*" "*" mean" is actually searching for 3 'phrases':
> 1) "what does ", 2) " " (a space), and 3)" mean".
>
> So, Google hasn't retained support for wildcards at all I'm afraid, and this
> is why we are developing our own search engine in WebCorp, as Antoinette
> Renouf mentioned yesterday.
>
> Andrew Kehoe
> Research and Development Unit for English Studies
> Univerity of Central England in Birmingham
>
> http://www.webcorp.org.uk/ <http://www.webcorp.org.uk/>
>
> -----Original Message-----
> From: owner-corpora at lists.uib.no on behalf of John Milton
> Sent: Thu 17/03/2005 13:39
> To: CORPORA at uib.no
> Cc:
> Subject: [Corpora-List] Re: problems with Google
>
>
>
> I just discovered that Google seems to have retained some use of the
> wildcard for words if you use double quotes with the asterisk. A search
> for "what does "*" mean" and "what does "*" "*" mean" results MAINLY in
> any one and two words respectively. If anyone else is using web searches
> as language learning/teaching resources, this also looks promising:
> http://www.findforward.com/ <http://www.findforward.com/>
>
> John Milton
> Hong Kong University of Science & Technology
>
>
>
>
>
>
>
>
> **************************************************************************
> This e-mail and any files transmitted with it may contain privileged or
> confidential information. It is solely for use by the individual for whom
> it is intended, even if addressed incorrectly. If you received this e-mail
> in error, please notify the sender; do not disclose, copy, distribute, or
> take any action in reliance on the contents of this information; and delete
> it from your system. Any other use of this e-mail is prohibited. Thank you
> for your compliance.
>
>
>
>


Marian
http://www.utdallas.edu/~mgo031000/


		
__________________________________
Do you Yahoo!?
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/



More information about the Corpora mailing list