[Corpora-List] Re: problems with Google

Pascal Soucy pascal.soucy.1 at ulaval.ca
Thu Mar 17 15:07:37 UTC 2005


Googles does that with all stopwords. If you search for:

what does "the" "the" mean, you'll get the same behavior. Google ignores
stopwords (and * seems to managed as a stopword).

Both the queries:

what does "*" mean

and

what does "*" "*" mean

results in about the same list of documents. The difference between the two
occurs in the ranking process. The ranking algorithm likely use term proximity
so to better match the query as it is written and it keep the position of
stopwords in the query to do that.

Pascal Soucy
Coveo

Selon John Milton <lcjohn at ust.hk>, 17.03.2005:

> I just discovered that Google seems to have retained some use of the
> wildcard for words if you use double quotes with the asterisk. A search
> for "what does "*" mean" and "what does "*" "*" mean" results MAINLY in
> any one and two words respectively. If anyone else is using web searches
> as language learning/teaching resources, this also looks promising:
> http://www.findforward.com/
>
> John Milton
> Hong Kong University of Science & Technology
>
>
>
>



More information about the Corpora mailing list