[Corpora-List] re: pronunciation (caveat)

Damon Allen Davison linguist at socal.rr.com
Wed Jul 24 16:08:03 UTC 2002


A caveat to all about relying too much on Google (and other search
engines) for corpus research:

Although Google allows you to define the page language for searches, it
looks at ISO tags in the HTML source to determine this.  Many people who
have their own web sites use software that by default inserts an
English-language ISO tag into their source.  Therefore, any spelling
that happens to be a word in another language may indeed be written in
another language, despite what the search engine claims.



More information about the Corpora mailing list