[Corpora-List] re: pronunciation (caveat)

Sampo Nevalainen samponev at cc.joensuu.fi
Thu Jul 25 06:46:35 UTC 2002


At 09:08 24.7.2002 -0700, Damon Allen Davison wrote:
>A caveat to all about relying too much on Google (and other search
>engines) for corpus research:

Yes, that's very true. Consulting the web is like consulting your
neighbour, or the friend of the guy you used to go fishing with... There is
always a bit of truth mixed with a lot of noise, and separating them is not
always an easy task. Ugh, this is not scientific at all. But, on the other
hand,  I think consulting the web is also like making a random gallup on
the pronounciation of "pronunciation" on the streets of a big city in this
globalized world - there will always be some French-orienterd co-fellows
making their own contribution to the distributions... Supposedly, it tells
something about the language as it is used these days. Isn't that what
linguists want (or, at least, should want)? Or... is it just easier for us
to trust on frequencies based on a "well-balanced" or /and "representative"
corpus?

sincerely,
sampo



More information about the Corpora mailing list