[Corpora-List] Questions for Google syntactic N-grams corpus
John F Sowa
sowa at bestweb.net
Wed Nov 13 12:33:32 UTC 2013
On 11/13/2013 4:44 AM, Adam Kilgarriff wrote:
> While N-grams is a fascinating resource, it is not full sentences (and
> I'm not sure how much not-text and duplication it includes, this was
> a problem with the first version) so what you can do is constrained...
The N-grams also contain accidental patterns that just happen to have
a high frequency of occurrence on the WWW.
Peter Norvig at Google cited examples of advertising slogans such as
"Life is better with XYZ", where XYZ is a product name. In certain
conditions, a phrase that matched the first part of the slogan would
get translated with some free advertising for XYZ.
John
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list