[Corpora-List] RE: Lesser (sic) used languages

Matthew Hurst mhurst at intelliseek.com
Fri Feb 11 18:54:29 UTC 2005


A corpus of 8bn documents give the following numbers:

less known - 270k
lesser known - 1, 160k

less used - 107 k
lesser used - 52.5 k

Matt Hurst


Nancy Ide wrote:
>
> On Feb 10, 2005, at 7:16 PM, Somers, Harold wrote:
>
>
>>That's funny. My personal reaction to any marginally acceptable
>>collocation that I personally don't use is that it's American ;-)
>>
>>
>>
>>
>
> ...and the Americans take the challenge!
>
> The 11m words of the ANC First release include "less used" only once,
> and "lesser used" does not appear at all. But  "lesser known" follows
> the pattern  found in the BNC for "lesser used":
>
> the lesser known people
> the lesser known though no less captivating part of the Lake District .
> lesser known but high - caliber musicians
> lesser known but accomplished singers
> lesser known works
> the lesser known of the two
>
> But we do have one attributive use of "less known":
>
> a less known and common signification
>
> Go figure (as we Americans say). When our next 10m words are ready in a
> month or two, we'll have another look.
>
> Nancy Ide
>
> =======================================================
>
> Nancy Ide
>
> Professor  of Computer Science
> Vassar College
> Poughkeepsie, NY 12604-0520 USA
> Tel: +1 845 437-5988 Fax: +1 845 437-7498
> ide at cs.vassar.edu
>
> Chercheur Associe
> Equipe Langue et Dialogue, LORIA/CNRS
> Campus Scientifique - BP 239
> 54506 Vandoeuvre-les-Nancy FRANCE
> Tel: +33 (0)3 83 59 20 47 Fax: +33 (0)3 83 41 30 79
> ide at loria.fr
>
> =======================================================
>
>
>
>
>



More information about the Corpora mailing list