[Corpora-List] Ambiguous words in English and their frequency

Eckhard Bick eckhard.bick at mail.dk
Thu Jan 26 09:30:57 UTC 2012


Hello,

It depends on what you call ambiguity, of course, and whether it's types 
or tokens.

If ambiguity is meant to include in-lemma inflexion ambiguity such as 
participle vs. past tense, and infinitive versus finite verb, then a 
quick mini-run, using our Constraint Grammar analysis on Leipzig 
internet corpus data, yields an ambiguity of 2.11 readings per English 
word token, punctuation excluded.

Best regards,
Eckhard


On 2012-01-25 20:33, FORT, Karen wrote:
> Hi all,
>
> I need to find this information (the proportion of ambiguous words in English and their frequency).
> For example, we know that in French 8% of the words represent 30% of the ambiguity.
> Of course, it's very rough, but it's only to have a rough idea.
>
> Can somebody help me with this (of course, I searched for a ref but could not find anything precise)?
>
> Thank you in advance,
>
> Regards,
>
>
> Karën FORT
> Ingénieure/Engineer et/and doctorante/PhD student
> INIST-CNRS / LIPN
> 2, allée de Brabois
> 54500 Vandoeuvre-lès-Nancy
> France
> Bureau/Office: H112
> +33 (0)3 83 50 46 36
>
> http://www-lipn.univ-paris13.fr/~fort/
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>


-- 
Eckhard Bick,
cand.med., dr.phil.
University of Southern Denmark
e-mail: eckhard.bick at mail.dk
web: http://beta.visl.sdu.dk


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list