[Corpora-List] Frequency of the pronoun I

Mike Scott mike at lexically.net
Thu Sep 15 07:42:42 UTC 2011


There are also English texts without THE (lists of products, election 
results etc.) so the computation either way would need to avoid dividing 
by zero...

What a useful discussion. Clarified a particularly cluttered and dusty 
corner of my own thinking.

Cheers -- Mike

On 13/09/2011 19:19, Rich Cooper wrote:
>
> Using "the/I" can lead to infinite values in corpora (scientific lit, 
> patents) that never use the pronoun "I".  It might be better practice 
> to use the inverse, i.e. the "I/the" ration, which would be 0.0 for 
> such corpora.
>

-- 
Mike Scott

***
If you publish research which uses WordSmith, do let me know so I can include it at
http://www.lexically.net/wordsmith/corpus_linguistics_links/papers_using_wordsmith.htm
***
University of Aston and Lexical Analysis Software Ltd.
mike.scott at aston.ac.uk
www.lexically.net

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110915/c2d55a42/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list