Analyze word and phrase frequency

Tom Zurinskas truespel at HOTMAIL.COM
Thu Apr 2 03:35:25 UTC 2009


For my data in truespel book 4 I used word frequency to determine phoneme frequency in general running text.  The database of was the Collins Cobuild word count.  The top 5k words had 15.4 million instances, with the most popular word "thu" having 1 million instances.  I estimate the top 5k words take up about 90% of words on a page.

If anyone does do a count of words of general text using the program below, like newspapers, I'd like to compare results.  As far as I know my analysis is unique.  I'd like to hear of any others.


Tom Zurinskas, USA - CT20, TN3, NJ33, FL5+
see truespel.com





> ---------------------- Information from the mail header -----------------------
> Sender: American Dialect Society
> Poster: James Harbeck
> Subject: Fwd: Analyze word and phrase frequency
> -------------------------------------------------------------------------------
>
> This looks like it could be useful for some kinds of analysis.
>
> -----Original Message-----
>
> http://lifehacker.com/5190716/primitive-word-counter-analyzes-word-and-phrase-frequency
>
> You can check the number of words in just about any word processing
> program, but what about the distribution of those words?
>
> Primitive Word Counter analyzes text from your clipboard or file and
> returns the frequency of words and phrases in the text. You can set a
> minimum word length and have it ignore numbers to trim down the
> volume of replies it returns.
>
> ------------------------------------------------------------
> The American Dialect Society - http://www.americandialect.org
_________________________________________________________________
Quick access to your favorite MSN content and Windows Live with Internet Explorer 8.
http://ie8.msn.com/microsoft/internet-explorer-8/en-us/ie8.aspx?ocid=B037MSN55C0701A

------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org



More information about the Ads-l mailing list