[Corpora-List] Parallel Word Lists

David L. Hoover david.hoover at nyu.edu
Mon Oct 19 15:22:44 UTC 2009


I often need what I'll call a parallel word list, which is a combined 
word frequency list for a corpus of texts along with an entry for the 
frequency of each word in each text, including zero frequencies, like 
this (the entries are in descending frequency order for the entire corpus):


	Text 1 	Text 2 	Text 3
the 	0.0610 	0.0428 	0.0551
and 	0.0387 	0.0294 	0.0249
to 	0.0265 	0.0287 	0.0272
of 	0.0252 	0.0291 	0.0326
a 	0.0239 	0.0238 	0.0207
city 	0.0000 	0.0015 	0.0002


I have my own methods of doing this, and I know that WordSmith Tools 
will produce such a list using the "Detailed Consistency List" function, 
with View Column Totals, but I wonder if there are especially good 
publicly available (free) methods out there that I just haven't found.

Also, to be clear, I'm looking for a simple tool for users without any 
programming experience, so no Perl scripts, no UNIX, etc.

Thanks,
David Hoover

-- 
          David L. Hoover, Professor of English, NYU
       212-998-8832       http://homepages.nyu.edu/~dh3/

    Most of her friends had an anxious, haggard look, . . .  
Basil Ransom wondered who they all were; he had a general idea 
they were mediums, communists, vegetarians. 
           -- Henry James, The Bostonians (1886)


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list