[Corpora-List] Word frequencies for classical and New Testament Greek

Orion Montoya orion at mdcclv.com
Thu Mar 3 23:05:30 UTC 2011


The ARTFL project at the University of Chicago has loaded up the Perseus texts in their PhiloLogic corpus-analysis package: here are the frequencies for the New Testament (warning: big pageload):

http://perseus.uchicago.edu/cgi-bin/philologic/getwordcount.pl?GreekFeb2011.255.1

Helma Dik's blog post at
http://cybergreek.uchicago.edu/index.html/?q=node/26
has links to several frequency lists for Greek prose, but they all focus on high-frequency words. I munged the URL query string to get you a list of just the hapax legomena:
http://bit.ly/h88GbZ (also a big pageload)

Tricks:
• Setting the &displaymorethan=0&displaylessthan=2 made it show nothing, I guess because 0 is taken as "don't do this". I made it displaymorethan=0.5 and that persuaded it to show the hapax.
• Dr. Dik's query is only for prose and excludes a bunch of pre- post-classical authors; if you want a pan-Hellenic/pan-genre search you may wish to modify the query.

You can also get collocations out of this system, but I don't think that makes much sense for hapax legomena (although it might be interesting to see if there are words that co-occur with many hapax).

Enjoy,

Orion

On Mar 3, 2011, at 4:23 AM, Eric Atwell wrote:

> Graham,
> 
> The Perseus project at Tufts University has a growing collection of
> classical texts, see:
> 
> http://www.perseus.tufts.edu/hopper/collection?collection=Perseus:collection:Greco-Roman
> 
> - this includes "New Testament. Brooke Foss Westcott, Fenton John
>  Anthony Hort. (Greek)" - with a search facility
> http://www.perseus.tufts.edu/hopper/search?doc=Perseus%3atext%3a1999.01.0155
> 
> ... but i don't know if the website includes word frequencty lists,
> you could try ghe help center
> http://www.perseus.tufts.edu/hopper/help
> 
> or if that fails, email the webmaster
> 
> If you find the wordlist, please let me know :-)
> 
> 
> eric atwell, Leeds University
> 
> 
> 
> On Wed, 2 Mar 2011, Graham White wrote:
> 
>> I wonder if anyone could point me in the direction of some information
>> I'm looking for. What I would like is word frequencies for New Testament
>> Greek, together with word frequencies for a larger corpus including the
>> New Testament (the TLG would be great): what I'm particularly interested
>> in is how many hapax legomena in the NT remain hapax legomena
>> in the larger corpus. (I'm doing this for a historical article I'm
>> writing, on Schleiermacher, hence the preference for the New Testament).
>> 
>> Thanks
>> 
>> Graham
>> 
>> --------------------------------------
>> 
>> Graham White
>> Electronic Engineering and Computer Science
>> Queen Mary University of London
>> 
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>> 
> 
> -- 
> Eric Atwell, Senior Lecturer, Language research group,
> I-AIBS Institute for Artificial Intelligence and Biological Systems
> School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
> Leeds LS2 9JT, England.        TEL: 0113-3435430  FAX: 0113-3435468
> 
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110303/d26e557d/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list