[Corpora-List] The meta book
John F. Sowa
sowa at bestweb.net
Mon Dec 14 13:06:14 UTC 2009
I came across the following article from a news item on the BBC,
but I couldn't find any mention of it on Corpora List:
http://arxiv.org/PS_cache/arxiv/pdf/0909/0909.4385v1.pdf
The meta book and size-dependent properties of written language
Following is the summary from the BBC with some comments from
an interview with the first author:
http://news.bbc.co.uk/2/hi/science/nature/8404025.stm
Rare words 'author's fingerprint'
The authors are physicists, and they published the article in
a physics journal. I wondered how it compares to other studies
by people on this list.
Following is the abstract.
John Sowa
__________________________________________________________________
The meta book and size-dependent properties of written language
Authors: Sebastian Bernhardsson, Luis Enrique Correa da Rocha,
Petter Minnhagen
New J. Phys. 11 (2009) 123015
Abstract: Evidence is given for a systematic text-length dependence of
the power-law index gamma of a single book. The estimated gamma values
are consistent with a monotonic decrease from 2 to 1 with increasing
length of a text. A direct connection to an extended Heap's law is
explored. The infinite book limit is, as a consequence, proposed to be
given by gamma = 1 instead of the value gamma=2 expected if the Zipf's
law was ubiquitously applicable. In addition we explore the idea that
the systematic text-length dependence can be described by a meta book
concept, which is an abstract representation reflecting the
word-frequency structure of a text. According to this concept the
word-frequency distribution of a text, with a certain length written by
a single author, has the same characteristics as a text of the same
length pulled out from an imaginary complete infinite corpus written by
the same author.
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list