22.2332, Disc: Error in the Fernandez Huerta Readability Formula
linguist at LINGUISTLIST.ORG
linguist at LINGUISTLIST.ORG
Thu Jun 2 16:27:56 UTC 2011
LINGUIST List: Vol-22-2332. Thu Jun 02 2011. ISSN: 1068 - 4875.
Subject: 22.2332, Disc: Error in the Fernandez Huerta Readability Formula
Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews: Veronika Drake, U of Wisconsin-Madison
Monica Macaulay, U of Wisconsin-Madison
Rajiv Rao, U of Wisconsin-Madison
Joseph Salmons, U of Wisconsin-Madison
Anja Wanner, U of Wisconsin-Madison
<reviews at linguistlist.org>
Homepage: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University,
and donations from subscribers and publishers.
Editor for this issue: Elyssa Winzeler <elyssa at linguistlist.org>
================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.cfm.
===========================Directory==============================
1)
Date: 27-May-2011
From: Gwillim Law [glaw at measinc.com]
Subject: Error in the Fernandez Huerta Readability Formula
-------------------------Message 1 ----------------------------------
Date: Thu, 02 Jun 2011 12:26:09
From: Gwillim Law [glaw at measinc.com]
Subject: Error in the Fernandez Huerta Readability Formula
E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=22-2332.html&submissionid=4521276&topicid=5&msgnumber=1
José Fernández Huerta published his formula for calculating the readability
of text in Spanish in 1959. It is still widely used. I found six web pages
that contain the actual formula, and many more where it is cited. I have
two strong reasons for believing that the formula contains an error. I
would be interested in getting additional feedback on the matter. Also, if
there is a consensus that the formula does contain an error, where would I
go to report it?
The Huerta score* was an adaptation of the Flesch Reading Ease score into
Spanish. The Flesch formula for English text, first published in 1948, is:
Flesch = 206.835 - 84.6 * syllables/words - 1.015 * words/sentences
Scores run roughly from 30 to 100, with higher scores being easier to read.
This makes sense. Sentences with more words in them will produce a lower
score; words with more syllables in them will produce a lower score.
The Huerta formula is usually presented as 206.84 - (0.60 * P) - (1.02 *
F), where P = number of syllables and F = number of sentences, as counted
in a sample containing 100 words. Applying the same sanity check, we see
that if the number of syllables per word increases, the score decreases, as
expected; but if the number of sentences increases, the score also
decreases. Now, if the number of sentences in a 100- word sample increases,
each sentence must be getting shorter. That should make the readability of
the passage increase, not decrease. (Reason 1.)
In its original form, the Huerta formula is not scalable. To compare it to
the Flesch formula, one would have to convert it to a formula that works
for a passage containing any number of words. When I do that, I come up
with the formula
Huerta = 206.84 - 60 * syllables/words - 102 * sentences/words
Note that if words = 100, this works out the same as the original Huerta
formula. Note also that it matches the Flesch formula almost term for term.
The coefficients of (syllables/words) in the two formulas differ by about
40%, but that's understandable, because the average number of syllables in
a Spanish word is greater than the corresponding ratio in English. It's the
last term that looks wrong. The coefficients are very different (1.015 and
102), but that's because I converted the Huerta formula to make it scalable.
In its original form, the coefficient was 1.02. But the real difference is
that the fraction is inverted. (Reason 2.) Since Fernández Huerta avowedly
based his work on that of Flesch, it seems to me that the obvious
conclusion is that he made a mistake. When he decided to stipulate a sample
of 100 words, he got confused and didn't realize that he had inverted the
fraction. Perhaps he tested his formula using a sample with 10 sentences,
in which case the two formulas give the same result: 1.02 * 10 = 102 * 10
/100.
Gwillim Law
References:
Original publication of the Huerta formula:
Fernández Huerta, José. Medidas sencillas de lecturabilidad. Consigna 1959;
(214): 29-32.
Some web pages describing or using the Huerta formula:
http://www.ideosity.com/SEO/SEO-Readability-Tests.aspx
http://www.standards-schmandards.com/exhibits/rix/
http://www.utexas.edu/disability/ai/resource/readability/manual/huerta-calculate-English.html
http://scielo.isciii.es/scielo.php?script=sci_arttext&pid=S1135-57272002000400007&lng=en&nrm=iso
http://www.faculty.de.gcsu.edu/~cbader/5210/fryforeign.htm
Linguistic Field(s): Applied Linguistics
-----------------------------------------------------------
LINGUIST List: Vol-22-2332
----------------------------------------------------------
More information about the LINGUIST
mailing list