high versus low frequency words

Marita Boehning boehning at kronos.ling.uni-potsdam.de
Thu Apr 18 12:19:33 UTC 2002


Dear Info-Childes members,
    does anyone know advice to the following problem?
I have generated a word list (from CELEX database, German words) for
which frequency counts are given.
I need these words to use them in an experiment I want to conduct. I
need to choose a certain amount of "low" and a certain amount of "high
freuquency" words.  Is there a kind of "rule" for a cut off criterion so
that one  can decide: these words belong to my high frequency category
and these blong to my low frequency category?
One way I thought of is  to look at the distribution and take the lowest
x% and the highest x% of the distribution (i.e. the very high frequent
words and the very low frequent words) and leave out all words in the
middle.  This causes the problem that I would "loose" a lot of possible
items for my experiment as these "middle frequency words" would neither
fit into high frequency category nor into low frequency category.
A big problem of the word list is that there are many words of rather a
similarly medium/low frequency and only a few with really high
frequency.
Any  suggestions or publications that had to deal with same problem?

Thank you!

Marita Böhning



******************************
Marita Boehning
Department of Linguistics
University of Potsdam
P.O. Box 60 15 53
D - 14415 Potsdam
Germany
Phone: +49 331 977 2929
Fax:  +49 331 977 2095
*****************************



More information about the Info-childes mailing list