9.112, Disc: Vocabulary Density
The LINGUIST List
linguist at linguistlist.org
Fri Jan 23 12:58:40 UTC 1998
LINGUIST List: Vol-9-112. Fri Jan 23 1998. ISSN: 1068-4875.
Subject: 9.112, Disc: Vocabulary Density
Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at linguistlist.org>
Helen Dry: Eastern Michigan U. <hdry at linguistlist.org>
T. Daniel Seely: Eastern Michigan U. <seely at linguistlist.org>
Review Editor: Andrew Carnie <carnie at linguistlist.org>
Associate Editor: Ljuba Veselinova <ljuba at linguistlist.org>
Assistant Editors: Martin Jacobsen <marty at linguistlist.org>
Brett Churchill <brett at linguistlist.org>
Anita Huang <anita at linguistlist.org>
Julie Wilson <julie at linguistlist.org>
Elaine Halleck <elaine at linguistlist.org>
Software development: John H. Remmers <remmers at emunix.emich.edu>
Zhiping Zheng <zzheng at online.emich.edu>
Home Page: http://linguistlist.org/
Editor for this issue: Elaine Halleck <elaine at linguistlist.org>
=================================Directory=================================
1)
Date: Thu, 22 Jan 98 10:46:16 +0300
From: "solovyev" <solovyev at open.ksu.ras.ru>
Subject: Vocabulary density
-------------------------------- Message 1 -------------------------------
Date: Thu, 22 Jan 98 10:46:16 +0300
From: "solovyev" <solovyev at open.ksu.ras.ru>
Subject: Vocabulary density
Dear Linguists,
How densely packed can vocabularies be ?
What kind of discussion has there been about
'vocabulary density' in natural and artificial
languages ?
Density could be depicted crudely as the ratio
of recognised lexemes to phonemes in each
category or word classed by size. For example,
for one-syllable sounds recognised in a language,
a subset of these are recognised as 'words' -
equally for two-syllable sound-strings, only a
subset have established meanings.
Presumably the closer the ratio is to 1.0, that
is the more completely dense or packed the
vocabulary is said to be, then the more ambiguity
problems users and translators of the language
will encounter. A single phonetic or spelling
error will be more likely (in a dense language)
to inadvertently create another meaningful word.
A lot of redundancy in a vocabulary - that is
plenty of dummy words, made up of sound-strings
not recognised as words, clustered 'around' each
recognised word - may make a language more
efficient for communicating.
There must be many papers examining these
issues, and I am hoping to find them on the
web and catch up on where the discussion is
at so far. I guess it must be somewhere
where linguistics meets information theory ?
Mark Griffith, journalist
markgriffith at yahoo.com
This question was placed on the Web-site of
"Web Journal of Formal, Computational and
Cognitive Linguistics" (FCCL). Please send your
opinion to e-mail address: <solovyev at tatincom.ru>
too.
Valery Solovyev, Editor of the FCCL.
---------------------------------------------------------------------------
LINGUIST List: Vol-9-112
More information about the LINGUIST
mailing list