? "Vocabulary Density"

Wed Aug 11 18:17:48 UTC 1999

Some initial thoughts:

(a) languages differ enormously from one another in their vocabulary density
and in the distribution of their vocabulary across the total "semantic
space", as you put it.   If we are going to compare them with reconstructed
languages, don't we need to establish some kind of benchmark for existent
languages first?

(b) I can only speak for PIE amongst the reconstructed languages.  Its
reconstructed vocabulary is certainly odd.   Not including obvious root
extensions:
   (i) There are 18 roots for glisten/glitter, and 12 for shine (total 30)
   (ii) There are 8 for goat
   (iii) there are 8 or 9 for grow
   (iv) There are 23 for hit
   (v) There are 10 for jump
   (vi) There are 11 for weave/plait
   (vii) There are 12 for pull
   (viii) There are 11 for press
   (ix) There are 24 for turn
   (x) There 17 for swell

There are a large number of roots with similar semantic connotations, (over
half the semantic concepts have at least two reconstructed independent
roots).   Some of these have large numbers of these "pseudo-synonyms".
Given the patchy and limited nature of what we can reconstruct, it certainly
seems that reconstructed PIE has its words clustered around some concepts at
the expense of others.

So this gives two questions:
(A) Is this pattern anything like natural languages?
(B)Is the overall average anything like the overall average in natural
languages?

Peter