[Corpora-List] Frequency of masc./fem/neut. in German

Andras Kornai andras at kornai.com
Fri Apr 17 15:54:11 UTC 2009


On Wed, Apr 15, 2009 at 10:35:13AM -0700, Dan I. Slobin wrote:
> 
>    How does this count treat noun compounds?  E.g., das Werk, der
>    Werkfuehrer, die Werkstatt... / die Kammer, das Kammerwasser, der
>    Kammerbeamter...

Dan,

to the extent compounds inherit their gender from their head it is
extremely unlikely that the overall numbers would change much, this
would require some special effect that impacts the productivity of
masc fem or neut bases differentially. You can observe the same broad
tendency, neuters contributing only about 15%, the rest being fem and
masc distributed about equally, by simply counting die der and das
in running text. In 10.1m words from Project Gutenberg (typically 19th
c. or earlier material) you find

242894  die
238893  der
106332  das

and similarly for 1990s newspaper text (8.4m words of Der Spiegel)

284777  die
265051  der
86214  das

Given that such numbers are easily swayed by style -- compare a 14.9m
word sample from Frankfurter Rundschau from the same year that has

501637  der
497189  die
143069  das

and the fact that plurals would favor die over der, the numbers are
largely consistent with Sven's findings (but are obtained with far
less work). 

>      Here are some type counts based on noun readings (and not noun
>      lemmas)
>      in two computational lexica for German,
>      ignoring readings with more than 1 possible gender:
>                     fem   masc  neut
>      HaGenLex        6409  4702  1723
>      CELEX+HaGenLex 23311 15846 10064

Altogether, the effect of usage (masculine nouns seem to be used more
frequently than their frequency among stems would dictate) appear to
be considerably greater than the effects of compounding, but this is
just a rough order of magnitude impression, it would take quite a bit
of work to unravel the impact of these factors across genres and
styles. 

Andras Kornai

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list