How to select words for a bilingual dictionary (was: [Lexicog] Digest Number 66)

Mike Maxwell maxwell at LDC.UPENN.EDU
Thu Mar 11 16:44:14 UTC 2004


Mery Martinelli wrote:
> ...a lemma may represent more than a word. For example
> in Italian some adjectives have 4 different forms and to
> describe them the corpus should provide us at least 80
> instances,  20 occurrences for each form.

I'm not understanding something here.  What are the four different "forms"?
Are they just the inflected variants (masculine singular, masculine plural,
feminine singular, feminine plural)?  If so, I don't see any need for
finding instances in corpora of all four forms.  All four forms are just
variants of a single lemma, and unless there is something really different
about, say, the meaning of the singular and plural (which I doubt), then
we're really talking about needing 20 occurrences (given your criteria that
"To describe the behaviour of a word we need at least 20 instances").

Note that this is different from the question of irregular forms--if a form
is irregular, it should be listed in the dictionary.  But IMO it isn't
necessary to find 20 instances of an irregular form.  It shouldn't even be
necessary to find one instance of an irregular form if all you want to do is
show that its form is irregular; it would be perfectly sufficient to use
your native (or non-) intuition to tell you that.  (Having said that,
irregular forms tend to be common, else they ain't gonna maintain their
irregularity across generations of speakers.)

    Mike Maxwell
    Linguistic Data Consortium
    maxwell at ldc.upenn.edu




Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
     lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list