[Lexicog] Classifying compounds

Mike Maxwell maxwell at LDC.UPENN.EDU
Thu Jun 16 13:36:57 UTC 2005


Kenneth Keyes wrote:
> One of the many exciting features of Fieldworks is the possibility of
> including a more rigorous treatment of complex lexical items, among
> them, compounds as subentries in the lexical database. So far the
> developers have included "A MoEndoCentric Compound." What other types of
> compounds are there? How do I find out? Can anyone suggest some
> resources to research?

I can speak to the issue of what the underlying model allows, but not to
when any of this will be implemented.  The following is taken from the
model description I wrote several years ago; references are at the end.

-----------------------

The typology of compounds is based on types Spencer (1991: 310ff.); some of
the compounds he discusses shade into idioms.  The model does not attempt
to account for the semantics of compounds in the compounding rule, since
this is largely unpredictable. Nor do we attempt to account for argument
linking (e.g. 'drawbridge', where 'bridge' is the internal argument of 'draw').

At a general level, the model distinguishes between binary branching
compounds and coordinate compounds.  Most compounds are binary branching,
(although see the cautions in Fabb (1998) section 2.3). Binary compounds
which contain more than two constituents must thus be built up recursively.
Thus, manhole cover has the structure [[man hole] cover], rather than a
flat structure.

Among binary compounds, the model distinguishes endocentric and exocentric
compounds.

In endocentric compounds (the part of the model that handles this also
handles incorporation), the morphosyntactic properties of the head
constituent determine the morphosyntactic properties of the compound
structure (the head's morphosyntactic features "percolate").  Most English
compounds are of this type.

Note that the head constituent is best defined in terms of the relationship
between the head and the whole, rather than the relationship between the
two constituents; thus, the English 'killjoy' is not endocentric, despite
the fact that 'kill' presumably selects 'joy'. See Fabb (1998: 70).

This class does not provide a way to override the percolation of the head’s
morphosyntactic properties to the output structure, this being essentially
the definition of ‘head’. However, this may be too strict a limitation, in
that a construction might override the head properties by imposing a minor
modification on the morphosyntactic properties of the output. For example,
in languages with (true) incorporation, incorporation of the direct object
may or may not make the resulting verb intransitive (Baker 1996: 31).  It
may therefore be necessary to provide for overriding the percolation of
features, or (better) to change the subcategorization list of the head.

Endocentric compounds are inflected on their heads (Scalise 1986: 124).
Non-heads of endocentric compounds are usually uninflected, even when the
word in question is always inflected in isolation. An English example is
pluralia tantum words like 'scissors' and 'trousers', which appear in their
singular forms in compounds: scissor-handle and trouser-leg (example (68)
in Scalise 1986: 123). Exceptions sometimes occur with irregular plurals:
teethmarks (but cf. toothbrush, *teethbrush).

Exocentric compounds are compounds like the English 'killjoy' or Spanish
'paracaidas' "parachute", in which neither constituent appears to be the head.

To my knowledge, exocentric compounds are not inflected for their syntactic
function, although the individual members of the compound may have their
own inflection. For example, the Spanish example of the previous paragraph
is made up of a preposition 'para' "for" and a feminine plural noun
'caidas' "falls"; the compound itself is masculine and ambiguous for
number, but is not so inflected. Other Spanish examples include
'lavaplatos' "dish washer", consisting of the third person singular present
indicative verb 'lava' "washes" and the masculine plural noun 'platos'
"dishes"; and 'sacamuelas' "dentist", consisting of the third person
singular present indicative verb 'saca' "removes" and the plural noun
'muelas' "teeth'.  'Lavaplatos' is masculine and ambiguous for number,
while 'sacamuelas' is ambiguous for both gender and number.

Coordinate compounds are compounds of which all the members are heads;
branching may be non-binary.  There are several linguistic terms for this
sort of compound, including ‘co-ordinate compounds’, ‘appositional
compounds’, and ‘dvandva compounds’. An example (taken from Fabb 1998, page
74; see also section 1.1.2) is the Tamil vira-tira-cakacan-kal "courage,
bravery and valour".  [I've had to remove all the special characters in
this Tamil word--I had it in a Roman-style transliteration, not the Tamil
characters, but even that I couldn't put in this email :-(.]

References:

Baker, Mark. 1996. The Polysynthesis Parameter. Oxford Studies in
Comparative Syntax. Oxford: Oxford University Press.

Fabb, Nigel. 1998. "Compounding." Pages 66-83 in Spencer and Zwicky (1998).

Scalise, Sergio. 1986. Generative Morphology. Studies in Generative Grammar
18. Dordrecht: Foris.

Spencer, Andrew. 1991. Morphological Theory. An Introduction to Word
Structure in Generative Grammar. Oxford: Basil Blackwell.)

Spencer, Andrew; and Arnold M. Zwicky (eds.). 1998. The Handbook of
Morphology. Blackwell Handbooks in Linguistics. Oxford: Blackwell Publishers.
--
	Mike Maxwell
	Linguistic Data Consortium
	maxwell at ldc.upenn.edu




Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list