[Corpora-List] Incidence of MWEs

Will Fitzgerald will.fitzgerald at gmail.com
Wed Mar 15 16:40:53 UTC 2006


The thing is, the various meanings of 'pencil sharpener', 'crayon
sharpener' and 'stick sharpener' are all predictable, just not from
their immediate lexical items. I think that any 'tool for Verbing
Noun' or a 'tool for Verbing, shaped like a Noun' will apply in Noun
Verb-er expressions. Certainly, because there is a greater need for
pencil sharpeners, pencil sharpeners tend to have standard shapes &
components, but a pencil sharpener that worked via laser beams would
still be a pencil sharpener. And imagine a tool for sharpening knives
that had a graphite core; in the proper context, 'pencil sharpener'
(or maybe even 'pencil knife sharpener' is ok.

The point is that general real-world knowledge, plus rules of phrasal
combination, create predictable meanings for some expressions that are
not predicatable based on the lexical meanings.

Oh, by the way, here is a 'pencil pencil sharpener':
<http://www.shop-eds.com/ProductDetail.aspx?prntdid=1810&did=1828&pid=23623>



On 3/15/06, Amsler, Robert <Robert.Amsler at hq.doe.gov> wrote:
> I have found published dictionary's judgments as to what constitute MWEs
> to be both dated and biased against declaring MWEs to exist. Until I
> actually went through a number of texts to extract MWEs by hand and
> compared those MWEs I found against those listed in dictionaries I used
> to think the lexicographic coverage was adequate and followed the rule
> that "if you can predict its meaning from its constituent parts, it
> doesn't need a separate entry" to be correct. What I found was that not
> only didn't the rule seem to be applied consistently, but that MWEs
> appeared to be a much neglected area of lexicography with many more
> undocumented MWEs being used in text than were in the dictionaries. It
> was as though dictionaries reviewed their MWE entries far less often and
> less diligently than they did their isolated word entries.
>
> There are probably good reasons against dictionary publishers declaring
> MWEs to exist. Namely, MWEs greatly increase the size of a dictionary
> for a small gain in clarity, perhaps only useful to Speakers of English
> as a Foreign Language (and practitioners of computational linguists,
> information retrieval and artificial intelligence). The "prediction"
> rule used to discount MWEs needing entries seems to beg the question of
> what algorithm can predict these and what does that algorithm predict.
> There is a big difference between believing you are excluding MWEs
> because they are understandable without definitions and having an
> algorithm that can generate the definition you would have written from
> the separate dictionary entries for the component words.
>
> Take an MWE such as "pencil sharpener". Most dictionaries don't define
> this since according to the prediction rule, it could be assumed to be
> just "a sharpener for pencils". However, that denies the fact that we
> all know pencil sharpeners are a specific category of manufactured
> product and if you look for a photo of a pencil sharpener it will have
> one of several distinct models. We also know details about how pencil
> sharpener's work. In contrast, things like a "stick sharpener" or a
> "crayon sharpener" are novel creations without long-standing precedent
> (I just checked the web, and, sigh, they both exist, but a "stick
> sharpener" isn't a tool for sharpening sticks, it is a knife sharpener
> whose shape resembles a stick, i.e., a thin cylindrical file.")
>
> A pencil sharpener would be something like "an electrical, mechanical or
> manual device with sharpened blades into which pencils can be inserted
> and which when operated creates a tapered conical pointed tip on the
> pencil which initializes or renews its ability to be used as a writing
> implement"
>
> Here is where I would say computational linguistics has to take its
> leave of lexicography (or at least published lexicographic practice) and
> declare "pencil sharpener" to be a useful and necessary MWE. I would
> even go so far as to say that every MWE for which an explicit definition
> can be written, should have an explicit definition and that ONLY when
> the explicit definitions show no differentiation should they be
> eliminated in favor of entries for the separate word elements. That is,
> REVERSE the "prediction" rule to assume you cannot predict the meaning
> of an MWE until you fail to find anything to say in its definition that
> is not formulaic.
>
> I don't believe published dictionaries contain sufficient information to
> correctly understand the MWEs they fail to explicitly list. I don't
> believe published dictionaries actually think about MWEs consistently or
> conscientiously.
>
>
>
>
>
>
>
>
>


--
Will Fitzgerald
weblog: <http://www.entish.org/willwhim>



More information about the Corpora mailing list