[Corpora-List] Incidence of MWEs
Rob Freeman
lists at chaoticlanguage.com
Mon Mar 20 08:44:00 UTC 2006
Hi Mike,
Thanks for this. You have thought about my post, which means I may be able to
explain my position.
On Monday 20 March 2006 03:55, Mike Maxwell wrote:
> Rob Freeman wrote:
> > Surely the question is are tags sensible parameters of language in
> > the first place.
>
> I am not sure what you mean by "parameters".
I mean the same as you mean. The difference in our thinking is not in the
meaning of the word "parameters". The difference is in the nature of the
parameters we think are appropriate to language.
Briefly...
You think generalizations of usage parametrize language.
I think usage parametrizes language by virtue of various generalizations.
They are not the same.
Remember I said my differentiating point was "...it is not enough that your
model be based on generalizations of usage, you must also allow for the
possibility of discontinuous change in those generalizations."
To push your rope metaphor, wound and unwound are not just two ways of looking
at the same thing. Sometimes the way you wind the rope makes a difference.
Your metaphor suggests there is only one way of winding language usage into a
"rope", and the rope itself, wound in this way, is a sufficient parameter. In
contrast I think there are many different ways of winding that rope, and we
need them all.
In my opinion this potential of different "windings" is the single greatest
misapprehension preventing language engineering (in particular) moving
forward.
One "winding" for each MWE.
> > Instead of worrying where MWE's start and stop, let's accept that
> > MWE's cover all of language. All language is an MWE.
>
> Except for this, which isn't an MWE. And except for your posting, which
> isn't an MWE either (at least not one that I've ever seen before).
This is easy to refute. At least, it is standard theory in some areas of
linguistics.
"Except for this" is an MWE. Replacing "except" with "also" what does "also
for this" mean? Replacing "for" with "to benefit", what does "except to
benefit this" mean? Replacing "this" with "it" is "except for it" acceptable
usage? Is "except for here" acceptable usage, or should that just be "except
here"? Is "excepting for this" proper usage? If so, is "including for this"
meaningful?
"Except for this" is an MWE, and so is everything else. I could go on. Pawley
and Syder is a classic reference on this.
> > Explain MWE's in terms of generalizations over usage and let's start
> > thinking about how we can use these generalizations over usage
>
> Uh, let's see. Here's a generalization over usage: the MWE "kick the
> bucket" has a distribution much like the MWE "fire off a shot", which
> has a distribution much like the MWE "pick up the pace", etc. Let's
> make up a label for these MWEs that obey this generalization--I dunno,
> maybe "VP".
> ...
I think I covered this (different "windings") above, but it is worth
repeating.
This is the important bit. This is the core of the difference between us.
This is exactly the mistake which is made by the popular modern sub-field of
Grammatical Inference, and statistical NLP in general. This is where they
fail.
You are making generalizations, and assuming these generalizations can be
treated as sufficient parameters.
The problem is there is no single complete set of generalizations of this
kind to be made. You can make generalizations, but the generalizations are
necessarily multiple and inconsistent.
Of historical interest, I understand this is related to the result which
Chomsky observed (for phonology) and used to force the abandonment of context
distributions as a parameter of grammatical inference in the '50s.
Chomsky used this result to force the abandonment of rules based on
generalizations over usage. We must now use this result (in contrast to
Chomsky) to give precedence back to generalizations over usage, and refute
the idea of a single consistent set of rules, instead.
Whatever we do, we must not ignore the result (inconsistent generalizations)
itself anymore. That would be reason for Chomsky to (continue to) hold us in
contempt, indeed.
-Rob
More information about the Corpora
mailing list