[Corpora-List] Incidence of MWEs

Rob Freeman lists at chaoticlanguage.com
Mon Mar 20 08:44:00 UTC 2006


Hi Mike,

Thanks for this. You have thought about my post, which means I may be able to 
explain my position.

On Monday 20 March 2006 03:55, Mike Maxwell wrote:
> Rob Freeman wrote:
> > Surely the question is are tags sensible parameters of language in
> > the first place.
>
> I am not sure what you mean by "parameters".

I mean the same as you mean. The difference in our thinking is not in the 
meaning of the word "parameters". The difference is in the nature of the 
parameters we think are appropriate to language.

Briefly...

You think generalizations of usage parametrize language.

I think usage parametrizes language by virtue of various generalizations.

They are not the same.

Remember I said my differentiating point was "...it is not enough that your 
model be based on generalizations of usage, you must also allow for the 
possibility of discontinuous change in those generalizations."

To push your rope metaphor, wound and unwound are not just two ways of looking 
at the same thing. Sometimes the way you wind the rope makes a difference.

Your metaphor suggests there is only one way of winding language usage into a 
"rope", and the rope itself, wound in this way, is a sufficient parameter. In 
contrast I think there are many different ways of winding that rope, and we 
need them all.

In my opinion this potential of different "windings" is the single greatest 
misapprehension preventing language engineering (in particular) moving 
forward.

One "winding" for each MWE.

> > Instead of worrying where MWE's start and stop, let's accept that
> > MWE's cover all of language. All language is an MWE.
>
> Except for this, which isn't an MWE.  And except for your posting, which
> isn't an MWE either (at least not one that I've ever seen before).

This is easy to refute. At least, it is standard theory in some areas of 
linguistics.

"Except for this" is an MWE. Replacing "except" with "also" what does "also 
for this" mean? Replacing "for" with "to benefit", what does "except to 
benefit this" mean? Replacing "this" with "it" is "except for it" acceptable 
usage? Is "except for here" acceptable usage, or should that just be "except 
here"? Is "excepting for this" proper usage? If so, is "including for this" 
meaningful?

"Except for this" is an MWE, and so is everything else. I could go on. Pawley 
and Syder is a classic reference on this.

> > Explain MWE's in terms of generalizations over usage and let's start
> > thinking about how we can use these generalizations over usage
>
> Uh, let's see.  Here's a generalization over usage: the MWE "kick the
> bucket" has a distribution much like the MWE "fire off a shot", which
> has a distribution much like the MWE "pick up the pace", etc.  Let's
> make up a label for these MWEs that obey this generalization--I dunno,
> maybe "VP".
> ...

I think I covered this (different "windings") above, but it is worth 
repeating.

This is the important bit. This is the core of the difference between us.

This is exactly the mistake which is made by the popular modern sub-field of 
Grammatical Inference, and statistical NLP in general. This is where they 
fail.

You are making generalizations, and assuming these generalizations can be 
treated as sufficient parameters. 

The problem is there is no single complete set of generalizations of this 
kind to be made. You can make generalizations, but the generalizations are 
necessarily multiple and inconsistent.

Of historical interest, I understand this is related to the result which 
Chomsky observed (for phonology) and used to force the abandonment of context 
distributions as a parameter of grammatical inference in the '50s.

Chomsky used this result to force the abandonment of rules based on 
generalizations over usage. We must now use this result (in contrast to 
Chomsky) to give precedence back to generalizations over usage, and refute 
the idea of a single consistent set of rules, instead.

Whatever we do, we must not ignore the result (inconsistent generalizations) 
itself anymore. That would be reason for Chomsky to (continue to) hold us in 
contempt, indeed.

-Rob



More information about the Corpora mailing list