[Corpora-List] Chomsky and computational linguistics

Fri Aug 3 02:07:34 UTC 2007

I see where you are coming from with this Chris. You are thinking of "one
grammar" as the range of predictive features used to learn a model. I hadn't
thought of it like that. In that sense people in this area are quite
flexible, I agree. I recall a presentation by... Eric Brill(?) for SIGDial,
at the ACL in HK a few years back, where he cast a very wide net of features
and tried to sift out some (any?) with the predictive properties he was
looking for.

But I'm afraid even this fits my definition of "one grammar", yes. Because
for all of this kind of work, while they might have cast a very wide net of
features, their goals have been the same. The goal of all the work I have
seen has been to learn a model, one model, singular, complete, and
explaining/predicting all the structure of a given language in one go.

I don't think the features matter a whole lot. We may find we need to refine
them a bit to get the last drop. Probably to get everything you need to go
right into social relations, and details of biology and all, as the
embodiment guys in functional and cognitive schools claim.

But I don't think the problem has been features. I think the problem is in
the goal of a single complete generalization over those features.

Where is the work which drops the assumption of a single complete and
consistent generalization over a set of features?

The adaptor grammar stuff looks interesting. They seem to be saying reduce
your initial assumption of a fixed global model and allow your model to
change. That could be good. But it is quite limited. Just an easier way to
get existing Bayesian models.

At any given moment they require their model to be globally complete and
consistent.

Discontinuous change in particular would surely be quite hard to do like
this: flipping from one perspective to another. Even if the posterior
distribution, and not context, were the appropriate parameter of change.

>>From my point of view stochastic models are only necessary at all because we
insist on global consistency. We try to fit "black" = "strong" and "black"
!= "strong" into the same model, and the result is random. If we relax the
condition of global consistency our model will be perfectly deterministic
(though incomplete, in any given instance.)

Where is the work which looks at it this way?

-Rob

On 8/2/07, chris brew <cbrew at acm.org> wrote:
>
>
> >
> > I don't think the problem has been our techniques, which have been good
> > for 50 years or more (until Chomsky noticed they gave multiple inconsistent
> > results.) The problem has been our goals. If we change our goal from that of
> > finding one complete grammar (and the functional, cognitive guys are just as
> > guilty of this, not to mention the engineers and machine learning guys),
>
>
>
> I might agree with most of this. But not with the account I think Rob is
> giving of what the engineers are doing. I say "think" because I assume
>  Rob's  "not to mention" is
> saying that the engineers are in the same boat as the rest on the
> philosophical issues anout grammar. Anyway, only under a very liberal
> interpretation of what it means to have "one complete grammar" can the kinds
> of models that dominated this year's CL conferences be seen as pursuit of
> the goal that Rob ascribes. Rather, the engineers and machine learning
> people seem to me entirely open to using whatever structures and predictive
> features can drive up their performance figures. It's taken a while, but my
> impression is that syntactically inspired notions such as dependency
> relations are gaining a bit more traction in this activity than was the case
> a couple of years ago.  The relevant syntactic notions have been around
> essentially forever, but only lately are people understanding how to exploit
> them well enough to get benefit. There is a lot of variation in how able and
> willing the engineers are to put in the effort to reduce unfamiliar concepts
> expressed in the terminology of other fields to effective engineering
> practice. That should be no surprise.
>
> Also, although such things are obviously a minority taste at ACL,  Mark
> Johnson and colleagues have been doing work that opens up a few options that
> theoreticians might want to think about. In particular, they are working
> with rather flexible grammars for which no exact inference algorithms are
> known, and instead use Markov Chain Monte Carlo to obtain posterior
> distributions over grammatical analyses. Open-minded syntacticians reading
> this list might want to meditate on whether this way of doing business is a
> relevant and interesting challenge to their standard assumptions, and if so,
> how to respond in a constructive way.
>
> Chris
>
> Here's a citation for some of the work by Johnson and colleagues. There's
> more on his publications page.
>
> Mark Johnson, Thomas L. Griffiths and Sharon Goldwater (2007)  Adaptor
> Grammars: A Framework for Specifying Compositional Nonparametric Bayesian
> Models<http://www.cog.brown.edu/%7Emj/papers/JohnsonGriffithsGoldwater06AdaptorGrammars.pdf>,
> to appear in Proceedings of NIPS.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070803/eb1ba8f6/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora