I see where you are coming from with this Chris. You are thinking of "one grammar" as the range of predictive features used to learn a model. I hadn't thought of it like that. In that sense people in this area are quite flexible, I agree. I recall a presentation by... Eric Brill(?) for SIGDial, at the ACL in HK a few years back, where he cast a very wide net of features and tried to sift out some (any?) with the predictive properties he was looking for.
<br><br>But I'm afraid even this fits my definition of "one grammar", yes. Because for all of this kind of work, while they might have cast a very wide net of features, their goals have been the same. The goal of all the work I have seen has been to learn a model, one model, singular, complete, and explaining/predicting all the structure of a given language in one go.
<br><br>I don't think the features matter a whole lot. We may find we need to refine them a bit to get the last drop. Probably to get everything you need to go right into social relations, and details of biology and all, as the embodiment guys in functional and cognitive schools claim.
<br><br>But I don't think the problem has been features. I think the problem is in the goal of a single complete generalization over those features.<br><br>Where is the work which drops the assumption of a single complete and consistent generalization over a set of features?
<br><br>The adaptor grammar stuff looks interesting. They seem to be saying reduce your initial assumption of a fixed global model and allow your model to change. That could be good. But it is quite limited. Just an easier way to get existing Bayesian models.
<br><br>At any given moment they require their model to be globally complete and consistent.<br><br>Discontinuous change in particular would surely be quite hard to do like this: flipping from one perspective to another. Even if the posterior distribution, and not context, were the appropriate parameter of change.
<br><br>From my point of view stochastic models are only necessary at all because we insist on global consistency. We try to fit "black" = "strong" and "black" != "strong" into the same model, and the result is random. If we relax the condition of global consistency our model will be perfectly deterministic (though incomplete, in any given instance.)
<br><br>Where is the work which looks at it this way?<br><br>-Rob<br><br><div><span class="gmail_quote">On 8/2/07, <b class="gmail_sendername">chris brew</b> <<a href="mailto:cbrew@acm.org">cbrew@acm.org</a>> wrote:
</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><span class="q"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br><br>I don't think the problem has been our techniques, which have been good for 50 years or more (until Chomsky noticed they gave multiple inconsistent results.) The problem has been our goals. If we change our goal from that of finding one complete grammar (and the functional, cognitive guys are just as guilty of this, not to mention the engineers and machine learning guys),
</blockquote><div><br> </div></span><div>I might agree with most of this. But not with the account I think Rob is giving of what the engineers are doing. I say "think" because I assume Rob's "not to mention" is
</div><div>saying that the engineers are in the same boat as the rest on the philosophical issues anout grammar. Anyway, only under a very liberal interpretation of what it means to have "one complete grammar" can the kinds of models that dominated this year's CL conferences be seen as pursuit of the goal that Rob ascribes. Rather, the engineers and machine learning people seem to me entirely open to using whatever structures and predictive features can drive up their performance figures. It's taken a while, but my impression is that syntactically inspired notions such as dependency relations are gaining a bit more traction in this activity than was the case a couple of years ago. The relevant syntactic notions have been around essentially forever, but only lately are people understanding how to exploit them well enough to get benefit. There is a lot of variation in how able and willing the engineers are to put in the effort to reduce unfamiliar concepts expressed in the terminology of other fields to effective engineering practice. That should be no surprise.
</div><div><br></div><div>Also, although such things are obviously a minority taste at ACL, Mark Johnson and colleagues have been doing work that opens up a few options that theoreticians might want to think about. In particular, they are working with rather flexible grammars for which no exact inference algorithms are known, and instead use Markov Chain Monte Carlo to obtain posterior distributions over grammatical analyses. Open-minded syntacticians reading this list might want to meditate on whether this way of doing business is a relevant and interesting challenge to their standard assumptions, and if so, how to respond in a constructive way.
</div><div><br></div><div>Chris</div><div><br></div><div>Here's a citation for some of the work by Johnson and colleagues. There's more on his publications page.
</div>
<div><br></div><div><span style="font-family: Times; font-size: 16px;">Mark Johnson, Thomas L. Griffiths and Sharon Goldwater (2007) <a href="http://www.cog.brown.edu/%7Emj/papers/JohnsonGriffithsGoldwater06AdaptorGrammars.pdf" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">
Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Models</a>, to appear in Proceedings of NIPS.</span></div></div>
</blockquote></div><br>