On 9/15/07, <b class="gmail_sendername">Mike Maxwell</b> <<a href="mailto:maxwell@umiacs.umd.edu" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">maxwell@umiacs.umd.edu</a>> wrote:<div><span class="gmail_quote">

</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

One advantage of splitting apart syntax and semantics ... is that you begin to see how you can understand long distance dependencies of meaning, for example    The support that John brought was helpful to making our argument.

<br>'support' means one thing; whereas in the following sentence<br>   The support that John brought was helpful for bracing the building.<br>it means something quite different.  Syntax allows you to relate the<br>

choice of sense in each case to words at an arbitrary distance away, in a way that collocation *by itself* cannot do.</blockquote><div> You can model long distance dependencies using raw word associations (collocations), Mike. I did an implementation of this. All you need to do is carry the (collocation) features through until they are needed to make a selection.

My implementation works a bit like unification in the formal grammar world. The only difference is that what you are unifying are actual word associations observed in corpora (keeping things at the level of observed word associations means you can capture all that extra complexity associated with grammatical "incompleteness" I've been talking about.) Anyway, just like in unification each parent node can be made to inherit the (collocation) features of its daughters, until the relevant dependency is encountered, and the selection is made.

What I find is that information about word associations carried through in this way naturally selects a parse structure of sorts. That was the point of the parser presented on my website. I found I could work only with word associations, unified and carried through over multiple combinations, and have some kind of parse structure emerge.

<br><br>Since then I've moved away from parsing as a task. Like any labeling it is at some level subjective. But the fact I get parses, and that they are "reasonable" (up to the state-of-the-art for symbolic grammars according to the guy who did the Chinese implementation) indicates that the method of ad-hoc generalization really is capturing linguistic regularities, including long distance regularities (and capturing them with the extra complexity of ad-hoc generalization, too, that's the important thing.)

<br></div><br>-Rob</div>