[Corpora-List] ad-hoc generalization and meaning
Rob Freeman
lists at chaoticlanguage.com
Mon Sep 17 01:12:45 UTC 2007
On 9/15/07, Mike Maxwell <maxwell at umiacs.umd.edu> wrote:
>
> One advantage of splitting apart syntax and semantics ... is
> that you begin to see how you can understand long distance dependencies
> of meaning, for example
> The support that John brought was helpful to making our argument.
> 'support' means one thing; whereas in the following sentence
> The support that John brought was helpful for bracing the building.
> it means something quite different. Syntax allows you to relate the
> choice of sense in each case to words at an arbitrary distance away, in
> a way that collocation *by itself* cannot do.
You can model long distance dependencies using raw word associations
(collocations), Mike. I did an implementation of this. All you need to do is
carry the (collocation) features through until they are needed to make a
selection.
My implementation works a bit like unification in the formal grammar world.
The only difference is that what you are unifying are actual word
associations observed in corpora (keeping things at the level of observed
word associations means you can capture all that extra complexity associated
with grammatical "incompleteness" I've been talking about.) Anyway, just
like in unification each parent node can be made to inherit the
(collocation) features of its daughters, until the relevant dependency is
encountered, and the selection is made.
What I find is that information about word associations carried through in
this way naturally selects a parse structure of sorts. That was the point of
the parser presented on my website. I found I could work only with word
associations, unified and carried through over multiple combinations, and
have some kind of parse structure emerge.
Since then I've moved away from parsing as a task. Like any labeling it is
at some level subjective. But the fact I get parses, and that they are
"reasonable" (up to the state-of-the-art for symbolic grammars according to
the guy who did the Chinese implementation) indicates that the method of
ad-hoc generalization really is capturing linguistic regularities, including
long distance regularities (and capturing them with the extra complexity of
ad-hoc generalization, too, that's the important thing.)
-Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070917/b58a9dc4/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list