[Corpora-List] Is a complete grammar possible (beyond thecorpus itself)?

Rob Freeman lists at chaoticlanguage.com
Tue Sep 11 02:48:33 UTC 2007


On 9/11/07, John F. Sowa <sowa at bestweb.net> wrote:
>
>
> RF> I'm just saying we should at least explore the possibility
> > formal grammars are "necessarily incomplete" descriptions of
> > corpora, that the right way to handle language is to generalize
> > grammar ad-hoc from examples, as you go.
>
> In fact, that is what many, if not most, grammar and parser
> developers have been doing for the past 50 years.  Everybody who
> is developing broad-coverage parsers starts small and generalizes
> with ad-hoc examples (usually selected from one or more corpora)
> until the coverage gets better and better.


No, no. You are simply misunderstanding me John. When I say "generalize
grammar ad-hoc from examples as you go" I don't mean "as you develop your
grammar". I mean "from sentence-to-sentence."

It is not a question that you start with a grammar where "black" is in the
same class as "strong" and then gradually remove it as you add examples.
"Black" will need to be in a class with "strong" for every "coffee" context,
and _not_ in a class with strong for every "cloud" context.

You need to be able to access both these generalizations. But they
contradict ("black" = "strong" && "black" != "strong".)

The only solution is to keep the examples and generalize "black" = "strong"
or "black" != "strong", as the context demands it.

By all means disagree with me, but don't characterize this as "what parser
developers have been doing for 50 years." It is different.

Please see the difference and argue against that.

-Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070911/7e147a32/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list