[Corpora-List] Chomsky and computational linguistics
Rob Freeman
lists at chaoticlanguage.com
Tue Jul 31 13:17:13 UTC 2007
On 7/31/07, Oliver Mason <O.Mason at bham.ac.uk> wrote:
>
> > And the goal is good, because we all do it, every day.
>
> I would say the goal is pointless.
The goal of "generating all and only the grammatical sentences of a
language"?
Well I would agree that the goal of getting them all becomes poorly defined
in the limit. But I would like to know how I can produce as many as I like,
and most of them fairly "grammatical" by common consensus.
Language is not a fixed formal
> mechanism, it's a dynamic and evolving system. I'm guessing here, but
> I'm pretty sure nobody in biology would care about listing all the
> possible shapes in which a tree can grow. So it's not a problem that
> is relevant to understanding how language works. Furthermore,
> language is constantly changing, so as soon as you've created a
> grammar than can generate all those sentence it's already out of date.
> A bit like counting the exact population of our planet.
>
> However, other people might disagree, and it depends very much on what
> you're looking for when analysing language. I would argue that you
> need a corpus to get a decent grammar, by which I mean one that
> describes actual usage and hence allows you to make relevance
> judgments. If a grammar describes an obscure phenomenon in great
> detail but neglects more common structures, then it's not that useful.
> And human intuition is good at neglecting routine usage in favour of
> 'interesting' and 'weird' things.
>
> Also, corpora are not irreducibly complex. We just haven't found the
> right way forwards, as we're too focused on formal methods and
> traditional grammar. And I blame Chomsky for that, boo hiss.
You've set up a couple of straw men for me here. I'm not arguing that
language is a "fixed formal system "at all.
Chomsky advocated a fixed formal system, sure. But that was not what made
him distinct. Not originally.
What made Chomsky distinct was his observation that if a fixed formal system
exists, it _cannot_ be seen in the data. (Leading him to the conclusion it
must be innate.)
By all means reject a fixed formal system. I think this is indeed how
Chomsky should have interpreted his observations.
But then this also means that corpora are irreducible (otherwise you can get
a fixed formal system, by definition.)
This is the point I wish us to see. We are missing it.
Corpus linguists shouldn't have a problem with this. And the machine
learning guys here shouldn't worry about it either. Once we give up the goal
of reduction of corpora it opens up new worlds of descriptive power for
them.
-Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070731/e3ce915c/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list