[Corpora-List] Is a complete grammar possible (beyond the corpus itself)?

Wed Sep 5 13:11:37 UTC 2007

> I've given up on the idea of a complete grammar of a language

Not sure what you mean by "a complete grammar of a language"--sounds like
you mean a grammar of all varieties of English or other language
(including e.g. Cockney, Scots English, Southern American English, and
sub-varieties of these).

> I now view language as an individual phenomenon.  We all have our
> own grammars, which overlap to a large degree, but are nevertheless
> distinct. This is because your language experience is different from
> mine.

But wrt this second point, I doubt that any linguist would disagree.  In
particular, I don't think Chomsky would disagree, at least in principle. 
The notorious idealization to "the ideal speaker-hearer" which he (and
Morris Halle) made was just that, an idealization, and acknowledged as
such.  Much like the idealizations that physicists make to frictionless
surfaces.

Indeed, there has been some research in the generative side of the house
into what kind of individual variation might exist, ranging from lexical
(the subcategorizations of individual verbs, for example) to the
particular categories that serve as bounding nodes.  I guess people like
David Lightfoot have also argued that individual variations in learning
are a force driving language change.  (I confess that I can't put my
finger on any of this research-- it's been a long time since I was doing
syntax--but I know I've seen it :-).)

> So all you can hope for is a rough approximation (an average of
> several grammars used to produce the corpus), or possibly a grammar
> based on a particular corpus.  But the grammar derived from the next
> corpus will be different again.

When you get to corpora and engineered grammars derived from the corpora,
it's a different game.  Most of the syntactic (and for that matter,
morphological) grammars that I've seen derived from corpora have been
'learned' with very little information given to the machine learning tool.
 Generativists, otoh, generally assume a fairly rich innate grammatical
component.  I would imagine that such an innate component could drive
human language learners to come up with similar grammars (assuming that to
be a fact) in the face of different corpora, while the lack of much of an
innate component is what allows machine learned grammars to differ
substantially among themselves.  Seems like this would make an
interesting, if difficult, comparison.

   Mike Maxwell
   CASL/ U MD

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora