[Corpora-List] Is a complete grammar possible (beyond the corpus itself)?

Wed Sep 5 08:46:15 UTC 2007

Oliver Mason wrote:
> I've given up on the idea of a complete grammar of a language, as I
> now view language as an individual phenomenon.  We all have our own
> grammars, which overlap to a large degree, but are nevertheless
> distinct.

Although I agree with Oliver, I think Rob's initial enquiry (following 
on from the previous thread) was a little different. I do think that 
Oliver's suggestion can be integrated.

My interpretation is that Rob is criticising the search for a single 
grammatical theory that accounts for all possible structures in 
language. His criticism is that a single theory has failed to account 
for language use, and his suggested remedy is that we don't attempt to 
build a single grammar, and instead we should attempt to combine 
competing grammatical (or otherwise) interpretations according to context.

The reason I see Oliver's point as orthogonal is that, within the mind 
of an individual, we might still pursue a single-grammar theory - 
attempting to find a single-grammar that accounts for all language used 
by the individual in question. (Though I think you can argue that, if 
all humans do this, then each learns a subset of a single-grammar 
theory.) I think you can adopt either a single-grammar theory or a 
combined approach, and claim that individuals have quite different 
experiences, where those experiences are variations over model parameters.

In response to Rob's idea of storing the corpus verbatim and using 
context to select interpretations of the structure, I'd like to ask a 
few questions. I'd like to find answers (or at least opinions) because I 
think Rob's idea is highly appealing, but is either underdeveloped, or 
my exposure to the idea is inadequate. Either way, I hope a bit of 
discussion will help.

First, I think that storing a corpus verbatim and attempting to recover 
different information according to context is a great idea for 
computational linguistics, and particularly in combining machine 
learning approaches into language models. However, I'm not sure how well 
it stands up (or whether it is even intended) as an account of human 
language learning. Is there evidence from psycholinguistics that 
supports or contradicts the claim that humans store all their linguistic 
experience? Since "context" often includes the state of the world or 
other beings, is the totality of human experience stored? I can imagine 
a system where each experience is abstracted to existing models wherever 
possible, and otherwise stored verbatim until a suitable model is learnt 
(perhaps as data for a competition between models during a phase for 
learning which models to use). This might account for development.

Second, some of the most controversial (in terms of generating debate) 
aspects of Chomsky's approach are those that suggest that language 
faculties are innate and specialised to deal only with language. These 
still pertain (as issues to address) in a "combine several models 
according to context" approach:
- which models will you use in your combination? Are they innate? Do 
they represent "intelligence" that is specific to dealing with language 
(as opposed to more general forms of intelligent behaviours)?
- how do you define context? I assume context is defined in relation to 
a model, so again, is this innate? How do you use context to trigger events?

Finally, if we are to use a combination of different theories, which 
theories should we use? For instance, would we allow dependency grammar, 
head-driven phrase-structure grammar and multi-word expression grammar 
(e.g. Pattern Grammar) to compete to interpret at a "syntactic" level? 
If so, isn't there a problem that some of the underlying theories for 
have been designed as an account for language in a single-grammar 
approach? I mean, an elegant grammatical theory might make severe 
concessions (in terms of massively increasing complexity or restricting 
form) that are tailored to the original application as a single-grammar. 
These concessions may be irrelevant in a combined approach, and there 
may be much simpler explanations -- those developed with the aim of 
modularity and interaction in mind.

I'd also like to return to one of Rob's much earlier points: that there 
is little previous work on ideas akin to his. I can see a few parallels 
between Rob's suggestion and the work of Rodney Brooks (no relation) in 
the field of behaviour-based robotics. Brooks claimed that robots with 
internal representations of the world suffered because their models were 
perpetually out-of-sync with the world. He suggested a "world as its own 
best model" theory, where the robot operates on percepts obtained from 
the world, and avoids internal representation. I see this as similar to 
Rob's suggestion of keeping the corpus, which acts as our "world", and 
avoiding a single-grammar abstraction that might not fully account for 
the corpus. (I would agree with Diana Santos' claim that a corpus is 
only a sample -- and an impoverished sample in terms of contextual 
information -- of the world at a given time.)

Brooks also suggested a "subsumption architecture" for organising 
behaviours. Within this architecture, behaviours compete for the right 
to control the robot, and are triggered by percepts. Again, this sounds 
similar to Rob's suggestion, where linguistic context would be the basis 
for percepts.

I think similar models have been employed in linguistics. For instance, 
Rumelhart and McClelland's Interactive Activation Model casts lexical 
access as a competition between linguistic units (from phones up to 
words) for the right to represent some sound-wave input. I think Rob's 
suggestion could be treated within a similar framework: rather than 
having phonemes compete for the right to represent a sound-wave, perhaps 
different theories could compete.

The problems with these approaches (as connectionist or parallel 
distributed processing models) are usually couched in terms of 
efficiency and the difficulty of engineering a combination of 
distributed behaviours that achieves the task at hand. The latter 
returns to my question about which theories should be included, and may 
explain why such work is not widespread in our field.

Regards,
David

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora