<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:blue;
text-decoration:underline;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:Arial;
color:navy;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang=EN-US link=blue vlink=blue>
<div class=Section1>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>There are still Chomskyans on the prowl
though:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><a
href="http://www.amazon.com/Atoms-Language-Minds-Hidden-Grammar/dp/019860632X/ref=pd_bbs_sr_1/103-3873164-6759834?ie=UTF8&s=books&qid=1189095805&sr=8-1">http://www.amazon.com/Atoms-Language-Minds-Hidden-Grammar/dp/019860632X/ref=pd_bbs_sr_1/103-3873164-6759834?ie=UTF8&s=books&qid=1189095805&sr=8-1</a><o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>-Rich<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<div>
<div class=MsoNormal align=center style='text-align:center'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>
<hr size=2 width="100%" align=center tabindex=-1>
</span></font></div>
<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;
font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2
face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'>
corpora-bounces@uib.no [mailto:corpora-bounces@uib.no] <b><span
style='font-weight:bold'>On Behalf Of </span></b>Rob Freeman<br>
<b><span style='font-weight:bold'>Sent:</span></b> Wednesday, September 05,
2007 8:01 AM<br>
<b><span style='font-weight:bold'>To:</span></b> David Brooks; CORPORA@UIB.NO<br>
<b><span style='font-weight:bold'>Subject:</span></b> Re: [Corpora-List] Is a
complete grammar possible (beyond thecorpus itself)?</span></font><o:p></o:p></p>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>Hi David,<br>
<br>
Thanks for looking at this so closely.<br>
<br>
Yes, you are exactly right. What I am suggesting is slightly different to
Oliver's interpretation. Though Oliver's issue (dialectal differences?) can be
integrated easily. <o:p></o:p></span></font></p>
<div>
<p class=MsoNormal><span class=gmailquote><font size=3 face="Times New Roman"><span
style='font-size:12.0pt'>On 9/5/07, <b><span style='font-weight:bold'>David
Brooks </span></b><<a href="mailto:d.j.brooks@cs.bham.ac.uk" target="_blank">d.j.brooks@cs.bham.ac.uk</a>>
wrote:</span></font></span><o:p></o:p></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><br>
First, I think that storing a corpus verbatim and attempting to recover<br>
different information according to context is a great idea for<br>
computational linguistics, and particularly in combining machine<br>
learning approaches into language models. However, I'm not sure how well <br>
it stands up (or whether it is even intended) as an account of human<br>
language learning. Is there evidence from psycholinguistics that<br>
supports or contradicts the claim that humans store all their linguistic<br>
experience?<o:p></o:p></span></font></p>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><br>
I think there is evidence. The importance of collocation and detail of
phraseology could be interpreted as such.<o:p></o:p></span></font></p>
</div>
<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'><br>
People will say there is clear evidence we recall only the gist. I agree, but
think this is because we "recall" based on meaning, and meaning is
defined by _sets_ of examples ( c.f. exemplar theory.) So we remember all the
individual examples in a set of exemplars, but can only "recall" the
set as a whole.<br>
<br>
For example. I can't "recall" everything I read verbatim, but,
anecdotally, I may hear a sentence from a book I read years ago, and
"remember" it instantly (including maybe what I was doing at the time
I read it.) <o:p></o:p></span></font></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>Since "context" often includes the state of the world or
other beings, is the totality of human experience stored? <o:p></o:p></span></font></p>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><br>
I don't think everything is stored, but what is stored I think is stored
verbatim. (We may lose bits of it, but what we lose is not systematic.)<o:p></o:p></span></font></p>
</div>
<blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;
margin-left:4.8pt;margin-right:0in'>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>Second, some of the most controversial (in terms of generating debate)
aspects of Chomsky's approach are those that suggest that language faculties
are innate and specialised to deal only with language. These still pertain (as
issues to address) in a "combine several models according to context"
approach:<br>
- which models will you use in your combination? Are they innate? Do they
represent "intelligence" that is specific to dealing with language
(as opposed to more general forms of intelligent behaviours)? <br>
- how do you define context? I assume context is defined in relation to a
model, so again, is this innate?<o:p></o:p></span></font></p>
</blockquote>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><br>
Actually I'm not so much suggesting that we integrate existing models. I'm
suggesting we focus instead on ways of finding models, or grammars, in short
grammatical induction, especially distributional analysis.<br>
<br>
As far as contexts go I think typically grammatical induction gets good results
even just by clustering words on immediate contexts ( e.g. one word.)<br>
<br>
I only have one issue. As far as I am concerned grammatical induction has be
held up only by the (almost?) universal assumption it should be possible to
generalize grammar globally. <br>
<br>
Change that one assumption and I think we will immediately start to produce
very useful results.<br>
<br>
As a corollary I don't think the generalization mechanism is specific to
language at all. I am sure it is general to all perceptual (intelligent?)
behaviour. <o:p></o:p></span></font></p>
</div>
<blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;
margin-left:4.8pt;margin-right:0in'>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>How do you use context to trigger events?<o:p></o:p></span></font></p>
</blockquote>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
</div>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>Crudely put, I filter the possible grammar of each word on its context.
Other than that it is done in much the same way as grammatical induction is
done now. <br>
<br>
In grammatical induction you cluster the contexts of a word to
"learn" a grammatical class for it. I do the same. It is just that
now I "learn" a different class for each word, depending on what word
is adjoining. So if "black" adjoins "coffee", I
"cluster" a different class for "black" than I would if
"cloud" were adjoining. <o:p></o:p></span></font></p>
</div>
<blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;
margin-left:4.8pt;margin-right:0in'>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'>I'd also like to return to one of Rob's much earlier points: that there
is little previous work on ideas akin to his. I can see a few parallels between
Rob's suggestion and the work of Rodney Brooks (no relation) in the field of
behaviour-based robotics. Brooks claimed that robots with internal
representations of the world suffered because their models were perpetually
out-of-sync with the world. He suggested a "world as its own best
model" theory, where the robot operates on percepts obtained from the
world, and avoids internal representation. I see this as similar to Rob's
suggestion of keeping the corpus, which acts as our "world", and
avoiding a single-grammar abstraction that might not fully account for the
corpus. (I would agree with Diana Santos' claim that a corpus is only a sample
-- and an impoverished sample in terms of contextual information -- of the
world at a given time.)<o:p></o:p></span></font></p>
</blockquote>
<div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><br>
That robot work sounds good. I agree, this sounds like the kind of thing I
mean.<br>
<br>
Language is a much better test bed for such ideas, though, because it is so accessible.
It is difficult to model the "world" of a robot.<br>
<br>
The "world" of language is just the corpus.<br>
<br>
But you are looking for precedent. <br>
<br>
There is of course Paul Hopper's "Emergent Grammar". I think this is
essentially right, but hesitate to mention it because somehow it is always
mis-interpreted. The idea of something which cannot be described in terms of
rules just seems to be too subtle, and perhaps Paul has not had the maths to
formalize it. For whatever reason, people always seem to identify his "emergence"
with "evolution" of grammar (which he specifically denies below.)
Read correctly I think the ideas are all there. Only an implementation is
missing: <br>
<br>
Here is his famous paper from 1985(?):<br>
<br>
<a href="http://eserver.org/home/hopper/emergence.html" target="_blank">http://eserver.org/home/hopper/emergence.html
</a><br>
<br>
>>><br>
I am concerned in this paper with ... the assumption of an abstract, mentally
represented rule system which is somehow implemented when we speak. <br>
<br>
...<br>
<br>
The notion of emergence is a pregnant one. It is not intended to be a standard
sense of origins or genealogy, not a historical question of 'how' the grammar
came to be the way it 'is', but instead it takes the adjective emergent
seriously as a continual movement towards structure, a postponement <br>
or 'deferral' of structure, a view of structure as always provisional, always
negotiable, and in fact as epiphenomenal, that is at least as much an effect as
a cause.<br>
<br>
...<br>
<br>
Structure, then, in this view is not an overarching set of abstract principles,
but more a question of a spreading of systematicity from individual words,
phrases, and small sets. <br>
>>><br>
<br>
In engineering terms all I have been able to find is an approach called
"similarity modeling". This had some success improving speech
recognition scores using crude bigrams (generalized ad-hoc) some years ago: <br>
<br>
e.g . <a href="http://citeseer.ist.psu.edu/dagan99similaritybased.html"
target="_blank">http://citeseer.ist.psu.edu/dagan99similaritybased.html</a> <br>
<br>
There an earlier paper with the nicest quote. I think it is this one: <br>
<br>
Dagan, Ido, Shaul Marcus and Shaul Markovitch. Contextual word similarity and
estimation from sparse data, Computer, Speech and Language, 1995, Vol. 9, pp.
123-152. <br>
<br>
p.g. 4:<br>
<br>
"In a general perspective, the similarity-based approach promotes an
"unstructured" point of view on the way linguistic information should
be represented. While traditional approaches, especially for semantic
classification, have the view that information should be captured by the
maximal possible generalizations, our method assumes that generalizations
should be minimized. Information is thus kept at a maximal level of
detail, and missing information is deduced by the most specific analogies,
which are carried out whenever needed. Though the latter view seems
hopeless for approaches relying on manual knowledge acquisition, it may turn
very useful for automatic corpus-based approaches, and better reflect the
nature of unrestricted language." <o:p></o:p></span></font></p>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><br>
I don't think they put this "similarity modeling" in a theoretical
context with Hopper at all. And as I say, they only applied this to the
estimation of bigrams. But as a crude example of ad-hoc estimation of
grammatical parameters it goes in the right direction. <br>
<br>
To go further you need to generalize the representation of non-terminal
elements so they too can be vectors of examples. I don't think that is
difficult. I've used a kind of "cross-product".<br>
<br>
I had a parser on-line for a while which worked quite well doing this.<br>
<br>
As I say, the principles are very much like what has been done already with
machine learning/grammatical induction. We can use a lot of that.<br>
<br>
I'm sure the main thing we need to change is only the assumed goal of a single
complete grammar. <br>
<br>
-Rob<o:p></o:p></span></font></p>
</div>
</div>
</body>
</html>