<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">

<head>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">

<meta name=Generator content="Microsoft Word 11 (filtered medium)">

<!--[if !mso]>

<style>

v\:* {behavior:url(#default#VML);}

o\:* {behavior:url(#default#VML);}

w\:* {behavior:url(#default#VML);}

.shape {behavior:url(#default#VML);}

</style>

<![endif]-->

<style>

<!--

 /* Font Definitions */

 @font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

 /* Style Definitions */

 p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman";}

a:link, span.MsoHyperlink

        {color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {color:blue;

        text-decoration:underline;}

span.EmailStyle18

        {mso-style-type:personal-reply;

        font-family:Arial;

        color:navy;}

@page Section1

        {size:8.5in 11.0in;

        margin:1.0in 1.25in 1.0in 1.25in;}

div.Section1

        {page:Section1;}

-->

</style>

</head>

<body lang=EN-US link=blue vlink=blue>

<div class=Section1>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>There are still Chomskyans on the prowl

though:<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><a

href="http://www.amazon.com/Atoms-Language-Minds-Hidden-Grammar/dp/019860632X/ref=pd_bbs_sr_1/103-3873164-6759834?ie=UTF8&s=books&qid=1189095805&sr=8-1">http://www.amazon.com/Atoms-Language-Minds-Hidden-Grammar/dp/019860632X/ref=pd_bbs_sr_1/103-3873164-6759834?ie=UTF8&s=books&qid=1189095805&sr=8-1</a><o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>-Rich<o:p></o:p></span></font></p>

<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>

<div>

<div class=MsoNormal align=center style='text-align:center'><font size=3

face="Times New Roman"><span style='font-size:12.0pt'>

<hr size=2 width="100%" align=center tabindex=-1>

</span></font></div>

<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;

font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2

face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'>

corpora-bounces@uib.no [mailto:corpora-bounces@uib.no] <b><span

style='font-weight:bold'>On Behalf Of </span></b>Rob Freeman<br>

<b><span style='font-weight:bold'>Sent:</span></b> Wednesday, September 05,

2007 8:01 AM<br>

<b><span style='font-weight:bold'>To:</span></b> David Brooks; CORPORA@UIB.NO<br>

<b><span style='font-weight:bold'>Subject:</span></b> Re: [Corpora-List] Is a

complete grammar possible (beyond thecorpus itself)?</span></font><o:p></o:p></p>

</div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><o:p> </o:p></span></font></p>

<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3

face="Times New Roman"><span style='font-size:12.0pt'>Hi David,<br>

<br>

Thanks for looking at this so closely.<br>

<br>

Yes, you are exactly right. What I am suggesting is slightly different to

Oliver's interpretation. Though Oliver's issue (dialectal differences?) can be

integrated easily. <o:p></o:p></span></font></p>

<div>

<p class=MsoNormal><span class=gmailquote><font size=3 face="Times New Roman"><span

style='font-size:12.0pt'>On 9/5/07, <b><span style='font-weight:bold'>David

Brooks </span></b><<a href="mailto:d.j.brooks@cs.bham.ac.uk" target="_blank">d.j.brooks@cs.bham.ac.uk</a>>

wrote:</span></font></span><o:p></o:p></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><br>

First, I think that storing a corpus verbatim and attempting to recover<br>

different information according to context is a great idea for<br>

computational linguistics, and particularly in combining machine<br>

learning approaches into language models. However, I'm not sure how well <br>

it stands up (or whether it is even intended) as an account of human<br>

language learning. Is there evidence from psycholinguistics that<br>

supports or contradicts the claim that humans store all their linguistic<br>

experience?<o:p></o:p></span></font></p>

<div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><br>

I think there is evidence. The importance of collocation and detail of

phraseology could be interpreted as such.<o:p></o:p></span></font></p>

</div>

<p class=MsoNormal style='margin-bottom:12.0pt'><font size=3

face="Times New Roman"><span style='font-size:12.0pt'><br>

People will say there is clear evidence we recall only the gist. I agree, but

think this is because we "recall" based on meaning, and meaning is

defined by _sets_ of examples ( c.f. exemplar theory.) So we remember all the

individual examples in a set of exemplars, but can only "recall" the

set as a whole.<br>

<br>

For example. I can't "recall" everything I read verbatim, but,

anecdotally, I may hear a sentence from a book I read years ago, and

"remember" it instantly (including maybe what I was doing at the time

I read it.) <o:p></o:p></span></font></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'>Since "context" often includes the state of the world or

other beings, is the totality of human experience stored? <o:p></o:p></span></font></p>

<div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><br>

I don't think everything is stored, but what is stored I think is stored

verbatim. (We may lose bits of it, but what we lose is not systematic.)<o:p></o:p></span></font></p>

</div>

<blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;

margin-left:4.8pt;margin-right:0in'>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><o:p> </o:p></span></font></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'>Second, some of the most controversial (in terms of generating debate)

aspects of Chomsky's approach are those that suggest that language faculties

are innate and specialised to deal only with language. These still pertain (as

issues to address) in a "combine several models according to context"

approach:<br>

- which models will you use in your combination? Are they innate? Do they

represent "intelligence" that is specific to dealing with language

(as opposed to more general forms of intelligent behaviours)? <br>

- how do you define context? I assume context is defined in relation to a

model, so again, is this innate?<o:p></o:p></span></font></p>

</blockquote>

<div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><br>

Actually I'm not so much suggesting that we integrate existing models. I'm

suggesting we focus instead on ways of finding models, or grammars, in short

grammatical induction, especially distributional analysis.<br>

<br>

As far as contexts go I think typically grammatical induction gets good results

even just by clustering words on immediate contexts ( e.g. one word.)<br>

<br>

I only have one issue. As far as I am concerned grammatical induction has be

held up only by the (almost?) universal assumption it should be possible to

generalize grammar globally. <br>

<br>

Change that one assumption and I think we will immediately start to produce

very useful results.<br>

<br>

As a corollary I don't think the generalization mechanism is specific to

language at all. I am sure it is general to all perceptual (intelligent?)

behaviour. <o:p></o:p></span></font></p>

</div>

<blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;

margin-left:4.8pt;margin-right:0in'>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><o:p> </o:p></span></font></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'>How do you use context to trigger events?<o:p></o:p></span></font></p>

</blockquote>

<div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><o:p> </o:p></span></font></p>

</div>

<div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'>Crudely put, I filter the possible grammar of each word on its context.

Other than that it is done in much the same way as grammatical induction is

done now. <br>

<br>

In grammatical induction you cluster the contexts of a word to

"learn" a grammatical class for it. I do the same. It is just that

now I "learn" a different class for each word, depending on what word

is adjoining. So if "black" adjoins "coffee", I

"cluster" a different class for "black" than I would if

"cloud" were adjoining. <o:p></o:p></span></font></p>

</div>

<blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;

margin-left:4.8pt;margin-right:0in'>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><o:p> </o:p></span></font></p>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'>I'd also like to return to one of Rob's much earlier points: that there

is little previous work on ideas akin to his. I can see a few parallels between

Rob's suggestion and the work of Rodney Brooks (no relation) in the field of

behaviour-based robotics. Brooks claimed that robots with internal

representations of the world suffered because their models were perpetually

out-of-sync with the world. He suggested a "world as its own best

model" theory, where the robot operates on percepts obtained from the

world, and avoids internal representation. I see this as similar to Rob's

suggestion of keeping the corpus, which acts as our "world", and

avoiding a single-grammar abstraction that might not fully account for the

corpus. (I would agree with Diana Santos' claim that a corpus is only a sample

-- and an impoverished sample in terms of contextual information -- of the

world at a given time.)<o:p></o:p></span></font></p>

</blockquote>

<div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><br>

That robot work sounds good. I agree, this sounds like the kind of thing I

mean.<br>

<br>

Language is a much better test bed for such ideas, though, because it is so accessible.

It is difficult to model the "world" of a robot.<br>

<br>

The "world" of language is just the corpus.<br>

<br>

But you are looking for precedent. <br>

<br>

There is of course Paul Hopper's "Emergent Grammar". I think this is

essentially right, but hesitate to mention it because somehow it is always

mis-interpreted. The idea of something which cannot be described in terms of

rules just seems to be too subtle, and perhaps Paul has not had the maths to

formalize it. For whatever reason, people always seem to identify his "emergence"

with "evolution" of grammar (which he specifically denies below.)

Read correctly I think the ideas are all there. Only an implementation is

missing: <br>

<br>

Here is his famous paper from 1985(?):<br>

<br>

<a href="http://eserver.org/home/hopper/emergence.html" target="_blank">http://eserver.org/home/hopper/emergence.html

</a><br>

<br>

>>><br>

I am concerned in this paper with ... the assumption of an abstract, mentally

represented rule system which is somehow implemented when we speak. <br>

<br>

...<br>

<br>

The notion of emergence is a pregnant one. It is not intended to be a standard

sense of origins or genealogy, not a historical question of 'how' the grammar

came to be the way it 'is', but instead it takes the adjective emergent

seriously as a continual movement towards structure, a postponement <br>

or 'deferral' of structure, a view of structure as always provisional, always

negotiable, and in fact as epiphenomenal, that is at least as much an effect as

a cause.<br>

<br>

...<br>

<br>

Structure, then, in this view is not an overarching set of abstract principles,

but more a question of a spreading of systematicity from individual words,

phrases, and small sets. <br>

>>><br>

<br>

In engineering terms all I have been able to find is an approach called

"similarity modeling". This had some success improving speech

recognition scores using crude bigrams (generalized ad-hoc) some years ago: <br>

<br>

e.g . <a href="http://citeseer.ist.psu.edu/dagan99similaritybased.html"

target="_blank">http://citeseer.ist.psu.edu/dagan99similaritybased.html</a> <br>

<br>

There an earlier paper with the nicest quote. I think it is this one: <br>

<br>

Dagan, Ido, Shaul Marcus and Shaul Markovitch. Contextual word similarity and

estimation from sparse data, Computer, Speech and Language, 1995, Vol. 9, pp.

123-152. <br>

<br>

p.g. 4:<br>

<br>

"In a general perspective, the similarity-based approach promotes an

"unstructured" point of view on the way linguistic information should

be represented. While traditional approaches, especially for semantic

classification, have the view that information should be captured by the

maximal possible generalizations, our method assumes that generalizations

should be minimized.  Information is thus kept at a maximal level of

detail, and missing information is deduced by the most specific analogies,

which are carried out whenever needed.  Though the latter view seems

hopeless for approaches relying on manual knowledge acquisition, it may turn

very useful for automatic corpus-based approaches, and better reflect the

nature of unrestricted language." <o:p></o:p></span></font></p>

</div>

<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:

12.0pt'><br>

I don't think they put this "similarity modeling" in a theoretical

context with Hopper at all. And as I say, they only applied this to the

estimation of bigrams. But as a crude example of ad-hoc estimation of

grammatical parameters it goes in the right direction. <br>

<br>

To go further you need to generalize the representation of non-terminal

elements so they too can be vectors of examples. I don't think that is

difficult. I've used a kind of "cross-product".<br>

<br>

I had a parser on-line for a while which worked quite well doing this.<br>

<br>

As I say, the principles are very much like what has been done already with

machine learning/grammatical induction. We can use a lot of that.<br>

<br>

I'm sure the main thing we need to change is only the assumed goal of a single

complete grammar. <br>

<br>

-Rob<o:p></o:p></span></font></p>

</div>

</div>

</body>

</html>