analysis: unhappiness

Daniel Everett dan at daneverett.org
Sun Sep 12 01:35:54 UTC 2010


To clarify, I actually *do* buy the distinction,  not as a false dichotomy, not as something that every linguist/psychologist should be concerned with, but in the way I just outlined.

Dan

On 11 Sep 2010, at 21:32, Daniel Everett wrote:

> Sorry, Matt, for being careless in my attribution of that distinction. I do believe that there is such a distinction, however, and that it is very important. If you are after careful and deep understanding of the nature of representations, then the 'in the head' focus is important. If you are more interested (always, occasionally, etc) in how discourses, sentences, words, phrases and so on are underwritten by/structured by or merely interact with cultural values, then there is a sense in which the focus is on language outside the head, in the community.
> 
> Even cognition works this way, I think. For example, I know somethings about trees. But even greater knowledge, knowledge I can access, about trees is found in my culture. In studying knowledge, I might want to know what people know individually, or I might want to know about the knowledge of cultures, some of which will not be known/mastered by any one member of the culture, and how the cultural knowledge affects/is accessed by, individuals. Values are other things that can have both a psychological and a cultural existence. 
> 
> Dan
> 
> 
> 
> 
> On 11 Sep 2010, at 21:26, Matthew S. Dryer wrote:
> 
>> 
>> Dan etc,
>> 
>> There have unfortunately been two sub-threads with the same subject heading "Re:
>> [FUNKNET] analysis: unhappiness".  Even I have found the
>> Gibson-MacWhinney-Everett-Hudson subthread the more interesting one.  I have
>> pursued the other one only because it wasn't clear that others hadn't
>> misunderstood what I was trying to say.
>> 
>> Unfortunately, the following comment from Dan seems to illustrate further that I
>> haven't made myself clear:
>> 
>> "On another note, I don't buy the 'in my head' 'out of my head' distinction
>> either (that Matt seems to be urging upon us)."
>> 
>> BUT, it was Lise who urged that distinction on us.  The whole point of my emails
>> has been to deny such a distinction, to argue that the only reality is the "in
>> the head" one.  In fact, the gradual convergence of thinking in the Gibson-etc
>> subthread seems to reflect the idea that despite apparent differences, there is a
>> common underlying goal.
>> 
>> Matthew
>> 
>> On Fri 09/10/10  9:05 PM , Daniel Everett dan at daneverett.org sent:
>>> I think that Brian and Dick make excellent points. There are very good
>>> grammars written that could be improved by psycholinguistic experimentation
>>> and more quantitative approaches. But large sections of those grammars
>>> aren't going to change one bit (go-went) with quantitative tests and such
>>> tests would be completely counterproductive given the shortness of life and
>>> the vastness of the field linguist's tasks.
>>> Part of the problem is that linguistics is not simply a subdiscipline of
>>> psychology. Linguistics has its own objectives and those only occasionally
>>> overlap with psychology. The same for methods.
>>> On another note, I don't buy the 'in my head' 'out of my head' distinction
>>> either (that Matt seems to be urging upon us). We study different things
>>> and have different reasons for being satisfied with the results we
>>> achieve.
>>> I believe that  we linguists are often complacent and fail to apply better
>>> methods. But of course that applies to all disciplines. 
>>> In the meantime, checking corpora, collecting data as a result of careful
>>> interviews with native speakers, and the other aspects of the field
>>> linguist's task are vital parts of the linguist's task and much of this
>>> won't be improved by quantitative methods as we currently understand them.
>>> Maybe sometime.
>>> Dan  
>>> 
>>> P.S. In my original reference to Ted and Ev's paper, I said that they
>>> showed the danger of using intuitions. What I meant to say of using
>>> intuitions as standardly used by linguists. They convinced me that there is
>>> a lot to learn from quantitative methods.
>>> On 10 Sep 2010, at 19:40, Richard Hudson wrote:
>>> 
>>>> Dear Ted and Ev,
>>>> Yes, I understand your view, but I think it's a
>>> psycholinguist's view. Your goal is to find general processes and
>>> principles that apply uniformly across individuals, so you have to use
>>> methods to check for generality. And (as you know) I admire the way you
>>> pursue that goal. But my goal, as a linguist, is different. I want to
>>> explore the structure of a language so that I can understand how all the
>>> bits fit together. Like you, I'm aiming to model cognition, but my focus is
>>> on items and structures, and I start from the assumption that these can and
>>> do vary across speakers.> 
>>>> However, having said all that I do agree with
>>> you that linguists should all get used to collecting and using quantitative
>>> data; and with the help of Brian MacWhinney's typology we'd know what
>>> methods to use when. And I do agree with your points about bid/bidded: in
>>> cases like that, quantitative data would be at least a very good starting
>>> point for a proper investigation.> 
>>>> Best wishes, Dick
>>>> 
>>>> Richard Hudson www.phon.ucl.ac.uk/home/dick/home.htm> 
>>>> On 10/09/2010 19:30, Ted Gibson
>>> wrote:>> Dear Dick:
>>>>> 
>>>>> Perhaps we are talking at cross purposes. I
>>> don't understand what is confusing about what Ev Fedorenko and I are
>>> claiming. All we are saying is that if you have some testable claim
>>> involving a general hypothesis about a language, then you need to get
>>> quantitative data from unbiased sources to evaluate that claim. If you are
>>> interested in English past tense morphology, then depending on the claims
>>> that you might want to investigate, there are lots of ways to get relevant
>>> quantitative evidence. Corpus data will probably be useful. For very low
>>> frequency words, you can run experiments to test behavior with respect to
>>> such words.>> 
>>>>> Your example of the past tense of
>>> "bid" is a fine such example. You can run an experiment like the
>>> one you suggested to find out what people think the past tense is. If you
>>> then found that 20/50 people responded "bidded" and 30/50 respond
>>> "bid", that is a lot of useful information. As you suggest in
>>> your discussion, this result wouldn't answer the question of how past tense
>>> is stored in each individual. This result would be ambiguous among several
>>> possible explanations. One possibility is that the probability distribution
>>> that is being discovered reflects different dialects, such that 2/5 of the
>>> population has one past tense, and 3/5 has another. Another possibility is
>>> that each person has a similar probability distribution in their heads,
>>> such that 2/5 of the time I respond one way, and 3/5 of the time I respond
>>> another. Further experiments would be necessary to answer between these and
>>> other possible theories (e.g., with repeated trials from the same person,
>>> carefully planned so that the participants don't notice that they are being
>>> asked multiple times). Without the quantitative evidence in the first
>>> place, there is no way to answer these kinds of questions.>> 
>>>>> Regarding the past tense of "go",
>>> this would be useful as a baseline in an experiment involving the less
>>> frequent ones. So, yes, it would useful to gather quantitative evidence in
>>> such a case also, as baselines with respect to the more interesting cases
>>> for theories.>> 
>>>>> The bottom line: if you have a
>>> generalization about a language that you wish to evaluate (such that you
>>> hypothesize that it is true across the speakers of the language), then you
>>> need quantitative evidence from multiple individuals, using an unbiased
>>> data collection method, to evaluate such a claim. The point about
>>> Mechanical Turk is that it is really *easy* to do this now, at least for
>>> languages like English.>> 
>>>>> Best wishes,
>>>>> 
>>>>> Ted Gibson & Ev Fedorenko
>>>>> 
>>>>> On Sep 10, 2010, at 1:59 PM, Richard Hudson
>>> wrote:>> 
>>>>>> Dear Ted,
>>>>>> Thanks for the very interesting comment,
>>> but are you REALLY saying that I shouldn't claim, for example, that the
>>> past tense of GO is "went" without first cross-checking with 50
>>> native speakers?>>> 
>>>>>> Isn't there a danger of missing the
>>> point that we all, as native speakers, spend our whole lives scanning other
>>> people's linguistic behaviour (language 'out there', E-language) and trying
>>> to explain it to ourselves in terms of a language system (language 'in
>>> here', I-language)? So every judgement we make is based on thousands or
>>> millions of observed exemplars, and reflects a unique experience of
>>> E-language filtered through a unique I-language.>>> 
>>>>>> Given that view of language development,
>>> I don't see how quantitative data will help. Let's take a real uncertainty,
>>> such as the past tense of BID. If I want to say I did it, do I say "I
>>> bidded" or "I bid"? My judgement: I don't know. Ok, you get
>>> 50 people to oblige on Mechanical Turk, and 20 of them give
>>> "bidded" and 30 "bid". So what? Does that mean that the
>>> correct answer is "bidded"? Surely not. How is it better than my
>>> judgement? I agree you could record my speech and find how often I use each
>>> alternative; but the reason I don't know is precisely because it's a rare
>>> word, so in a sense quantitative data are irrelevant even there. What would
>>> solve the problem of subjectivity, of course, would be a machine for
>>> probing the bit of my mind (or even brain) that holds BID and its details;
>>> but I suspect that even that wouldn't move us much further forward than my
>>> original "don't know". (Incidentally I write as a fan of
>>> quantitative sociolinguistics, so I do accept that quantitative data are
>>> relevant to linguistic analysis in some areas, where the I-language
>>> phenomenon is frequent enough to produce usable data.)>>> 
>>>>>> It seems to me that this discussion
>>> raises the really fundamental question of what kind of thing we think
>>> language is: social or individual. The problem isn't unique to linguistics
>>> of course; it's the same throughout the social sciences. But what's special
>>> about linguistics is that we deal in very fine details of culture (e.g.
>>> details of how a particular word is used or pronounced) so the differences
>>> between individuals really matter. I don't see that we're ever going to
>>> have anything better than judgements to go on, so what we need is a way to
>>> ensure that judgements are accurate reports of individual I-language. A
>>> rotten situation for a science, but I don't see how it can get
>>> better.>>> 
>>>>>> Dick
>>>>>> 
>>>>>> Richard Hudson www.phon.ucl.ac.uk/home/dick/home.htm>>> 
>>>>>> On 10/09/2010 14:03, Ted Gibson
>>> wrote:>>>> Dear Dan, Dick:
>>>>>>> 
>>>>>>> I would like to clarify some points
>>> that Dan Everett makes, in>>>> response to Dick Hudson.
>>>>>>> 
>>>>>>> Ev Fedorenko and I have written a
>>> couple of papers recently (Gibson &>>>> Fedorenko, 2010, in press, see
>>> references and links below) on what we>>>> think are weak methodological
>>> standards in syntax and semantics>>>> research over the past many years.
>>> The issue that we address is the>>>> prevalent method in syntax and
>>> semantics research, which involves>>>> obtaining a judgment of the
>>> acceptability of a sentence / meaning>>>> pair, typically by just the author
>>> of the paper, sometimes with>>>> feedback from colleagues. As we
>>> address in our papers, this>>>> methodology does not allow proper
>>> testing of scientific hypotheses>>>> because of (a) the small number of
>>> experimental participants>>>> (typically one); (b) the small
>>> number of experimental stimuli>>>> (typically one); (c) cognitive
>>> biases on the part of the researcher>>>> and participants; and (d) the effect
>>> of the preceding context (e.g.,>>>> other constructions the researcher
>>> may have been recently>>>> considering). (As Dan said, see
>>> Schutze, 1996; Cowart, 1997; and>>>> several others cited in Gibson &
>>> Fedorenko, in press; for similar>>>> points, but with not as strong a
>>> conclusion as ours).>>>> 
>>>>>>> Three issues need to be separated
>>> here: (1) the use of intuitive>>>> judgments as a dependent measure in
>>> a language experiment; (2)>>>> potential cognitive biases on the
>>> part of experimental subjects and>>>> experimenters in language
>>> experiments; and (3) the need for obtaining>>>> quantitative evidence, whatever the
>>> dependent measure might be. The>>>> paper that Ev and I wrote addresses
>>> the last two issues, but does not>>>> go into depth on the first issue
>>> (the use of intuitions as a dependent>>>> measure in language experiments).
>>> Regarding this issue, we don't think>>>> that there is anything wrong with
>>> gathering intuitive judgments as a>>>> dependent measure, as long as the
>>> task is clear to the experimental>>>> participants.
>>>>>>> 
>>>>>>> In the longer paper (Gibson &
>>> Fedorenko, in press) we respond to some>>>> arguments that have been given in
>>> support of continuing to use the>>>> traditional non-quantitative method
>>> in syntax / semantics research.>>>> One recent defense of the
>>> traditional method comes from Phillips>>>> (2008), who argues that no harm has
>>> come from the non-quantitative>>>> approach in syntax research thus
>>> far. Phillips argues that there are>>>> no cases in the literature where an
>>> incorrect intuitive judgment has>>>> become the basis for a widely
>>> accepted generalization or an important>>>> theoretical claim. He therefore
>>> concludes that there is no reason to>>>> adopt more rigorous data collection
>>> standards. We challenge Philips�>>>> conclusion by presenting three cases
>>> from the literature where a>>>> faulty intuition has led to
>>> incorrect generalizations and mistaken>>>> theorizing, plausibly due to
>>> cognitive biases on the part of the>>>> researchers.
>>>>>>> 
>>>>>>> A second argument that is sometimes
>>> presented for the continued use of>>>> the traditional non-quantitative
>>> method is that it would be too>>>> inefficient to evaluate every
>>> syntactic / semantic hypothesis or>>>> phenomenon quantitatively. For
>>> example, Culicover & Jackendoff (2010)>>>> make this argument explicitly in
>>> their response to Gibson & Fedorenko>>>> (2010): �It would cripple
>>> linguistic investigation if it were required>>>> that all judgments of
>> ambiguity and
>>> grammaticality be subject to>>>> statistically rigorous experiments
>>> on naive subjects, especially when>>>> investigating languages whose
>>> speakers are hard to access� (Culicover>>>> & Jackendoff, 2010, p. 234).
>>> (Dick Hudson makes a similar point>>>> earlier in the discussion here.)
>>> Whereas we agree that in>>>> circumstances where gathering data
>>> is difficult, some evidence is>>>> better than no evidence, we do not
>>> agree that research would be slowed>>>> with respect to languages where
>>> experimental participants are easy to>>>> access, such as English. In
>>> contrast, we think that the opposite is>>>> true: the field�s progress
>>> is probably slowed by not doing>>>> quantitative research.
>>>>>>> Suppose that a typical syntax /
>>> semantics paper that lacks>>>> quantitative evidence includes
>>> judgments for 50 or more sentences />>>> meaning pairs, corresponding to 50
>>> or more empirical claims. Even if>>>> most of the judgments from such a
>>> paper are correct or are on the>>>> right track, the problem is in
>>> knowing which judgments are correct.>>>> For example, suppose that 90% of the
>>> judgments from an arbitrary paper>>>> are correct (which is probably a
>>> high estimate). (Colin Phillips and>>>> some of his former students /
>>> postdocs have commented to us that, in>>>> their experience, quantitative
>>> acceptability judgment studies almost>>>> always validate the claim(s) in the
>>> literature. This is not our>>>> experience, however. Most
>>> experiments that we have run which attempt>>>> to test some syntactic / semantic
>>> hypothesis in the literature end up>>>> providing us with a pattern of data
>>> that had not been known before the>>>> experiment (e.g., Breen et al., in
>>> press; Fedorenko & Gibson, in>>>> press; Patel et al., 2009; Scontras
>>> & Gibson, submitted).) This means>>>> that in a paper with 50 empirical
>>> claims 45/50 are correct. But which>>>> 45? There are 2,118, 760 ways to
>>> choose 45 items from 50. That�s over>>>> two million different theories. By
>>> quantitatively evaluating the>>>> empirical claims, we reduce the
>>> uncertainty a great deal. To make>>>> progress, it is better to have
>>> theoretical claims supported by solid>>>> quantitative evidence, so that even
>>> if the interpretation of the data>>>> changes over time as new evidence
>>> becomes available � as is often the>>>> case in any field of science
>>> � the empirical pattern can be used as a>>>> basis for further
>>> theorizing.>>>> 
>>>>>>> Furthermore, it is no longer
>>> expensive to run behavioral experiments,>>>> at least in English and other widely
>>> spoken languages. There now>>>> exists a marketplace interface
>>> � Amazon.com�s Mechanical Turk � which>>>> can be used for collecting
>>> behavioral data over the internet quickly>>>> and inexpensively. The cost of using
>>> an interface like this is>>>> minimal, and the time that it takes
>>> for the results to be returned is>>>> short. For example, currently on
>>> Mechanical Turk, a survey of>>>> approximately 50 items will be
>>> answered by 50 or more participants>>>> within a couple of hours, at a cost
>>> of approximately $1 per>>>> participant. Thus a survey can be
>>> completed within a day, at a cost of>>>> less than $50. (The hard work of
>>> designing the experiment, and>>>> constructing controlled materials
>>> remains of course.)>>>> 
>>>>>>> Sorry to be so verbose. But I think
>>> that these methodological points>>>> are very important.
>>>>>>> 
>>>>>>> Best wishes,
>>>>>>> 
>>>>>>> Ted Gibson
>>>>>>> 
>>>>>>> Gibson, E. & Fedorenko, E. (In
>>> press). The need for quantitative>>>> methods in syntax and semantics
>>> research. Language and Cognitive>>>> Processes.
>> http://tedlab.mit.edu/tedlab_website/researchpapers/Gibson>>>> & Fedorenko InPress
>>> LCP.pdf>>>> 
>>>>>>> Gibson, E. & Fedorenko, E.
>>> (2010). Weak quantitative standards in>>>> linguistics research. Trends in
>>> Cognitive Science, 14, 233-234.>>>>
>> http://tedlab.mit.edu/tedlab_website/researchpapers/Gibson & Fedorenko>>>> 2010
>> TICS.pdf
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> Dick,
>>>>>>>> 
>>>>>>>> You raise an important issue
>>> here about methodology. I believe that>>>>> intuitions are a fine way to
>>> generate hypotheses and even to test>>>>> them - to a degree. But while it
>>> might not have been feasible for>>>>> Huddleston, Pullum, and the
>>> other contributors to the Cambridge>>>>> Grammar to conduct experiments
>>> on every point of the grammar,>>>>> experiments could have only made
>>> the grammar better. The use of>>>>> intuitions, corpora, and
>>> standard psycholinguistic experimentation>>>>> (indeed, Standard Social Science
>>> Methodology) is vital for taking the>>>>> field forward and for providing
>>> the best support for different>>>>> analyses. Ted Gibson and Ev
>>> Fedorenko have written a very useful new>>>>> paper on this, showing serious
>>> shortcomings with intuitions as the>>>>> sole source of evidence, in
>>> their paper: "The need for quantitative>>>>> methods in syntax and semantics
>>> research".>>>>> 
>>>>>>>> Carson Schutze and Wayne Cowart,
>>> among others, have also written>>>>> convincingly on this.
>>>>>>>> 
>>>>>>>> It is one reason that a team
>>> from Stanford, MIT (Brain and Cognitive>>>>> Science), and researchers from
>>> Brazil are beginning a third round of>>>>> experimental work among the
>>> Pirahas, since my own work on the syntax>>>>> was, like almost every other
>>> field researcher's, based on native>>>>> speaker intuitions and
>>> corpora.>>>>> 
>>>>>>>> The discussion of methodologies
>>> reminds me of the initial reactions>>>>> to Greenberg's work on
>>> classifying the languages of the Americas. His>>>>> methods were strongly (and
>>> justifiably) criticized. However, I always>>>>> thought that his methods were a
>>> great way of generating hypotheses,>>>>> so long as they were ultimately
>>> put to the test of standard>>>>> historical linguistics methods.
>>> And the same seems true for use of>>>>> native-speaker
>>> intuitions.>>>>> 
>>>>>>>> -- Dan
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>>> We linguists can add a
>>> further layer of explanation to the>>>>>> judgements, but some
>>> judgements do seem to be more reliable than>>>>>> others. And if we have to
>>> wait for psycholinguistic evidence for>>>>>> every detailed analysis we
>>> make, our whole discipline will>>>>>> immediately grind to a halt.
>>> Like it or not, native speaker>>>>>> judgements are what put us
>>> linguists ahead of the rest in handling>>>>>> fine detail. Imagine writing
>>> the Cambridge Grammar of the English>>>>>> Language (or the OED)
>>> without using native speaker judgements.>>>>>> 
>>>>>>>>> Best wishes, Dick
>>> Hudson>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> 



More information about the Funknet mailing list