form versus meaning

Mon Jan 13 18:41:54 UTC 1997

My computer went down this morning in the middle of a message, which seems
to have gone out anyway.  On the questionable assumption that
Half a Harangue is not better than none, I append the full text,
with apologies for troubling everyone 1.5 times to make the same
points. -liz

In response to Dan Everett's message about the priority of linguistics,
my research as a psycholinguist and developmentalist focuses primarily
on morphology, syntax and the lexicon.  In the psycholinguistic work,
I am asking about how these form-meaning mappings (and form-form mappings)
are processed in real time, and the results have what I believe to be
crucial implications for our understanding of HOW THEY ARE REPRESENTED
IN THE MIND.  How is that different, other than by methodology, with
the work that is conducted in "linguistics proper"?  I simply reject
Dan's premise that linguists are looking directly at language while
the rest of us are squinting sideways.

And now let's talk for just a moment more about the use of grammaticality
judgment as a way of staring directly at language....I'd like to make
four quick points.  The first two are NOT examples of psychology trying
to have hegemony of linguistics.  Rather, they pertain to strictures on
methodology that hold (or should hold) in every social science, as well
as agriculture, industry, anywhere where the investigator wants to draw
inferences that generalize beyond the sample in question.  The
second two are more psychological in nature, but I believe that they
have implications for the most central questions about structure.

1. Representativeness of the data base.  If you want to know how your
corn crop is going to fare, it is widely acknowledged in agriculture
that it would be unwise to look at the four plants right outside your
window (assuming this isn't your whole crop....).  A truism of all
empirical science is that the data base from which we draw our
generalizations should be representative (either through random
sampling, or careful construction of the data base a priori) of
the population to which we hope to generalize.  In research
on language, this constraint holds at two levels: in the human subjects
that we select (e.g. the people who are giving the grammaticality
judgments) and in the linguistic materials we choose to look at (e.g.
the sentences selected/constructed to elicit grammaticality judgments).

These strictures are typically ignored, as best I can tell, in the
day-to-day work of theoretical linguists who rely on grammaticality
judgments as their primary data base.
In fact, we have known since the 1970's that the grammaticality judgments
made by linguists do not correlate very well with the judgments made
by naive native speakers.  These differences have been explained away
by stating that naive native speakers don't really know what they are
doing, only linguists know how to strip away the irrelevant semantic,
pragmatic or performance facts and focus their judgments on the
structures they really care about.  Which, in turn, presupposes a
theory of the boundary conditions on those facts -- introducing, I
should think, a certain circularity into the relationship between
theory and data.  In any case, by using a very restricted set of
judges, the assumption that one can generalize to 'native speaker
competence" may be at risk.  Instead of a theory of grammar, we
may have a theory of grammar in Building 10.  At this point I should
stress that ALL the sciences studying language have problems of
generalizability.  In psycholinguistics, we want to generalize to
all normal adult native speakers of the language in question, but
most of our data come from middle class college sophomores.  In
developmental psycholinguistics, we want to generalize to all normal
children who are acquiring this language, but are usually stuck with
data from those middle class children willing to sit through our
experiments, which means that we may have a theory of language in
the docile child....In short, I am not proposing that only linguists
have this problem, but I think the problem of generalizability may be
more severe if grammaticality judgments come from only a handful
of experts.

2. Reliability of the data base.  If you weigh your child on your
bathroom scales twice in a row, and get a reading of 50 pounds
on one measurement and 55 pounds on the next, you need to worry
about the reliability of your instrument, i.e. the extent to which
it correlates with itself in repeated measurements.  Reliability is
a serious problem in every science, and is often the culprit when
results don't replicate from one laboratory to another (or from
one experiment to another in the same laboratory).  My experience
in graduate courses in syntax and other limited exposure to
theoretical linguistics suggests to me that there may be a
reliability problem in the use of grammaticality judgments.
Even in the same restricted set of judges, with similar sentences
materials (see above), a sentence that is ungrammatical at 4 p.m.
may become grammatical by 6 o'clock, at the end of a hard day.
To be sure, there are many kinds of errors that EVERYONE agrees
about, EVERY time they are presented.  But these clear cases are
not the ones that drive the differences between formal theories, as
best I can tell.  Theoretical shifts often seem to depend on the
more subtle cases -- the very ones that are most subject to the
reliability problem.  And of course, reliability interacts
extensively with the representativeness problem described above
(i.e. performance on one half of the target structures in a given
category may not correlate very highly with performance on the
other half, even though they are all supposed to be about the
same thing...).

3. Timing.  In a recent paper in Language and Cognitive Processes,
Blackwell and Bates looked at the time course of grammaticality
judgment, i.e. the point at which a sentence BECOMES ungrammatical
for naive native speakers.  The punchline is that there is tremendous
variability over sentences and over subjects in the point at which
a sentence becomes "bad", even for sentences on which (eventually)
everyone agrees that a violation exists.  For some error types, it is
more appropriate to talk about a "region" in which decisions are made,
a region that may span a lot of accumulating structure.  This is
relevant not only to our understanding of grammaticality judgment
as a psychological phenomenon, but also to our understanding of
the representations that support such judgments: if two individuals
decide that a sentence is "bad" at completely different points (early
vs. late), then it follows that they are using very different
information to make their decision, a fact that is surely relevant
for anyone's theory of well-formedness.

4. Context effects.  Finally, there are multiple studies showing that
violations interact, with each other and with the rest of the sentence
and discourse context.  A sentence that is "bad" in one context may be
"good" in another, and a sentence that is "bad" with one set of
lexical items may become "good" with a slightly different set, even
though those two sets do not differ along what are supposed to be
grammatically relevant conditions (i.e. we substitute a transitive
verb for a transitive verb, an animate noun for an animate noun,
and so forth).

My point is NOT to denigrate linguistic methodology, because I have
nothing to offer that is better.  But I think the above problems
should make us worry a lot about a "core" theory that is built
exclusively out of one kind of data.  To go back to my first point,
in my first volley during this discussion (this should probably be
my last, to round things out): we need all the constraints we can
get, all the data we can get, all the methods we can find, and it
is not yet the moment to declare that any of these methods or fields
have priority over the others. -liz bates