Hypothesis formation vs. testing

Tue Aug 24 15:03:28 UTC 1999

On Thu, 19 Aug 1999 ECOLING at aol.com wrote:

> Back to Trask:

>> Now, in order to go about this, I maintain, [Ruhlen] should start with the
>> negation of this statement as his null hypothesis, and then go on to
>> show that there is so much evidence against this null hypothesis that it
>> is untenable and must be rejected.  But that's not what he does.

> The last paragraph above is in complete contradiction to what Larry
> Trask says he agrees with ("I fully agree"...).  If one believes it
> is not possible to test a proposition, then it is NOT REASONABLE to
> ask anyone else to test it.  One cannot have this both ways.

Not at all.

I myself do not believe that monogenesis can possibly be investigated by
purely linguistic methods, and so I have no interest at all in trying to
investigate it.

Ruhlen, in great contrast, *does* believe that monogenesis can be
investigated by linguistic methods.  OK, fine: he's entitled to believe
this if he wants.  But, then, if he wants to conduct such an
investigation, he must go about this in a principled and rigorous
manner.  However, he doesn't -- not at all.  So I have every right to
criticize his work, not because his beliefs are different from mine, but
because his procedures are unacceptable.

I have *not* asked Ruhlen to test monogenesis.  I have merely said that
(a) I don't believe it can be tested, and (b) Ruhlen's procedures are
wholly inadequate for conducting a test of anything.  No contradiction
there.

[LT]

>> Instead, he *starts* with the hypothesis `All languages are related',
>> and then proceeds to assemble what he sees as evidence in support of
>> this last hypothesis.  Amazingly enough [;-)]. he is able to find such
>> evidence.

> So far, this is legitimate in principle [but on practice, see below]
> IF the purpose is to establish the plausibility of a hypothesis
> (as distinct from testing it, NOTICE!).

But this is absolutely *not* what Ruhlen believes he is doing.  Read
what he has written.  Ruhlen plainly believes that he has not merely
tested monogenesis but proved it.  He says so in plain English.

> This is how almost all hypotheses are first established as hypotheses,
> simply by accumulating suggestive, anecdotal, case-study evidence,
> in contexts in which we do not even know how to estimate chance
> very well.

But we can certainly estimate chance well enough to include estimates of
chance in our investigations, crude though these may be at present.
Ruhlen, however, simply excludes chance altogether in his work.  As far
as he is concerned, chance resemblances arise so rarely that they can be
*completely* discounted, and therefore *all* resemblances must be
cognates.  This is not a secret: look at what he does, or merely read
pp. 12-14 of his book The Origin of Language (Wiley).

> The contradictory of the strong claim (all related) is that there
> are at least two languages which are not related to each other
> genetically. I would doubt that Ruhlen had evidence to exclude this
> possibility, or that if asked clearly, he would say so.

You don't have to wonder about Ruhlen's position.  Read what he says on
p. 213 of his book, for example.  Ruhlen asserts that it is *proved*
that all languages are related.

> After all (trivially) there are languages for which there are only
> one or two words attested, and one can go on from there with very
> little work to find other cases where I think Ruhlen would grant
> there is not even a loose probability based on the data itself to
> establish any relationship.

I'm afraid you are putting words into Ruhlen's mouth.  Ruhlen, so far as
I am aware, has *never* admitted in print that there exist any languages
at all which are certainly, probably or even possibly not related to all
the others.  If anybody can locate such a passage in R's published work,
I will be glad to hear about it.

[LT]

>> This fundamental failure to understand proper methodology is enough to
>> render Ruhlen's work vacuous,

> Not so, since Ruhlen can be treated as involved in hypothesis
> FORMATION not hypothesis testing.

This is plainly *not* what Ruhlen sees himself as doing.

[LT]

>> quite apart from the vast number of
>> egregious errors in the material he cites as evidence,

> Now THAT is quite another matter, and when present in very large
> quantity, not merely slight differences from the analysis an expert in
> a particular language would offer but more serious, complete
> misunderstandings vitiating completely any use of particular data...
> it does discredit the work as a whole, and can quite legitimately,
> even without absolute proof of its wrong-headedness,
> lead reasonable people to pay no more attention to it.
> But note carefully the caveat above.  It is NOT sufficient merely to
> provide minor improvements of detail to the presentation,
> to discredit the work.  An expert can ALWAYS provide minor
> improvements.  That itself shows nothing at all.

We are not talking about minor improvements.  At least as far as Basque
is concerned, the errors in Ruhlen's data are so awful as to be beyond
salvation.  Anyway, he never pays any attention to my corrections: he
just tells me I must be wrong because his comparisons are so compelling.

[LT]

>> and quite apart
>> from his failure to realize that lookalikes do not constitute evidence
>> of any kind.

> Disagree flatly, unless defined circularly so that "lookalikes"
> means more than it says, namely so that it means "lookalikes which
> are known to be unrelated as cognates".

No.  By `lookalikes', I mean words or morphemes which, in the opinion of
some investigator, are similar in form and meaning.  This is in no way
circular, though it is certainly highly objective, at least until
tightened up by the provision of fully explicit criteria for adjudging
similarity.

> If it actually means "items which look alike in sound and meaning",
> then of course such comparisons DO constitute PRELIMINARY evidence.

Sure, but this isn't very interesting.  We can always find lookalikes
between any arbitrary languages.

> Any such preliminary evidence can be discounted by showing
> that the resemblances are secondary and late,
> or that they manifest a type of sound symbolism,
> or in other ways.

All of which I have in fact done, in Ruhlen's case, to no avail.

> It was lookalikes in grammar and vocabulary which led to
> the original hypothesis of the relatedness of the Indo-European
> languages.  Some of these turned out to be true cognates,
> some turned out not to be cognates, merely chance lookalikes.
> But the IE hypothesis thus preliminarily established withstood the
> discounting of some of the lookalikes as non-cognates
> and the reaffirmation of others a true cognates (whatever the
> terminology used at the time).

Sorry, but I can't see this as a fair characterization of the discovery
of IE.  What did the trick was not just miscellaneous lookalikes but the
observation of shared morphological paradigms.

> Once again, I wish to urge us back to the FACTS.
> And those FACTS include whatever we can establish about how
> each of our tools works, where it works well and where it fails,
> how deep historically each tool can push us with languages
> of certain types or with language changes of certain types,
> and whatever we can establish about new tools we have not yet
> systematically used (such as explicit paths of historical
> change in sound systems and in semantic spaces, and metrics
> of distances along such paths of change...).

> We get nowhere by repeating the discrediting of STRAW MAN
> claims, by holding hypothesis formation to standards of absolute
> hypothesis testing, by counting minor corrections and improvements
> to data as completely discrediting use of the data when they do not,
> etc. etc. and so forth.

> The field is at an impasse in these discussions,
> until we return the discussion to an empirical basis.
> Pure philosophy will not get us much progress.

And neither will the accumulation of lookalikes.

Larry Trask
COGS
University of Sussex
Brighton BN1 9QH
UK

larryt at cogs.susx.ac.uk