# STATISTICS IN LINGUISTICS

Patrick C. Ryan proto-language at email.msn.com
Wed Feb 3 01:33:57 UTC 1999

```[ moderator re-formatted ]

Dear Steve and IEists:

-----Original Message-----
From: X99Lynx at aol.com <X99Lynx at aol.com>
Date: Tuesday, February 02, 1999 2:24 AM

>I wrote:

><<Predictive power is based on an effective understanding of cause and effect
>relationships.>>

>Patrick C. Ryan replied:

><<Why does it seem to be so hard to understand that the relationship of cause
>and effect is statistical?>>

>Because it is not true.

>The relationship of cause and effect is physical or chemical or cultural, etc.
>Statistics can be extremely useful in establishing such relationships.  But
>I'm afraid statistics do not equal cause and effect.

You have said what I am saying in other words:

1) I said: "the RELATIONSHIP of cause and effect is statistical";

2) you said: "statistics can be extremely useful in establishing such
relationships".

I have NOT said  that "statistics (. . .) equal casue and effect". I am not
even sure what that is supposed to mean in English.

><<If you have Cause A, and you predict successfully Effect B, every time, then
>the relationship is <1> or 100% PROBABILITY.>>

>You are definitely jumping the gun here.  You are already telling me there is
>a cause and effect relationship BEFORE YOU'VE PROVEN IT.  You are presuming
>cause and effect before have statistically shown it.

Above, I said that the coincidence of A and B was 100%. That 100% defines
causality.

>The best you can say here is that if A occurs and then B occurs, everytime,
>there is some probability that A causes B.

Wrong. If A then B, every time, and there is no reason to think that will
ever change, then, whether we ever correctly understand the modality of the
causation, there is a causal relationship between A and B.

>HOWEVER, if your assumptions are flawed, you are not proving cause and effect
>with this.  All that this demonstrates is a 100% CORRELATION. But NO cause and
>effect relationship has been established.  And this should not be hard to
>understand.

What I find amazing is that you would think a "100% CORREKATION" does not
establish a cause and effect relationship.

>The classic classroom example is: EVERYTIME you see people carrying umbrellas,
>it ends up raining.  Based on that, you conclude that umbrellas cause rain.
>(Everytime equals "100% probability.")

Sophomoric! The correct causal relationship is:

1. Whenever there is a perceived prospect of rain (A), people carry
umbrellas (B).

>Even a very high correlation does not equal causation.

I spoke only of a 100% correlation.

>This is very important in a field like historical linguistics, where you do
>not have an independent variable to manipulate and therefore don't have the
>hard experimental controls you get in a lab.

Why is the prospect of analyzing linguistic data rigorously, employing
mathematical models, so frightening to you?

>With improper analysis, statistics are not just worthless.
>They are damaging.

Statistics are never worthless. But like anything in this world, they can be
poorly interpreted, and improperly applied.

>And of course the other thing that is inaccurate is "100% probability".  Until
>there is an end of time, there is no such thing.  Because no matter how many
>"n" times A leads to B, there is always "n + 1."  If you want to claim it, the
>best you get is 99% in this world.

I can see why you prefer not to deal with mathematical models. If you have
100 trials, and the same cause has the same effect, the probability of the
cause creating the same effect again is 100%. Not 99%. Not 98%. Infinity is
not a factor in this equation.

>As far as historical linguistics goes, statistical analysis could be a very
>powerful tool.

Yes. Why not use it?

>But all it is is a tool.

So?

>And if its limitations are misunderstood, it can be and has been used to prove
>all kinds of nonsense.

Ah, there is the real crux. Someone like Ringe comes up with proper
conceived math, and strange conclusions.

There are no limitations to statistics. There are only limitations of the
abilities of the people who employ statistics. These same limitations will
appear to effect results adversely no matter what "tools" used.

GCOG: Garbage trucks carry only garbage.

Pat

PATRICK C. RYAN <PROTO-LANGUAGE at email.msn.com>
(501) 227-9947; FAX/DATA (501)312-9947
9115 W. 34th St. Little Rock, AR 72204-4441 USA
WEBPAGES: <http://www.geocities.com/Athens/Forum/2803>
and PROTO-RELIGION:
<http://www.geocities.com/Athens/Forum/2803/proto-religion/indexR.html>
"Veit ek, at ek hekk, vindga meipi, nftr allar nmu,
geiri undapr . . . a ~eim meipi er mangi veit
hvers hann af rstum renn." (Havamal 138)

```