[Corpora-List] Help in Applying Appropriate Statistical Test andIts Interpretation

Hardie, Andrew a.hardie at lancaster.ac.uk
Mon Jun 28 16:10:19 UTC 2010


Hi,

 

There are a couple of things to be aware of in what you’re doing here.

 

First, you are not interpreting the outcome of the test appropriately when you say

 

>> the difference between the two variables (Double Object and To Dative) is significant

 

Actually that’s not what you’ve shown. Rather, the test shows that there is sufficient evidence to reject the hypothesis that there is no difference among the different genres in your table in terms of their distribution of Double Obj and To-Dative across ditransitives. That’s not the same thing! Plus, showing that the null hypothesis can be rejected, and going from that to a meaningful interpretation that tells you something about what you are trying to investigate, are two separate things, for reasons outlined in Adam’s paper.

 

If you are actually trying to test whether or not there is a difference between the frequency of D.O and to-dative, then there is no reason to test this on a table with frequencies broken down by genre. Indeed, if the only thing you’re interested in is the difference between (total n. Double Object) and (total n. to-Dative), then chi-square and related tests are very likely not what you want.

 

Moving along, you have applied chi-quare to a table with very low frequencies – this is generally considered a Bad Idea; the usual rule of thumb is that with expected frequencies less than 5, chi-squared is to be avoided. But you have about a dozen rows with row-totals less than 10, so it is very likely that you have quite a few cells with expected frequencies less than 5 (I’ve not done the sums but e.g. the expected values for ALT, MNU, must obviously be less than 5). If you really want to test this table, you probably need the Fisher Exact test.

 

Finally,

 

>> I applied the test on normalized frequencies (which were calculated by dividing the frequency of each genre with the number of words it has, and the multiplying it with 100,000 i.e. .1 million) but the chisquare result was same (same p-value).

 

Chi-square (and similar) does not work with normalised frequencies – although you will occasionally this error in the published literature, alas! 

 

Remember what you’re looking at is whether there is enough evidence to reject the null hypothesis. But normalised frequencies are designed to conceal how much evidence there is by putting all frequencies on the same basis. Normalised frequencies conserve ratios, but ratios are not the same thing as amount of evidence. For instance, you’d be a lot more impressed if I said A was 3x more common than B if the underlying figures were 9,000 and 3,000, rather than if they were 90 and 30. Using normalised frequencies in a chi-square test is equivalent to me lying to you about the actual numbers underlying my three-times-more-common claim. 

 

In particular, if you use words-per-million on any figures that come from a less-than-1 million word corpus, then you are over-claiming (to the test, and to anyone who reads the result of the test) how much evidence you have.  

 

(You can demonstrate this quite easily using any of the online chi-square calculators: stick in any four numbers and get the chi-square, then redo with an extra 0 on the end of each number. You’ll find the chi-square that results is much higher (lower p) – but those two sets of figures are directly equivalent in terms of normalised frequencies).

 

Quite apart from that, for purposes where normalised frequencies are appropiate, you should consider whether N words is a suitable basis for normalisation, as opposed to (say) total N ditransitives in either form.

 

best

 

Andrew.

 

 

 

From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of True Friend
Sent: 28 June 2010 05:26
To: corpora
Subject: [Corpora-List] Help in Applying Appropriate Statistical Test andIts Interpretation

 

Good Day to All Copora Members
I am a masters in applied linguistics student, currently working on my thesis. The topic of research is the use of ditransitive constructions. To authenticate the results I want to apply statistical techniques on the research. For example I am trying to see whether there is a significant difference in the usage of two alternative ditransitive patterns in PWE (Pakistani Written English, the corpus I am working on for the research). The alternative ditransitive patterns here mean Double Object (He gave me a pen) and To Dative (He gave a pen to me). I am pasting the table here, which contains genre names and frequencies of all verbs (used ditransitively) in that genre.


Genre

D. Object

To Dative

ALT

0

4

ART

210

344

BKS

335

308

BLT

2

7

BRU

4

2

CLM

108

303

CST

0

7

DIR

1

7

EDT

8

32

FTW

23

14

INT

38

44

LDS

7

53

LTR

35

92

MGP

2

5

MNF

3

6

MNU

0

1

NLT

7

23

NVL

5

3

NWS

24

108

OLT

44

9

PLC

0

1

PRS

11

22

RPR

19

60

RPT

4

17

SRY

0

7

STR

76

36

THS

20

36

TRN

30

19

WWW

16

30

Some facts about the data are as follows:
Genre are not of equal in length (number of words) so there may be a genre like ALT of a few hundred words, and another like ART of .5 million words.
Frequencies here are calculated by adding the occurrences of all the verbs occurred in the given genre in a given pattern.
I have applied Chi Square test using R and with this command "cxx = chisq.test(x, correct = FALSE)" (while 'x' and 'cxx' are R objects) and the result was as follows.
Pearson's Chi-squared test

data:  x 
X-squared = 268.2688, df = 28, p-value < 2.2e-16

Going through the help manuals of R, I came to know that p-value  '2.2e-16' is a too much small number, so it means that the difference between the two variables (Double Object and To Dative) is significant, as p-value for social sciences is considered p<0.005. Please correct me if I am misunderstanding the test, its results or applying it incorrectly. And if this test is not suitable for such kind of analysis, and alternatively which kind of test should I apply. And last one last thing, I applied the test on normalized frequencies (which were calculated by dividing the frequency of each genre with the number of words it has, and the multiplying it with 100,000 i.e. .1 million) but the chisquare result was same (same p-value).
Any help and comments would be highly appreciated.
Best Regards 

-- 
Muhammad Shakir Aziz محمد شاکر عزیز
Masters in Applied Linguistics (last semester student)
Translator, Course Developer, Linguist for Urdu, Punjabi and English
Urdu:- http://awaz-e-dost.blogspot.com/
English:- http://linguisticslearner.blogspot.com/
Facebook:- http://www.facebook.com/truefriend2004
Skype:- true_friend2004

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100628/84586edd/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list