[Corpora-List] Help in Applying Appropriate Statistical Test and Its Interpretation

Angus B. Grieve-Smith grvsmth at panix.com
Mon Jun 28 13:15:45 UTC 2010


True Friend wrote:
> Good Day to All Copora Members
> I am a masters in applied linguistics student, currently working on my 
> thesis. The topic of research is the use of ditransitive 
> constructions. To authenticate the results I want to apply statistical 
> techniques on the research. For example I am trying to see whether 
> there is a significant difference in the usage of two alternative 
> ditransitive patterns in PWE (Pakistani Written English, the corpus I 
> am working on for the research). The alternative ditransitive patterns 
> here mean Double Object (He gave me a pen) and To Dative (He gave a 
> pen to me). I am pasting the table here, which contains genre names 
> and frequencies of all verbs (used ditransitively) in that genre.
> Genre 	D. Object 	To Dative
> ALT 	0 	4
> ART 	210 	344
>
    First of all, let me applaud your question.  I think too many 
linguists are reluctant to ask about their statistics.  It's important 
for us to know what these things mean and how they work.  At UNM we were 
required to take at least a semester of statistics, and it helped 
tremendously, but I can tell that we just scratched the surface.  I try 
to check all my tests with a statistician to make sure they're 
appropriate.  If your university has a statistics clinic, I strongly 
recommend a visit.

    I agree with what Adam and Thomas wrote, but I'm going to focus on a 
different aspect, relating to the envelope of variation.  Here's a paper 
I wrote about it!

The Envelope of Variation in Multidimensional Register and Genre Analyses
Author: Grieve-Smith, Angus B.
Source: Language and Computers 
<http://www.ingentaconnect.com/content/rodopi/lang;jsessionid=3gf6v67o36w4f.alexandra>, 
Corpus Linguistics Beyond the Word: Corpus Research from Phrase to 
Discourse. Edited by Eileen Fitzpatrick , pp. 21-42(22)
Publisher: Rodopi 
<http://www.ingentaconnect.com/content/rodopi;jsessionid=3gf6v67o36w4f.alexandra>
http://www.ingentaconnect.com/content/rodopi/lang/2006/00000060/00000001/art00003?crawler=true
http://www.grieve-smith.com/Academic/AAACL-grvsmth.060225.pdf

    In this case, correlation tests are not appropriate, because you 
would expect the number of tokens to vary with the total number of words 
in each genre.  Running a correlation test on per-word frequency counts 
is also not appropriate, because these are two different strategies for 
doing the same thing, and you would expect them to vary inversely with 
one another.  The writers are describing events where a thing is being 
given to a person (or similar).  The two constructions have the same 
envelope of variation.

    I think you need a better hypothesis.  It is unlikely that any two 
constructions will occur with comparable frequencies, especially 
constructions that have the same conceptual meaning, so if you find that 
to be true, it doesn't tell you much.  If you are breaking it out by 
genre, does that mean that you expect the percentage of ditransitives to 
vary with genre?  In that case, I think you need to figure out which 
genres you would expect to do what, and why.  Then you will have a good 
hypothesis, and you can find a statistical test based on that.

    I hope this helps.

-- 
				-Angus B. Grieve-Smith
				grvsmth at panix.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100628/fe3dc004/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list