[Corpora-List] Help in Applying Appropriate Statistical Test and Its Interpretation
Angus B. Grieve-Smith
grvsmth at panix.com
Mon Jun 28 13:15:45 UTC 2010
True Friend wrote:
> Good Day to All Copora Members
> I am a masters in applied linguistics student, currently working on my
> thesis. The topic of research is the use of ditransitive
> constructions. To authenticate the results I want to apply statistical
> techniques on the research. For example I am trying to see whether
> there is a significant difference in the usage of two alternative
> ditransitive patterns in PWE (Pakistani Written English, the corpus I
> am working on for the research). The alternative ditransitive patterns
> here mean Double Object (He gave me a pen) and To Dative (He gave a
> pen to me). I am pasting the table here, which contains genre names
> and frequencies of all verbs (used ditransitively) in that genre.
> Genre D. Object To Dative
> ALT 0 4
> ART 210 344
>
First of all, let me applaud your question. I think too many
linguists are reluctant to ask about their statistics. It's important
for us to know what these things mean and how they work. At UNM we were
required to take at least a semester of statistics, and it helped
tremendously, but I can tell that we just scratched the surface. I try
to check all my tests with a statistician to make sure they're
appropriate. If your university has a statistics clinic, I strongly
recommend a visit.
I agree with what Adam and Thomas wrote, but I'm going to focus on a
different aspect, relating to the envelope of variation. Here's a paper
I wrote about it!
The Envelope of Variation in Multidimensional Register and Genre Analyses
Author: Grieve-Smith, Angus B.
Source: Language and Computers
<http://www.ingentaconnect.com/content/rodopi/lang;jsessionid=3gf6v67o36w4f.alexandra>,
Corpus Linguistics Beyond the Word: Corpus Research from Phrase to
Discourse. Edited by Eileen Fitzpatrick , pp. 21-42(22)
Publisher: Rodopi
<http://www.ingentaconnect.com/content/rodopi;jsessionid=3gf6v67o36w4f.alexandra>
http://www.ingentaconnect.com/content/rodopi/lang/2006/00000060/00000001/art00003?crawler=true
http://www.grieve-smith.com/Academic/AAACL-grvsmth.060225.pdf
In this case, correlation tests are not appropriate, because you
would expect the number of tokens to vary with the total number of words
in each genre. Running a correlation test on per-word frequency counts
is also not appropriate, because these are two different strategies for
doing the same thing, and you would expect them to vary inversely with
one another. The writers are describing events where a thing is being
given to a person (or similar). The two constructions have the same
envelope of variation.
I think you need a better hypothesis. It is unlikely that any two
constructions will occur with comparable frequencies, especially
constructions that have the same conceptual meaning, so if you find that
to be true, it doesn't tell you much. If you are breaking it out by
genre, does that mean that you expect the percentage of ditransitives to
vary with genre? In that case, I think you need to figure out which
genres you would expect to do what, and why. Then you will have a good
hypothesis, and you can find a statistical test based on that.
I hope this helps.
--
-Angus B. Grieve-Smith
grvsmth at panix.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20100628/fe3dc004/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list