[Corpora-List] Help with Biber's MD analysis

Angus B. Grieve-Smith grvsmth at panix.com
Sat Feb 5 19:10:53 UTC 2011


On 2/5/2011 5:14 AM, Andrea Nini wrote:
> 1) is it possible to calculate the factor scores of a new text using 
> the factors that Biber used for his study?

     Absolutely, but see below.

> 2) would it affect the results the fact that my texts have to be 
> normalised to 100 words whereas Biber's texts were normalised to 1000 
> words?

     In principle yes, but see below.

> 3) when calculating the factor scores for my texts, what means should 
> I consider? The ones taken from my dataset or the ones taken from 
> Biber's study?

     The ones from your dataset, absolutely, but...

     Multidimensional analysis is exciting, but there are significant 
problems.  The main one that I found is that Biber did not use 
per-choice frequencies, so the co-occurrences he identified could have 
been due to grammar.  In fact, you could interpret his Dimension 1 as 
simply "nouns vs. verbs" and Dimension 2 as "past vs. present."  I tried 
to use the envelope of variation to counteract this, but I was not 
successful.  I discussed this more in a paper at the 2005 AACL:

http://www.grieve-smith.com/Academic/AAACL-grvsmth.060225.pdf

-- 
				-Angus B. Grieve-Smith
				Saint John's University
				grvsmth at panix.com


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list