[Corpora-List] Measuring relative collocational strength

Adam Kilgarriff adam at lexmasterclass.com
Thu Oct 14 08:22:37 UTC 2010


Alon,

the problem is - the differences are extremely likely to be statistically
significant but that does not mean they are linguistically interesting - for
the full explanation see
Language is never ever ever
random.<http://kilgarriff.co.uk/Publications/2005-K-lineer.pdf>
 *Corpus Linguistics and Linguistic Theory* 1 (2): 263-276.

So you can't get an objective answer to the question 'is the difference
noteworthy' (at least not until we have a far better theory of corpora) but
there are some suggestions of the maths to support your analysis in Simple
Maths for Keywords<http://kilgarriff.co.uk/Publications/2009-K-CLLiverpool-SimpleMaths.doc>
(Proc.
Corpus Linguistics, Liverpool 2009)

Best

Adam

On 13 October 2010 15:01, Alon Lischinsky <alon.lischinsky at kultmed.umu.se>wrote:

> Hi.
>
> I am looking for help with a kind of statistical measure that has
> probably been described in the literature, but which I don't know how
> to call. I should point out that I'm relatively new to corpus studies,
> having a background in qualitative discourse studies, and am still
> coming to terms with some of the technical lexis.
>
> Simply put, I want to find out, given two terms that are seemingly
> synonymous but different in absolute frequency (say, "potato" and
> "spud"), which (lexical) terms have statistically significant
> differences in their collocation with either. I suppose I could simply
> look at the full list of collocates for each term ordered by t-score
> or MI and spot differences, but since one of the terms is much rarer
> and MI scores are affected by absolute frequency, I guess this might
> lead to quite a few artifacts.
>
> I don't know of any piece of software that can do that, so I would
> appreciate any pointers, or even suggestions as to how to go about
> doing it in R or any other statistical software (my programming skills
> aren't great, but I trust I could manage with a little guidance).
>
> Best,
>
> Alon Lischinsky
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>



-- 
================================================
Adam Kilgarriff
http://www.kilgarriff.co.uk
Lexical Computing Ltd                   http://www.sketchengine.co.uk
Lexicography MasterClass Ltd      http://www.lexmasterclass.com
Universities of Leeds and Sussex       adam at lexmasterclass.com
================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20101014/5f05b3b6/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list