[Corpora-List] Distribution of mutual information?

Linas Vepstas linasvepstas at gmail.com
Fri Mar 13 18:28:27 UTC 2009


Hi,

2009/3/12 J Washtell <lec3jrw at leeds.ac.uk>:

> The weight of your plot is slightly to the right of neutral (MI=0)
> because language exhibits positive associative structure,

Yes, of course.

>I have not been able to reproduce your plot

Possibly because I described it incorrectly. This is *not* a scatterplot
of MI(x,y) vs. P(x,y) , as I may have suggested earlier.  This is a
distribution  of MI(x,y) -- i.e. a graph of the likelihood of observing a
particular value of mutual information.

I am preparing scatterplots now, and will update the blog post when
these are ready.

> my best guess is [...]

Pondering.

A guess as to the high-MI peaks off to the right are word pairs which
are heavily used in one text, but don't occur in any of the others.
Dunno.

My question was at least partly a statistics question: flipping
through textbooks, I simply can't find a distribution with log-linear
slopes.

--linas

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list