[Corpora-List] SVD on high-dimension data
Christopher Manning
manning at stanford.edu
Tue Mar 6 16:00:25 UTC 2007
But actually InfoMap is using SVDPACKC internally....
The top-level answer to this is that you produce a restricted space
for the context vectors, and so really you do SVD on something like a
1 million by 5000 matrix.
Chris.
On Mar 6, 2007, at 7:38 AM, David Reitter wrote:
> Jamie,
>
> On 6 Mar 2007, at 14:59, Jamie Smith wrote:
>
>> I have large (1 million by 1 million) term-term matrices. What SVD
>> packages work with such massive datasets? I have tried Matlab and
>> SVDPACKC without much success.
>
> Have a look at Infomap,
>
> http://infomap-nlp.sourceforge.net/
> http://infomap.stanford.edu/
>
> we've used it successfully on the Aquaint and DUC2005 data (100+
> million words).
>
>
> --
> David Reitter
> ICCS/HCRC, Informatics, University of Edinburgh
> http://www.david-reitter.com
More information about the Corpora
mailing list