[Corpora-List] SVD on high-dimension data

Christopher Manning manning at stanford.edu
Tue Mar 6 16:00:25 UTC 2007


But actually InfoMap is using SVDPACKC internally....

The top-level answer to this is that you produce a restricted space  
for the context vectors, and so really you do SVD on something like a  
1 million by 5000 matrix.

Chris.


On Mar 6, 2007, at 7:38 AM, David Reitter wrote:

> Jamie,
>
> On 6 Mar 2007, at 14:59, Jamie Smith wrote:
>
>> I have large (1 million by 1 million) term-term matrices. What SVD
>> packages work with such massive datasets? I have tried Matlab and
>> SVDPACKC without much success.
>
> Have a look at Infomap,
>
> http://infomap-nlp.sourceforge.net/
> http://infomap.stanford.edu/
>
> we've used it successfully on the Aquaint  and DUC2005 data (100+  
> million words).
>
>
> --
> David Reitter
> ICCS/HCRC, Informatics, University of Edinburgh
> http://www.david-reitter.com



More information about the Corpora mailing list