[Corpora-List] spectral clustering

Yannick Versley versley at sfs.uni-tuebingen.de
Mon Sep 3 14:04:38 UTC 2007


Hi,

I've got code sitting around  that wraps SVDPACKC's las2 code (which does SVD 
using subspace iteration) into Python; I have only debugged it on small 
datasets, but if this is something you could use and you want to try it out, 
just drop me a line.
The advantage over numpy's SVD/eigenvector functions is that the code can work 
with sparse matrices - you just need to pass functions that return
y=A^t*A*x (opa) and y=A*x (opb), respectively.
For this, you could use Numpy's sparse matrix support, or PySparse, or any 
other representation you want.
(It should be possible to completely eradicate the code that computes the 
vectors of the U matrix and thus obtain simpler code that just computes 
eigenvectors of a symmetric matrix, but that would be a bit of work).

Best regards,
Yannick
> We just used the Heigenvectors function from
> 
> http://numpy.scipy.org/numpydoc/numpy-18.html#pgfId-306314
> 
> This particular array interface for Python is getting a bit aged. We found
> it adequate for smallish (50-100 item) datasets, but had less success
> with larger collections.
> 
> There is a huge body of work on numerical linear algebra. I'd be interested
> in hearing how you do with this technology and what you finish up doing.
> 
> Chris
> 
> 
> On 31/08/2007, Marco Baroni <marco.baroni at unitn.it> wrote:
> >
> > Dear All,
> >
> > Does anybody know of existing tools to perform spectral clustering (as
> > described, e.g., in Brew / Schulte im Walde: Spectral clustering for
> > German verbs, EMNLP 2002)? [I guess generating the affinity matrix and
> > using a standard clustering algorithm on the eigenvector matrix is
> > easy, so what I'm really asking for is a tool to perform the spectral
> > decomposition...]
> >
> > Thanks.
> >
> > Regards,
> >
> > Marco
> >
> >
> >
> >
> >
> > --
> > Marco Baroni
> > CIMeC, University of Trento
> > http://www.form.unitn.it/~baroni
> >
> >
> > _______________________________________________
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
>
-- 
Yannick Versley
Seminar für Sprachwissenschaft, Abt. Computerlinguistik
Wilhelmstr. 19, 72074 Tübingen
Tel.: (07071) 29 77352

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list