[Corpora-List] Request for help concerning a LSA problem
Evgeniy Gabrilovich
gabr at cs.technion.ac.il
Fri May 5 14:59:08 UTC 2006
Dear Cecilie D. Widsteen,
I'm not familiar with the Jama Matrix Package, but recently
I conducted a search for existing implementations of LSA,
so I thought you might find these useful:
1) Text to Matrix Generator (TMG) - Matlab toolbox
http://scgroup.hpclab.ceid.upatras.gr/scgroup/Projects/TMG/
2) A package for the R Project for Statistical Computing
http://cran.r-project.org/src/contrib/Descriptions/lsa.html
3) General Text Parser (GTP) - C++ code
http://www.cs.utk.edu/~lsi/gtp-request.html
4) Links to additional LSA-related software are available at
http://www.cs.utk.edu/~lsi/soft.html
Regards,
Evgeniy.
--
Evgeniy Gabrilovich
Ph.D. student in Computer Science
Department of Computer Science, Technion - Israel Institute of Technology
Technion City, Haifa 32000, Israel
Email: gabr at cs.technion.ac.il WWW: http://www.cs.technion.ac.il/~gabr
Phone: +972-4-8294948
> -----Original Message-----
> From: owner-corpora at lists.uib.no
> [mailto:owner-corpora at lists.uib.no] On Behalf Of Cecilie
> Desiree Widsteen
> Sent: Thursday, May 04, 2006 10:29
> To: corpora at uib.no
> Subject: [Corpora-List] Request for help concerning a LSA problem
>
> Hello all,
>
> I´m currently trying to implement Latent Semantic Analysis, as part of
> an automatic classification system. I´m programming in Java, and using
> the Jama Matrix package for the matrix stuff. I have stumbled
> over some
> strange problems, and would be grateful if anyone on this list could
> offer some help.
> My problem is: I have implemented a class which takes care of
> building a
> matrix representation of a corpus, and performs SVD over the
> term-by-document matrix. Most of the operations are done by the Jama
> class "Matrix". This works fine, except for the fact that when I ran
> the program over various small test corpora (like, for
> instance, the one
> from Chapter 15 in Schütze and Manning´s book Foundations of
> Statistical
> NLP) most of the righ and left singular vectors contained the correct
> values but with wrong/reversed sign?! E.g. a vector that
> should have the
> values [-0.75,-0.28,-0.20, ...] are assigned the values [0.75,0.28,
> ...]. Unfortunately, I have limited experience with linear algebra and
> the like so now I find myself completely at loss in debugging this...
> As far as I can understand, this means that my vectors are pointing in
> the opposite direction from the one they should, but why this
> is escapes
> my understanding :)
> Any help, hints, tricks and the like are extremely welcome! I can also
> send over the source code on request.
>
> Regards,
> --
> Cecilie D. Widsteen
> Department of Linguistics
> University of Oslo
>
>
>
More information about the Corpora
mailing list