[Corpora-List] SuperMatrix -- a General Tool for Distributional Semantic Analysis of Corpora

Bartosz Broda bartosz.broda at pwr.wroc.pl
Wed Sep 19 14:35:56 UTC 2012


Dear Corpora Members,

We are happy to announce the release of SuperMatrix on GNU GPL
licence. SuperMatrix is a collection of tools and scripts designed for
measuring semantic relatedness between words. Several algorithms for
collecting corpus frequencies, transforming feature values and
calculating semantic relatedness of words on the basis of their
feature vectors were implemented. SuperMatrix is a language
independent tool that has been applied to Polish, English and Slovene
so far. It can be relatively easily linked with other tools, e.g.
clustering tools. A detailed description of the tool can be found in
the following papers:

Broda, Bartosz, Maciej Piasecki. 2008. SuperMatrix: a General Tool for
Lexical Semantic Knowledge Acquisition In Speech and Language
Technology, 239-254. Polish Phonetics Assocation.
http://nlp.pwr.wroc.pl/en/bartosz-broda/64/show/publication

Broda, Bartosz, Maciej Piasecki. 2011. Parallel, Massive Processing in
SuperMatrix -- a General Tool for Distributional Semantic Analysis of
Corpora. International Journal of Data Mining, Modelling and
Management.
http://nlp.pwr.wroc.pl/en/bartosz-broda/80/show/publication

You can download the corpora via git clone
http://nlp.pwr.wroc.pl/supermatrix.git

After cloning the repository you will find a brief usage manual in
doc/manual (in LaTeX). It should help with compiling the system and
running it for the first time.

Best regards,
  Bartosz Broda

G4.19 Research Group
Institute of Informatics
Faculty of Computer Science and Management
Wroclaw University of Technology, Poland

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list