[Corpora-List] SEMILAR Corpus

Vasile Rus (vrus) vrus at memphis.edu
Thu Sep 13 13:21:42 UTC 2012


Dear Colleagues,

We are pleased to announce the release of the SEMILAR corpus, which is part of the larger SEMILAR Toolkit project. The corpus can be downloaded at this link: www.semanticsimilarity.org<http://www.semanticsimilarity.org> .

The SEMILAR corpus provides qualitative judgments of similarity between words in two texts. The semantic relationship between the larger texts is also annotated. Word-level judgments were made based on two types of word pairings, greedy and optimal. The corpus could be used to further the understanding of word-to-word semantic similarity methods and their role in assessing the semantic relationships between larger texts. It can also be used as an alignment corpus or for other purposes such as the evaluation of Machine Translation performance metrics that compare automatically translated texts to human translations.

We also make available the GUI-based annotation tool that we used for annotating the corpus. The SEMILAT (SEMantic simILrity Annotation Tool) tool is available for download from the SEMILAR Toolkit project: www.semanticsimilarity.org<http://www.semanticsimilarity.org> .

The goal of the larger SEMantic simILARity (SEMILAR; pronounced the same way as the word 'similar') software toolkit project is to promote productive, systematic, rigorous, and fair research advancements in the area of semantic similarity. The SEMILAR software toolkit offers users, researchers, and developers, easy access to fully-implemented semantic similarity methods from both a GUI-based interface and a library. The SEMILAR Toolkit will be released publicly for research purposes in the near future. More details about the SEMILAR project can be found at this link: www.semanticsimilarity.org<http://www.semanticsimilarity.org>.

Truly,

Vasile Rus, PhD
Associate Professor (CS/IIS)
Department of Computer Science (CS)
Institute for Intelligent Systems (IIS)
Systems Testing Research Fellow of The Fedex Institute of Technology

The University of Memphis
Memphis, TN 38152
USA

WWW: http://www.cs.memphis.edu/~vrus/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120913/83cd47d4/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list