[Corpora-List] release of sar-graph 1.0
Feiyu Xu
feiyu at dfki.de
Sun Jul 13 12:11:22 UTC 2014
=============================
RESOURCE ANNOUNCEMENT
RELEASE OF sar-graph 1.0
http://sargraph.dfki.de
=============================
The following resource will be released on META-SHARE and is already available as a pre-release at http://sargraph.dfki.de.
A sar-graph is a graph containing linguistic knowledge at syntactic and lexical semantic levels for a given language and target relation. a sar-graph for a targeted relation assembles many linguistic patterns that are used in texts to mention this relation. The term semantically associated relations graph was chosen since the patterns may either express the target relation directly or by expressing a semantically associated relation. The nodes in a sar-graph are either semantic arguments of a target relation or content words (to be more exact, their word senses) needed to express/recognize an instance of the target relation. The nodes are connected by two kinds of edges: syntactic dependency structure relations and lexical semantic relations. Thus they are labelled with dependency-structure tags provided by a parser or lexical-semantic relation tags. A definition can be found in (Uszkoreit and Xu, 2013). The individual patterns are assembled in one graph per target relation for an easier combination of mentions gathered across sentences, but all patterns could also be employed individually.
From Strings to Things
SAR-Graphs: A New Type of Resource for Connecting Knowledge and Language
Hans Uszkoreit and Feiyu Xu (2013)
In Proceedings of 1st International Workshop on NLP and DBpedia (NLP&DBPedia), volume 1064, Sydney, NSW, Australia, CEUR Workshop Proceedings, 10/2013
The current sar-graph version 1.0 contains syntactic dependency relations between content words. In future versions, we will integrate lexical semantic relations between word senses.
In the current version, the patterns have been automatically learned by the web-scale version Web-DARE (Krause et al., 2012) of the relation extraction system DARE (Xu et al., 2007) from dependency structures obtained by parsing sentential mentions of the target relation. DARE patterns contain the content words that signal the mentioned (semantically associated) relation and by the syntactic dependencies that combine these words and link them with the phrases representing the arguments of the target relation. Thus, a sar-graph is composed of syntactic dependency graphs. Their edges denote dependency relations. Each edge is labeled with the tag the parser has assigned to the dependency. Vertices come in two flavors: One type of vertices denotes a regular node in a dependency structure, thus it is labeled with a word. Vertices of the second type represent the slots for the arguments of the target relation, instead of a word, they are labeled by the name of the argument, e.g. Person_1. Several dependency parsers have been employed, but the current set of sar-graphs is built from parsing results of the MALT parser.
Applications of sar-graphs are information extraction, question answering and summarisation.
The resource might also be useful for research on paraphrases, textual entailment and syntactic variation within a language.
Release 1.0 has the following properties:
Language: English
Number of target relations: 25
Arity of relations: n-ary relations (2≤n≤5)
Domains of relations: biographic information, corporations, awards
Format of patterns: DARE patterns in lemon format and specific xml schema (DTD provided)
Format of sar-graphs: specific xml schema (DTD provided)
APIs: java api for reading and storing patterns and sar-graphs,
java api for various use cases: getting and searching for vertex, edge information of a DARE pattern and a sar-graph,
java api for pattern visualization
Download is available at: http://sargraph.dfki.de/download.html
More statistics are available at: http://sargraph.dfki.de/statistics.html
More references can be found at: http://sargraph.dfki.de/publications.html
Feedback via email: sargraph at dfki.de
sar-graphs were conceived and defined at DFKI LT-Lab Berlin and then realized in
a collaboration between DFKI LT-Lab and the BabelNet group at Sapienza University of Rome.
The development of sar-graphs is partially supported by
• the German Federal Ministry of Education and Research (BMBF) through the project Deependance (contract 01IW11003)
• the project LUcKY, a Google Focused Research Award in the area of Natural Language Understanding.
Enjoy!
Feiyu Xu
----------------------------------
Dr. Feiyu Xu
Senior Researcher
Project Leader
DFKI Projektbüro Berlin
Alt Moabit 91c
D-10559 Berlin
Germany
Phone +49-30-23895-1812
Sek +49-30-23895-1800
Fax +49-30-23895-1810
E-mail: feiyu at dfki.de
homepage: http://www.dfki.de/~feiyu
------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140713/5625a62d/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list