[Corpora-List] ACL Anthology Searchbench: new release
Ulrich Schaefer
ulrich.schaefer at dfki.de
Thu Feb 9 19:27:52 UTC 2012
Dear colleagues,
we are happy to announce a new release of the ACL Anthology Searchbench
<http://aclasb.dfki.de>, a public service that combines
sentence-semantic, full-text and bibliographic search in the ACL
Anthology (http://take.dfki.de/#Systems).
*New highlights* (your feedback via the button at the left bottom of the
Searchbench start page is appreciated!):
The Searchbench now indexes over 22,500 CL & LT papers including the so
far missing journal articles and conferences from 2011, past LREC
proceedings from 2000--2010, and many more.
From now on, we'll be able to update the index shortly after new papers
have been added to the ACL Anthology.
*Graphical citation browser*.
In the Citations tab in the Searchbench's document view, there is now a
graphical citation browser (sample link
<http://aclasb.dfki.de/CitationBrowser.html#id=W11-2927>, full HD screen
recommended ;-) ). It uses ParsCit, ACL Anthology Network data and
sentence information from the Searchbench. You can click on the labeled
edges or right mouse button on the document nodes to see the citation
sentences in context and highlight them in PDF. A tentative link to
external public scientific search services is generated in case a cited
paper or book is not in the Anthology.
*Bibliographic metadata*.
At the same place (from the Citations tab), you can also inspect and
copy bibliographic metadata for each Anthology paper
- in rich text (roughly ACL citation style), and
- as bibtex with mostly correct LaTeX character encoding
(example <http://aclasb.dfki.de/nlp/bib/J11-3002>). Because bibtex is
missing for many papers in the Anthology, we generated it from the
Anthology index files.
Page numbers were taken automatically from the paper layout where
possible, e.g. for many CL journal articles.
We are collaborating with the other groups working on the Anthology and
hope to be able to provide even more complete and corrected metadata later.
*Online glossary extraction*.
You could use the Searchbench as an online *glossary extraction tool*.
Simply try a semantic statements query of the form s:<term> p:is --
example: dependency parsing
<http://aclasb.dfki.de/#stm%7EsNC%7Cs%3Adependency%20parsing%20p%3Ais%2A>).
*ACL-2012 Contributed Task*.
Finally, let us draw your attention to the *Contributed Task
<http://translit.i2r.a-star.edu.sg/r50/taskintro/> *that is part of the
ACL-2012 Special Workhop <http://translit.i2r.a-star.edu.sg/r50/>.
We provide the Searchbench's paperxml data for this. The goal of the
Contributed Task is to generate improved, high quality rich text (XML)
versions of all Anthology papers as a free corpus for further research,
e.g. in summarization, parsing, citation analysis, etc.
Cheers,
Ulrich and Christian
--
Ulrich Schaeferhttp://www.dfki.de/~uschaefer
Christian Spurkhttp://www.dfki.de/~cspurk/ <http://www.dfki.de/%7Ecspurk/>
DFKI Language Technology Lab, D-66123 Saarbruecken, Germany
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
eschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender), Dr. Walter Olthoff. Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes. Amtsgericht Kaiserslautern, HRB 2313
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120209/1676859e/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list