[Corpora-List] Merging verb phrase ellipsis annotations with the WSJ treebank
E. Alan Hogue
eahogue at email.arizona.edu
Tue Jun 11 21:04:40 UTC 2013
Hello Corpora List,
As you may know, not long ago this article was published:
Bos, J., & Spenader, J. (2011). An annotated corpus for the analysis of VP
ellipsis. Language Resources and Evaluation, 45(4), 463–494.
doi:10.1007/s10579-011-9142-3
Along with this, the authors made available a file of standoff annotation
based on the raw version (non-parsed, non-tagged) of the WSJ in the Penn
Treebank.
http://www.let.rug.nl/bos/vpe/annotations.html
I am currently trying to figure out the best way to merge or align this
with the _parsed_ version of the WSJ, and this is turning out to be
trickier than I expected. It occurs to me that this might in general be a
problem someone else has solved before.
Does anyone know of any code, modules, packages, algorithms, tricks, etc
that already do a good job of this type of thing, and which I might modify
for this particular task? If it happens to be in Python that is a plus, but
just about any language/platform will do.
Thank you!
Alan Hogue
University of Arizona
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130611/d14e2ba3/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list