[Corpora-List] Merging verb phrase ellipsis annotations with the WSJ treebank

Alan Hogue eahogue at gmail.com
Tue Jun 11 20:37:59 UTC 2013


Hello Corpora List,

As you probably know, not long ago this article was published:

Bos, J., & Spenader, J. (2011). An annotated corpus for the analysis of VP
ellipsis. Language Resources and Evaluation, 45(4), 463–494.
doi:10.1007/s10579-011-9142-3

Along with this, the authors made available a file of standoff annotation
based on the raw version (non-parsed, non-tagged) of the WSJ in the Penn
Treebank.

http://www.let.rug.nl/bos/vpe/annotations.html

I am currently trying to figure out the best way to merge or align this
with the _parsed_ version of the WSJ, and this is turning out to be
trickier than I expected. It occurs to me that this might in general be a
problem someone else has solved before.

Does anyone know of any code, modules, packages, algorithms, tricks, etc
that already do a good job of this type of thing, and which I might modify
for this particular task? If it happens to be in Python that is a plus, but
just about any language/platform will do.

Thank you!

Alan Hogue
University of Arizona
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130611/6f1e812f/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list