<div dir="ltr">Hello Corpora List,<div><br></div><div style>As you probably know, not long ago this article was published:</div><div style><br></div><div style>Bos, J., & Spenader, J. (2011). An annotated corpus for the analysis of VP ellipsis. Language Resources and Evaluation, 45(4), 463–494. doi:10.1007/s10579-011-9142-3<br>
</div><div style><br></div><div style>Along with this, the authors made available a file of standoff annotation based on the raw version (non-parsed, non-tagged) of the WSJ in the Penn Treebank.</div><div style><br></div>
<div style><a href="http://www.let.rug.nl/bos/vpe/annotations.html">http://www.let.rug.nl/bos/vpe/annotations.html</a><br></div><div style><br></div><div style>I am currently trying to figure out the best way to merge or align this with the _parsed_ version of the WSJ, and this is turning out to be trickier than I expected. It occurs to me that this might in general be a problem someone else has solved before. </div>
<div style><br></div><div style>Does anyone know of any code, modules, packages, algorithms, tricks, etc that already do a good job of this type of thing, and which I might modify for this particular task? If it happens to be in Python that is a plus, but just about any language/platform will do. </div>
<div style><br></div><div style>Thank you!</div><div style><br></div><div style>Alan Hogue</div><div style>University of Arizona</div><div style><br></div><div style><br></div></div>