[Corpora-List] Copy of the Hewlett-Packard test suite?

Stephan Oepen oe at ifi.uio.no
Thu Nov 27 22:48:00 UTC 2008


hi kevin, my apologies for a late reply to your query!

> Can anyone point me towards a copy of the old Hewlett-Packard
> syntactic test suite?

dan flickinger has been maintaining the original HP test suite as part
of his work on the English Resource Grammar.  the HP data was imported
into the TSNLP annotation scheme in the mid-1990s (and annotated using 
the relatively shallow TSNLP phenomenon classification), and under the
name CSLI test suite it has been part of my [incr tsdb()] distribution
in recent years.

you can browse a treebanked version (in LinGO Redwoods style) on-line:

  http://erg.emmtee.net/compare?data=gold/erg/csli

the full test suite (in TSNLP format) is available for download too:

  http://svn.emmtee.net/tags/handon/lingo/lkb/src/tsdb/skeletons/english/csli

to just extract the actual test items plus grammaticality judgements, 
the following should work:

  awk -F@ '{printf("[%d] %s%s\n", $1, $8 ? "" : "*", $7)}' item

for all i recall, dan may have made a tiny number of adjustments since
the original HP release.  but if so, these changes would be trivial in
nature, i believe.  maybe dan or someone else still has a copy of the
original HP file?  i think i do too, only i cannot find it :-).

                                                     all best  -  oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at ifi.uio.no; oe at csli.stanford.edu; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list