[Corpora-List] English treebank
Ulrik Sandborg-Petersen
ulrikp at hum.aau.dk
Thu Apr 3 09:11:37 UTC 2008
Hi Rich,
A subset of the Penn Treebank is freely available under a Creative
Commons license as part of the NLTK corpus set:
http://nltk.org/index.php/Corpora
You might also want to purchase the BLLIP corpus from the LDC, as it is
cheaper than the Penn Treebank, and is an automatic parsing of the
1987-1989 stories from the Wall Street Journal also used for the Penn
Treebank. It might suit your needs.
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2000T43
Ulrik Sandborg-Petersen
Rich Cooper Elk wrote:
>
> Hi Linguisticians,
>
> I'm looking for a free English tree bank to perform some small
> experiments on. The Penn Treebank looks like a great one, but it costs
> $1,000 from the LDC. I’m just experimenting, so I don’t want to fork
> over that much cash just yet.
>
> Does anyone know of a free annotated Treebank of English text derived
> from an edited journal or equivalent?
>
> -Rich
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list