[Corpora-List] Hierarchically classified corpora?
Tony Abou-Assaleh
taa at acm.org
Tue Jan 16 16:14:59 UTC 2007
Hi Daniel,
Some datasets that come to mind are ACM digital library for CS-related
publications (but need to be careful about licensing issues), and dmoz.org
for Web pages. The open directory dmoz.org is available for several
languages.
Cheers,
TAA
-----------------------------------------------------
Tony Abou-Assaleh
Email: taa at acm.org
Web site: http://tony.abou-assaleh.net
----------------------[THE END]----------------------
On Tue, 16 Jan 2007, Daniel Beck wrote:
> Hello corpora mailing list,
>
> I'm working on my master thesis "Accurate Hierarchical Classification
> using NLP Techniques". I hope to improve the accuracy of hierarchical
> classification on English and German corpora by using additional
> information extracted with aid of linguistic tools.
>
> I would like to ask where I can obtain corpora which are already
> classified in a hierarchy. I need several English and German corpora. I
> would prefer if the topics of the corpora are about linguistic or
> computer science.
>
> Regards & Thanks,
>
> Daniel
>
>
>
More information about the Corpora
mailing list