[Corpora-List] Hierarchically classified corpora?
Pierre Zweigenbaum
pz at limsi.fr
Thu Jan 18 14:39:26 UTC 2007
Dear Daniel,
> I'm working on my master thesis "Accurate Hierarchical Classification
> using NLP Techniques". I hope to improve the accuracy of hierarchical
> classification on English and German corpora by using additional
> information extracted with aid of linguistic tools.
>
> I would like to ask where I can obtain corpora which are already
> classified in a hierarchy. I need several English and German corpora. I
> would prefer if the topics of the corpora are about linguistic or
> computer science.
>
> Regards & Thanks,
>
> Daniel
The Medline database of scientific publications in the
biomedical domain contains article abstracts which are
indexed using the hierarchically organized MeSH thesaurus.
It can be obtained for free through a license with the US National
Library of Medicine. It currently contains over 16 million records,
a majority of which have English abstracts.
http://www.nlm.nih.gov/bsd/licensee/2007_stats/baseline_doc.html
Greetings from the other side of the Rhine.
Pierre.
--
Pierre Zweigenbaum
----
LIMSI - CNRS
Groupe LIR / Dépt. Communication Homme-Machine
Tél : (+33) (0)1 69 85 80 04 ; Fax : (+33) (0)1 69 85 80 88
Mél : pz at limsi.fr ; Toile : http://www.limsi.fr/~pz/
Lieu : Bâtiment 508, Université Paris XI,
Courrier : LIMSI, BP 133, 91403 ORSAY Cedex, France
----
CRIM, Institut National des Langues et Civilisations Orientales
----
More information about the Corpora
mailing list