[Corpora-List] Classified corpora

Ralf Steinberger ralf.steinberger at jrc.ec.europa.eu
Sat Dec 18 22:15:50 UTC 2010


Hello Seid,

the JRC-Acquis multilingual parallel corpus (22 languages) has been 
manually (multi-label) classified with the Eurovoc thesaurus. You find 
it at http://langtech.jrc.ec.europa.eu/JRC-Acquis.html.

Greetings,

Ralf

On 18/12/2010 21:47, Seid Muhie wrote:
> Dear All
> Greetings
> I am a student, and needs a classified corpus for my ML project. It 
> might be for example, Classified News (as sport, economy, 
> polittics,...), Or Email data (spam, non spam), Or Financial data, or 
> anything else. I just need a preclassifed data.
> Thank you too much.
>
> -- 
> Seid M.
> University of Trento, Italy
> Human Language Technology and Interfaces 
> <http://old.disi.unitn.it/edu/hlti/courses.xml?lang=en>
> Via Brennero 150, Room 105
>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- 
Ralf Steinberger (Ralf.Steinberger at jrc.ec.europa.eu)
European Commission - Joint Research Centre (JRC)
IPSC - GlobeSec - OPTIMA (OPensource Text Information Mining and Analysis)
URL - Applications: http://emm.jrc.it/overview.html
URL - The science behind them: http://langtech.jrc.it
T.P. 267, Via Fermi 2749
21027 Ispra (VA), Italy
Tel: +39 0332 78-6271
Fax: +39 0332 78-5154
Secretary: +39 0332 78-5648 or 9478
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20101218/0a357739/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list