[Corpora-List] Resources concerning multilabel problem
radev at umich.edu
radev at umich.edu
Fri Aug 18 13:42:58 UTC 2006
Look at this paper:
http://citeseer.ist.psu.edu/8956.html
Error-Correcting Output Coding for Text Classification (1999)
Adam Berger
See also:
http://citeseer.ist.psu.edu/19268.html
> -----Original Message-----
> From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
> Behalf Of Cecilie Desiree Widsteen
> Sent: 18 August 2006 11:09
> To: Corpora list
> Subject: [Corpora-List] Resources concerning multilabel problem
>
>
>
> Hello all!
>
>
>
> I am looking for resources (articles, books, webpages) concerning the
>
> multilabel (multiclass?) problem in the context of text classification.
>
> By this I mean the fact that a document can be classified into more than
>
> one category. Especially w.r.t. supervised learning algorithms, where
>
> the documents in the training set may belong to multiple classes.
>
>
>
> Regards,
>
> --
>
> Cecilie Widsteen
>
> Institute for Informatics,
>
> University of Oslo
>
>
> --Boundary_(ID_Bwp/m+5eLKSGnHqnO7FgzA)
> Content-type: text/html; charset=us-ascii
> Content-transfer-encoding: 7BIT
>
> <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">
>
> <head>
> <meta http-equiv=Content-Type content="text/html; charset=us-ascii">
> <meta name=Generator content="Microsoft Word 11 (filtered medium)">
> <style>
> <!--
> /* Font Definitions */
> @font-face
> {font-family:"Comic Sans MS";
> panose-1:3 15 7 2 3 3 2 2 2 4;}
> @font-face
> {font-family:Verdana;
> panose-1:2 11 6 4 3 5 4 4 2 4;}
> /* Style Definitions */
> p.MsoNormal, li.MsoNormal, div.MsoNormal
> {margin:0cm;
> margin-bottom:.0001pt;
> font-size:12.0pt;
> font-family:"Times New Roman";}
> a:link, span.MsoHyperlink
> {color:blue;
> text-decoration:underline;}
> a:visited, span.MsoHyperlinkFollowed
> {color:purple;
> text-decoration:underline;}
> p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
> {margin:0cm;
> margin-bottom:.0001pt;
> font-size:10.0pt;
> font-family:Arial;
> color:navy;}
> p
> {mso-margin-top-alt:auto;
> margin-right:0cm;
> mso-margin-bottom-alt:auto;
> margin-left:0cm;
> font-size:12.0pt;
> font-family:"Times New Roman";}
> span.EmailStyle18
> {mso-style-type:personal;
> font-family:Arial;
> color:windowtext;}
> @page Section1
> {size:595.3pt 841.9pt;
> margin:72.0pt 107.65pt 72.0pt 107.65pt;}
> div.Section1
> {page:Section1;}
> -->
> </style>
>
> </head>
>
> <body lang=EN-GB link=blue vlink=purple>
>
> <div class=Section1>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>Dear Cecilie,<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>We have recently made available the JRC-Acquis corpus,
> which is a multilingual (21 languages) document collection multi-labelled according
> to the Eurovoc thesaurus and aligned at paragraph level for each of the 210
> language pairs. You find it for download at:<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'> <a
> href="http://langtech.jrc.it/JRC-Acquis.html">http://langtech.jrc.it/JRC-Acquis.html</a><o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>Furthermore, in the ‘Publications’ section
> of our web site (<a href="http://langtech.jrc.it/#Publications">http://langtech.jrc.it/#Publications</a>),
> you find a number of papers on (typically multilingual) multi-label text
> categorisation applications (look mainly around the years 2002-2004), including
> the following:<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p style='margin-left:36.0pt;text-align:justify'><font size=2 face=Verdana><span
> style='font-size:10.0pt;font-family:Verdana'>Pouliquen Bruno, Ralf Steinberger
> & Camelia Ignat (2003)</span></font><font size=2 face=Verdana><span
> style='font-size:10.0pt;font-family:Verdana'>. <i><span style='font-style:italic'><a
> href="http://langtech.jrc.it/Documents/EuroLan-03_Pouliquen-Steinberger-et-al.pdf">Automatic
> Annotation of Multilingual Text Collections with a Conceptual Thesaurus</a></span></i>.
> In: Proceedings of the Workshop <i><span style='font-style:italic'>Ontologies
> and Information Extraction</span></i> at the Summer School <i><span
> style='font-style:italic'>The Semantic Web and Language Technology - Its
> Potential and Practicalities</span></i> (EUROLAN'2003). Bucharest, Romania, 28
> July - 8 August 2003. </span></font><font face=Verdana><span style='font-family:
> Verdana'><o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>The text categorisation approach described in that
> paper is used as the major ingredient in our daily news analysis system
> NewsExplorer (freely accessible at <a href="http://press.jrc.it/NewsExplorer">http://press.jrc.it/NewsExplorer</a>)
> to link related news across languages.<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>I hope this helps. All the best,<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>Ralf<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
> font-family:Arial'><o:p> </o:p></span></font></p>
>
> <div style='mso-element:para-border-div;border:none;border-top:solid windowtext 1.0pt;
> padding:1.0pt 0cm 0cm 0cm'>
>
> <p class=MsoNormal style='border:none;padding:0cm'><b><font size=1
> color=maroon face="Comic Sans MS"><span style='font-size:8.0pt;font-family:
> "Comic Sans MS";color:maroon;font-weight:bold'>Ralf Steinberger</span></font></b><font
> size=1 color=maroon face="Comic Sans MS"><span style='font-size:8.0pt;
> font-family:"Comic Sans MS";color:maroon'> (</span></font><font size=1
> color=maroon face="Comic Sans MS"><span lang=DE style='font-size:8.0pt;
> font-family:"Comic Sans MS";color:maroon'><a
> href="mailto:Ralf.Steinberger at jrc.it" title="mailto:Ralf.Steinberger at jrc.it"><font
> color=maroon><span lang=EN-GB style='color:maroon'><span
> title="mailto:Ralf.Steinberger at jrc.it"><span
> title="mailto:Ralf.Steinberger at jrc.it">Ralf.Steinberger at jrc.it</span></span></span></font></a></span></font><font
> size=1 color=maroon face="Comic Sans MS"><span style='font-size:7.5pt;
> font-family:"Comic Sans MS";color:maroon'>) <br>
> European Commission - Joint Research Centre (JRC)<br>
> IPSC - SeS - Language Technology (</span></font><font size=1
> color=maroon face="Comic Sans MS"><span style='font-size:8.0pt;font-family:
> "Comic Sans MS";color:maroon'><a href="http://langtech.jrc.it/"
> title="http://www.jrc.it/langtech"><font size=1 color=maroon><span
> style='font-size:7.5pt;color:maroon'><span title="http://www.jrc.it/langtech"><span
> title="http://www.jrc.it/langtech">http://langtech.jrc.it</span></span></span></font></a>,
> <a href="http://press.jrc.it/NewsExplorer/" title="http://www.jrc.it/langtech"><font
> size=1 color=maroon><span style='font-size:7.5pt;color:maroon'><span
> title="http://www.jrc.it/langtech"><span title="http://www.jrc.it/langtech">http://press.jrc.it/NewsExplorer</span></span></span></font></a></span></font><font
> size=1 color=maroon face="Comic Sans MS"><span style='font-size:7.5pt;
> font-family:"Comic Sans MS";color:maroon'>) <br>
> T.P. 267, Via Fermi 1<br>
> 21020 Ispra (VA), <U1:COUNTRY-REGION u2:st="on"><U1:PLACE u2:st="on">Italy<br>
> </U1:PLACE></U1:COUNTRY-REGION>Tel: +39 0332 78-6271<br>
> Fax: +39 0332 78-5154<br>
> Secretary: +39 0332 78-5648 or 9478</span></font><o:p></o:p></p>
>
> </div>
>
> <p class=MsoNormal><font size=1 face="Comic Sans MS"><span style='font-size:
> 8.0pt;font-family:"Comic Sans MS"'><o:p> </o:p></span></font></p>
>
> <p class=MsoNormal><b><font size=1 color=red face="Comic Sans MS"><span
> style='font-size:8.0pt;font-family:"Comic Sans MS";color:red;font-weight:bold'>New
> URL:</span></font></b><font size=1 face="Comic Sans MS"><span style='font-size:
> 8.0pt;font-family:"Comic Sans MS"'> <a href="http://langtech.jrc.it/">http://langtech.jrc.it</a>.
> The previous address http://www.jrc.it/langtech will only be valid for a few
> more months.<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'> <o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span lang=EN-US
> style='font-size:10.0pt'>-----Original Message-----<br>
> From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On Behalf
> Of Cecilie Desiree Widsteen<br>
> Sent: 18 August 2006 11:09<br>
> To: Corpora list<br>
> Subject: [Corpora-List] Resources concerning multilabel problem</span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>Hello all!<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>I am looking for resources (articles, books, webpages)
> concerning the <o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>multilabel (multiclass?) problem in the context of
> text classification. <o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>By this I mean the fact that a document can be
> classified into more than <o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>one category. Especially w.r.t. supervised learning
> algorithms, where <o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>the documents in the training set may belong to
> multiple classes.<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'><o:p> </o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>Regards,<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>--<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>Cecilie Widsteen<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>Institute for Informatics,<o:p></o:p></span></font></p>
>
> <p class=MsoPlainText><font size=2 color=navy face=Arial><span
> style='font-size:10.0pt'>University of Oslo<o:p></o:p></span></font></p>
>
> </div>
>
> </body>
>
> </html>
>
> --Boundary_(ID_Bwp/m+5eLKSGnHqnO7FgzA)--
>
>
>
--
Dragomir R. Radev radev at umich.edu
Associate Professor of Information, Electrical Engineering and
Computer Science, and Linguistics, the University of Michigan, Ann Arbor
Phone: 734-615-5225 Fax: 734-764-2475 http://www.si.umich.edu/~radev
More information about the Corpora
mailing list