<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:162622429;
mso-list-type:hybrid;
mso-list-template-ids:1942880318 -1244626956 994857844 382473846 -225818596 -1440051108 -1761042036 198366820 1132376002 1475656650;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Times New Roman","serif";}
@list l0:level2
{mso-level-start-at:1251;
mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:72.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Arial","sans-serif";
mso-bidi-font-family:"Times New Roman";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:108.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Times New Roman","serif";}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:144.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Times New Roman","serif";}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:180.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Times New Roman","serif";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:216.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Times New Roman","serif";}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:252.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Times New Roman","serif";}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:288.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Times New Roman","serif";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:•;
mso-level-tab-stop:324.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Times New Roman","serif";}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-GB link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><b><u><span style='color:#002060'>WHAT IS JEX<o:p></o:p></span></u></b></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#002060'>The <b><a href="http://langtech.jrc.ec.europa.eu/Eurovoc.html"><span style='color:red'>J</span>RC <span style='color:red'>E</span>uroVoc Inde<span style='color:red'>x</span>er JEX</a></b> is readily trained multi-label categorisation software that assigns categories from the large-scale and wide-coverage <a href="http://eurovoc.europa.eu/" target="_blank">EuroVoc Thesaurus</a> (consisting of thousands of categories). JEX is being distributed together with its training data (twenty to forty thousand documents per language). JEX has been trained for 22 languages on mostly parallel text (texts and their professionally produced translations). You can re-train JEX with your own documents, and even using your own categorisation scheme. JEX provides a graphical user interface (GUI), a command line option for batch processing, as well as an API.<o:p></o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><b><u><span style='color:#002060'>DOWNLOAD JEX – LANGUAGE COVERAGE<o:p></o:p></span></u></b></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'>Languages: Readily trained for </span></b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'>22 languages, but trainable for many more: <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'> Bulgarian, Czech, Danish, Dutch, English, Estonian, German, Greek,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'> Finnish, French, Hungarian, Italian, Latvian, Lithuanian, Maltese, <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'> Polish, Portuguese, Romanian, Slovak, Slovene, Spanish and Swedish.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'> <o:p></o:p></span></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'>Language families: </span></b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'>Germanic, Romance, Slavic, Hellenic, Finno-Ugric, Baltic and Semitic.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'>URL: </span></b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'><a href="http://langtech.jrc.ec.europa.eu/Eurovoc.html">http://langtech.jrc.ec.europa.eu/Eurovoc.html</a><o:p></o:p></span></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'><o:p> </o:p></span></b></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'>Creator: </span></b><span style='font-size:10.0pt;font-family:"Courier New";color:#632523'>European Commission – Joint Research Centre (<a href="http://langtech.jrc.ec.europa.eu/">JRC</a>)<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><b><u><span style='color:#002060'>WHAT JEX CAN BE USED FOR<o:p></o:p></span></u></b></p><p class=MsoNormal><span style='color:#002060'> <o:p></o:p></span></p><p class=MsoNormal><span style='color:#002060'>JEX can be used fully automatically or as an interactive tool to support professional librarians in their work. <o:p></o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#002060'>JEX has also many potential uses in the field of <b>Computational Linguistics</b> because it is highly multilingual and it lends itself to cross-lingual tasks:<o:p></o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-family:"Times New Roman","serif";color:#002060'><span style='mso-list:Ignore'>•<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='color:#002060'>Use for <b>multilingual classification experiments</b>, e.g. to test the impact of different document representations, etc. (n-grams, lemmas, POS, word-sense disambiguation, …), across different languages and language families;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-family:"Times New Roman","serif";color:#002060'><span style='mso-list:Ignore'>•<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='color:#002060'>Use as <b>input to other text mining applications</b>, e.g.<o:p></o:p></span></p><p class=MsoNormal style='margin-left:72.0pt;text-indent:-18.0pt;mso-list:l0 level2 lfo1'><![if !supportLists]><span lang=FR style='font-family:"Arial","sans-serif";color:#002060'><span style='mso-list:Ignore'>•<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='color:#002060'>Detect</span><span lang=FR style='color:#002060'> document translations (Pouliquen et al. 2004);<o:p></o:p></span></p><p class=MsoNormal style='margin-left:72.0pt;text-indent:-18.0pt;mso-list:l0 level2 lfo1'><![if !supportLists]><span style='font-family:"Arial","sans-serif";color:#002060'><span style='mso-list:Ignore'>•<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='color:#002060'>Cross-lingual plagiarism<b> </b></span><span style='color:#002060'>detection (Potthast et al. 2010);<o:p></o:p></span></p><p class=MsoNormal style='margin-left:72.0pt;text-indent:-18.0pt;mso-list:l0 level2 lfo1'><![if !supportLists]><span style='font-family:"Arial","sans-serif";color:#002060'><span style='mso-list:Ignore'>•<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='color:#002060'>Link related documents across languages<b> </b></span><span style='color:#002060'>(Pouliquen et al. 2008);<o:p></o:p></span></p><p class=MsoNormal style='margin-left:72.0pt;text-indent:-18.0pt;mso-list:l0 level2 lfo1'><![if !supportLists]><span style='font-family:"Arial","sans-serif";color:#002060'><span style='mso-list:Ignore'>•<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='color:#002060'>Support the lexical choice in Machine Translation;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:72.0pt;text-indent:-18.0pt;mso-list:l0 level2 lfo1'><![if !supportLists]><span style='font-family:"Arial","sans-serif";color:#002060'><span style='mso-list:Ignore'>•<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='color:#002060'>Rank sentences in topic-specific summarisation;<o:p></o:p></span></p><p class=MsoNormal style='margin-left:72.0pt;text-indent:-18.0pt;mso-list:l0 level2 lfo1'><![if !supportLists]><span style='font-family:"Arial","sans-serif";color:#002060'><span style='mso-list:Ignore'>•<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='color:#002060'>…<o:p></o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><u><span style='color:#002060'>MORE INFORMATION<o:p></o:p></span></u></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#002060'>At <a href="http://langtech.jrc.ec.europa.eu/">http://langtech.jrc.ec.europa.eu/</a>, you find more information on the JRC’s multilingual language technology activity, download links for the <i>JRC EuroVoc Indexer JEX</i>, as well as a page pointing to further freely available multilingual resources. For details on JEX and its performance, you can read the following publication, which you might also want to use for scientific references:<o:p></o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal style='margin-left:36.0pt'><span style='color:#002060'>Steinberger Ralf, Mohamed Ebrahim & Marco Turchi (2012). <br><strong><span style='font-family:"Calibri","sans-serif"'><a href="http://langtech.jrc.ec.europa.eu/Documents/2012_LREC-JEX-final.pdf" target="_blank" title="Reference publication explaining JEX, to be used in the bibliography of scientific publications"><span style='color:#002060'>JRC EuroVoc Indexer JEX - A freely available multi-label categorisation tool</span></a></span></strong>. <br>Proceedings of the 8<sup>th</sup> international conference on Language Resources and Evaluation <br>(LREC'2012), Istanbul, 21-27 May 2012. <br>Available at : <a href="http://langtech.jrc.ec.europa.eu/Documents/2012_LREC-JEX-final.pdf"><span style='color:#002060'>http://langtech.jrc.ec.europa.eu/Documents/2012_LREC-JEX-final.pdf</span></a> <o:p></o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><span style='color:#002060'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:9.0pt;color:gray'>Ralf Steinberger, Mohamed Ebrahim & Marco Turchi<br></span></b><span style='font-size:9.0pt;color:gray'>European Commission - Joint Research Centre (JRC)<br>21027 Ispra (VA), Italy<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;color:gray'>URL – Applications: <a href="http://emm.newsbrief.eu/overview.html"><span style='color:gray'>http://emm.newsbrief.eu/overview.html</span></a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;color:gray'>URL – The science behind them: <a href="http://langtech.jrc.ec.europa.eu/"><span style='color:gray'>http://langtech.jrc.ec.europa.eu/</span></a> <o:p></o:p></span></p><p class=MsoNormal><span style='font-size:9.0pt;color:gray'><br><br><o:p></o:p></span></p></div></body></html>