<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Another example of application of LCS algorithms in corpora:<br>
We used detection of repeated strings in press articles to identify
texts relevant for epidemic surveillance and detect what disease
spread where.<br>
It proved particularly useful for articles written in
morphologically rich languages (Greek, Polish, Russian...) or
languages with different writing systems (arabic, chinese).<br>
<br>
Some examples are shown here:<br>
<a class="moz-txt-link-freetext" href="https://daniel.greyc.fr/">https://daniel.greyc.fr/</a><br>
<br>
More details can be found in this paper:<br>
<a class="moz-txt-link-freetext" href="http://www.cs.helsinki.fi/u/doucet/papers/JapTAL2012.pdf">http://www.cs.helsinki.fi/u/doucet/papers/JapTAL2012.pdf</a><br>
<br>
Gaël<br>
<br>
<a href="javascript:void(0)" data-ved="0CGkQ5hkwBQ" class="pplsrsla"
data-url="http://www.cs.helsinki.fi/u/doucet/papers/JapTAL2012.pdf"
data-title="DAnIEL: Language Independent Character-Based News
Surveillance" data-desc="where multilingual capacity is crucial,
we focus on Epidemic Surveillance. ... Analysis for Information
Extraction in any Language (DAnIEL), a genre-based ..."
data-sli="srsl_5" data-ci="srslc_5" data-vli="srslcl_5"
id="srsl_5" data-slg="webres" jsaction="srl.s" role="button"
tabindex="0"><span class="pplsrsl"></span></a>
<pre class="moz-signature" cols="72">
--
----------------------------------------
PhD Student, HUman Language TECHnologies (HULTECH)
Caen Campus 2, Bureau S3-365,
Boulevard du Maréchal Juin
14000 Caen
Tél: 02 31 56 73 98
<a class="moz-txt-link-freetext" href="http://lejeuneg.users.greyc.fr/">http://lejeuneg.users.greyc.fr/</a>
---------------------------------------- </pre>
</body>
</html>