[Corpora-List] Korean and Japanese stemming

Sebastian Hellmann hellmann at informatik.uni-leipzig.de
Fri Mar 2 11:11:48 UTC 2012


Hello Stefan,

I know, it is not 100% what you are looking for, but it is open source:
http://semanticweb.kaist.ac.kr/home/index.php/HanNanum

Maybe, you can ask the developers directly? (I cc'ed DongHyun Choi )

It would also be nice to migrate anything you find to more prominent 
places such as Lucene [1].
Although, I am not sure what the process for this would look like. 
(Maybe a mail to the Lucene project might be enough.)

All the best,
Sebastian


[1] 
http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/api/contrib-snowball/index.html



On 03/02/2012 10:16 AM, Stefan Bordag wrote:
> Dear all,
>
> Does anyone know whether someone wrote a simple Porter-stemmer or 
> similar set of rules for stemming korean texts? Same for Japanese 
> texts. It doesn't need to be anything fancy. But using google 
> translate and search engine results turns out to not lead anywhere, or 
> I am looking in the wrong places.
>
> Thank you very much in advance,
> Stefan Bordag
>


-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects: http://nlp2rdf.org , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list