[Corpora-List] Q: How to identify duplicates in a largedocument collection

Marc Kupietz kupietz at ids-mannheim.de
Wed Jan 5 15:54:47 UTC 2005


Hi Bill,

I'm currently preparing some platform-portable code to share.

Regards,
Marc

Am Mittwoch, den 05.01.2005, 06:33 -0500 schrieb William Fletcher:
> Hi Marc and Normand,
> 
> How about sharing your code scripts? I am sure everyone would be grateful for an of-the-shelf solution that could be easily adapted to one's own needs or serve as inspiration for other applications.
> 
> Regards,
> Bill
> 

-- 
Marc Kupietz                                      Tel. (+49) 621/1581-409
Institut für Deutsche Sprache, Dept. of Lexical Studies/Corpus Technology
PO Box 101621, 68016 Mannheim, Germany        http://www.ids-mannheim.de/



More information about the Corpora mailing list