[Corpora-List] Q: How to identify duplicates in a largedocument collection
Marc Kupietz
kupietz at ids-mannheim.de
Wed Jan 5 15:54:47 UTC 2005
Hi Bill,
I'm currently preparing some platform-portable code to share.
Regards,
Marc
Am Mittwoch, den 05.01.2005, 06:33 -0500 schrieb William Fletcher:
> Hi Marc and Normand,
>
> How about sharing your code scripts? I am sure everyone would be grateful for an of-the-shelf solution that could be easily adapted to one's own needs or serve as inspiration for other applications.
>
> Regards,
> Bill
>
--
Marc Kupietz Tel. (+49) 621/1581-409
Institut für Deutsche Sprache, Dept. of Lexical Studies/Corpus Technology
PO Box 101621, 68016 Mannheim, Germany http://www.ids-mannheim.de/
More information about the Corpora
mailing list