[Corpora-List] Comparing files

Miles Osborne miles at inf.ed.ac.uk
Sat Nov 15 22:04:17 UTC 2003


that's far too slow -use a hash table instead.

now, this wouldn't be homework, would it?

Miles

Quoting Otto Lassen <otto at lassen.mail.dk>:

> Hi
> That could be done in any language:
> 1. sort then two lists
> 2. compare them word for word
> 3. output words which are not in both lists
> Regards
> Otto Lassen
>
> At 21:54 15-11-2003 +0100, you wrote:
> >Hi,
> >
> >I'm doing a project that involves comparing two very large word lists
>
> >(~40.000 and 70.000 words). What I need to find out, is which words are
> on
> >one list and not on the other (and/or vice versa).
> >Can anyone give me a hint as to how to do this? (I was thinking; maybe
> a
> >perl script?)
> >
> >Any help will be greatly appreciated.
> >Best,
> >Tine Lassen
>
>



More information about the Corpora mailing list