Corpora: FW: help - comparing word lists

Vlado Keselj vkeselj at cs.uwaterloo.ca
Fri Apr 20 15:04:03 UTC 2001


On Unix, Linux and similar: You can sort both lists and use comm, e.g.:
sort -u < list1 > list1.sorted
sort -u < list2 > list2.sorted
comm -23 list1.sorted list2.sorted

It will output the words that are on list1 but not on list2.
Both commands are pretty efficient.

Vlado


On Fri, 20 Apr 2001, Wiesheu, Martin wrote:

>
>
> hello out there,
>
> could anyone help me on the following question?:
>
> is there any tool or method to efficiently compare two very long word lists
> to see what words are on one list and not on the other?
>
> thanks,
>
> martin
>
>
> Martin Wiesheu
> ZGS Research
> COMMERZBANK Securities
>
> Tel. + 49 - 69 - 136 43730
> Fax. + 49 - 69 - 136 27445
>



More information about the Corpora mailing list