>I'm doing a project that involves comparing two very large word lists (~40.000 and 70.000 words). What I need to find out, is which words are on one list and not on the other (and/or vice versa).
>Can anyone give me a hint as to how to do this? (I was thinking; maybe a perl script?)
>
>
sort list1 > list1.sorted
sort list2 > list2.sorted
join -v1 list1.sorted list2.sorted
(if you use -v2 instead, you'll get words in list2 and not in list1)
best
-- ------------------------------------------------------------------------ * Lluís Padró i Cirera * UNIVERSITAT POLITÈCNICA DE CATALUNYA *Departament de Llenguatges i Sistemes Informàtics <http://www.lsi.upc.es>* *Centre de Recerca TALP <http://www.talp.upc.es>* Tel: XX-34-934 015 652 Fax: XX-34-934 017 014 padro@lsi.upc.es <mailto:padro@lsi.upc.es> http://www.lsi.upc.es/~padro <http://www.lsi.upc.es/%7Epadro> Mòdul C6 - Campus Nord Jordi Girona Salgado 1-3 08034 Barcelona------------------------------------------------------------------------
This archive was generated by hypermail 2b29 : Mon Nov 17 2003 - 10:13:13 MET