Re: stop words list

Ted E. Dunning (ted@aptex.com)
Thu, 12 Sep 1996 10:32:53 -0700

A stop word list is a list of common words omitted during a process of
indexing.

righto.

So, do you know where i could find these kind of list in French, German,
Italian, Spanish, Danish...but not in English.

you effectively already have them.

if you have enough text in these languages to make indexing
interesting, then it is literally just two hours work to work through
the thousand most common words to select however many stop words you
want to use.

since it is still pretty much early days for seriously multilingual
IR, if you then distribute these lists, you will have effectively set
the standard.