similarity vector of a passage

Yaakov Yaari (yyaari@netvision.net.il)
Sat, 02 Nov 1996 19:41:20 +0200

Could you give me an advice on this: I want to form a similarity vector
(in the Salton sense) for a passage of text, so I can compare it to
another psssage in the same document.

At the moment I am using a simplistic approach where a tf and idf of a
word is fixed within a document and depends on the document collection.
I have gathered a collection of articles which make up my collection
(15K words total).

Any ideas?

Yaakov Yaari