Re: [Corpora-List] How to word presentation for word clustering?

From: Gaël Dia (ddg@di.ubi.pt)
Date: Wed Jul 07 2004 - 17:12:35 MET DST

Next message: Godby,Jean: "[Corpora-List] British and American English"

Previous message: Menno van Zaanen: "[Corpora-List] ICGI04 - ACCEPTED PAPERS, REGISTRATION, GRANTS, DEMO, COMPETITION"
In reply to: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Next in thread: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Reply: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Be careful,

IDF is unique for a word and does not depend on the document
so that you have:

vector w = { tf(1)*IDF(w), tf(2)*IDF(w)....,tf(n)*IDF(w))}

Gaël.

Clive De Silva wrote:
> Dear Chen Wenliang,
>
> I am using TF*IDF values as my representation for words.
> vector w = { tf(1)*IDF(1), tf(2)*IDF(2)....,tf(n)*IDF(n))} where the IDF is
> computed from a large corpus. This seems to give better results than just
> the raw frequency counts.
> The representations I investigated were: TF, TF*IDF and simple binary(1
> represents the word existing in the vector and 0 if it isn't) counts.
>
> Regards,
>
> Clive De Silva
> University of Cambridge

-- 
---------------------------------------------------------
Gaël Harry Dias, PhD            | Assistant Professor
Human Language Technology Group | [www.di.ubi.pt/~ddg]
Computer Science Department     | [ddg@di.ubi.pt]
Beira Interior University       | [Tel: +351 275 319 700]
6201-001 - Covilhã - PORTUGAL   | [Fax: +351 275 319 732]
---------------------------------------------------------

Next message: Godby,Jean: "[Corpora-List] British and American English"
Previous message: Menno van Zaanen: "[Corpora-List] ICGI04 - ACCEPTED PAPERS, REGISTRATION, GRANTS, DEMO, COMPETITION"
In reply to: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Next in thread: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Reply: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Wed Jul 07 2004 - 17:13:49 MET DST