Re: [Corpora-List] How to word presentation for word clustering?

From: Clive De Silva (cd334@cam.ac.uk)
Date: Wed Jul 07 2004 - 16:44:00 MET DST

Next message: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"

Previous message: Alberto Lavelli: "[Corpora-List] European Master in Language and Communication Technologies in Bozen-Bolzano, Italy"
In reply to: chen wenliang: "[Corpora-List] How to word presentation for word clustering?"
Next in thread: Ga毛l Dia: "Re: [Corpora-List] How to word presentation for word clustering?"
Next in thread: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Reply: Ga毛l Dia: "Re: [Corpora-List] How to word presentation for word clustering?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Dear Chen Wenliang,

I am using TF*IDF values as my representation for words.
vector w = { tf(1)*IDF(1), tf(2)*IDF(2)....,tf(n)*IDF(n))} where the IDF is
computed from a large corpus. This seems to give better results than just
the raw frequency counts.
The representations I investigated were: TF, TF*IDF and simple binary(1
represents the word existing in the vector and 0 if it isn't) counts.

Regards,

Clive De Silva
University of Cambridge
----- Original Message -----
From: "chen wenliang" <chenwl@mail.neu.edu.cn>
To: <corpora@hd.uib.no>
Sent: Wednesday, July 07, 2004 10:17 AM
Subject: [Corpora-List] How to word presentation for word clustering?

Dear all,

I am looking for a word presentation for word clustering.

I am doing a project that is about word clustering. Now I use a presentation
that word is presented as

a vector w = {tf(1),tf(2),...,tf(n)}, tf(i) is the frequency of the word in
document i. Then I use k-means

as the clustering algorithm.

Thanks all.
　　

regards,

Chen Wenliang chenwl@mail.neu.edu.cn

Nlplab, Northeastern University, China.

2004-07-07

Next message: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Previous message: Alberto Lavelli: "[Corpora-List] European Master in Language and Communication Technologies in Bozen-Bolzano, Italy"
In reply to: chen wenliang: "[Corpora-List] How to word presentation for word clustering?"
Next in thread: Ga毛l Dia: "Re: [Corpora-List] How to word presentation for word clustering?"
Next in thread: Clive De Silva: "Re: [Corpora-List] How to word presentation for word clustering?"
Reply: Ga毛l Dia: "Re: [Corpora-List] How to word presentation for word clustering?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Wed Jul 07 2004 - 16:44:23 MET DST