Dear List members,
Can anyone point me to a free Text Classification system?
(More details of what I want it for below.)
Thank you in advance for any help
Measuring Corpus homogeneity
My thesis project is to measure corpus homogeneity. As part of that
project, I have developed methods for unsupervised classification of
documents based on text internal evidence. I now want a supervised
classification system which I can use to evaluate the unsupervised
classification I have developed.
To date, the corpus I used for the experiments is made of 107
documents from the BNC (about 2 million words). The idea is to use the
BNC Index information and part of the corpus documents to produce a
training sample and use the rest of the corpus documents as a test
corpus. I would like to compare the results of the unsupervised
classification againt those from the supervised classification.
This archive was generated by hypermail 2b29 : Thu Jan 17 2002 - 18:34:58 MET