Re: Corpora: Text Classification System

From: Bruce L. Lambert, Ph.D. (lambertb@uic.edu)
Date: Thu Jan 17 2002 - 18:45:53 MET

  • Next message: Miles Osborne: "Re: Corpora: Text Classification System"

    Simple Google search turned up:

    http://www-a2k.is.tokushima-u.ac.jp/member/kita/NLP/nlp_tools.html

    At 05:27 PM 1/17/02 +0000, Gabriela Cavaglia wrote:
    >Dear List members,
    >
    >Can anyone point me to a free Text Classification system?
    >(More details of what I want it for below.)
    >
    >Thank you in advance for any help
    >
    >Gabriela Cavaglia`
    >Phd Student
    >ITRI
    >
    >Measuring Corpus homogeneity
    >=====================================================
    >
    >My thesis project is to measure corpus homogeneity. As part of that
    >project, I have developed methods for unsupervised classification of
    >documents based on text internal evidence. I now want a supervised
    >classification system which I can use to evaluate the unsupervised
    >classification I have developed.
    >
    >To date, the corpus I used for the experiments is made of 107
    >documents from the BNC (about 2 million words). The idea is to use the
    >BNC Index information and part of the corpus documents to produce a
    >training sample and use the rest of the corpus documents as a test
    >corpus. I would like to compare the results of the unsupervised
    >classification againt those from the supervised classification.
    >=====================================================



    This archive was generated by hypermail 2b29 : Thu Jan 17 2002 - 18:50:12 MET