RE: [Corpora-List] Texts with keywords for supervised learning

From: Lee, David (dvdlee@umich.edu)
Date: Thu Jan 16 2003 - 17:43:30 MET

  • Next message: Przemyslaw Kaszubski: "[Corpora-List] web concordancing engine"

    Williams' correct. In fact, when working on my BNC Index, I manually copied and pasted keywords from the COPAC libary catalogue system into the BNC Index spreadsheet (thus they are easily retrievable). Most BNC texts that were taken from published books therefore have library keywords associated with them. All that's needed is a licence for the BNC World Edition.

    Hope this helps.

    Dave.
    ___________________________________________________
    David YW Lee
    dvdlee@umich.edu
    Research Fellow, MICASE project
    English Language Institute, University of Michigan
    TCF Building, 401 E. Liberty, Suite 350, Rm 3140
    Ann Arbor, Michigan 48104-2298, USA. Tel: +1 734-615-9638 (O)

    MICASE web site: http://www.lsa.umich.edu/eli/micase/micase.htm
    Corpus-based Linguistics web site: http://devoted.to/corpora
    ___________________________________________________

    > -----Original Message-----
    > From: William Mann [mailto:bill_mann@sil.org]
    > Sent: Thu, January 16, 2003 11:26 AM
    > To: Anette Hulth; corpora@hit.uib.no
    > Subject: Re: [Corpora-List] Texts with keywords for
    > supervised learning
    >
    >
    > My impression is that many library catalogs are really this
    > sort of corpus,
    > except that the texts are on the shelves.
    >
    > Perhaps catalogs of items that are available on line could be
    > converted into
    > being this sort of corpus.
    >
    > Bill Mann
    >
    > ----- Original Message -----
    > From: "Anette Hulth" <hulth@dsv.su.se>
    > To: <corpora@hit.uib.no>
    > Sent: Thursday, January 16, 2003 9:41 AM
    > Subject: [Corpora-List] Texts with keywords for supervised learning
    >
    >
    > > Dear list members,
    > >
    > > I'm currently doing experiments on keyword derivation,
    > > treating it as a supervised learning task. (By keywords
    > > a mean a set of say 3-15 words reflecting the content
    > > of the actual text.) I wonder if there is anybody who's
    > > aware of any freely available corpus of text documents
    > > in English, with manually assigned keywords that may
    > > be (automatically) extracted. Any pointers will be much
    > > appreciated!
    > >
    > > Kind regards
    > > /Anette Hulth
    > >
    > > ---------------------------------------------------
    > > Anette Hulth
    > > Dept. of Computer and Systems Sciences
    > > Stockholm University / KTH
    > > Sweden
    > > ---------------------------------------------------
    > >
    > >
    > >
    > >
    > >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Thu Jan 16 2003 - 17:43:45 MET