Corpora: Term extraction. How to implement?

From: Hristo Tanev (htanev@yahoo.co.uk)
Date: Sat Feb 10 2001 - 12:54:10 MET

  • Next message: Roberta Facchinetti: "Corpora: corpus of Academic American English"

    Dear All,
    Currently I am working with some students on term
    extraction from IT texts in Bulgarian.
    For Bulgarian we don't have annotated corpora, so we
    intend to use collection of texts - one collection
    from IT texts and the other from non-IT texts.

    We intend to make the term extraction by taking the
    most frequently appearing words from the IT
    collection, which don't appear frequently in the
    non-IT texts, thus skipping prepositions, conjunctions
    and other frequently used non-term words.

    Can someone tell me more about this kind of term
    extraction?
    And eventually can someone propose another method for
    term extraxtion, which doesn't require annotated
    corpora.

    Best wishes,
    Hristo Tanev

    ____________________________________________________________
    Do You Yahoo!?
    Get your free @yahoo.co.uk address at http://mail.yahoo.co.uk
    or your free @yahoo.ie address at http://mail.yahoo.ie



    This archive was generated by hypermail 2b29 : Sat Feb 10 2001 - 12:50:34 MET