[Corpora-List] Merging info from the BNC and WordNet

From: Mark Davies (Mark_Davies@byu.edu)
Date: Tue Nov 04 2003 - 14:41:30 MET

  • Next message: Edward Loper: "[Corpora-List] NLTK 1.2 released"

    Is anyone aware of projects that have created some type of database that merges the semantic information from WordNet with the frequency and distributional information from the BNC?
     
    For example, a user could query the database to look for all lemma occurring with a particular frequency in certain registers of English or in certain collocations (info from the BNC), but which are also related to a particular hyponym or are a member meronym of a given word (info from WordNet).
     
    I was considering working on such a project -- since I already have both the BNC and WordNet in relational database form (SQL Server) -- but I didn't want to proceed much further if I'd just be re-inventing the wheel. (BTW, the output would not contain actual sentences and paragraphs from the BNC [licensing issues], but would probably just be tables containing info on lemma, frequency, distribution, and semantic relationships).
     
    I'd be happy to summarize the responses, if there is sufficient interest. Thanks in advance.
     
    Mark Davies
     
    =================================================
    Mark Davies
    Assoc. Prof., Linguistics
    Brigham Young University
    (phone) 801-422-9168 / (fax) 801-422-0906
    http://davies-linguistics.byu.edu

    ** Corpus design and use // Web-database scripting **
    ** Historical linguistics // Functional-typological grammar **
    ** Spanish and Portuguese historical and dialectal syntax **
    =================================================



    This archive was generated by hypermail 2b29 : Tue Nov 04 2003 - 14:44:18 MET