[Corpora-List] Human Language Technology for Corpus Lexicography

From: Amy Neale (a.neale@itri.brighton.ac.uk)
Date: Tue Jan 28 2003 - 15:40:56 MET

  • Next message: Rodrigo Tadeu Gonçalves: "[Corpora-List] Greek characters transliteration"

    8 day Short Course in Human Language Technology for Corpus Lexicography
    25 - 28 Feb; 3 - 6 March 2003
    ITRI, University of Brighton,
    UK

    This eight-day course offers those working in linguistic disciplines the chance to discover how language technologies can add to their research capabilities.

    The course teaches the language technologies that can be used to process text corpora. A study is also made of existing lexical resources produced by, or for, language technology, and the dominant formalisms in use.

    Course Details:
    On completing this course students will be able to:

       1. Describe the ways in which language corpora can be enriched using
          a variety of language technologies.
       2. Critically evaluate these technologies, and determine their
          usefulness for linguistic research and lexicography.
       3. Work with different algorithms and strategies for lemmatisation,
          part-of-speech tagging, parsing and word sense disambiguation.
       4. Describe and evaluate other computational lexical resources that
          are available.
       5. Interpret data in a variety of leading formalisms for lexical
          representation.

    Course Content:

        * Lemmatization, for English and for languages with more complex
          morphology
        * Local grammars for proper names, dates, places, etc
        * Part-of-speech tagging for English and other languages: tagsets
          and training corpora; manual rule-writing approaches
        * Grammars and Parsing: history; context-free grammars; dependency
          grammars; deep and shallow parsing; parser evaluation
        * Word sense disambiguation; word senses, norms and exploitations;
          dictionary-based methods; supervised training methods; senses and
          domains; evaluation
        * Feature structures as a way of holding lexical information
        * Lexical entries in Head-Driven Phrase Structure Grammar
        * Key initiatives in lexical resource development and
          standardisation: EAGLES, SIMPLE, WordNets, FrameNet
        * Machine learning strategies, to include Bayesian approaches,
          Markov Models, Maximum Entropy, Transformation-Based Learning and
          Decision trees and lists.

    Course Dates and Venue:
    Human Language Technology for Corpus Lexicography will run from 25 - 28
    February, and 3 - 6 March, 2003 at the Information Technology Research
    Institute (ITRI) at the University of Brighton, East Sussex, U.K. ITRI
    is an internationally-known centre of excellence in the field of Human
    Language Technology. Brighton is a lively, cosmopolitan city on England
    s south coast, one hour from London by train, and 30 minutes from London
    Gatwick Airport.

    Course Fees:
    The full fee for this two-week course is £1645.00 (including VAT) for
    the first delegate. Second and subsequent delegates from the same
    institution qualify for a reduced rate of £1292.50. Places are limited
    and early registration is recommended.

    For more information and details of how to register please visit:
    http://www.itri.bton.ac.uk/courses/CPDLex/modules/LCM07.html Or contact
    us at itel@brighton.ac.uk



    This archive was generated by hypermail 2b29 : Tue Jan 28 2003 - 16:14:27 MET