Corpora: CFP: Linguistic Database Workshop

From: Steven Bird (sb@unagi.cis.upenn.edu)
Date: Wed Aug 15 2001 - 15:19:49 MET DST

  • Next message: Christof Monz: "Corpora: Assistant Professorship at ILLC"

                       IRCS WORKSHOP ON LINGUISTIC DATABASES

                            University of Pennsylvania
                                 Philadelphia, USA
                                11-13 December 2001

                   http://www.ldc.upenn.edu/annotation/database/

                                   Organized by:
                   Steven Bird, Peter Buneman and Mark Liberman
                  Department of Computer and Information Science,
           Department of Linguistics, and the Linguistic Data Consortium
                            University of Pennsylvania

                     Funded by the National Science Foundation

    CALL FOR ABSTRACTS AND PROPOSALS

    Linguistic databases are digital repositories of structured information
    intended to document natural language and natural communicative
    interaction. Over the last decade, linguistic databases have come to stand
    at the center of empirical research in the language sciences, and in the
    development of new human language technologies. Like genomic databases,
    linguistic databases are complex, evolving and richly annotated
    repositories, and pose interesting challenges for efficient representation,
    indexing and query. And like most scientific databases, linguistic
    databases have made little use of standard database technology.

    The goals of the workshop are to take stock of existing research in
    linguistic databases, to identify the key problems, and to explore
    applications of current database research to these problems. More broadly,
    the workshop will help define the research questions of a new "linguistic
    database community" and initiate the ongoing interchange of relevant
    problems and results between this community and the database community at
    large.

    The workshop is expected to attract participants from a range of
    specialties including databases, linguistics, computational linguistics,
    annotation and markup. There will be tutorial-style presentations on
    relevant models in each of these areas.

    The workshop will address a selection of the following topics:

    MODELS
    * models for text databases, speech databases, multimodal databases,
      typological databases, geographical databases (language maps),
      and metadata repositories
    * relational, object-oriented and semi-structured models for
      representing linguistic annotations
    * representations for specific linguistic datatypes (e.g. databases of
      aligned parallel text)
    * modelling temporal and (geo)spatial structure
    * critical analysis of existing linguistic databases
    * special problems for systematic data representation posed by
      linguistic fieldwork

    LANGUAGES
    * query of multilayer annotations
    * linguistic applications/extensions of XML query languages
    * analysis of existing ad hoc query languages
    * queries over temporal and (geo)spatial structure

    OTHER TOPICS
    * database support (e.g. what standard database technology has proven
      worthwhile for linguistic databases?)
    * systematic methods for populating linguistic databases
    * appropriate indexing methods for linguistic strings and structures
    * archiving and preservation
    * metadata standards serving as finding aids for linguistic databases
    * data provenance / data lineage
    * annotation servers

    PROGRAM

    The program will have a varied format, designed to maximize
    cross-fertilization among the various specialties, and to allow
    extended open discussion. Components of the program will include:

    * tutorials on relevant models from linguistics, databases
      or annotation, e.g. the structure of lexical entries,
      semi-structured query languages, models of text and signal annotation
    * panel sessions on annotated text and lexicons (and possibly others),
      with position papers and panel discussion,
      to evaluate competing approaches
    * full papers reporting new research
    * demonstrations of systems for creating and/or managing
      linguistic data

    TIMETABLE

    Expressions of interest are welcome anytime, please see the form on the
    workshop website. If you have any suggestions concerning the workshop,
    please email the organizers.

    FRIDAY 14 SEPTEMBER
      Proposals for tutorials and position papers - please email the organizers
    FRIDAY 14 SEPTEMBER
      Abstracts for papers (400 words) and demonstrations (200 words)
    FRIDAY NOVEMBER 30
      Final papers (10 page limit)

    Registration will be open in September. Please note that participation
    will be limited by space.

    PROCEEDINGS

    The papers will be published in web and hardcopy form (the latter just
    for workshop attenders). Papers submitted in HTML should be written
    with the hardcopy version in mind, so a text string which anchors a
    hyperlink should be directly interpretable, rather than e.g. "visit
    this link".

    VENUE

    The workshop will be held at the Institute for Research in Cognitive
    Science (IRCS) at the University of Pennsylvania, in Philadelphia,
    USA. Workshop sessions will take place in IRCS conference rooms,
    located on the fourth floor of 3401 Walnut Street, adjacent to the
    university campus, which is two miles west of the city center. The
    main meeting rooms will be equipped with the usual presentation
    facilities, including projection and audio facilities.

    SPONSORSHIP

    The workshop is being funded by some NSF grants to the University of
    Pennsylvania. There will be no registration fee, and hotel accomodation
    will be covered for presenters.

    USEFUL WEBSITES

    http://db.cis.upenn.edu Database Research at Penn
    http://www.ldc.upenn.edu/annotation/ Linguistic Annotation
    http://www.ldc.upenn.edu/exploration/ Linguistic Exploration
    http://www.cis.upenn.edu/~ircs/ IRCS homepage
    http://www.talkbank.org/ NSF TalkBank Project
    http://www.ldc.upenn.edu/sb/isle.html NSF ISLE Project
    http://www.language-archives.org/ Open Language Archives Community
    http://www.upenn.edu/philadelphia/ Philadelphia
    http://www.facilities.upenn.edu/visitUs/ Getting to Penn

    ORGANIZERS

    Steven Bird http://www.ldc.upenn.edu/sb/
    Peter Buneman http://www.cis.upenn.edu/~peter/
    Mark Liberman http://www.ldc.upenn.edu/myl/



    This archive was generated by hypermail 2b29 : Wed Aug 15 2001 - 15:14:42 MET DST