[Corpora-List] CFP: Special session on treebanks for spoken language, NoDaLiDa05

From: Janne Bondi Johannessen (j.b.johannessen@ilf.uio.no)
Date: Fri Dec 17 2004 - 12:22:17 MET

  • Next message: Joerg Tiedemann: "[Corpora-List] SMT models trained on EUROPARL"

            !!!!!!!!!!!!!!!!!!!!!!!!!!!
            CALL FOR PAPERS
              !!!!!!!!!!!!!!!!!!!!!!!!!!!

              NODALIDA 2005: http://phon.joensuu.fi/nodalida2005/
              SPECIAL SESSION ON TREEBANKS:
    http://www.hf.uio.no/tekstlab/treebank_workshop

            SPECIAL SESSION ON TREEBANKS FOR SPOKEN LANGUAGE AND DISCOURSE

              JOENSUU, FINLAND, THURSDAY MAY 19, 2005

              ORGANIZED BY THE NORDIC TREEBANK NETWORK

              Treebanks are a language resource that provides annotations of
              natural languages at various levels: at the morpheme level, the
              word level, the phrase level, the discourse level, and the level
              of functor-argument structure. Treebanks have become crucially
              important for the development of data-driven approaches to natural
              language processing, human language technologies, grammar
              extraction and linguistic research in general.

              Existing spoken language treebanks include the Switchboard section
              of the Penn Treebank [1], and the CHRISTINE [2] and ICE-GB [3]
              treebanks for English; the VERBMOBIL [4] treebanks for English,
              German, and Japanese; and the CGN [5] treebank for Dutch. Existing
              discourse treebanks include the English RST Corpus [6] and the
              Penn Discourse Treebank [7]. The DAMSL project [8] and the
              Gothenburg Dialogue Coding Schemas [9] address the problem of
              annotating dialogues with speech act relations between utterances.

              The special NODALIDA session on treebanks aims to provide a forum
              where researchers and advanced students with an interest in
              treebanks can exchange ideas, in particular on how to extend
              treebanks from syntactic annotations of written language to
              treebanks that also include annotations of the structure of spoken
              language with respect to syntax, discourse structure, and/or
              speech acts.

              TOPICS OF INTEREST

            There will be one invited speaker.
            We invite submission of papers on topics relevant to treebanks
            in general, and spoken language and discourse treebanks in
    particular, including but not limited to:

                      * design principles and annotation schemes for annotating
                      spoken language and discourse treebanks with respect
    to syntax, discourse structure, and/or speech acts;

                    * automatic tools for creating spoken language and
    discourse treebanks, and how to adapt tools
                            designed for creating written language
                        treebanks to spoken language and discourse;

                      * comparing spoken language and discourse annotations
    with written language annotations, and
                    identifying the most important challenges in spoken
    language and discourse annotation;

            While we particularly encourage submissions on spoken language
            and discourse treebanks, we also encourage submissions on other
            treebank topics.

              SUBMISSIONS

              We invite extended abstracts (approximately 1500 words) describing
              existing research connected to the topics of the special session.
              Submissions are non-anonymous and should include: title;
              author(s); affiliation(s); and contact author's e-mail address,
              postal address, telephone and fax numbers.

              Abstracts should be sent to: mtk@id.cbs.dk

              The presentation at the workshop will be 30 minutes long (20
              minutes for presentation and 10 minutes for questions and
              discussion). The final version of the accepted papers may not
              exceed 12 A4 pages.

              A SAMPLE SPOKEN LANGUAGE AND DISCOURSE TREEBANK

            We strongly encourage the participants as well as the speakers of
            the special session on spoken language and discourse
              treebanks to contribute with a small sample treebank which should
              preferably:

                      * be based on a small corpus of spontaneous spoken dialogue
                        consisting of 500-1500 words in any language;
                      * contain English glosses to ensure that the treebank is
                        accessible to a wider audience;
                      * include annotations of discourse relations, speech acts, or
                        similar relations that connect sentences and utterances made
                        by different speakers into larger units;
                      * contain annotated examples of overlapping dialogue,
                        including utterances where one speaker completes an
                        utterance started by another speaker.

              The sample treebank should be submitted by sending the following
              three files to mtk@id.cbs.dk before 20th February 2005:

                      * a plain text abstract of 50-200 words that briefly
                      describes
                        how the sample treebank was created, possibly with
                        hyperlinks to more detailed information about the treebank;
                      * a PDF file containing a human-readable visualization of the
                        treebank;
                      * optionally, the source files for the sample treebank,
                        preferably encoded in TIGER-XML format.

              The sample treebanks will be made publicly available before the
              NODALIDA conference.

              IMPORTANT DATES

                      Deadline for submission of
                       abstracts and treebank samples to the treebank
                       session
                      February 20, 2005

                      Notification of acceptance
                      March 25, 2005

                    Special session on treebanks
                      Thursday, May 19, 2005

                      Final version of paper for proceedings
                      June 20, 2005

              PROCEEDINGS
              Papers presented at the workshop will be
               invited to appear in the workshop proceedings
              (after a reviewing process).

              PROGRAM COMMITTEE

              Matthias Trautner Kromann (mtk at id.cbs.dk)
              Peter Juel Henrichsen (pjuel at id.cbs.dk)
              Janne Bondi Johannessen (jannebj at ilf.uio.no)

    IMPORTANT WEBSITES:

    SPECIAL TREEBANK SESSION: http://www.hf.uio.no/tekstlab/treebank_workshop
    NORDIC TREEBANK NETWORK: http://w3.msi.vxu.se/~nivre/research/nt.html
    NODALIDA: http://phon.joensuu.fi/nodalida2005/

              LINKS

              [1] http://www.cis.upenn.edu/~treebank/home.html
              [2] http://www.grsampson.net/RChristine.html
              [3] http://www.ucl.ac.uk/english-usage/ice-gb
              [4] http://verbmobil.dfki.de/cgi-bin/verbmobil/htbin/doc-access.cgi
              [5]
     
    http://lands.let.kun.nl/cgn/doc_English/topics/version_1.0/annot/syntax/info.htm
              [6] http://www.isi.edu/~marcu/discourse
              [7] http://www.cis.upenn.edu/~pdtb
              [8] http://www.cs.rochester.edu/research/cisd/resources/damsl/
              [9] http://www.ling.gu.se/~jens/publications/docs076-100/093.pdf



    This archive was generated by hypermail 2b29 : Sun Dec 19 2004 - 09:02:34 MET