Re: [Corpora-List] error tagging

From: Timothy Baldwin (tbaldwin@csli.stanford.edu)
Date: Fri Sep 26 2003 - 19:49:31 MET DST

  • Next message: isahara@crl.go.jp: "Re: [Corpora-List] error tagging"

    > I am interested in error tagging and I am looking for corpora which are (or are being) error tagged. Do you know of any? And do you know of any available error tagset?

    One more recent effort I know of is the SST Corpus, which is a 1m word corpus
    of transcribed English speech by Japanese learners of English. Various errors
    are tagged, although I can't find any online account of the full tagset. There
    are a couple of papers in English on the corpus, notably:

    Tono, Y., Kaneko, T., Isahara, H., Saiga, T. and Izumi, E. The Standard
    Speaking Test (SST) Corpus: A 1 million-word spoken corpus of Japanese
    learners of English and its implications for L2 lexicography. Lee, S. (ed.)
    ASIALEX 2001 Proceedings: Asian Bilingualism and the Dictionary. The Second
    Asialex International Congress, August 8-10, 2001, Yonsei University, Korea,
    pp. 257-262

    There is a web page with some documentation and a copy of this paper at:

    http://leo.meikai.ac.jp/~tono/sst/

    There was also a paper at this year's ACL:

    Emi Izumi, Kiyotaka Uchimoto, Toyomi Saiga, Thepchai Supnithi and Hitoshi
    Isahara (2003) Automatic error detection in the Japanese learners' English
    spoken data. In Companion Volume to the Proceedings of the 41st Annual Meeting
    of the Association for Computational Linguistics (ACL '03), pp. 145-8.

    which is also available online at:

    http://acl.ldc.upenn.edu/acl2003/posterdemo/pdf/Izumi.pdf

    Tim

    *-----------------------------------*

    Timothy Baldwin
    Senior research engineer
    Multiword Expression project
    CSLI LinGO Lab

    Contact details:

     Email: tbaldwin@csli.stanford.edu
    Tel: (+1)-650-723-0515
    Fax: (+1)-650-723-2166

    *-----------------------------------*



    This archive was generated by hypermail 2b29 : Fri Sep 26 2003 - 19:53:55 MET DST