[Corpora-List] Short message service (SMS) corpus publicly available

From: Min-Yen Kan (kanmy@comp.nus.edu.sg)
Date: Mon May 03 2004 - 14:46:50 MET DST

  • Next message: Marco Baroni: "[Corpora-List] Knorpora 1.0"

    Dear researchers:

            We are pleased to make publicly available a small corpus of short
    message service (SMS) messages.

    **** National University of Singapore Short Message Service Corpus ****

    These messages were collected and used in a final year undergraduate project
    analyzing the efficiency of SMS input. The corpus contains messages mostly
    in English. The message contributors were mainly university students in
    Singapore.

    Over 10,000 messages were collected, representing over 100 different users.
    The corpus is made available under a modified Open Directory Project
    license. Please see the webpage for the corpus for more details. More
    comprehensive documentation on the (on-going) project will be made available
    as time and demand allow.

    http://www.comp.nus.edu.sg/~rpnlpir/downloads/corpora/smsCorpus/

    We hope the community with find this corpus useful as a small benchmark for
    gauging the efficiency of SMS message entry as well as for SMS / chat log
    language analysis. These messages are provided as an XML file that
    validates against a document-internal DTD.

    Regards,

    Min-Yen KAN
    Assistant Professor
    Department of Computer Science, School of Computing
    National University of Singapore, Singapore 117543
    Office: S15-05-05
    Tel: ++ (65) 6874-1885
    Fax: ++ (65) 6779-4580
    kanmy@comp.nus.edu.sg
    http://www.comp.nus.edu.sg/~kanmy



    This archive was generated by hypermail 2b29 : Mon May 03 2004 - 15:04:14 MET DST