Corpora: Diacritics

From: Geoffrey Sampson (geoffs@cogs.susx.ac.uk)
Date: Fri Apr 20 2001 - 15:44:40 MET DST

  • Next message: Wiesheu, Martin: "Corpora: FW: help - comparing word lists"

    I don't know the UNIbet system, but I suspect that in practice it has been
    rendered out of date by the SAM-PA system, which has sufficient "official"
    backing to be accepted as an international standard. The most accessible
    reference on this which I know is an appendix in D. Gibbon et al., eds.,
    _Handbook of Standards and Resources for Spoken Language Systems_,
    Mouton de Gruyter. SAM-PA is intended for "broad phonetic" (roughly,
    phonemic) transcription; it consists of a mapping of the main IPA symbols
    into the ASCII character set, together with sets of conventions for
    using these elements for the sounds of the various official languages of
    EU member states and a few other languages. For narrow phonetic transcription
    this is insufficiently precise, of course, but for those purposes the IPA
    itself has defined a numerical coding of its entire up-to-date system of
    notations (and if I remember rightly the Gibbon volume reprints this too).

    G.R. Sampson, Professor of Natural Language Computing

    School of Cognitive & Computing Sciences
    University of Sussex
    Falmer, Brighton BN1 9QH, GB

    e-mail geoffs@cogs.susx.ac.uk
    tel. +44 1273 678525
    fax +44 1273 671320
    web http://www.grsampson.net



    This archive was generated by hypermail 2b29 : Fri Apr 20 2001 - 15:40:37 MET DST