RE: Corpora: Sound picture of known world languages

From: Van den Heuvel M, Mev (
Date: Fri Jan 04 2002 - 14:10:09 MET

  • Next message: Van den Heuvel M, Mev: "RE: Corpora: Is corpus phonetics a part of corpus linguistics?"

    To add to Yuri's comment - from the perspective of automatic speech
    recognition there's of course another use for calculating the frequency of
    occurence of particular phonemes in a language. When collecting speech data
    for data-driven speech services, it's essential to cover the entire phonetic
    inventory of a particular language, in order of the frequency of occurence.
    The data should also include the frequency of biphones and triphones. Our
    unit has been doing a lot of these recently for the collection of speech
    data for 2 Germanic (South African English and Afrikaans) and 3 Bantu
    (Xhosa, Zulu and Sesotho) languages.
    Maritza van den Heuvel

    Research Unit for Experimental Phonology
    Department of African Languages
    Stellenbosch University
    South Africa
    Private Bag X1

    Tel: ++27 21 808 3974
    Fax: ++27 21 808 3975
    Internet: <>


    -----Original Message-----
    From: Yuri Tambovtsev []
    Sent: 25 December 2001 15:45
    Subject: Corpora: Sound picture of known world languages

    Dear colleagues, thank you all who answer me. I'd like to answer your
    question that was in all your messages. Why it is important to compute the
    phonemic frequencies of occurrence in a language. Every language has this or
    that unigue sound picture. One can intuitively feel that language A is
    different from language B hearing the sound picture of a language. The
    phonemic frequencies of occurrence create this or that sound mosaic of a
    language. We can compare world languages with each other after we obtain the
    sound picture of every world language. Now linguists believe that there are
    about 4000 or 5000 languages in the world. However, unfortunately, there are
    only 120 data on phonemeic frequency of occurrence I that I could collect
    for world languages. This is why, I urge world linguists to join our group
    of phoneticians who investigate the sound picture of world languages.
    Looking forward to hearing from you soon to my email address: <> Remain yours most hopefully
    Yuri Tambovtsev

    This archive was generated by hypermail 2b29 : Mon Jan 07 2002 - 01:30:18 MET