RE: Corpora: Is corpus phonetics a part of corpus linguistics?

From: Van den Heuvel M, Mev (
Date: Mon Jan 07 2002 - 07:35:05 MET

  • Next message: Adam Smith: "Corpora: advertisement for professor of linguistics at Macquarie University"

    Offhand it may seem strange to use derived orthographic indications to
    perform the sort of phonemic calculations we're talking about. However, in
    cases where there are no speech corpora at all for the languages you wish to
    study, starting out with calculations of phonemic occurence based on
    orthographic strings is a very good way of designing corpus content to
    elicit the complete phonetic inventory of a language. Based on a normalised
    frequency score on a fairly large text (minimum 50 000 words) from the
    language, you can determine the frequency and contexts of /p/, and be sure
    to include it in in all possible contexts in the items to be collected for a
    speech corpus. This way, you can be sure to cover the allophonic variation
    of /p/ comprehensively. The usefulness of these sort of calculations are,
    however, limited. The interesting corpus phonetics starts when you get your
    hands on the actual data that you've collected! :-)

    Maritza den Heuvel


    -----Original Message-----
    From: Alex Chengyu Fang []
    Sent: 04 January 2002 16:52
    To: Yuri Tambovtsev;
    Subject: Re: Corpora: Is corpus phonetics a part of corpus linguistics?

    By "corpus phonetics", I'd understand it as a study
    based on a corpus of "recorded speech", therefore a
    rather redundant expression since phonetics is
    traditionally much of a field study. You may find
    interesting an ICAME article by Haliday which mentions
    the pioneering efforts by his teacher Wang Li to
    construct a corpus of recorded Cantonese Chinese.

    Your own study seems to be one based on derived
    indications from authography or transcribed speech
    that relate themselves indirectly to phonetic
    features. If so, it's certainly part of corpus
    linguistics but needs a more self-evident name,
    something like "text-based phonetics", which,
    admittedly, sounds a bit mutually exclusive.


    This archive was generated by hypermail 2b29 : Mon Jan 07 2002 - 07:48:49 MET