Re: Corpora: a particular type of sloppiness

From: Marco Antonio Esteves da Rocha (marcor@cce.ufsc.br)
Date: Wed Apr 11 2001 - 07:08:14 MET DST

  • Next message: Friedrich Michael Dimpel: "Corpora: corpora: Statistical test procedures in quantitative stylistic analysis"

    On Tue, 10 Apr 2001, Tadeusz Piotrowski wrote:

    > These arguments are double-edged. There is sloppiness and there is
    > sloppiness. My last name is structurally similar to that of Ken Litkowski,
    > and yet he is (most likely) a native speaker of English, I am not. Will
    > native speakers of English stand my sloppiness? Harold Somers' attitude
    > shows some of them will not, if I used 'corpi', or 'corpus are', or 'corpora
    > is'. Whatever. Well, actually, I think I might err rather by being
    > hypercorrect than otherwise. No split infinitives for me... But even my
    > non-native self looks with a patronizing (?), pitying (?), hurt(?) attitude
    > at some of the mail here.... and there. Should you (native speakers of
    > English) /we (members of this list) struggle with the form, hoping the
    > contents will be illuminating? Or -- the dustbin?
    >
    > But in fact I wanted to report on an interesting type of sloppiness in a
    > language with diacritics. Polish has nine diacritics, or eighteen, when
    > capital letters are counted separately. The point is that very few people
    > bother about diacritics in e-mails, they use what is sometimes called pidgin
    > Polish: only the Latin (or English) characters are used. (You have to press
    > two keys at the same time when you want to use diacritics, you press one
    > when you do not. Economy of language...).
    > A very (VERY) careful writer will use diacritics, or you can tell somebody
    > was writing offline seeing diacritics in his/her mail. In fact, we have a
    > nice gradation: a proper letter with diacritics, a proper letter without
    > diacritics, a casual letter, etc. This device tells you a lot about the
    > speaker(?)/writer.
    > I wonder what do the people do with other diacritic-rich languages? German?
    > French? Czech? Is it the same as in Polish?
    > Regards
    > Tadeusz Piotrowski
    > ***************************************************************
    > mailing address
    > Department of English
    > Opole University Chrobrego 20
    > Oleska 48 PL-55-020 Zorawina (Zórawina)
    > Opole
    > POLAND
    > phone/fax (+48)71-3165847
    > mobile (+48)607159263
    >
    >

    Curious idea. The absence of diacritics in Portuguese is what disturbs me,
    not their inclusion. It is difficult to be sure whether people on the
    other end of the message have the equipment and configuration to actually
    see those diacritics on screen in their e-mail editor. In fact, what
    appears on different screens around the world when you produce diacritics
    in your own equipment is quite unpredictable and may be unreadable for the
    recipient. So people writing messages in Portuguese often choose not to
    use them for safety.

    But it makes me feel very uncomfortable. It is not at all the feeling of
    using pidgin Portuguese but of writing in a different language, especially
    because some very common words - such as the preposition/conjunction "e"
    ("and") and the verb form "e'" (equivalent to "is") can only be
    distinguished by the diacritic. This forces the writer to resort to
    nonexistent spelling - such as using "eh" instead of "e'" - as these words
    are very common, likely invariably to appear in virtually any message
    longer than fifty words and crucial for understanding.

    Imagine if you had to interpret a sentence in which there was no clear
    graphic distinction between

    Corpora and corpus is the same

    AND

    Corpora is corpus and the same

    :)

    Marco Rocha



    This archive was generated by hypermail 2b29 : Wed Apr 11 2001 - 04:07:16 MET DST