Re: [Corpora-List] Date: Wed, 11 Sep 2002 15:16:20 +0200

From: Jean Veronis (Jean.Veronis@mailup.univ-mrs.fr)
Date: Wed Sep 11 2002 - 18:31:31 MET DST

  • Next message: Steven Krauwer: "[Corpora-List] EACL2003: 2nd Call for Workshop Proposals, deadline Oct 1"

    At 15:22 11/09/2002 +0200, maria_rzewuska@mail.ukie.gov.pl wrote:
    >Hi, I have been reading the list for a while and lately I took a closer
    >look at some bilingual corpus projects and I noticed a relatively flexible
    >use of terms: translation corpus, parallel corpus, comaparable corpus, but
    >mainly between the two first. Maybe someone could tell me is there any
    >difference or is it simply mixed up. In the composition of the corpora I
    >did not find any difference which could explain the terminological
    >difference. Any book or clever article that I should read?
    >thanks

    The terminology is used in different ways by different groups of people.
    The situation is so confusing that I had to include the following foreword
    in my book:

    Véronis, J. (Ed.). (2000). Parallel Text Processing: Alignment and use of
    translation corpora. Dordrecht: Kluwer Academic Publishers.

    http://www.up.univ-mrs.fr/veronis/parallel-book.html

    ------------------------

    Terminological note

    As the book was in its final writing stages, Alan Melby made us aware of a
    terminological difficulty concerning the expression parallel text. This
    term is well established within the computational linguistics community, as
    witnessed by its consistent use throughout this book and in the numerous
    publications listed in the bibliography, where it refers to texts
    accompanied by their translation in one or several other languages. It is
    used in a different way among the translation theory and terminology
    circles, where it means texts in different languages and in the same
    domain, but not necessarily being translations of each other (the
    computational linguistics community uses the term comparable for such texts).

    We were therefore faced with a dilemma: either change the title of the
    book--and the terminology used in all the chapters--and risk a complete
    lack of understanding from the computational linguistics community, or stay
    with the usage of the term established by computational linguists and risk
    severe criticism from translation theorists and terminologists.

    We decided for the latter since, after all, computational linguists are
    likely to make up the main readership of the book. Hopefully, this
    terminological note will suffice to clarify matters.



    This archive was generated by hypermail 2b29 : Thu Sep 12 2002 - 11:30:35 MET DST