English-German translation corpus

Josef.Schmied@phil.tu-chemnitz.de
Thu, 14 Dec 1995 09:57:27 -0100

We are presently compiling an English-German, German-English
translation corpus. We have so far concentrated on the English-German
part, of which over half a million words have been computerized to date.
The English-German part is divided into a core corpus and a
subcorpus. Core corpus parameters include British English, written and
non-literary, with the texts ranging from academic textbooks from various
domains (e.g. history, philosophy, the arts, economics and
physics), to publications by the European Community/Union and a selection
of tourist brochures. Our subcorpus consists of two cat
egories which deviate from the main parameters of the core corpus,
namely contemporary British literature on the one hand and scripted public
speeches - language which is written to be spoken - on the other.
Since it is not always possible to find parallel categories
for the two directions, the German-English part will at least consist of
academic textbooks, literature and tourist brochures.
We would be grateful for ideas and suggestions on what else we could
include in our corpus. Since copyright problems are a major obstacle to
corpus compilation, we are particularly interested in material
which is freely accessible and perhaps even available in
machine-readable form. More specifically, are there any newspapers
that publish translated articles regularly, or are there any special
translation pages on the internet?
More information on the project will be available on the internet
shortly under http://www.tu-chemnitz.de/~ehe/public/real.htm
Prof. Dr. Josef Schmied
English Language and Linguistics
Reichenhainer Str.39/223
TU Chemnitz-Zwickau
09107 Chemnitz
Tel.: +49 371 531-4226
+49 371 531-4279 (Secretary)
Fax.: +49 371 531-4233
e-mail: Josef.Schmied@phil.tu-chemnitz.de