Corpora: German corpora.

Gunter Lorenz (Gunter.Lorenz@Phil.Uni-augsburg.de)
Tue, 12 Jan 1999 17:01:51 +0100

Dear all,

reading Peter Falvey's recent message to corpora, I regret not having
posted a summary to a similar query of mine three months ago. I did thank
those who responded to me individually, but did not think the overall
results were worth everyone's while - simply because corpora for German are
not nearly as large, as numerous and as elaborate as for English.

Seeing as there seems to be a more widespread interest, I would like to
save my respondents the trouble of answering again.

Arne Fitschen pointed me to the "European Corpus Initiative" (ECI)
http://www.hcrc.ed.ac.uk/Site/ECI.html
and to "Projekt Gutenberg" for non-journalistic texts
http://gutenberg.aol.de/

Valérie Mapelli recommended the ELRA catalogue
http://www.icp.grenet.fr/ELRA/home.html

and Oliver Strunk mentioned the various corpora at the IDS in Mannheim, an
inventory of which can be found under
http://www.ids-mannheim.de/kt/corpora.html.

As I asked about large monitor corpora at the time, ie dynamic corpora that
are constantly being enlarged and enable micro-diachronic investigation,
the only one I found which vaguely resembles those available for English is
the "Mannheimer Morgen Korpus" which should by now have added up to some 30
mio running words.

Hope this helps.

Thanks again to the above colleagues & best wishes - Gunter.