A number of duplicate articles (approx. 3%) that might distort
quantitative analysis was encountered in the Frankfurter Rundschau
Corpus contained on the Multilingual Corpus 1 CD-ROM of the European
Corpus Initiative.
A list of the duplicate articles and a Unix shell script to remove
the duplicates from the corpus are available from:
http://www.sfs.nphil.uni-tuebingen.de/~feldweg/fr-dups.html
-- Helmut Feldweg
------------------------------------------------------------------------
Seminar f"ur Sprachwissenschaft, Universit"at T"ubingen
Wilhelmstr. 113, D-72074 T"ubingen, Germany
Tel: +49 7071 294279
Fax: +49 7071 550520
E-mail: Helmut.Feldweg@uni-tuebingen.de
feldweg@sfs.nphil.uni-tuebingen.de
------------------------------------------------------------------------