Corpora: Sentence splitter

Martin Wynne (wynne@solaris3.ids-mannheim.de)
Tue, 5 Oct 1999 12:12:50 +0100 (WET DST)

Dear Corporans,

I have plain text parallel corpora in French, German, Spanish and
English which I would like to align automatically. However all of the
alignment programs that I have access to require sentence tags in the
texts. Can anyone recommend a good sentence splitter either for plain
running text files or for files with minimal SGML markup (we've got to
do this to them too), which would preferably be free, easy to install
and run under Unix, (although DOS/Windows programs could be used) and
will work for these languages.

Many thanks for any suggestions,

Martin

**********************************************************************
Martin Wynne Multilinguale Forschung
Visiting Research Fellow Abteilung LEXIK
wynne@ids-mannheim.de Institut fuer deutsche Sprache
Tel: +49 621 1581 427 R5, 6-13
Fax: +49 621 1581 415 D-68161 Mannheim
+49 621 1581 200
**********************************************************************