Re: Corpora: MS Word to text

Tony Berber Sardinha (tony4@uol.com.br)
Fri, 3 Sep 1999 14:54:48 -0300

Hi

Download.com has these two utilities that might do the trick:

SingleSource 1.0
Turn your Word documents into help files and HTML.
OS: Windows 95/98/NT   License: Shareware

HTML to Text 1.0
Convert HTML documents to plain text.
OS: Windows 3.x   License: Shareware

cheers
tony
-------------------------------
Dr Tony Berber Sardinha
Catholic University of Sao Paulo, Brazil
tony4@uol.com.br
http://sites.uol.com.br/tony4/homepage.html
http://homepages.infoseek.com/~corpuslinguistics/homepage.html
-------------------------------

----------
> From: Marco Antonio Esteves da Rocha <marcor@cce.ufsc.br>
> To: CORPORA@HD.UIB.NO
> Subject: Corpora: MS Word to text
> Date: 02 September 1999 20:25
>
> Dear all,
> Someone has collected a sizable corpus of literary works and documents
> written in Brazilian Portuguese throughout the nineteenth century. It is
a
> valuable asset for us here and it is been all typed in MS Word, thus it
is
> impossible to use all those software resources you all know. Does anyone
> know about a way to transform these .doc files into ASCII text files
> without having to do that one by one ? If you feel tempted to suggest
> sitting on the curb and crying, please don't.
> Marco Rocha
> marcor@cce.ufsc.br
>