Re: Corpora: MS Word to text

Philip Resnik (resnik@umiacs.umd.edu)
Fri, 3 Sep 1999 14:19:23 -0400 (EDT)

Mike, thanks for posting a solution to the problem!

However, I was hoping there might be a solution to a closely related
problem: how to get MS Word documents to text on a Unix platform!!
Word e-mail attachments are becoming more and more frequent, and
people seem to be assuming that the world is running on PC's. We may
indeed reach that point, but regardless (let's not get *that*
discussion started!), at least for the moment it would be nice if
either (a) people would send documents around in a *portable* format,
or (b) there were at least a convenient way to convert .DOC to .txt on
a Unix platform. I have little hope for (a) so I'm hoping someone can
help with (b). (I was unable to find a solution via Web searching.)

I realize this question is a bit beyond the purview of the corpora
list, though I imagine enough people in the research community work
under Unix that it may be an issue... Still, let me suggest that
people reply to me personally by e-mail [no Word attachments, please!
;-)] and I will post a summary of useful replies I receive.

Best,

Philip
----------------------------------------------------------------
Philip Resnik, Assistant Professor
Department of Linguistics and Institute for Advanced Computer Studies

1401 Marie Mount Hall UMIACS phone: (301) 405-6760
University of Maryland Linguistics phone: (301) 405-8903
College Park, MD 20742 USA Fax : (301) 405-7104
http://umiacs.umd.edu/~resnik E-mail: resnik@umiacs.umd.edu