Corpora: Tools to convert HTML files into plain text

From: htakashi@mse.biglobe.ne.jp
Date: Mon Mar 27 2000 - 15:45:17 MET DST

  • Next message: Alexander S. Yeh: "Corpora: statistics in CL question"

    Jean Veronis wrote:
    >I have a related question. What tools do you use once you have downloaded
    >the HTML files to (batch-)convert them in reasonably clean "plain" text?

    I am using my tools, "DeHTML".
    DOS/Win16/Win32/OS2 versions are now available at
     http://www2d.biglobe.ne.jp/~htakashi/software/DEHTML_E.HTM or
     http://www2d.biglobe.ne.jp/~htakashi/software/DEHTML_J.HTM (Japanese version)

    --
    = HAMAGUCHI Takashi(KOBE, Japan)            htakashi@mse.biglobe.ne.jp =
    =[ http://www2d.biglobe.ne.jp/~htakashi/ ]  NBC03301@nifty.ne.jp ==
    



    This archive was generated by hypermail 2b29 : Tue Mar 28 2000 - 09:13:27 MET DST