Re: Corpora: Help please - downloading text from the Web

From: Dave Braze (davebraze@uconn.cted.net)
Date: Mon Mar 27 2000 - 03:07:29 MET DST

  • Next message: Andrew Harley: "Re: Corpora: Help please - downloading text from the Web"

    Knut Hofland wrote:

    > On Thu, 23 Mar 2000, Geoff Wilkins wrote:
    >
    > > I'm looking for software - preferably freeware or shareware - to
    > > use to download text from Web sites, for use in a corpus.
    >
    > I have used w3mir
    > http://www.math.uio.no/~janl/w3mir/
    > and
    > SiteSnagger
    > http://hotfiles.zdnet.com/cgi-bin/texis/swlib/hotfiles/info.html?fcode=000P7Z
    > Both have shortcomings, but I have downloaded gigabytes of HTML-files
    > with the programs.

    There is also wget:

    http://www.interlog.com/~tcharron/wgetwin.html

    I've only used it a little, but it seems serviceable enough.

    -Dave

    --
    Dave Braze
    Linguistics Department, U-1145
    University of Connecticut
    Storrs, CT 06269-1145 USA
    



    This archive was generated by hypermail 2b29 : Mon Mar 27 2000 - 10:26:08 MET DST