RE: [Corpora-List] Legal aspects of compiling corpora

From: peetm (peet.morris@comlab.ox.ac.uk)
Date: Tue Jun 17 2003 - 16:22:42 MET DST

  • Next message: Maria Gavrilidou: "Re: [Corpora-List] Legal aspects of compiling corpora"

    One possible reason for Google having not been sued so far, is that it is a
    private company, i.e. not worth anything like as much as it would be if it
    floated (in fact, if it's private, is it worth anything in 'real terms'?)

    My own research - building specific corpora in realtime using grid
    computing, uses the web as its data-source - so I will be allowing academics
    access to entire texts. I consulted with lawyers at the Oxford Internet
    Institute (www.oii.ox.ac.uk) and the bottom line was an opinion (isn't it
    always) - that I wouldn't be sued. 'Couldn't' wasn't brought up however.

    peetm

    www.clg.ox.ac.uk

    -----Original Message-----
    From: owner-corpora@lists.uib.no [mailto:owner-corpora@lists.uib.no] On
    Behalf Of Mark Davies
    Sent: 17 June 2003 16:55
    To: corpora@hd.uib.no
    Subject: RE: [Corpora-List] Legal aspects of compiling corpora

    When I was compiling the 100 million word Corpus del Espaņol
    (www.corpusdelespanol.org), I
    consulted two professors from the US who are experts on copyright law, as
    applied to the
    Internet. I explained to them that in my corpus, at least, end users
    wouldn't have access
    to etnire paragraphs of text, much less an entire text itself. Both were in
    agreement
    that it would be quite unlikely that there would be any copyright problems.

    What has me intrigued with search engines like Google, however, is their
    "cached web page"
    functionality, in which they are in essnce reproducing an entire web page --
    and all of
    the web pages of a given site (assuming no use of robots.txt). It seems
    that this is much
    more than the limited context that I ( and others) make available in our
    corpora, and yet
    there has been no legal challenge.

    On the other hand, both of the professors who I consulted mentioned that
    it's still a very
    murky issue with little or no clearly defined legal precedent -- at least in
    the US.

    Mark Davies

    =================================================
    Mark Davies
    Assoc. Prof., Spanish Linguistics
    Illinois State University
    http://mdavies.for.ilstu.edu/

    ** Corpus design and use // Web-database scripting **
    ** Historical and dialectal Spanish and Portuguese syntax **
    =================================================



    This archive was generated by hypermail 2b29 : Tue Jun 17 2003 - 16:23:35 MET DST