[Corpora-List] Legal aspects of compiling corpora

From: Larry Spitz (spitz@docrec.com)
Date: Fri Jun 13 2003 - 23:16:43 MET DST

  • Next message: Elliott Franco Drabek: "[Corpora-List] Summary -- Hindi parsers"

    I find this discussion very interesting and I want to add an extra twist
    which probably is not of much interest to the Corpora list in general, but
    unfortunately I do not know of a better forum.

    Aside from the legal aspect of collecting text are the legal aspects of
    collecting scanned images of documents. For those of us who are interested
    in the analysis of document images obtaining databases of images is quite
    difficult, particularly generally available databases where the results of
    individual research can be compared.

    Since the University of Washington and the University of Nevada, Las Vegas
    have stopped publishing such databases, I do not know of anyone who is in
    the process of doing so.

    One of the real problems is getting copyright permission on document
    images. To many authors it is an incomprehensible concept. Does the holder
    of a copyright also de facto hold a copyright on the image of the text?
    Since the goal is not to photocopy, or otherwise reproduce, that image but
    to use it as a basis for research, does copyright law even apply?

    In general, we in the document image analysis community are not
    particularly interested in the document content as intellectual property
    though we are interested in being able to reproduce it (OCR) or understand
    its structure, or find it (IR).

    To some of us, getting copyright permission on document images does not
    seem to be a rational (moral, if you will) requirement.

    I would be interested in a discussion as to how copyright law and practice
    with respect to images fits in with text corpus collection.

    Cheers,

    Larry

    -- 
           	 DocRec Ltd   http://www.docrec.com/
          phone: +64-3-545-2105 fax: +64-3-545-2106
    34 Strathaven Place, Atawhai, Nelson 7001, New Zealand
    



    This archive was generated by hypermail 2b29 : Fri Jun 13 2003 - 23:22:43 MET DST