Re: [Corpora-List] Looking for source of images of completed personnel forms

From: Eric Atwell (eric@comp.leeds.ac.uk)
Date: Tue Jun 24 2003 - 11:03:21 MET DST

  • Next message: Webmaster CL: "[Corpora-List] Concordancing russian text"

    peter,

    My guess is that you'll find it hard to come by Personnel files as
    they're confidential! However, you could try trawling www for online CV
    pages. In fact, rather than "trawling randomly", you could start with
    established job-hunting sites, eg www.elsnet.org (european language and
    speech network) has a subpage where jobseekers can advertise their CVs.

    You *could* ask a commercial recruitment agency for access to their
    files, e.g. doctorjob.co.uk help Leeds students find graduate jobs;
    however i suspect they'll say their files are confidential and not
    available for research (at least not without paying the recruiter
    commercial access fee...)

    You might say "I want scanned forms, not HTML online CVs, to test
    data-mining" - but you *could* artifically (re)create forms from
    the info in CVs. This has the advantage that you "know" the
    information you will be trying to data-mine, so you can evaluate your
    learning-system against known "annotations". We did something similar
    when testing student plagiarism/copying detectors: we "deliberately
    plagiarised" some courseworks and fed these copies into the trials,
    to see if the systems under evaluation could find them...

    good luck with your hunt!

    Eric Atwell, Leeds Unviersity

    On Mon, 23 Jun 2003, Peter Viechnicki wrote:

    > Dear List Members,
    >
    > I've been an interested 'lurker' for a few months now, but now would like
    > to pose a question of my own to you all. Does anyone know of any sources
    > of publicly-available scanned versions of personnel forms or similar
    > forms? We're doing a project on data mining from personnel forms, and
    > would like to identify test data if it exists. What we need are image
    > files (real exemplars, filled out) of forms which contain names,
    > addresses, organizations, dates, and similar information. Any suggestions
    > would be greatly appreciated. Please reply to me directly, and I will
    > post a summary.
    >
    > Thanks in advance,
    >
    > -Peter Viechnicki
    > Vredenburg Corp.
    > pviechnicki@vredenburg.com
    >
    >
    >
    >
    >
    >
    >

    -- 
    Eric Atwell, CVL: Computer Vision and Language research group
    Distributed Multimedia Systems MSc Tutor & SOCRATES/JYA Tutor
    School of Computing, University of Leeds, LEEDS LS2 9JT
    TEL: 0113-3435761  MOBILE: 0775-1039104 FAX: 0113-3435468
    WWW: http://www.comp.leeds.ac.uk/eric  EMAIL: eric@comp.leeds.ac.uk
    Visit http://www.computingLEEDS.ac.uk - our newsletter for industry
    



    This archive was generated by hypermail 2b29 : Tue Jun 24 2003 - 11:08:19 MET DST