[Corpora-List] Looking for source of images of completed personnel forms

Eric Atwell eric at comp.leeds.ac.uk
Tue Jun 24 09:03:21 UTC 2003


peter,

My guess is that you'll find it hard to come by Personnel files as
they're confidential!  However, you could try trawling www for online CV
pages.  In fact, rather than "trawling randomly", you could start with
established job-hunting sites, eg www.elsnet.org (european language and
speech network) has a subpage where jobseekers can advertise their CVs.

You *could* ask a commercial recruitment agency for access to their
files, e.g. doctorjob.co.uk help Leeds students find graduate jobs;
however i suspect they'll say their files are confidential and not
available for research (at least not without paying the recruiter
commercial access fee...)

You might say "I want scanned forms, not HTML online CVs, to test
data-mining" - but you *could* artifically (re)create forms from
the info in CVs.  This has the advantage that you "know" the
information you will be trying to data-mine, so you can evaluate your
learning-system against known "annotations".  We did something similar
when testing student plagiarism/copying detectors: we "deliberately
plagiarised" some courseworks and fed these copies into the trials,
to see if the systems under evaluation could find them...

good luck with your hunt!


Eric Atwell, Leeds Unviersity


On Mon, 23 Jun 2003, Peter Viechnicki wrote:

> Dear List Members,
>
> I've been an interested 'lurker' for a few months now, but now would like
> to pose a question of my own to you all.  Does anyone know of any sources
> of publicly-available scanned versions of personnel forms or similar
> forms?  We're doing a project on data mining from personnel forms, and
> would like to identify test data if it exists.  What we need are image
> files (real exemplars, filled out) of forms which contain names,
> addresses, organizations, dates, and similar information.  Any suggestions
> would be greatly appreciated.  Please reply to me directly, and I will
> post a summary.
>
> Thanks in advance,
>
> -Peter Viechnicki
> Vredenburg Corp.
> pviechnicki at vredenburg.com
>
>
>
>
>
>
>

--
Eric Atwell, CVL: Computer Vision and Language research group
Distributed Multimedia Systems MSc Tutor & SOCRATES/JYA Tutor
School of Computing, University of Leeds, LEEDS LS2 9JT
TEL: 0113-3435761  MOBILE: 0775-1039104 FAX: 0113-3435468
WWW: http://www.comp.leeds.ac.uk/eric  EMAIL: eric at comp.leeds.ac.uk
Visit http://www.computingLEEDS.ac.uk - our newsletter for industry



More information about the Corpora mailing list