[Corpora-List] Job: Text mining at The Open University and Natural History Museum UK

Alistair Willis A.G.Willis at open.ac.uk
Fri Oct 3 15:05:48 UTC 2008


The Open University and the Natural History Museum are looking for a
postdoctoral researcher to work on concept extraction from scanned
taxonomic literature.

Scanned texts contain errors introduced by imperfect OCR and other
sources, so techniques are required that are robust in the face of such
errors. The successful applicant will develop techniques that use
typographical and contextual cues to identify and tag relevant document
content.

The ideal candidate would have a PhD (or equivalent experience), and
experience in one or more of the following:
-	natural language processing/information extraction/information
retrieval, in particular from noisy data;
-	image analysis and feature extraction;
-	document layout (reverse-engineering a DTD);
-	XML for mark-up and term annotation;
-	broad familiarity with biological systematics.

Good programming skills are essential, as is the ability to learn
quickly. Applications from candidates with a background in the
biological sciences who can demonstrate appropriate computing skills are
encouraged.

For detailed information and how to apply go to
www3.open.ac.uk/employment, or email the Recruitment Secretary at
MCS-Recruitment at open.ac.uk quoting the reference number. Closing date:
16th October 2008.

For enquiries about the research project, please contact: David Morse
[d.r.morse at open.ac.uk].

--
Dr David R. Morse
Computing Department, The Open University, Walton Hall, Milton Keynes
MK7 6AA, UK.
Email: D.R.Morse at open.ac.uk | Phone: +44 (0)1908 858463 | Fax: +44
(0)1908 652140

---------------------------------
The Open University is incorporated by Royal Charter (RC 000391), an 
exempt charity in England & Wales and a charity registered in Scotland 
(SC 038302).

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list