[Corpora-List] Postdoc in machine learning and web document classification

Serge Sharoff s.sharoff at leeds.ac.uk
Fri Jan 30 15:02:17 UTC 2009


The Centre for Translation Studies and School of Computing, the
University of Leeds, UK, are inviting applications for the post of
Research Assistant to work on a 12-month research project funded by a
Google Research Award to Serge Sharoff and Katja Markert.

The project aims to classify web pages into genres in multiple
languages, since different pages have quite different uses and
characteristics.  Our research questions in this project are:

      * Which features of webpages are useful for their automatic
        classification? 
      * How language-specific are those features? 
      * How can we build efficient classifiers with minimal annotation
        data?

The RA will be doing research on feature selection, development of
classifiers and evaluation.  The ideal candidate has recently completed
or nearing completion of a PhD in computational linguistics or machine
learning with an emphasis on weakly-supervised methods in natural
language processing.  Knowledge of graph-based machine learning or
linguistic genre theory and at least one language in addition to English
is desirable.  Suitable candidates with MSc/MA degree can be also
considered. 

Salary will be in the University Grade 6 (GBP24,152-28,839 p.a.)
depending on qualifications and experience.

With any informal questions about this job, please contact Dr Serge
Sharoff, (s.sharoff at leeds.ac.uk, tel. +44(0)113 3437287). 
Information about the Centre for Translation Studies can be found at
http://www.leeds.ac.uk/cts/. Information about the School of Computing
can be found at http://www.comp.leeds.ac.uk/.

To download the application form and job details please visit
http://www.hr.leeds.ac.uk/jobs/ViewJob.aspx?search=313343&JId=83

Closing date for applications is 27 February 2009




_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list