[Corpora-List] Human Language Technology for Corpus Lexicography

Amy Neale a.neale at itri.brighton.ac.uk
Tue Jan 28 14:40:56 UTC 2003


8 day Short Course in Human Language Technology for Corpus Lexicography
25 - 28 Feb; 3 - 6 March 2003
ITRI, University of Brighton,
UK

This eight-day course offers those working in linguistic disciplines the chance to discover how language technologies can add to their research capabilities.

The course teaches the language technologies that can be used to process text corpora.  A study is also made of  existing lexical resources produced by, or for, language technology, and the dominant formalisms in use.

Course Details:
On completing this course students will be able to:

   1. Describe the ways in which language corpora can be enriched using
      a variety of language technologies.
   2. Critically evaluate these technologies, and determine their
      usefulness for linguistic research and lexicography.
   3. Work with different algorithms and strategies for lemmatisation,
      part-of-speech tagging, parsing and word sense disambiguation.
   4. Describe and evaluate other computational lexical resources that
      are available.
   5. Interpret data in a variety of leading formalisms for lexical
      representation.

Course Content:

    * Lemmatization, for English and for languages with more complex
      morphology
    * Local grammars for proper names, dates, places, etc
    * Part-of-speech tagging for English and other languages: tagsets
      and training corpora; manual rule-writing approaches
    * Grammars and Parsing: history; context-free grammars; dependency
      grammars; deep and shallow parsing; parser evaluation
    * Word sense disambiguation; word senses, norms and exploitations;
      dictionary-based methods; supervised training methods; senses and
      domains; evaluation
    * Feature structures as a way of holding lexical information
    * Lexical entries in Head-Driven Phrase Structure Grammar
    * Key initiatives in lexical resource development and
      standardisation: EAGLES, SIMPLE, WordNets, FrameNet
    * Machine learning strategies, to include Bayesian approaches,
      Markov Models, Maximum Entropy, Transformation-Based Learning and
      Decision trees and lists.

Course Dates and Venue:
Human Language Technology for Corpus Lexicography will run from 25 - 28
February, and 3 - 6 March, 2003 at the Information Technology Research
Institute (ITRI) at the University of Brighton, East Sussex, U.K. ITRI
is an internationally-known centre of excellence in the field of Human
Language Technology. Brighton is a lively, cosmopolitan city on England
s south coast, one hour from London by train, and 30 minutes from London
Gatwick Airport.

Course Fees:
The full fee for this two-week course is £1645.00 (including VAT) for
the first delegate. Second and subsequent delegates from the same
institution qualify for a reduced rate of £1292.50. Places are limited
and early registration is recommended.

For more information and details of how to register please visit:
http://www.itri.bton.ac.uk/courses/CPDLex/modules/LCM07.html Or contact
us at itel at brighton.ac.uk





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20030128/ff6fcc18/attachment.htm>


More information about the Corpora mailing list