[Corpora-List] 2nd UIMA at GSCL Workshop - Final Call for Participation

Thu Sep 17 05:46:27 UTC 2009

=======================================================================

Final Call for Participation

Unstructured Information Management Architecture (UIMA)
2nd UIMA at GSCL Workshop

October 1st, 2009
Potsdam, Germany

http://www.ling.uni-potsdam.de/acl-lab/gscl09/workshops.en.html
========================================================================

-------------------
Program
-------------------

09:00 - 10:00    -    UIMA Tutorial, Graham Wilcock

10:00 - 10:30    -    Coffee Break

10:30 - 10:45    -    Opening

10:45 - 11:15    -    ClearTK: A Framework for Statistical Natural 
Language Processing (Philip V. Ogren, Philipp G. Wetzler, and Steven J. 
Bethard)
11:15 - 11:45    -    Multimedia Feature Extraction in the SAPIR Project 
(Aaron Kaplan, Jonathan Mamou, Francesco Gallo, and Benjamin Sznajder)
11:45 - 12:15    -    TextMarker: A Tool for Rule-Based Information 
Extraction (Peter Kluegl, Martin Atzmueller, and Frank Puppe)

12:15 - 13:00    -    Lunch Break

13:00 - 13:30    -    LuCas - A Lucene CAS Indexer (Erik Faessler, Rico 
Landefeld, Katrin Tomanek, and Udo Hahn)
13:30 - 14:00    -    Abstracting the types away from a UIMA type system 
(Karin Verspoor, William Baumgartner Jr., Christophe Roeder, and 
Lawrence Hunter)

14:00 - 14:30    -    Poster Session

14:30 - 15:00    -    Round Table/Discussion

-----------------------------
Workshop Description
-----------------------------

For many decades, NLP has suffered from low software engineering 
standards causing a limited degree of re-usability of code and 
interoperability of different modules within larger NLP systems. While 
this did not really hamper success in limited task areas (such as 
implementing a parser), it caused serious problems for the emerging 
field of language technology where the focus is on building complex 
integrated software systems, e.g., for information extraction or machine 
translation. This lack of integration has led to duplicated software 
development, work-arounds for programs written in different (versions 
of) programming languages, and ad-hoc tweaking of interfaces between 
modules developed at different sites.

In recent years, the Unstructured Information Management Architecture 
(UIMA) framework has been proposed as a middleware platform which offers 
integration by design through common type systems and standardized 
communication methods for components analysing streams of unstructured 
information, such as natural language. The UIMA framework offers a solid 
processing infrastructure that allows developers to concentrate on the 
implementation of the actual analytics components. An increasing number 
of members of the NLP community thus have adopted UIMA as a platform 
facilitating the creation of reusable NLP components that can be 
assembled to address different NLP tasks depending on their order, 
combination and configuration.

This workshop aims at bringing together members of the NLP community 
that are users, developers or providers of either UIMA components or 
UIMA-related tools in order to explore and discuss the opportunities and 
challenges in using UIMA as a platform for modern, well-engineered NLP. 
In the context of an emerging NLP-oriented UIMA community, the challenge 
to create not only reusable, but also interoperable components raises 
particular interest. From a methodological perspective, interoperability 
relies largely on UIMA type systems. Technically, it includes issues 
related to the packaging and distribution of UIMA components. Also, 
tools are important, for example to assemble complex processing work 
flows, to manage the bodies of data that are to be analysed and to 
visualize, explore, and further deploy the analysis results. Finally, 
interoperability is also affected by legal issues, such as potentially 
incompatible licenses ofcomponents and tools.

The availability of ready-to-use components plays a major role in 
choosing UIMA over other alternatives. To accentuate this, the workshop 
puts a focus on UIMA-based components and tools that are freely 
available for research.

--------------
Topics
--------------

Participants are invited to present applications realized using UIMA, 
general experiences using UIMA as a platform for natural language 
processing, as well as technical papers on particular aspects of the 
UIMA framework. Alternatives to and comparisons of other frameworks - 
e.g. GATE, LingPipe, etc. - with UIMA are of interest, too. More 
specifically, workshop topics include, but are not limited to:

• UIMA components with a special focus on genericity and type-system 
independence
• repositories of ready-to-use UIMA-based components
• (generic) type systems for UIMA
• distribution of UIMA components: documentation, licensing and packaging
• sophisticated tools to build and manage complex processing pipelines
• experience reports combining UIMA-based components from different 
sources, as well as solutions to interoperability issues
• processing of very large data collections: scale-out, parallelization, 
and performance optimization
• analysis of results: exploration, evaluation, visualization, and 
statistical analysis
• developing for UIMA: simplified APIs, debugging, unit testing, and 
limitations of UIMA

---------------------------------
Organizers and Contact
---------------------------------

• JULIE Lab, Friedrich-Schiller-Universität Jena
   • Udo Hahn
   • Katrin Tomanek
• UKP Lab, Technische Universität Darmstadt
   • Iryna Gurevych
   • Richard Eckart de Castilho

Please address any inquiries regarding the workshop to:
uima.gscl2009 at googlemail.com

---------------------------------
Program Committee
---------------------------------

• Anni R. Coden, IBM T.J. Watson Research Center, USA
• Branimir K. Boguraev, IBM T.J. Watson Research Center, USA
• Graham Wilcock, University of Helsinki, Finland
• Iryna Gurevych, Technische Universität Darmstadt, Germany
• Katrin Tomanek, Friedrich-Schiller-Universität Jena, Germany
• Leo Ferres, University of Concepcion, Chile
• Michael Tanenblatt, IBM T.J. Watson Research Center, USA
• Nicolas Hernandez, Université de Nantes, France
• Philipp Cimiano, Delft University of Technology, Netherlands
• Richard Eckart de Castilho, Technische Universität Darmstadt, Germany
• Sophia Ananiadou, University of Manchester, Great Britain
• Stefan Geißler, TEMIS GmbH, Germany
• Udo Hahn, Friedrich-Schiller-Universität Jena, Germany

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora