11.727, Confs: Large Corpora and Annotation Standards

The LINGUIST Network linguist at linguistlist.org
Fri Mar 31 00:55:05 UTC 2000


LINGUIST List:  Vol-11-727. Thu Mar 30 2000. ISSN: 1068-4875.

Subject: 11.727, Confs: Large Corpora and Annotation Standards

Moderators: Anthony Rodrigues Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
            Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Reviews: Andrew Carnie: U. of Arizona <carnie at linguistlist.org>

Associate Editors:  Ljuba Veselinova, Stockholm U. <ljuba at linguistlist.org>
		    Scott Fults, E. Michigan U. <scott at linguistlist.org>
		    Jody Huellmantel, Wayne State U. <jody at linguistlist.org>
		    Karen Milligan, Wayne State U. <karen at linguistlist.org>

Assistant Editors:  Lydia Grebenyova, E. Michigan U. <lydia at linguistlist.org>
		    Naomi Ogasawara, E. Michigan U. <naomi at linguistlist.org>
		    James Yuells, Wayne State U. <james at linguistlist.org>

Software development: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
                      Sudheendra Adiga, Wayne State U. <sudhi at linguistlist.org>
                      Qian Liao, E. Michigan U. <qian at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded jointly by Eastern Michigan University,
Wayne State University, and donations from subscribers and publishers.

Editor for this issue: Jody Huellmantel <jody at linguistlist.org>
 ==========================================================================
Please keep conferences announcement as short as you can; LINGUIST
will not post conference announcements which in our opinion are
excessively long.


=================================Directory=================================

1)
Date:  Thu, 30 Mar 2000 17:27:47 -0500
From:  "Nancy M. Ide" <ide at cs.vassar.edu>
Subject:  Large Corpora and Annotation Standards

-------------------------------- Message 1 -------------------------------

Date:  Thu, 30 Mar 2000 17:27:47 -0500
From:  "Nancy M. Ide" <ide at cs.vassar.edu>
Subject:  Large Corpora and Annotation Standards


                    Large Corpora and Annotation Standards

              http://www.cs.vassar.edu/~ide/ANLP-NAACL2000.html

                   Held in conjunction with ANLP/NAACL'00
                            Seattle, Washington
                             4 May 2000 1-6pm

           This meeting is intended to bring together researchers and
           developers from a variety of domains in text, speech,
           video, etc., to look broadly at the technical issues that
           bear on the development of software systems and standards
           for the annotation and exploitation of linguistic
           resources. The goal is to lay the groundwork for the
           definition of a data and system architecture to support
           corpus annotation and exploitation that can be widely
           adopted within the community.

           Among the issues to be addressed are:

               -   layered data architectures
               -   system architectures for distributed databases
               -   support for plurality of annotation schemes
               -   impact and use of XML/XSL
               -   support for multimedia, including speech and video
               -   tools for creation, annotation, query and access of
                   corpora
               -   mechanisms for linkage of annotation and primary
                   data
               -   applicability of semi-structured data models, search
                   and query systems, etc.
               -   evaluation/validation of systems and annotations

           The motivation for this meeting is the American National
           Corpus (ANC) effort, which should begin corpus creation
           within the year. We anticipate that the ANC will provide a
           significant resource for natural language processing, and
           we therefore seek to identify state-of-the-art methods for
           its creation, annotation, and exploitation. Also, as a
           national and freely available resource, the data and system
           architecture of the ANC is likely to become a de facto
           standard. We therefore hope to draw together leading
           researchers and developers to establish a basis for the
           design of a system to support the creation and use of the
           ANC.


                               Provisional Program

                  Overview of the American National Corpus Effort
                     Nancy Ide and Catherine Macleod

                  Searching Linguistically Annotated Corpora
                     Chris Brew

                  Considerations for Large Corpus Annotation:
                  Intercoder Reliability
                     Rebecca Bruce and Janyce Wiebe

                  The XML Framework and Its Implications for Large
                  Corpus Access
                     Nancy Ide

                  The ATLAS System
                     John Henderson

                  Annotation Standards and Their Impact on Large
                  Corpus Development
                     Nicoletta Calzolari

                  A Framework for Multi-level Linguistic Annotation
                     Patrice Lopez and Laurent Romary

                  Discussion : Requirements for the ANC



           A related workshop will be held at the LREC conference on
           May 29-30, 2000.  http://www.cs.vassar.edu/~ide/anc/lrec.html

           Organizer:

           Nancy Ide
           Professor and Chair
           Department of Computer Science
           Vassar College
           Poughkeepsie, NY 12604-0520 USA
           Tel: +1 914 437-5988 Fax: +1 914 437-7498
           ide at cs.vassar.edu

---------------------------------------------------------------------------
LINGUIST List: Vol-11-727



More information about the LINGUIST mailing list