14.30, Calls: Multilingual Corpora/Consequences of Mobility

LINGUIST List linguist at linguistlist.org
Tue Jan 7 19:57:13 UTC 2003


LINGUIST List:  Vol-14-30. Tue Jan 7 2003. ISSN: 1068-4875.

Subject: 14.30, Calls: Multilingual Corpora/Consequences of Mobility

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.



Editor for this issue: Karolina Owczarzak <karolina at linguistlist.org>
 ==========================================================================

As a matter of policy, LINGUIST discourages the use of abbreviations
or acronyms in conference announcements unless they are explained in
the text.

=================================Directory=================================

1)
Date:  Mon, 06 Jan 2003 12:39:52 +0100
From:  Silvia Hansen <hansen at CoLi.Uni-SB.DE>
Subject:  Workshop on Multilingual Corpora, UK

2)
Date:  Fri, 03 Jan 2003 08:50:30 +0100
From:  Hartmut Haberland <hartmut at ruc.dk>
Subject:  Consequences of Mobility, Denmark

-------------------------------- Message 1 -------------------------------

Date:  Mon, 06 Jan 2003 12:39:52 +0100
From:  Silvia Hansen <hansen at CoLi.Uni-SB.DE>
Subject:  Workshop on Multilingual Corpora, UK


                        ** CALL FOR PAPERS **


                        Multilingual Corpora:
          Linguistic Requirements and Technical Perspectives

   A pre-conference workshop to be held at
                      Corpus Linguistics 2003


       Lancaster, 27 March 2003

           http://www.comp.lancs.ac.uk/ucrel/cl2003




ORGANIZED BY:

Stella Neumann (Department of Applied Linguistics, Translation and
Interpreting)
Silvia Hansen (Department of Computational Linguistics)

Saarland University, Saarbrücken, Germany


TOPIC AND MOTIVATION:

How do researchers go about building multilingual corpora? For the
development of a linguistically interpreted corpus on the basis of more
than one language there seem to be two methods: First, the multilingual
corpus is split up into monolingual sub-corpora which are then annotated
independently. For the second method, one language serves as the basis for
building up and interpreting a multilingual corpus, whereas the other has
to be adapted. Both methods, however, are rather problematic. They do not
take sufficiently into account the differences and commonalities between
the languages in question at each stage of corpus-based research, involving
the comparability of the corpus design, the different kinds of
segmentation, the diverging annotation schemes, the corpus representations
and finally the again converging querying across different languages.
Mistakes or inconsistencies which happen at one stage of the multilingual
corpus development have negative influences on the following steps and
result in worse mistakes or inconsistencies. Not only do these problems
arise at each methodological step. They also multiply with the growing
complexity of the research design. If the research aims at interpreting
linguistic data on several levels, cross-linguistic comparability has to be
taken into account on each level.

The goal of the workshop is to bring together researchers who formulate
specific requirements of how to work with corpora under a linguistic
perspective and engineers who can offer technical solutions but need the
input of users to adapt their tools to the needs of the linguists. Within
this context, questions like the following are to be discussed:
- What happens, if the units under investigation diverge on the different
levels?
- At present, the preferred solution is to use XML at all stages and on all
layers. But is this really practicable?
- Do linguists get along with stand-off mark-up?
- Is this maybe a technical compromise?

The workshop should result in a requirement catalogue in combination with
technical solutions. It could thus serve as a starting point for the
development of an annotation typology which takes into account different
languages as well as different annotation layers. On the basis of this
typology, the comparability of a multilingual multi-layer annotated corpus
can be guaranteed. With this in mind, a multilingual corpus builder should
be able to cope with possible problems in each of the above explained steps
in corpus development.

Papers are expected on the following questions:
- linguistic requirements in the different methodological steps
- state-of-the-art technical solutions
- international standards which facilitate the development and exchange of
multilingual corpora


WORKSHOP PROFILE:
The workshop will take a full day comprising about 8-10 papers. Short
presentations are expected leaving enough time for discussion and
assessment of the used methodologies as well as the development of possible
solutions. This already points to the workshop agenda: The first third will
deal with linguistic fundamentals, the second part will discuss the
technical aspects and the last third will provide a platform for
integrating both perspectives. Workshop proceedings will be produced.


PROGRAMME COMMITTEE:

Silvia Bernardini, Bologna
Sabine Brants, Palo Alto
Andreas Eisele, Saarbrücken
Stefan Evert, Stuttgart
Silvia Hansen, Saarbrücken
Tony Hartley, Leeds
Natalie Kübler, Paris
Stella Neumann, Saarbrücken
Mick O'Donnell, Madrid
Maeve Olohan, Manchester
Elke Teich, Saarbrücken
Spela Vintar, Ljubljana
Federico Zanettin, Bologna

SCHEDULE:

20 January 2003: Deadline for submitted papers
21 February 2003: Notification of acceptance
7  March 2003: Camera ready copy
27 March 2003: Workshop


REGISTRATION:

Please refer to the main conference web page
(http://www.comp.lancs.ac.uk/ucrel/cl2003) for registration details.


SUBMISSIONS:

Please send submissions in English as RTF or plain text files (preferably
by email) to the address below.  Paper length should be 8-10 pages,
formatted
in the same way as for the main conference
(see http://www.comp.lancs.ac.uk/ucrel/cl2003/style.html
for paper format guidelines).

Stella Neumann (st.neumann at mx.uni-saarland.de)
Department of Applied Linguistics, Translation and Interpreting (FR 4.6)
Saarland University
Postfach 15 11 50
66041 Saarbrücken
Germany


-------------------------------- Message 2 -------------------------------

Date:  Fri, 03 Jan 2003 08:50:30 +0100
From:  Hartmut Haberland <hartmut at ruc.dk>
Subject:  Consequences of Mobility, Denmark


Extension of deadline for abstracts


The deadline for submission of abstracts and workshop proposals for the
conference


  The Consequences of Mobility:
  Linguistic and Sociocultural Contact Zones

organized by the Research group on Sociolinguistics, Language Pedagogy
and Sociocultural issues, Department of Language and Culture, Roskilde
University, Roskilde, Denmark and to be held on May 23 and 24, 2003
(plenary speakers: Peter Auer, Freiburg and Lesley Milroy, Ann Arbor),
has been extended to January 15, 2003.

For further information on the conference, see
http://www.ruc.dk/isok/Konferencer/Consequences_of_Mobility/

---------------------------------------------------------------------------
LINGUIST List: Vol-14-30



More information about the LINGUIST mailing list