19.2820, Calls: Comp Ling,Text/Corpus Ling/France; General Ling/USA

Tue Sep 16 15:42:37 UTC 2008

LINGUIST List: Vol-19-2820. Tue Sep 16 2008. ISSN: 1068 - 4875.

Subject: 19.2820, Calls: Comp Ling,Text/Corpus Ling/France; General Ling/USA

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews: Randall Eggert, U of Utah  
         <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Kate Wu <kate at linguistlist.org>
================================================================  

As a matter of policy, LINGUIST discourages the use of abbreviations
or acronyms in conference announcements unless they are explained in
the text.

To post to LINGUIST, use our convenient web form at 
http://linguistlist.org/LL/posttolinguist.html. 

===========================Directory==============================  

1)
Date: 16-Sep-2008
From: Serge Heiden < slh at ens-lsh.fr >
Subject: 1st Cataloguing and Encoding of Spoken Language Data 

2)
Date: 15-Sep-2008
From: Tom Recht < trecht at berkeley.edu >
Subject: Berkeley Linguistics Society, 35th Annual Meeting

-------------------------Message 1 ---------------------------------- 
Date: Tue, 16 Sep 2008 11:35:35
From: Serge Heiden [slh at ens-lsh.fr]
Subject: 1st Cataloguing and Encoding of Spoken Language Data  	 

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=19-2820.html&submissionid=190465&topicid=3&msgnumber=1

Full Title: 1st Cataloguing and Encoding of Spoken Language Data 
Short Title: CatCod 2008 

Date: 04-Dec-2008 - 05-Dec-2008
Location: Orléans, France 
Contact Person: Serge Heiden
Meeting Email: catcod2008 at ens-lsh.fr
Web Site: http://www.catcod.org 

Linguistic Field(s): Computational Linguistics; General Linguistics; Language
Documentation; Text/Corpus Linguistics 

Call Deadline: 30-Oct-2008 

Meeting Description:

CatCod 2008
First International Workshop on Cataloguing and Encoding of Spoken Language Data
December 4 - 5 2008
Université d'Orléans, France
http://www.catcod.org

[Français]
Premières rencontres internationales pour le Catalogage et le Codage de corpus oraux
4 - 5 Décembre 2008
à l'Université d'Orléans, France 

Second Call For Papers

extended deadline to the 30th of October

[version française plus bas]

The number of spoken recordings  which are digitized and available for the study
and  description of language has remained  quite small for a long  time  and 
their  distribution  has  been  largely  confined  to specialized  agencies. 
However, the  development of  the Web  and its associated storage, distribution
 and processing technologies have now made it  both practically and 
economically feasible for  many smaller structures,  such as individual 
research laboratories,  to distribute spoken resources for themselves.
We  are  thus  entering  a  critical  phase. It  is  now  possible  to
capitalize on the efforts  of projects which have digitized linguistic data in 
order to form the empirical basis for  entirely new research projects. However,
 many such  existing projects  in France  and Europe  shows a great
heterogeneity  among in their conformance  to established coding and 
cataloguing standards  for this  type of  resource.   Even though these same 
projects were often set  up with the  aims of facilitating access to  data, and
of sharing  and preservation of  data, we observe that the diversity of formats,
encodings and protocols they use limits exactly these objectives.  In this 
symposium, we aim to report on the major  initiatives within  the  field of 
digital document  management which will potentially have an important influence
on standardization. We  would like  to stress  two  specific aspects:  the
cataloguing  of spoken resources and their encoding.

Cataloguing:
The  recent  and  fast  growth  of the  number  of  spoken  recordings available
on the  Web needs to be accompanied  by a significant effort of  description 
and  referencing  if  these data  are  to  be  easily accessible  and managed, 
rather  than  being buried  in  the mass  of available data.  Some cataloguing
practices aim solely to complete the life cycle of a a  resource-creating
project. Others explicitly aim to guide  the exploitation,  preservation,  and
the  distribution of  the resource in the long term.
Such cataloguing activity is all the more important and urgent in view of the
rapid increase in  enhanced methods of handling operations this mass   of  data,
 by   means  of   data  exchange,   enhancement,  and research.   Some  research
  communities  are   well-organized  around established standards such as the
Dublin Core for Web-based resources, the TEI  Header, or  the MARC standards 
maintained by the  Library of Congress  for  the  description  of 
bibliographical  resources.  More recently, smaller linguistic  communities have
established cataloguing proposals (OLAC, IMDI).  People have now had enough 
experience in the use of these  proposals to be able to criticize  them and
propose some improvements.   With these  new  insights, it  should  be possible
 to establish a minimal  charter to be respected by those  who wish to get
involved  in the  distribution of  spoken language  data, in  order to
facilitate their exchange and their more general use in research.

Encoding:
If cataloguing  is essential for  the identification of  resources and for 
rapid comparisons  amongst  them, encoding  is  essential to  the description of
the interpretation  of their content and also essential to  their   exploitation
 for  specific  studies.    If  encoding  the transcription of  video or audio
material is  indeed the clarification of an interpretation,  then one notes here
as well  a great variety in the practices.
The  inventories made  during  the EAGLES,  MATE  and ISLE  successive
initiatives have demonstrated how difficulty  it is to grasp fully the extent of
 various encoding systems.  The ISLE project  suggested that only  the
specification of  a universal  software tool  for annotation could lead to 
resources encoded in a standardized  way. But this does not make  it any the
less necessary  for us to attempt  a communal and consensual  activity, aiming 
to  categorize, name,  and organize  the phenomena found  within spoken 
resources if we  hope to  achieve true interoperability  of the  data, with  a 
view to  multiple and  future exploitations. We must now start  an exercise for
the encoding of oral corpora  similar to  what  has already  been  undertaken
for  written corpora by the TEI.

Quality Control:
Assuming  that  we  can  achieve  an agreement  on  the  encoding  and
cataloguing of spoken data, it  will then be necessary to define rules
and develop tools to check the conformance of specific datasets to our
agreed  principles.  This  symposium  will also  therefore  report  on
quality control practices and techniques.

[Français]

Deuxième Appel à Communications

Le  nombre  d'enregistrements  oraux  numérisés  et  disponibles  pour l'étude
et la description des langues est longtemps resté relativement faible et 
ceux-ci étaient confinés dans des  agences spécialisées qui en  assuraient le
partage.  Avec l'essor  du web  et des  capacités de stockage, de diffusion et
de  traitement, il est devenu abordable pour des plus petites structures (par
ex. des laboratoires de recherche) de diffuser elles-mêmes leurs ressources 
orales. Nous sommes désormais à une étape  clé où  la capitalisation des 
efforts de  numérisation des données linguistiques  devient possible, ceci  afin
de former  la base empirique de nouveaux projets de recherche.
L'observation  des normes de  codage et  de catalogage  de ce  type de
ressources  dans les  différents  projets existants  en  France et  en Europe,
montre  une grande hétérogénéité des pratiques.  Alors que ces mêmes  projets se
sont  montés dans  le but  de faciliter  l'accès, le partage  ou  la 
conservation  des  données,  on  constate  que  cette diversité des formats, des
codages et des protocoles utilisés limitent justement ces objectifs.

Nous souhaitons  dans ce colloque  faire le point sur  les initiatives majeures
dans le  monde de la gestion des  documents numériques, ayant potentiellement
une  influence importante pour  la standardisation, en mettant  l'accent  sur 
deux   aspects  particuliers  qui  sont  :  le catalogage d'une ressource orale
et son codage.

Le catalogage :
La  croissance récente  et  rapide du  nombre d'enregistrements  oraux
disponibles  sur  le  web  demande  à  être  accompagnée  d'un  effort important
 de description  et de  référencement afin  que  ces données soient accessibles
facilement,  ne soient pas noyées dans  le masse et que  la  gestion  en  soit 
facilitée.   Il convient  à  ce  titre  de distinguer des pratiques de
catalogage  qui ont pour vocation le suivi du cycle de  vie d'un projet de
constitution  de ressources, de celles qui ont pour  vocation à guider
l'exploitation, ou  la conservation et la diffusion de ces ressources.
Cette activité  de catalogage est d'autant plus  importante et urgente que  les
 opérations  de  manipulations  sur cette  masse  de  données augmentent   
elles    aussi    (échange,   maintenance,    recherche, etc.). Certaines
communautés se  sont déjà fortement organisées et ont parfois établi  des normes
comme  le Dublin-Core pour ce  qui concerne les ressources  sur le web, ou 
depuis plus longtemps  les normes MARC maintenues  par la  bibliothèque du 
congrès pour  la  description des ressources  bibliographiques. Plus  récemment
 enfin, des  communautés plus  restreintes  en  linguistique  ont établi  des 
propositions  de catalogage   (OLAC,   IMDI).   Il   existe   maintenant  
suffisamment d'expériences dans l'utilisation de  ces propositions pour en faire
la critique, proposer des améliorations,  des pistes de réflexion et pour
établir  une  charte minimale  à  respecter  par  ceux qui  souhaitent s'engager
dans  la diffusion  de ressources orales  linguistiques afin d'en   faciliter  
l'échange   et  plus   généralement   l'utilisation scientifique.

Le codage :
Si le catalogage est essentiel  à l'identification des ressources et à la 
comparaison  rapide  entre  elles,  le codage  est  pour  sa  part essentiel  à
 la description  de  l'interprétation  du  contenu de  la ressource   elle-même,
 et   à  son   exploitation  pour   des  études particulières. Si  le codage 
d'une transcription  de  vidéo ou  de son  est bien  un travail d'explicitation
 d'une interprétation établie du  point de vue d'une  discipline  d'un objet  de
 recherches,  alors  on constate  là également  une très  grande diversité  de
pratiques.  Les recensements opérés  lors des initiatives  successives EAGLES, 
MATE puis  ISLE ont démontré la difficulté d'appréhender  l'étendue des divers
systèmes de codage.  La spécification d'un  outil logiciel  d'annotation
universel peut  être  une  voie  d'accès   à  des  ressources  codées  de  façon
standardisée, comme  cela a  été suggéré par  ISLE. Mais cela  ne nous dispense
 pas   de  faire  le  travail   communautaire  consensuel  de catégorisation, de
dénomination et  de structuration des phénomènes se trouvant  au   sein  des 
ressources  si  l'on   souhaite  une  réelle interopérabilité  des  données  en
 vue d'exploitations  multiples  et futures. Il s'agit donc de  commencer le
travail de standardisation du codage des  corpus oraux  comme cela a  déjà
commencé pour  les corpus textuels avec la TEI.

Contrôle qualité :
En supposant obtenus  un compromis sur le catalogage  et sur le codage des
données orales, il est alors nécessaire de se donner des règles et des outils 
de vérification de la conformité  de données particulières aux principes 
établis. Nous souhaitons donc également  faire le point dans  ce colloque  sur
les  pratiques de  contrôle de  la  qualité des ressources.

Topics of interest / Thématiques

- description and cataloguing of spoken resources
- distribution
- specification of tools
- research applications
- archiving
- publishing of language corpora
- annotation
- version control
- cataloguing and coding standards
- comparison of resources
- multimodal and multimedia transcription
- annotation schemes
- interoperability
- evaluation, quality control

- description et référencement des données orales
- diffusion
- spécification d'outils
- exploitation scientifique
- conservation
- édition de corpus
- annotation
- versionning
- standards de catalogage et de codage
- comparaison des ressources
- transcription multimodale et multimédia
- schémas d'annotation
- interopérabilité
- évaluation, contrôle qualité

Important dates / Calendrier

Initial Call for papers / Date de l'appel à communication : 11 July /
juillet 2008
Submission deadline / Date de soumission des résumés : 30 September /
septembre 2008
Evaluation deadline / Réponse de l'évaluation : 30 October / octobre 2008

Workshop date and place / Date et lieu du colloque :

4-5 December / décembre 2008  at Université d'Orléans

Submissions

- Paper submissions should not exceed 2 pages in length.
- The abstract  should be sent  as an attachment  in WORD, PDF  or RTF
  format. If  this is  not possible, send  the abstract to  the postal
  address shown below.
- At the top of the abstract, outside the typing area, put the title.
- Your name should only appear in e-mail message carrying the attached
abstract.
- Special fonts: If your abstract uses any special fonts, there are two
options:
i. In addition to the document in WORD or RTF format, send a PDF document.
ii. Send a paper copy to the address shown below.
- When sending  the email submission,  please follow this  format (use
  the numbering system given below):
1. Title of abstract:
2. Name:
3. Address:
4. Affiliation:
5. Status (faculty, student):
6. Email address:
7. Fax:
8. Phone numbers:

Send abstracts to: catcod2008 at ens-lsh.fr .

If you are unable to send an abstract in an electronic format, mail it to:

CatCod 2008
s/c M. Plisson
Laboratoire LLL
Université d'Orléans - UFR Lettres, Langues et Sciences Humaines
10 Rue de Tours - BP 46527 - 45065 ORLEANS Cedex 2 FRANCE

Propositions de communication

- les résumés des communications ne doivent pas dépasser deux pages.
- les résumés sont à envoyer au format WORD, PDF ou RTF.
Si ce n'est pas possible par voie électronique, envoyez votre document
à l'adresse postale mentionnée plus bas.
- en entête du résumé, mentionner le titre de votre communication.
- votre nom ne doit apparaître que dans le courriel accompagnant votre
résumé.
- si vous utilisez des caractères spéciaux dans votre résumé, il y deux
solutions :
i. en plus du document WORD ou RTF, envoyez un document PDF
ii. envoyez un document papier au Comité Catcod
- dans  le   courriel  qui  accompagne  votre   résumé,  indiquez  les
  information suivantes en respectant la numérotation :
1. Titre du résumé
2. Nom de l'auteur (ou des auteurs)
3. Adresse
4. Organisme
5. Statut (Etudiant, Chercheur, etc.)
6. Adresse électronique
7. N° de fax
8. N° de téléphone

Envoyez votre résumé à catcod2008 at ens-lsh.fr .

Si vous ne pouvez pas envoyer le résumé par voie électronique, envoyez
votre courrier à :

CatCod 2008
s/c M. Plisson
Laboratoire LLL
Université d'Orléans - UFR Lettres, Langues et Sciences Humaines
10 Rue de Tours - BP 46527 - 45065 ORLEANS Cedex 2 FRANCE

Program Committee / Comité de programme

Jean-Yves Antoine (Université F. Rabelais Tours)
Claude Barras     (LIMSI-CNRS)
Steven Bird       (University of Melbourne & LDC University of Pennsylvania)
Lou Burnard       (Oxford University Computing Services)
Pascal Cordereix  (BNF, Paris)
Benoît Habert     (ENS-LSH, Lyon)
Serge Heiden      (ENS-LSH, Lyon)
Nancy Ide         (Vassar College)
Michel Jacobson   (Ministère de la Culture, Paris)
Laurent Romary    (MPI Berlin-INRIA)
Emmanuel Schang   (Université d'Orléans)
Richard Walter    (CNRS, Université d'Orléans)
Peter Wittenburg  (Max-Planck-Institute for Psycholinguistics, Nijmegen)

Organisation Commitee / Comité d'organisation

Serge Heiden    (ENS-LSH, Lyon)
Michel Jacobson (Ministère de la Culture, Paris)
Emmanuel Schang (Université d'Orléans)
Richard Walter  (CNRS, Université d'Orléans)

Sponsors

Agence Nationale pour la Recherche (ANR) : projet VARILING

Information and Contact / Informations et Contact

Email: catcod2008 at ens-lsh.fr
Web: http://www.catcod.org

_____________________________________________________________
Serge Heiden, slh at ens-lsh.fr, https://weblex.ens-lsh.fr
ENS-LSH/CNRS - ICAR UMR5191, Institut de Linguistique Française
15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél. +33(0)622003883

-------------------------Message 2 ---------------------------------- 
Date: Tue, 16 Sep 2008 11:35:42
From: Tom Recht [trecht at berkeley.edu]
Subject: Berkeley Linguistics Society, 35th Annual Meeting

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=19-2820.html&submissionid=190405&topicid=3&msgnumber=2

Full Title: Berkeley Linguistics Society, 35th Annual Meeting 
Short Title: BLS 35 

Date: 14-Feb-2009 - 16-Feb-2009
Location: Berkeley, CA, USA 
Contact Person: Tom Recht
Meeting Email: bls at berkeley.edu
Web Site: http://www.linguistics.berkeley.edu/BLS/ 

Linguistic Field(s): General Linguistics 

Call Deadline: 13-Nov-2008 

Meeting Description:

The 35th annual meeting of the Berkeley Linguistics Society will take place at
the University of California, Berkeley, on February 14-16, 2009. The meeting
will consist of a General Session, a Parasession, and a Special Session. 

Call for Papers

General Session 
The General Session will cover all areas of linguistic interest. We encourage
proposals from diverse theoretical frameworks and also welcome papers on
language-related topics from disciplines such as anthropology, cognitive
science, literature, neuroscience, and psychology. 

Invited Speakers 
- William Croft, University of New Mexico
- William Hanks, University of California, Berkeley

Parasession: Negation
The Parasession invites papers on negation from various theoretical perspectives
(cognitive, historical, typological, generative, functional, etc.) and in
relation to any linguistic sub-domain (semantic, pragmatic, syntactic,
morpho-phonological, lexical).  Submissions that introduce new data from
understudied languages are especially welcome.

Invited Speakers 
- Laurence Horn, Yale University
- William Ladusaw, University of California, Santa Cruz

Special Session: Non-speech modalities
The Special Session will explore non-speech linguistic modalities such as sign
language and gesture, whether independently or in relation to spoken language.
We invite papers on grammatical topics as well as those dealing with issues of
language acquisition, sociolinguistics, psycholinguistics and cognitive science.
Papers with historical and typological perspectives are also encouraged.

Invited Speakers 
- Ulrike Zeshan, University of Central Lancashire
- Adam Kendon, University of Pennsylvania

Submission Guidelines 

Deadline 
Abstracts must be received electronically by 5:00 p.m. Pacific Standard Time,
Thursday, 13 November 2008. No late submissions can be accepted. Authors will be
notified of (non-) acceptance by mid-December. 

Guidelines 
An author may submit at most one single and one joint abstract. In the case of
joint authorship, one address should be designated for communication with BLS.
Abstracts should be as specific as possible, with a statement of topic,
approach, and conclusions, and must fit on one page in 12-point font with 1"
margins. So that the review process may remain anonymous, authors should not
include their names or otherwise reveal their identity anywhere on this page.
Data and examples must be given within the body of the text rather than at the
end, though references may be included on a separate page if necessary. 

Submissions 
All abstracts must be submitted electronically as PDF documents. Further
instructions for electronic submissions are available at
http://linguistics.berkeley.edu/BLS/abstracts.html. 

Presentation and Publication 
Presentations are allotted 20 minutes plus 10 minutes for questions. Presented
papers are published in the BLS Proceedings. Authors agree to provide
camera-ready copy (up to 12 pages) by 15 May 2009.

-----------------------------------------------------------
LINGUIST List: Vol-19-2820