Appel: Colloque La cooccurrence, du fait statistique au fait textuel

Fri Jun 10 20:25:37 UTC 2011

Date: Fri, 10 Jun 2011 10:46:42 +0200
From: Sylvie Mellet <Sylvie.Mellet at unice.fr>
Message-ID: <4DF1D9F2.8080505 at unice.fr>

*Appel à communications*

*/La cooccurrence : du fait statistique au fait textuel/*

Besançon, 8-9-10 février 2012

(/english version below/)

Depuis 60 ans et l'historique affirmation de Firth « you shall judge a
word by the company it keeps », les travaux sur la cooccurrence se
sont multipliés dans le monde pour devenir un champ à part entière de
la linguistique de corpus et de la linguistique textuelle.

Au-delà des perspectives très diverses (études phraséologiques,
extraction d'expressions idiomatiques, traduction automatique,
désambiguïsation sémantique d'homographes, fouille de textes,
description et modélisation textuelles, mise à jour des thématiques du
discours, etc.), au-delà aussi des variantes terminologiques pas
uniquement imputables à des effets de traduction (/cooccurrence/,
/collocation/, /colligation/, /corrélats/, /associations/, etc.), ces
travaux foisonnants reposent sur une posture contextualisante commune
et sur une approche probabiliste du langage partagée [London school
(Firth et Halliday) puis Birmingham school (Sinclair), Laboratoire de
Saint-Cloud (Tournier), etc.]. Le postulat est en effet ici que le
sens naît toujours en contexte, qu'il se construit à partir du
co-texte, et la cooccurrence représente la seule forme objectivable,
minimale mais calculable, de ce co-texte.

Le colloque « La cooccurrence : du fait statistique au fait textuel »
(Besançon, 8-10 février 2012) veut aujourd'hui faire le point sur les
études en cours, confronter les méthodologies, analyser les enjeux.

Sans exclure les études lexicologiques, phraséologiques ou le
traitement automatique de la langue dont les avancées théoriques aussi
bien que les nouveaux algorithmes de calcul, les outils d'extraction
et de représentation des réalités cooccurrentielles pourront être
présentés, on portera une attention particulière, dans la voie
lointainement initiée par Halliday & Hasan, à *la cooccurrence comme
facteur primordial de la textualité*.

Partant du simple calcul des paires de mots statistiquement
cooccurrentes dans le corpus ou du simple repérage des mots
statistiquement associés à un mot-pôle donné, les contributions
devront chercher à apporter un éclairage novateur sur des phénomènes
cooccurrentiels plus complexes comme les réseaux cooccurrentiels
intriqués qui structurent un texte, la cooccurrence généralisée
(Viprey) ou la poly-cooccurrence (Martinez) qui construisent des
structures d'équivalence, ou de résonance, non obvies, ou encore sur
les cooccurrences indirectes (A => B => C) ou de deuxième génération
qui à force d'itération prétendent épuiser le texte. Si le texte est
aujourd'hui perçu comme une entité réticulaire avec ses récurrences,
ses échos, ses rhizomes, la cooccurrence doit permettre de le décrire
et le modéliser.

L'axe syntagmatique et la linéarité orientée du texte pourront aussi
être problématisés. D'abord purement statistique, la définition de la
cooccurrence comme coprésence régulière de deux unités linguistiques
dans une fenêtre textuelle donnée pourra être enrichie par des
contraintes distributionnelles dans la tradition harrissienne ou des
contraintes de contiguïté, d'orientation, de place ou d'enchaînement
pour rejoindre la notion de /pattern/, de /segment répété/ (Salem) ou
de /motif/ (Longrée et Mellet).

Enfin, une attention particulière pourra être portée aux cooccurrences
sur des niveaux linguistiques étagés, c'est-à-dire à celles qui
concernent non pas seulement deux mots comme c'est habituellement le
cas dans le cadre du traitement du vocabulaire, mais deux éléments
grammaticaux, ou, de manière croisée, des éléments lexicaux et
grammaticaux, des lemmes et des étiquettes morphosyntaxiques, etc.

Certaines communications pourront porter sur l'histoire proprement
dite de cette dimension dans la constitution et l'évolution de la
linguistique textuelle, dans les domaines francophone, anglo-saxon,
scandinave, mais aussi dans le reste du monde.

A l'exception de ces dernières, les propositions de communication
devront prendre appui sur l'étude de corpus précisément définis et,
par delà le traitement et l'analyse des données, mettre fortement en
évidence l'apport méthodologique et/ou théorique de la contribution.

Les langues du colloque seront le français et l'anglais.

*Mots clés :*

Cooccurrence
Segment répété
Collocation
Colligation
Corrélats
Association lexicale
Contextualisation sémantique
Linguistique de corpus
Textualité

*Comité scientifique* : Gaëtane Dostie, Serge Heiden, Margareta
Kastberg, Jean-Marc Leblanc, Dominique Legallois, Dominique Longrée,
Lita Lundqvist, Damon Mayaffre, Sylvie Mellet, Henning Nølke, Max
Silberstein, Jean Véronis, Jean-Marie Viprey.

*Calendrier :*

- Colloque : 8-10 février 2012

- Date limite de soumission : 10septembre 2011

- Avis d'acceptation aux auteurs : 15 octobre 2011

Les actes du colloque seront publiés, après un nouveau processus de
soumission, relecture et sélection, dans le numéro 11 de la revue
/CORPUS/, à paraître en novembre 2012.

**

*Conference**call*

*/Co-occurrence: from a statistical to a textual phenomenon/*

*Besançon, 8-9-10 February 2012*

**

Call for papers

60 years ago, Firth declared, "you shall judge a word by the company
it keeps." Since he made that historic statement, studies on
co-occurrence patterns have multiplied in the world and have now
become a field in their own right, existing within the larger fields
of corpus linguistics and text linguistics.

Work focusing on co-occurrence patterns has adopted a great variety of
perspectives, focusing on, for example, phraseological phenomena,
automatic extraction of idioms, automatic translation, homograph
disambiguation, text mining, textual description and modelling, and
retrieval of discourse themes. Also, the terminology used has
displayed a notable diversity (which may not be entirely explained by
translation effects): indeed, the phenomena examined have been
referred to as co-occurrence patterns, collocations, colligations,
correlates, associations, and other terms. However, one should not be
puzzled, because, in spite of these visible differences, studies on
co-occurrence patterns all share a common postulate regarding the
essential role of context in the elaboration of word meaning, and they
all bring a probabilistic approach to the study of language [see
London school (Firth and Halliday), Birmingham school (Sinclair),
Laboratoire Saint-Cloud (Tournier), etc]. In brief, they assume that
(i) meaning is always a contextual phenomenon and is elaborated on the
basis of the co-text; and (ii) co-occurrence patterns are to be
considered as the only objectified units of the context, a minimal yet
calculable form.

The colloquium "Co-occurrence: from a statistical to a textual
phenomenon", to be held in Besançon from 8-10 February 2012, intends
to take stock of current studies, to compare methodologies, and to
analyse the scientific stakes involved.

Lexicological and phraseological studies, as well as work on natural
language processing, may be submitted. Contributions may also present
new logarithms, or new tools for extracting and modelling
co-occurrence phenomena. Work that would embrace the path opened long
ago by Halliday and Hasan, /i.e. /to//consider *textuality as being
primarily built by co-occurrence facts*,**would be most appreciated.

Whether the contributions rely on the simple computation of word pairs
that statistically co-occur within a given corpus, or on the simple
identification of words that are statistically associated with a
word-pole, it is expected that they will shed light on more complex
structures such as the intricate co-occurrence networks organizing a
text, generalised co-occurrences (Viprey), or polyco-occurrences
(Martinez), which contribute to the establishment of
underlyingequivalence or resonance structures. Contributions may also
address indirect co-occurrence patterns A => B => C (also called
second-generation co-occurrence patterns) which, if repeated several
times, help reveal the content of a text. Indeed, since a text is now
widely considered as a reticular entity formed by recurrent patterns,
echoes, and rhizomes, then co-occurrence patterns should make text
description and modelling possible.

The issues of the syntagmatic axis and the linear orientation of a
text may also be debated. Co-occurrence patterns have been defined,
first, as a purely statistical phenomenon,/i.e/. as the regular
co-presence of two linguistic units within a given textual window; but
it may be improved by identifying distributional constraints
(following the tradition of Harris's work) or constraints affecting
adjacency relations, or the orientation, position and/or sequencing of
given units; work of this kind would then meet the notions
of/pattern/, /repeated segments/ (Salem) or /motifs/ (Longrée and
Mellet).

Finally, special attention may also be given to multi-level
co-occurrence patterns /i.e./ patterns connecting two grammatical
units (and not two lexical items as is usually the case when one
analyses the vocabulary of a given corpus); studies may also cross
linguistic levels and focus on the co-occurrence of lexical units with
grammatical units, lemmas with morpho-syntactic tags, etc.

A certain number of submissions concerned with the history of the
notion of /co-occurrence/ and its role in the emergence and evolution
of textual linguistics will be accepted; the tradition may be
investigated in French-speaking countries, in the Anglo-Saxon and
Scandinavian areas, or in the rest of the world.

With the exception of this last (historical) orientation, submissions
must rely on precisely defined corpora, and, beyond the processing and
analysis of the data, they must emphasize the methodological and/or
theoretical contribution to the field.

The languages of the conference are French and English.

*Key words*

Co-occurrences
Repeated segments
Collocations
Lexical associations
Semantic contextualization
Corpus linguistics
Textuality

*Scientific committee*

Gaëtane Dostie, Serge Heiden, Margareta Kastberg, Jean-Marc Leblanc,
Dominique Legallois, Dominique Longrée, Lita Lundqvist, Damon
Mayaffre, Sylvie Mellet, Henning Nølke, Max Silberstein, Jean Véronis,
Jean-Marie Viprey.

Calendar

Conference: 8-9-10 February 2012

Deadline for paper submission: 10 September 2011

Notification of acceptance: 15 October 2011

Conference proceedings will be published, after a new process of
submission, review and selection, in /Corpus/ (11) to appear in
November 2012.

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------