16.414, Qs: Internet Chat Corpora; Typology Island Constraints
LINGUIST List
linguist at linguistlist.org
Thu Feb 10 21:34:21 UTC 2005
LINGUIST List: Vol-16-414. Thu Feb 10 2005. ISSN: 1068 - 4875.
Subject: 16.414, Qs: Internet Chat Corpora; Typology Island Constraints
Moderators: Anthony Aristar, Wayne State U <aristar at linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>
Reviews (reviews at linguistlist.org)
Sheila Collberg, U of Arizona
Terry Langendoen, U of Arizona
Homepage: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.
Editor for this issue: Steven Moran <steve at linguistlist.org>
================================================================
We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then strongly encouraged to post a summary to the list. This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.
In addition to posting a summary, we'd like to remind people that it
is usually a good idea to personally thank those individuals who have
taken the trouble to respond to the query.
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
===========================Directory==============================
1)
Date: 09-Feb-2005
From: Stuart McCaul < mccauls at tcd.ie >
Subject: Internet Chat Corpora
2)
Date: 09-Feb-2005
From: Inbal Arnon < inbalar at stanford.edu >
Subject: The Typology of Island Constraints
-------------------------Message 1 ----------------------------------
Date: Thu, 10 Feb 2005 16:32:34
From: Stuart McCaul < mccauls at tcd.ie >
Subject: Internet Chat Corpora
My project is to implement an Internet chatroom text-filter based on
Bayesian maths, which will categorise chat-room conversations. The filter
must be trained on pre-categorised text and I am having trouble finding a
categorised corpus of chat-room conversations.
Do you know of such a corpus? Or do you have any other advice for me?
In the event that such a collection of conversations is not available, I
will use the Reuters corpus or similar collection to train my filter.
I am a 4th year undergraduate student at Trinity College, Dublin, studying
for a bachelor's degree in Information and Communications Technology.
Linguistic Field(s): Computational Linguistics
Text/Corpus Linguistics
Subject Language(s): English (ENG)
-------------------------Message 2 ----------------------------------
Date: Thu, 10 Feb 2005 16:32:37
From: Inbal Arnon < inbalar at stanford.edu >
Subject: The Typology of Island Constraints
I am part of a research group that is investigating island constraints and
their relation to processing difficulty. We are currently gathering
information about cross-linguistic differences in island constraints and
the effects of finiteness, lexical choice (e.g. bridge verbs), etc. on
extraction difficulty, e.g. contrasts like: ( >/ = `at least as acceptable as')
Which book were they wondering whether or not to read? >/
Which book were they wondering whether or not they should read? >/
Which book were they wondering whether or not he had read?
Which symphony did Schubert die before finishing? >
Which symphony did Schubert die before he finished?
We'd welcome pointers to relevant data from any language or literature
discussing cross-linguistic differences in island phenomena. We're
interested in cross-linguistic variation of all sorts, including subjacency
or superiority effects, violations of the coordinate structure constraint, left
branch condition, etc.
Linguistic Field(s): Syntax
Typology
-----------------------------------------------------------
LINGUIST List: Vol-16-414
More information about the LINGUIST
mailing list