16.414, Qs: Internet Chat Corpora; Typology Island Constraints

LINGUIST List linguist at linguistlist.org
Thu Feb 10 21:34:21 UTC 2005


LINGUIST List: Vol-16-414. Thu Feb 10 2005. ISSN: 1068 - 4875.

Subject: 16.414, Qs: Internet Chat Corpora; Typology Island Constraints

Moderators: Anthony Aristar, Wayne State U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org)
        Sheila Collberg, U of Arizona
        Terry Langendoen, U of Arizona

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Steven Moran <steve at linguistlist.org>
================================================================

We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then strongly encouraged to post a summary to the list. This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.

In addition to posting a summary, we'd like to remind people that it
is usually a good idea to personally thank those individuals who have
taken the trouble to respond to the query.

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.


===========================Directory==============================

1)
Date: 09-Feb-2005
From: Stuart McCaul < mccauls at tcd.ie >
Subject: Internet Chat Corpora

2)
Date: 09-Feb-2005
From: Inbal Arnon < inbalar at stanford.edu >
Subject: The Typology of Island Constraints

	
-------------------------Message 1 ----------------------------------
Date: Thu, 10 Feb 2005 16:32:34
From: Stuart McCaul < mccauls at tcd.ie >
Subject: Internet Chat Corpora


My project is to implement an Internet chatroom text-filter based on
Bayesian maths, which will categorise chat-room conversations. The filter
must be trained on pre-categorised text and I am having trouble finding a
categorised corpus of chat-room conversations.

Do you know of such a corpus? Or do you have any other advice for me?

In the event that such a collection of conversations is not available, I
will use the Reuters corpus or similar collection to train my filter.

I am a 4th year undergraduate student at Trinity College, Dublin, studying
for a bachelor's degree in Information and Communications Technology.

Linguistic Field(s): Computational Linguistics
                     Text/Corpus Linguistics

Subject Language(s): English (ENG)



	
-------------------------Message 2 ----------------------------------
Date: Thu, 10 Feb 2005 16:32:37
From: Inbal Arnon < inbalar at stanford.edu >
Subject: The Typology of Island Constraints

	

I am part of a research group that is investigating island constraints and
their relation to processing difficulty.  We are currently gathering
information about cross-linguistic differences in island constraints and
the effects of finiteness, lexical choice (e.g. bridge verbs), etc. on
extraction difficulty, e.g. contrasts like: ( >/ = `at least as acceptable as')

Which book were they wondering whether or not to read? >/
Which book were they wondering whether or not they should read? >/
Which book were they wondering whether or not he had read?

Which symphony did Schubert die before finishing? >
Which symphony did Schubert die before he finished?

We'd welcome pointers to relevant data from any language or literature
discussing cross-linguistic differences in island phenomena.  We're
interested in cross-linguistic variation of all sorts, including subjacency
or superiority effects, violations of the coordinate structure constraint, left
branch condition, etc.

Linguistic Field(s): Syntax
                     Typology






-----------------------------------------------------------
LINGUIST List: Vol-16-414	

	



More information about the LINGUIST mailing list