[Corpora-List] Segmenting dialogue corpora (Yorick Wilks)

Maria Georgescul Maria.Georgescul at eti.unige.ch
Mon Oct 27 13:29:05 UTC 2008


Hello,

We performed experiences with discriminative and generative
machine learning techniques for automatic text structuring into linear 
and non-overlapping thematic episodes. In particular, we investigated 
the topic segmentation performance on multi-party dialogues using the 
ICSI data.
Here are a few references regarding the results we obtained:

- M. Georgescul, A. Clark and S. Armstrong, "An Analysis of Quantitative 
Aspects in the Evaluation of Thematic Segmentation Algorithms", The 7th 
SIGdial Workshop on Discourse and Dialogue, Sydney, 144-152, 2006.

- M. Georgescul, A. Clark and S. Armstrong, "Word Distributions for 
Thematic Segmentation in a Support Vector Machine Approach", The 10th 
Conference on Computational Natural Language Learning (CoNLL-X), 
101-109, New York, USA, 2006.

- M. Georgescul, A. Clark and S. Armstrong, "Exploiting Structural 
Meeting-Specific Features for Topic Segmentation", Actes de la 14ème 
Conférence sur le Traitement Automatique des Langues Naturelles, 
Association pour le Traitement Automatique des Langues, Toulouse, 
France, 2007.

- M. Georgescul, A. Clark and S. Armstrong, "A Comparative Study of 
Mixture Models for Automatic Topic Segmentation of Multiparty 
Dialogues", The 3rd International Joint Conference on Natural Language 
Processing (IJCNLP), Hyderabad, India, January 7-12, 2008.


Best regards,
Maria Georgescul
--
ISSCO/TIM, ETI
University of Geneva



-------- Original Message --------
Subject: Corpora Digest, Vol 16, Issue 23
Date: Sat, 25 Oct 2008 15:00:15 +0200
From: corpora-request at uib.no
Reply-To: corpora at uib.no
To: corpora at uib.no

Today's Topics:

    1. Re:  Free POS tagger (Niels Ott)
    2. Re:  Segmenting dialogue corpora (Yorick Wilks)
    3. Re:  Segmenting dialogue corpora (John Niekrasz)
    4. Re:  Segmenting dialogue corpora (John Niekrasz)


----------------------------------------------------------------------

Message: 1
Date: Fri, 24 Oct 2008 17:01:41 +0200
From: Niels Ott <nott at sfs.uni-tuebingen.de>
Subject: Re: [Corpora-List] Free POS tagger
Cc: corpora at uib.no

Hi,

Niels Ott schrieb:
> I know that there are some models available. but the person asking was 
> interested in POS tagging English and at least me and myself can't find 
> an OpenNLP model for English tagging at the download site. 
> http://opennlp.sourceforge.net/models/english/

As it turned out that the models are located in the parser directory
http://opennlp.sourceforge.net/models/english/parser/

Sorry for any inconvenience I may have created.

Best,

     Niels


-- 
Niels Ott
Computational Linguist (B.A.)
http://www.drni.de/niels/



------------------------------

Message: 2
Date: Fri, 24 Oct 2008 16:40:15 +0100
From: Yorick Wilks <Yorick at dcs.shef.ac.uk>
Subject: Re: [Corpora-List] Segmenting dialogue corpora
To: CORPORA List <corpora at uib.no>

> Does anyone out there have experience or recommendations on attempts  
> to segment dialogue corpora into "tiles" (by methods like Marti  
> Hearst's) into topic-coherent segments. We have not found applying  
> her (prose) methods very productive for the dialogue corpora we have
     and I would be glad to hear of any positive experiences of
researchers in doing this.
     Yorick Wilks
     Sheffield University



> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>




------------------------------

Message: 3
Date: Fri, 24 Oct 2008 22:03:23 +0100
From: "John Niekrasz" <john.niekrasz at gmail.com>
Subject: Re: [Corpora-List] Segmenting dialogue corpora
To: "Yorick Wilks" <Yorick at dcs.shef.ac.uk>
Cc: CORPORA List <corpora at uib.no>

Maybe try Michel Galley's LCSeg software.

John Niekrasz
Edinburgh University

On Fri, Oct 24, 2008 at 4:40 PM, Yorick Wilks <Yorick at dcs.shef.ac.uk> wrote:
>> Does anyone out there have experience or recommendations on attempts
>> to segment dialogue corpora into "tiles" (by methods like Marti
>> Hearst's) into topic-coherent segments. We have not found applying
>> her (prose) methods very productive for the dialogue corpora we have
>    and I would be glad to hear of any positive experiences of
> researchers in doing this.
>    Yorick Wilks
>    Sheffield University
>
>
>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>



------------------------------

Message: 4
Date: Fri, 24 Oct 2008 22:08:47 +0100
From: "John Niekrasz" <john.niekrasz at gmail.com>
Subject: Re: [Corpora-List] Segmenting dialogue corpora
To: "Yorick Wilks" <Yorick at dcs.shef.ac.uk>
Cc: CORPORA List <corpora at uib.no>

The relevant publication describing it is:

Michel Galley, Kathleen McKeown, Eric Fosler-Lussier, Hongyan Jing
(2003). Discourse Segmentation of Multi-Party Conversation.  In
Proceedings of the 41st Annual Meeting of the Association for
Computational Linguistics (ACL 2003). July 21-26, 2003. Sapporo,
Japan.

http://www-nlp.stanford.edu/~mgalley/papers/mtgseg.pdf

John

On Fri, Oct 24, 2008 at 10:03 PM, John Niekrasz 
<john.niekrasz at gmail.com> wrote:
> Maybe try Michel Galley's LCSeg software.
>
> John Niekrasz
> Edinburgh University
>
> On Fri, Oct 24, 2008 at 4:40 PM, Yorick Wilks <Yorick at dcs.shef.ac.uk> wrote:
>>> Does anyone out there have experience or recommendations on attempts
>>> to segment dialogue corpora into "tiles" (by methods like Marti
>>> Hearst's) into topic-coherent segments. We have not found applying
>>> her (prose) methods very productive for the dialogue corpora we have
>>    and I would be glad to hear of any positive experiences of
>> researchers in doing this.
>>    Yorick Wilks
>>    Sheffield University
>>
>>
>>
>>> _______________________________________________
>>> Corpora mailing list
>>> Corpora at uib.no
>>> http://mailman.uib.no/listinfo/corpora
>>>
>>
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>



----------------------------------------------------------------------
Send Corpora mailing list submissions to
	corpora at uib.no

To subscribe or unsubscribe via the World Wide Web, visit
	http://mailman.uib.no/listinfo/corpora
or, via email, send a message with subject or body 'help' to
	corpora-request at uib.no

You can reach the person managing the list at
	corpora-owner at uib.no

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Corpora digest..."


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


End of Corpora Digest, Vol 16, Issue 23
***************************************



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list