[Corpora-List] Segmenting dialogue corpora (Yorick Wilks)
Maria Georgescul
Maria.Georgescul at eti.unige.ch
Mon Oct 27 13:29:05 UTC 2008
Hello,
We performed experiences with discriminative and generative
machine learning techniques for automatic text structuring into linear
and non-overlapping thematic episodes. In particular, we investigated
the topic segmentation performance on multi-party dialogues using the
ICSI data.
Here are a few references regarding the results we obtained:
- M. Georgescul, A. Clark and S. Armstrong, "An Analysis of Quantitative
Aspects in the Evaluation of Thematic Segmentation Algorithms", The 7th
SIGdial Workshop on Discourse and Dialogue, Sydney, 144-152, 2006.
- M. Georgescul, A. Clark and S. Armstrong, "Word Distributions for
Thematic Segmentation in a Support Vector Machine Approach", The 10th
Conference on Computational Natural Language Learning (CoNLL-X),
101-109, New York, USA, 2006.
- M. Georgescul, A. Clark and S. Armstrong, "Exploiting Structural
Meeting-Specific Features for Topic Segmentation", Actes de la 14ème
Conférence sur le Traitement Automatique des Langues Naturelles,
Association pour le Traitement Automatique des Langues, Toulouse,
France, 2007.
- M. Georgescul, A. Clark and S. Armstrong, "A Comparative Study of
Mixture Models for Automatic Topic Segmentation of Multiparty
Dialogues", The 3rd International Joint Conference on Natural Language
Processing (IJCNLP), Hyderabad, India, January 7-12, 2008.
Best regards,
Maria Georgescul
--
ISSCO/TIM, ETI
University of Geneva
-------- Original Message --------
Subject: Corpora Digest, Vol 16, Issue 23
Date: Sat, 25 Oct 2008 15:00:15 +0200
From: corpora-request at uib.no
Reply-To: corpora at uib.no
To: corpora at uib.no
Today's Topics:
1. Re: Free POS tagger (Niels Ott)
2. Re: Segmenting dialogue corpora (Yorick Wilks)
3. Re: Segmenting dialogue corpora (John Niekrasz)
4. Re: Segmenting dialogue corpora (John Niekrasz)
----------------------------------------------------------------------
Message: 1
Date: Fri, 24 Oct 2008 17:01:41 +0200
From: Niels Ott <nott at sfs.uni-tuebingen.de>
Subject: Re: [Corpora-List] Free POS tagger
Cc: corpora at uib.no
Hi,
Niels Ott schrieb:
> I know that there are some models available. but the person asking was
> interested in POS tagging English and at least me and myself can't find
> an OpenNLP model for English tagging at the download site.
> http://opennlp.sourceforge.net/models/english/
As it turned out that the models are located in the parser directory
http://opennlp.sourceforge.net/models/english/parser/
Sorry for any inconvenience I may have created.
Best,
Niels
--
Niels Ott
Computational Linguist (B.A.)
http://www.drni.de/niels/
------------------------------
Message: 2
Date: Fri, 24 Oct 2008 16:40:15 +0100
From: Yorick Wilks <Yorick at dcs.shef.ac.uk>
Subject: Re: [Corpora-List] Segmenting dialogue corpora
To: CORPORA List <corpora at uib.no>
> Does anyone out there have experience or recommendations on attempts
> to segment dialogue corpora into "tiles" (by methods like Marti
> Hearst's) into topic-coherent segments. We have not found applying
> her (prose) methods very productive for the dialogue corpora we have
and I would be glad to hear of any positive experiences of
researchers in doing this.
Yorick Wilks
Sheffield University
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
------------------------------
Message: 3
Date: Fri, 24 Oct 2008 22:03:23 +0100
From: "John Niekrasz" <john.niekrasz at gmail.com>
Subject: Re: [Corpora-List] Segmenting dialogue corpora
To: "Yorick Wilks" <Yorick at dcs.shef.ac.uk>
Cc: CORPORA List <corpora at uib.no>
Maybe try Michel Galley's LCSeg software.
John Niekrasz
Edinburgh University
On Fri, Oct 24, 2008 at 4:40 PM, Yorick Wilks <Yorick at dcs.shef.ac.uk> wrote:
>> Does anyone out there have experience or recommendations on attempts
>> to segment dialogue corpora into "tiles" (by methods like Marti
>> Hearst's) into topic-coherent segments. We have not found applying
>> her (prose) methods very productive for the dialogue corpora we have
> and I would be glad to hear of any positive experiences of
> researchers in doing this.
> Yorick Wilks
> Sheffield University
>
>
>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
------------------------------
Message: 4
Date: Fri, 24 Oct 2008 22:08:47 +0100
From: "John Niekrasz" <john.niekrasz at gmail.com>
Subject: Re: [Corpora-List] Segmenting dialogue corpora
To: "Yorick Wilks" <Yorick at dcs.shef.ac.uk>
Cc: CORPORA List <corpora at uib.no>
The relevant publication describing it is:
Michel Galley, Kathleen McKeown, Eric Fosler-Lussier, Hongyan Jing
(2003). Discourse Segmentation of Multi-Party Conversation. In
Proceedings of the 41st Annual Meeting of the Association for
Computational Linguistics (ACL 2003). July 21-26, 2003. Sapporo,
Japan.
http://www-nlp.stanford.edu/~mgalley/papers/mtgseg.pdf
John
On Fri, Oct 24, 2008 at 10:03 PM, John Niekrasz
<john.niekrasz at gmail.com> wrote:
> Maybe try Michel Galley's LCSeg software.
>
> John Niekrasz
> Edinburgh University
>
> On Fri, Oct 24, 2008 at 4:40 PM, Yorick Wilks <Yorick at dcs.shef.ac.uk> wrote:
>>> Does anyone out there have experience or recommendations on attempts
>>> to segment dialogue corpora into "tiles" (by methods like Marti
>>> Hearst's) into topic-coherent segments. We have not found applying
>>> her (prose) methods very productive for the dialogue corpora we have
>> and I would be glad to hear of any positive experiences of
>> researchers in doing this.
>> Yorick Wilks
>> Sheffield University
>>
>>
>>
>>> _______________________________________________
>>> Corpora mailing list
>>> Corpora at uib.no
>>> http://mailman.uib.no/listinfo/corpora
>>>
>>
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
----------------------------------------------------------------------
Send Corpora mailing list submissions to
corpora at uib.no
To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.uib.no/listinfo/corpora
or, via email, send a message with subject or body 'help' to
corpora-request at uib.no
You can reach the person managing the list at
corpora-owner at uib.no
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Corpora digest..."
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
End of Corpora Digest, Vol 16, Issue 23
***************************************
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list