[Corpora-List] Final call: Treebanks for spoken language, NoDaLiDa05

Janne Bondi Johannessen j.b.johannessen at ilf.uio.no
Mon Feb 7 15:04:01 UTC 2005


	!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
	FINAL CALL FOR PAPERS
  	!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

  	NODALIDA 2005: http://phon.joensuu.fi/nodalida2005/
  	SPECIAL SESSION ON TREEBANKS:
	http://www.hf.uio.no/tekstlab/treebank_workshop

	SPECIAL SESSION ON TREEBANKS FOR
	SPOKEN LANGUAGE AND DISCOURSE

  	JOENSUU, FINLAND, THURSDAY MAY 19, 2005

  	ORGANIZED BY THE NORDIC TREEBANK NETWORK

  	Treebanks are a language resource that provides annotations of
  	natural languages at various levels: at the morpheme level, the
  	word level, the phrase level, the discourse level, and the level
  	of functor-argument structure. Treebanks have become crucially
  	important for the development of data-driven approaches to natural
  	language processing, human language technologies, grammar
  	extraction and linguistic research in general.

  	Existing spoken language treebanks include the Switchboard section
  	of the Penn Treebank [1], and the CHRISTINE [2] and ICE-GB [3]
  	treebanks for English; the VERBMOBIL [4] treebanks for English,
  	German, and Japanese; and the CGN [5] treebank for Dutch. Existing
  	discourse treebanks include the English RST Corpus [6] and the
  	Penn Discourse Treebank [7]. The DAMSL project [8] and the
  	Gothenburg Dialogue Coding Schemas [9] address the problem of
  	annotating dialogues with speech act relations between utterances.

  	The special NODALIDA session on treebanks aims to provide a forum
  	where researchers and advanced students with an interest in
  	treebanks can exchange ideas, in particular on how to extend
  	treebanks from syntactic annotations of written language to
  	treebanks that also include annotations of the structure of spoken
  	language with respect to syntax, discourse structure, and/or
  	speech acts.


  	TOPICS OF INTEREST

	Invited speaker: Bonnie Webber, University of Edinburgh.

	We invite submission of papers on topics relevant to treebanks
	in general, and spoken language and discourse treebanks in
particular, including but not
	limited to:

  		* design principles and annotation schemes for annotating
  		spoken language and discourse treebanks with respect
to syntax, discourse
		structure, and/or speech acts;

		* automatic tools for creating spoken language and
discourse treebanks, and  how
		 to adapt tools
			designed for creating written language
  		  treebanks to spoken language and discourse;

  		* comparing spoken language and discourse annotations
with written language
		 annotations, and
		identifying the most important challenges  in spoken
language and discourse
		annotation;

	While we particularly encourage submissions on spoken language
	and discourse treebanks, we also encourage submissions on other
	treebank topics.



  	SUBMISSIONS

  	We invite extended abstracts (approximately 1500 words) describing
  	existing research connected to the topics of the special session.
  	Submissions are non-anonymous and should include: title;
  	author(s); affiliation(s); and contact author's e-mail address,
  	postal address, telephone and fax numbers.

  	Abstracts should be sent to: mtk at id.cbs.dk

  	The presentation at the workshop will be 30 minutes long (20
  	minutes for presentation and 10 minutes for questions and
  	discussion). The final version of the accepted papers may not
  	exceed 12 A4 pages.


  	A SAMPLE SPOKEN LANGUAGE AND DISCOURSE TREEBANK

	We strongly encourage the participants as well as the speakers of
	the special session on spoken language and discourse
  	treebanks to contribute with a small sample treebank which should
  	preferably:

  		* be based on a small corpus of spontaneous spoken dialogue
  		  consisting of 500-1500 words in any language;
  		* contain English glosses to ensure that the treebank is
  		  accessible to a wider audience;
  		* include annotations of discourse relations, speech acts, or
  		  similar relations that connect sentences and utterances made
  		  by different speakers into larger units;
  		* contain annotated examples of overlapping dialogue,
  		  including utterances where one speaker completes an
  		  utterance started by another speaker.

  	The sample treebank should be submitted by sending the following
  	three files to mtk at id.cbs.dk before 20th February 2005:

  		* a plain text abstract of 50-200 words that briefly
  		describes
  		  how the sample treebank was created, possibly with
  		  hyperlinks to more detailed information about the treebank;
  		* a PDF file containing a human-readable visualization of the
  		  treebank;
  		* optionally, the source files for the sample treebank,
  		  preferably encoded in TIGER-XML format.

  	The sample treebanks will be made publicly available before the
  	NODALIDA conference.


  	IMPORTANT DATES

  		Deadline for submission of
  		 abstracts and treebank samples to the treebank
  		 session
  		February 20, 2005

  		Notification of acceptance
  		March 25, 2005

		Special session on treebanks
  		Thursday, May 19, 2005

  		Final version of paper for proceedings
  		June 20, 2005


  	PROCEEDINGS
  	Papers presented at the workshop will be
  	 invited to appear in the workshop proceedings
  	(after a reviewing process).

  	PROGRAM COMMITTEE

  	Matthias Trautner Kromann (mtk at id.cbs.dk)
  	Peter Juel Henrichsen (pjuel at id.cbs.dk)
  	Janne Bondi Johannessen (jannebj at ilf.uio.no)


IMPORTANT WEBSITES:

SPECIAL TREEBANK SESSION: http://www.hf.uio.no/tekstlab/treebank_workshop
NORDIC TREEBANK NETWORK: http://w3.msi.vxu.se/~nivre/research/nt.html
NODALIDA: http://phon.joensuu.fi/nodalida2005/



  	LINKS

  	[1] http://www.cis.upenn.edu/~treebank/home.html
  	[2] http://www.grsampson.net/RChristine.html
  	[3] http://www.ucl.ac.uk/english-usage/ice-gb
  	[4] http://verbmobil.dfki.de/cgi-bin/verbmobil/htbin/doc-access.cgi
  	[5]

http://lands.let.kun.nl/cgn/doc_English/topics/version_1.0/annot/syntax/info.htm
  	[6] http://www.isi.edu/~marcu/discourse
  	[7] http://www.cis.upenn.edu/~pdtb
  	[8] http://www.cs.rochester.edu/research/cisd/resources/damsl/
  	[9] http://www.ling.gu.se/~jens/publications/docs076-100/093.pdf



More information about the Corpora mailing list