Corpora: corpora: driver navigation

LDC Office ldc at unagi.cis.upenn.edu
Fri Feb 25 18:36:58 UTC 2000


Dear Angela,

The Linguistic Data Consortium (LDC) distributes two corpora
which might be of interest.  The HCRC Map Task Corpus
contains a total of about 18 hours of spontaneous speech that
was recorded from 128 two-person conversations, involving 64
different speakers, which were all students at the University
of Glasgow, 61 of them being native Scots. The conversations
were carried out in an experimental setting, in which each
participant has a schematic map in front of them, not visible
to the other.  Each map is comprised of an outline and
roughly a dozen labelled features (e.g. "a white cottage",
"an oak forest", "Green Bay", etc). Most features are common
to the two maps, but not all. One map has a route drawn in,
the other does not. The task is for the participant without
the route to draw one on the basis of discussion with the
participant with the route. In addition to the conversations,
each speaker provides a wordlist reading, consisting of the
major vocabulary items contained in the conversations.

The Road Rally corpus was designed for the development and
testing of word-spotting systems and was collected in a
conversational domain using a road rally planning task as the
topic. The corpus actually consists of 2 sub-corpora,
"Stonehenge" and "Waterloo". The Stonehenge corpus contains
road rally planning conversations as well as some read speech
collected using high quality microphones and a
telephone-simulating filter. The Waterloo corpus contains
read road rally planning domain speech which was collected
using actual telephone lines.

The Stonehenge corpus was collected from subjects
using telephone handsets which were modified to contain a
high quality microphone. To gather conversational data, 2
talkers were located in separate rooms, given a road map and
asked to participate in a road rally planning task. Their
objective was to form a path between two locations on the map
which would maximize their road rally point score. They were
also given a time limit in which to complete the task to
increase their responsiveness.

The Waterloo corpus was collected as an extension to
Stonehenge to provide similar domain speech under different
conditions. The corpus was collected from subjects using
conventional telephones and dialed up telephone lines in the
Massachussetts area.

For further information on LDC resources please visit our
Catalog at http://morph.ldc.upenn.edu/Catalog/.

Please feel free to contact me if you have any questions
regarding the corpora.

Best,
Shannon Sears
Manager, Intellectual Property Rights and Membership
----------------------------------------------------------------------
Linguistic Data Consortium          Phone: (215) 898-0464
3615 Market Street                  Fax:   (215) 573-2175
Suite 200                           email: ssears at ldc.upenn.edu
Philadelphia, PA 19104-2608         www: http://www.ldc.upenn.edu

> Date: Fri, 25 Feb 2000 10:16:22 -0800
> From: Angela Kessell <akessell at isl.hrl.hac.com>
> Organization: HRL Laboratories
> To: corpora at hd.uib.no
> Subject: Corpora: corpora:  driver navigation
> Precedence: bulk
>
> Dear All,
>
> Can anyone tell me if corpora exist of people giving directions or
> helping each other navigate a car or drive to a specific location?
>
> Thank you for your help,
> Angela
>
>



More information about the Corpora mailing list