8.1209, FYI: LDC Collection, American Dialect Soc.
The LINGUIST List
linguist at linguistlist.org
Thu Aug 21 05:05:45 UTC 1997
LINGUIST List: Vol-8-1209. Thu Aug 21 1997. ISSN: 1068-4875.
Subject: 8.1209, FYI: LDC Collection, American Dialect Soc.
Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at linguistlist.org>
Helen Dry: Eastern Michigan U. <hdry at linguistlist.org>
T. Daniel Seely: Eastern Michigan U. <seely at linguistlist.org>
Review Editor: Andrew Carnie <carnie at linguistlist.org>
Associate Editors: Ljuba Veselinova <ljuba at linguistlist.org>
Ann Dizdar <ann at linguistlist.org>
Assistant Editor: Martin Jacobsen <marty at linguistlist.org>
Software development: John H. Remmers <remmers at emunix.emich.edu>
Zhiping Zheng <zzheng at online.emich.edu>
Home Page: http://linguistlist.org/
Editor for this issue: Martin Jacobsen <marty at linguistlist.org>
=================================Directory=================================
1)
Date: Wed, 20 Aug 1997 17:55:55 EDT
From: LDC Office <ldc at unagi.cis.upenn.edu>
Subject: New Collection from the Linguistic Data Consortium
2)
Date: Wed, 20 Aug 1997 09:24:56 -0600
From: Andrew & Diane Lillie <andrewl at byu.edu>
Subject: New American Dialect Society URL
-------------------------------- Message 1 -------------------------------
Date: Wed, 20 Aug 1997 17:55:55 EDT
From: LDC Office <ldc at unagi.cis.upenn.edu>
Subject: New Collection from the Linguistic Data Consortium
Announcing a NEW RELEASE from the
LINGUISTIC DATA CONSORTIUM
CALLHOME Collection in Six Languages
The objective of the CALLHOME project is the creation of a
multi-lingual speech corpus that will support the development of Large
Vocabulary Conversational Speech Recognition (LVCSR) technology. The
collection covers six languages, American English, Egyptian Arabic,
German, Japanese, Mandarin Chinese, and Spanish.
Each CALLHOME language includes telephone speech, transcripts and
tables, and a lexicon. Each language can be distributed as a complete
set of speech, transcripts, and lexicon (lexicons to be released in
the near future) or the components can be ordered separately.
The telephone speech consists of either 100 or 120 unscripted
telephone conversations between native speakers of the specific
language. All calls, which lasted up to 30 minutes, originated in
North America. Participants typically called family members or close
friends. Most calls were placed to various locations overseas, but
some participants placed calls within North America.
The transcripts cover a contiguous 5 or 10 minute segment taken from a
recorded conversation. The transcripts are timestamped by speaker
turn for alignment with the speech signal, and are provided in
standard orthography.
The lexicons, which are not yet available, contain tab-separated
information fields with orthographic, morphological, phonological,
stress, source, and frequency information for each word. The lexicons
will be covered by a special license agreement.
Institutions that have membership in the LDC during the 1997
Membership Year will be able to receive the CALLHOME materials at no
additional charge, in the same manner as all other text and speech
corpora published by the LDC. Due to a delayed release, 1996 members
are entitled to CALLHOME Japanese, Mandarin Chinese, and Spanish.
Nonmembers can purchase CALLHOME materials for research purposes only.
The cost of the CALLHOME collection is $3000 per language. The
various components of this collection can be purchased separately;
Speech databases are $1000, transcripts are $500, and lexicons are
$1500 each. If you would like to order a copy of this corpus, please
email your request to ldc at unagi.cis.upenn.edu. If you need additional
information before placing your order, or would like to inquire about
membership in the LDC, please send email or call (215) 898-0464.
Further information about the LDC and its available corpora can be
accessed on the Linguistic Data Consortium WWW Home Page at URL
http://www.ldc.upenn.edu/. Information is also available via ftp at
ftp.cis.upenn.edu under pub/ldc; for ftp access, please use
"anonymous" as your login name, and give your email address when asked
for password.
Language Speech Transcripts Lexicon Membership
$1000 $500 $1500 year
- ---------------------------------------------------------------------
- ---------------------------------------------------------------------
American LDC97S42 LDC97T14 LDC97L20 97
English (PRONLEX)
- ---------------------------------------------------------------------
Egyptian LDC97S45 LDC97T19 LDC97L19 97
Arabic
- ---------------------------------------------------------------------
German LDC97S43 LDC97T15 LDC97L18 97
- ---------------------------------------------------------------------
Japanese LDC96S37 LDC96T18 LDC96L17 96/97
- ---------------------------------------------------------------------
Mandarin LDC96S34 LDC96T16 LDC96L15 96/97
Chinese
- ---------------------------------------------------------------------
Spanish LDC96S35 LDC96T17 LDC96L16 96/97
- ---------------------------------------------------------------------
- ---------------------------------------------------------------------
-------------------------------- Message 2 -------------------------------
Date: Wed, 20 Aug 1997 09:24:56 -0600
From: Andrew & Diane Lillie <andrewl at byu.edu>
Subject: New American Dialect Society URL
Dear colleagues,
Because of continuing server problems, the American Dialect Society
webpage has moved. You can now find it at:
http://www.et.byu.edu/~lilliek/ads/index.htm
I apologize to those who have contacted me about the web site problems
and thank you for your patience.
Diane Lillie
ADS Webmaster
---------------------------------------------------------------------------
LINGUIST List: Vol-8-1209
More information about the LINGUIST
mailing list