8.1209, FYI: LDC Collection, American Dialect Soc.

The LINGUIST List linguist at linguistlist.org
Thu Aug 21 05:05:45 UTC 1997


LINGUIST List:  Vol-8-1209. Thu Aug 21 1997. ISSN: 1068-4875.

Subject: 8.1209, FYI: LDC Collection, American Dialect Soc.

Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at linguistlist.org>
            Helen Dry: Eastern Michigan U. <hdry at linguistlist.org>
            T. Daniel Seely: Eastern Michigan U. <seely at linguistlist.org>

Review Editor:     Andrew Carnie <carnie at linguistlist.org>

Associate Editors: Ljuba Veselinova <ljuba at linguistlist.org>
                   Ann Dizdar <ann at linguistlist.org>
Assistant Editor:  Martin Jacobsen <marty at linguistlist.org>

Software development: John H. Remmers <remmers at emunix.emich.edu>
                      Zhiping Zheng <zzheng at online.emich.edu>

Home Page:  http://linguistlist.org/


Editor for this issue: Martin Jacobsen <marty at linguistlist.org>

=================================Directory=================================

1)
Date:  Wed, 20 Aug 1997 17:55:55 EDT
From:  LDC Office <ldc at unagi.cis.upenn.edu>
Subject:  New Collection from the Linguistic Data Consortium

2)
Date:  Wed, 20 Aug 1997 09:24:56 -0600
From:  Andrew & Diane Lillie <andrewl at byu.edu>
Subject:  New American Dialect Society URL

-------------------------------- Message 1 -------------------------------

Date:  Wed, 20 Aug 1997 17:55:55 EDT
From:  LDC Office <ldc at unagi.cis.upenn.edu>
Subject:  New Collection from the Linguistic Data Consortium

                Announcing a NEW RELEASE from the
                   LINGUISTIC DATA CONSORTIUM

	       CALLHOME Collection in Six Languages

The objective of the CALLHOME project is the creation of a
multi-lingual speech corpus that will support the development of Large
Vocabulary Conversational Speech Recognition (LVCSR) technology. The
collection covers six languages, American English, Egyptian Arabic,
German, Japanese, Mandarin Chinese, and Spanish.

Each CALLHOME language includes telephone speech, transcripts and
tables, and a lexicon.  Each language can be distributed as a complete
set of speech, transcripts, and lexicon (lexicons to be released in
the near future) or the components can be ordered separately.

The telephone speech consists of either 100 or 120 unscripted
telephone conversations between native speakers of the specific
language.  All calls, which lasted up to 30 minutes, originated in
North America.  Participants typically called family members or close
friends.  Most calls were placed to various locations overseas, but
some participants placed calls within North America.

The transcripts cover a contiguous 5 or 10 minute segment taken from a
recorded conversation.  The transcripts are timestamped by speaker
turn for alignment with the speech signal, and are provided in
standard orthography.

The lexicons, which are not yet available, contain tab-separated
information fields with orthographic, morphological, phonological,
stress, source, and frequency information for each word.  The lexicons
will be covered by a special license agreement.

Institutions that have membership in the LDC during the 1997
Membership Year will be able to receive the CALLHOME materials at no
additional charge, in the same manner as all other text and speech
corpora published by the LDC.  Due to a delayed release, 1996 members
are entitled to CALLHOME Japanese, Mandarin Chinese, and Spanish.

Nonmembers can purchase CALLHOME materials for research purposes only.
The cost of the CALLHOME collection is $3000 per language.  The
various components of this collection can be purchased separately;
Speech databases are $1000, transcripts are $500, and lexicons are
$1500 each.  If you would like to order a copy of this corpus, please
email your request to ldc at unagi.cis.upenn.edu. If you need additional
information before placing your order, or would like to inquire about
membership in the LDC, please send email or call (215) 898-0464.

Further information about the LDC and its available corpora can be
accessed on the Linguistic Data Consortium WWW Home Page at URL
http://www.ldc.upenn.edu/. Information is also available via ftp at
ftp.cis.upenn.edu under pub/ldc; for ftp access, please use
"anonymous" as your login name, and give your email address when asked
for password.

Language	 Speech	       Transcripts	Lexicon	     Membership	
		 $1000		  $500		 $1500		year
- ---------------------------------------------------------------------
- ---------------------------------------------------------------------
American	LDC97S42	LDC97T14	LDC97L20	 97
  English					(PRONLEX)
- ---------------------------------------------------------------------
Egyptian	LDC97S45	LDC97T19	LDC97L19	 97
  Arabic
- ---------------------------------------------------------------------
German		LDC97S43	LDC97T15	LDC97L18	 97
- ---------------------------------------------------------------------
Japanese	LDC96S37	LDC96T18	LDC96L17        96/97
- ---------------------------------------------------------------------
Mandarin	LDC96S34	LDC96T16	LDC96L15	96/97
  Chinese
- ---------------------------------------------------------------------
Spanish		LDC96S35	LDC96T17	LDC96L16	96/97
- ---------------------------------------------------------------------
- ---------------------------------------------------------------------


-------------------------------- Message 2 -------------------------------

Date:  Wed, 20 Aug 1997 09:24:56 -0600
From:  Andrew & Diane Lillie <andrewl at byu.edu>
Subject:  New American Dialect Society URL

Dear colleagues,

Because of continuing server problems, the American Dialect Society
webpage has moved.  You can now find it at:
http://www.et.byu.edu/~lilliek/ads/index.htm

I apologize to those who have contacted me about the web site problems
and thank you for your patience.

Diane Lillie
ADS Webmaster








---------------------------------------------------------------------------
LINGUIST List: Vol-8-1209



More information about the LINGUIST mailing list