9.982, FYI: Indo-Iranian, Report, LDC release, LDC corpus

Mon Jun 29 18:20:40 UTC 1998

LINGUIST List:  Vol-9-982. Mon Jun 29 1998. ISSN: 1068-4875.

Subject: 9.982, FYI: Indo-Iranian, Report, LDC release, LDC corpus

Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at linguistlist.org>
            Helen Dry: Eastern Michigan U. <hdry at linguistlist.org>

Review Editor:     Andrew Carnie <carnie at linguistlist.org>

Editors:  	    Brett Churchill <brett at linguistlist.org>
		    Martin Jacobsen <marty at linguistlist.org>
		    Elaine Halleck <elaine at linguistlist.org>
                    Anita Huang <anita at linguistlist.org>
                    Ljuba Veselinova <ljuba at linguistlist.org>
		    Julie Wilson <julie at linguistlist.org>

Software development: John H. Remmers <remmers at emunix.emich.edu>
                      Zhiping Zheng <zzheng at online.emich.edu>

Home Page:  http://linguistlist.org/

Editor for this issue: Brett Churchill <brett at linguistlist.org>

=================================Directory=================================

1)
Date:  Fri, 26 Jun 1998 13:50:30 +0300
From:  "Daniel Baum" <msdbaum at mscc.huji.ac.il>
Subject:  Indo-Iranian linguistics mailing list

2)
Date:   Mon, 29 Jun 1998 11:42:28 +0400
From:  "Sergei V. Rjabchikov" <srjabchikov at hotmail.com>
Subject:  a report about my new studies

3)
Date:  Mon, 29 Jun 1998 15:46:31 EDT
From:  LDC Office <ldc at unagi.cis.upenn.edu>
Subject:  A New Release From the LDC

4)
Date:  Mon, 29 Jun 1998 15:47:34 EDT
From:  LDC Office <ldc at unagi.cis.upenn.edu>
Subject:  A New Corpus from the LDC

-------------------------------- Message 1 -------------------------------

Date:  Fri, 26 Jun 1998 13:50:30 +0300
From:  "Daniel Baum" <msdbaum at mscc.huji.ac.il>
Subject:  Indo-Iranian linguistics mailing list

After receiving a good response to my previous postings, this is to announce
the formation of a new Indo-Iranian linguistics mailing list.

All those who answered the previous posting should have received a personal
invitation to join.

List description
============

This is a list for the discussion of Indo-Iranian linguistics. While the
main focus of the list will be Vedic and Avestan, discussion of any
Indo-Iranian linguistic topic will be welcome.

All aspects of these languages, e.g. phonology, morphology, syntax, text
linguistics, and historical and comparative linguistics may be discussed,
while any other language, whether non-IE Indian, or other branches of IE,
will be considered off-topic unless it is relevant in some way to
Indo-Iranian.

All linguistic "schools' are welcome, as long as the topic of discussion
remains Indo-Iranian.

To subscribe, send an empty message to indo_iranian-subscribe at makelist.com

Daniel Baum
msdbaum at mscc.huji.ac.il
Home Page http://www.angelfire.com/il/dbaum
Tel: ++972-2-583-6634; Mob. ++972-51-972-829

-------------------------------- Message 2 -------------------------------

Date:   Mon, 29 Jun 1998 11:42:28 +0400
From:  "Sergei V. Rjabchikov" <srjabchikov at hotmail.com>
Subject:  a report about my new studies

Dear Editor,

I have published "The Linear A and the Phaistos Disk: A Slavonic Key" HOME
PAGE <http://www.openweb.ru/windows/rongo/disk.htm>.

I have published "RONGORONGO, Easter Island Writing" HOME PAGE
(articles on the decipherment of the Easter Island writing system)
on the World Wide Web: <http://www.openweb.ru/windows/rongo/index.htm> (The

Rapanui Chant "He Timo te Akoako": Origin and Interpretation. Rongorongo
Script: Reading of Some Records. The Glyphs on the Spanish Treaty.
Rongorongo: The Milky Way and Antares. "The Numerals" in the Easter Island
Vocabulary: An Astronomical Report. Linguistic Evidence of Early
Peruvian-Rapanui Contacts. Dr Schuhmacher's Renunciation? Bibliography).

Yours sincerely,

Sergei V. Rjabchikov
srjabchikov at hotmail.com

http://www.openweb.ru/windows/rongo/index.htm
http://www.kuban.ru/users/Rjabchikov/index.htm

-------------------------------- Message 3 -------------------------------

Date:  Mon, 29 Jun 1998 15:46:31 EDT
From:  LDC Office <ldc at unagi.cis.upenn.edu>
Subject:  A New Release From the LDC

	Announcing a NEW RELEASE from the
	  Linguistic Data Consortium

************************************************
TAIWANESE PUTONGHUA SPEECH AND TRANSCRIPT CORPUS
************************************************

This set of data on Taiwanese accented Putonghua
(PTH) was recorded in Taiwan from December 1994 to
January 1995. Taiwanese accented PTH refers to PTH
spoken by people who were born in Taiwan and whose
first language is Taiwanese (Southern Min). A total
of 40 speakers; ranging in age, education, birth
place, and family dialect; were recorded. There were
5 two-speaker dialogues and 30 single-speaker
monologues. The dialogues were about 20 minutes each
and the monologues were about 10 minutes each.
Dialogues were recorded on two tracks, one for each
speaker. Monologues were recorded on one track.

The recordings were done in ordinary, but quiet
rooms. The speakers were asked in advance to speak in
conversation style, without notes, on any topic they
chose, or no topic at all. Most speakers spoke
spontaneously and the topic drifted freely. Some
speakers talked about their professional work in a
rather formal way. One speaker (#20, a public health
official) used notes. We consider this variation in
speech style a merit of the data.

The recording tools consisted of a portable DAT
(Teac) which recorded at a 44.1 kHz sampling rate at
16 bits linear quantization. The microphones were
AudioTechnica lapel microphones with a preamp and XLR
connection to the DAT. The XLR helped low noise
recordings, and the AudioTechnica provided
widebandwidth, flat response over the speech range of
interest, was unidirectional to minimize cross-talk,
and very light in comparison with standard
microphones. Both single-speaker monologues and
two-speaker dialogues were recorded using this system
on standard DAT tape.

Before recording, all speakers read and signed the
'Informed Consent Form', which was written in Chinese
and which largely followed the standard format
approved by the Human Subject Committee of the
University of Michigan. The form stated that the
participation in the recording was entirely voluntary
and that the speech may be used for linguistic
teaching and research purposes.

The speech data are accompanied by transcripts. The
monologues have start and end time stamps. The 5
dialogues are time stamped by speaker turn.

Institutions that have membership in the LDC during
the 1998 Membership Year will be able to receive this
corpus in the same manner as all other text and
speech corpora published by the LDC.

Nonmembers can receive a copy of the Taiwanese
Putonghua Speech and Transcript Corpus for $750.

If you would like to order a copy of this corpus,
please email your request to
<ldc at unagi.cis.upenn.edu>. If you need additional
information before placing your order, or would like
to inquire about membership in the LDC, please send
email or call (215) 898-0464.

Further information about the LDC and its available
corpora can be accessed on the Linguistic Data
Consortium WWW Home Page at URL:

http://www.ldc.upenn.edu/

Information is also available via ftp at
ftp.cis.upenn.edu under pub/ldc; for ftp access,
please use "anonymous" as your login name, and give
your email address when asked for password.

-------------------------------- Message 4 -------------------------------

Date:  Mon, 29 Jun 1998 15:47:34 EDT
From:  LDC Office <ldc at unagi.cis.upenn.edu>
Subject:  A New Corpus from the LDC

	Announcing a NEW RELEASE from the
	  Linguistic Data Consortium

****************************************
1997 Spanish Broadcast News (HUB-4NE)
****************************************

This corpus contains a portion of the acoustic data
designated as the training set for the 1997 DARPA HUB-4
Spanish Benchmark.  It contains speech and transcripts of
30 hours of broadcast news from the following sources:

VOA
Univision
Televisa

All acoustic files are in NIST SPHERE format, without compression.
The sample data are 16-bit linear PCM, 16-KHz sample frequency,
single channel.  Most files contain 30 minutes of recorded
material, and some contain 60 or 120 minutes (approximately); the
sampling format requires roughly 2 megabytes (MB) per minute of
recording, so the file sizes are typically around 60 MB, with some
files ranging up to 120 or 240 MB.

The transcripts are in SGML format, using the same markup
conventions that have been applied to the other 1997 Broadcast
News speech corpora (in English and Mandarin), and are transmitted
by ftp, not on the cdroms with speech data.

Because of restrictions imposed by the copyright holders, this
corpus is available to 1998 LDC members only.

If you would like to order a copy of this corpus, please email
your request to <ldc at unagi.cis.upenn.edu>.  If you need additional
information before placing your order, or would like to inquire
about membership in the LDC, please send email or call (215)
898-0464.

Further information about the LDC and its available corpora can be
accessed on the Linguistic Data Consortium WWW Home Page at URL:

http://www.ldc.upenn.edu/

Information is also available via ftp at ftp.cis.upenn.edu under
pub/ldc; for ftp access, please use "anonymous" as your login
name, and give your email address when asked for password.

---------------------------------------------------------------------------
LINGUIST List: Vol-9-982