13.2968, Software: New LDC Publications

LINGUIST List linguist at linguistlist.org
Fri Nov 15 17:13:11 UTC 2002


LINGUIST List:  Vol-13-2968. Fri Nov 15 2002. ISSN: 1068-4875.

Subject: 13.2968, Software: New LDC Publications

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Consulting Editor:
        Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Editors (linguist at linguistlist.org):
	Karen Milligan, WSU 		Naomi Ogasawara, Arizona U.
	James Yuells, EMU		Marie Klopfenstein, WSU
	Michael Appleby, EMU		Heather Taylor, EMU
	Ljuba Veselinova, Stockholm U.	Richard John Harvey, EMU
	Dina Kapetangianni, EMU		Renee Galvis, WSU
	Karolina Owczarzak, EMU		Anita Huang, EMU
	Tomoko Okuno, EMU		Steve Moran, EMU
	Lakshmi Narayanan, EMU		Sarah Murray, WSU
	Marisa Ferrara, EMU

Software: Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>
          Zhenwei Chen, E. Michigan U. <chen at linguistlist.org>
	  Prashant Nagaraja, E. Michigan U. <prashant at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.



Editor for this issue: James Yuells <james at linguistlist.org>

=================================Directory=================================

1)
Date:  Wed, 13 Nov 2002 14:45:55 -0500
From:  LDC Office <ldc at ldc.upenn.edu>
Subject:  New LDC Publications

-------------------------------- Message 1 -------------------------------

Date:  Wed, 13 Nov 2002 14:45:55 -0500
From:  LDC Office <ldc at ldc.upenn.edu>
Subject:  New LDC Publications


  *   Buckwalter Arabic Morphological Analyzer Version 1.0   *

               *   Voicemail Corpus Part II   *

             *   1997 HUB5 German Evaluation   *


The Linguistic Data Consortium (LDC) is pleased to announce the
availability of three new publications.


1.  The Buckwalter Arabic Morphological Analyzer Version 1.0 was
created by Tim Buckwalter at Qamus for POS-tagging Arabic text.
The analyzer consists primarily of three Arabic-English lexicon files:
prefixes, suffixes, and stems.  The lexicons are supplemented by three
morphological compatibility tables used for controlling prefix-stem
combinations, stem-suffix combinations, and prefix-suffix combinations.

The LDC is releasing this software under the GNU General Public License:

http://www.gnu.org/copyleft/gpl.html

For information on commercial use, please visit:

http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2002L49

Buckwalter Arabic Morphological Analyzer can be downloaded for free from
the above link.  If you would like a copy placed on CD-ROM, please note
that there is a $100 media charge.


2. The Voicemail Corpus Part II is the second voicemail corpus created
by Mukund Padmanabhan, Brian Kingsbury et al. at International Business
Machines.  This single disc publication is comprised of speech and
transcript files, and is separated into training and evaluation data.
The training data consists of 2048 voicemail messages and the
corresponding transcript files; the evaluation data consists of 50
voicemail messages and 50 transcripts.

For further information, please visit:

http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2002S35

Institutions that have membership in the LDC during the 2002 Membership
Year will be able to receive this corpus free of charge.  As a 'Members
Only' publication, the corpus is not available to nonmembers.


3.  The 1997 Hub5 Non-English evaluation is part of an ongoing series of
periodic evaluations conducted by NIST. These evaluations provide an
important contribution to the direction of research efforts and the
calibration of technical capabilities. They are intended to be of
interest to all researchers working on the general problem of
conversational speech recognition.

The Hub5 Non-English evaluation focuses on the task of transcribing
conversational telephone speech into text.  The 1997 HUB5 German
Evaluation is a single disc publications and contains nine hours of
speech data.  Transcripts are not included.

For more information, please visit:

http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2002S24

Institutions that have membership in the LDC during the 2002 Membership
Year will be able to receive this corpus free of charge.  Nonmembers may
purchase this publication for $1000.


			    *


If you need additional information before placing your order, or
would like to inquire about membership in the LDC, please send email to
<ldc at ldc.upenn.edu> or call (215) 573-1275.


- ------------------------------------------------------------------
Linguistic Data Consortium          Phone: (215) 573-1275
3600 Market Street                  Fax:   (215) 573-2175
Suite 810                           email: ldc at ldc.upenn.edu
Philadelphia, PA 19104-2653         www: http://www.ldc.upenn.edu

---------------------------------------------------------------------------
LINGUIST List: Vol-13-2968



More information about the LINGUIST mailing list