12.2055, FYI: ELRA, Disappearing accents, Reuters Corpus
The LINGUIST List
linguist at linguistlist.org
Fri Aug 17 00:52:33 UTC 2001
LINGUIST List: Vol-12-2055. Thu Aug 16 2001. ISSN: 1068-4875.
Subject: 12.2055, FYI: ELRA, Disappearing accents, Reuters Corpus
Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
Andrew Carnie, U. of Arizona <carnie at linguistlist.org>
Reviews (reviews at linguistlist.org):
Simin Karimi, U. of Arizona
Terence Langendoen, U. of Arizona
Editors (linguist at linguistlist.org):
Karen Milligan, WSU Naomi Ogasawara, EMU
Lydia Grebenyova, EMU Jody Huellmantel, WSU
James Yuells, WSU Michael Appleby, EMU
Marie Klopfenstein, WSU Ljuba Veselinova, Stockholm U.
Heather Taylor-Loring, EMU Dina Kapetangianni, EMU
Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>
Home Page: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.
Editor for this issue: Jody Huellmantel <jody at linguistlist.org>
=================================Directory=================================
1)
Date: Tue, 14 Aug 2001 15:15:10 +0200
From: Magali Duclaux <duclaux at elda.fr>
Subject: ELRA news 2/2
2)
Date: Tue, 14 Aug 2001 10:12:09 -0500
From: Michael Bernstein <michael at cascadilla.com>
Subject: Disappearing accents in PDF
3)
Date: Tue, 14 Aug 2001 11:48:24 +0100
From: Tony.Rose at reuters.com
Subject: Reuters Corpus
-------------------------------- Message 1 -------------------------------
Date: Tue, 14 Aug 2001 15:15:10 +0200
From: Magali Duclaux <duclaux at elda.fr>
Subject: ELRA news 2/2
*************************************************************
ELRA
European Language Resources Association
ELRA News
*************************************************************
We are happy to announce new resources available via ELRA:
ELRA S0034 Verbmobil (new resources added)
A description of each database is given below:
VM CD 53.1 - VM53.1 (BAS edition)
German, 16 spontaneous dialogues (16 close mic,
8 room mic, 8 phone line (GSM) recordings) - 1771 turns,
transliteration (VM II Format).
VM CD 60.1 - VM60.1 (BAS-Edition)
Japanese - 10 spontaneous dialogues (10 close mic,
0 room mic, 0 phone line (GSM) recordings) - 501 turns,
transliteration (VM II Format).
VM CD 61.1 - VM61.1 (BAS-Edition)
Japanese - 19 spontaneous dialogues (19 close mic,
0 room mic, 0 phone line (GSM) recordings) - 946 turns,
transliteration (VM II Format).
VM CD 62.1 - VM62.1 (BAS-Edition)
Japanese - 21 spontaneous dialogues (21 close mic,
0 room mic, 0 phone line (GSM) recordings) - 981 turns,
transliteration (VM II Format).
VM CD 51.1 - VM51.1 (BAS-Edition)
Multilingual German/English with human interpreter
(3 channels) - 15 spontaneous dialogues (15 close mic,
0 room mic, 0 phone line (GSM) recordings) - 873 turns,
transliteration (VM II Format).
VM CD 52.1 - VM52.1 (BAS-Edition)
Multilingual German/English with human interpreter
(3 channels) - 13 spontaneous dialogues (13 close mic,
0 room mic, 0 phone line (GSM) recordings) - 728 turns,
transliteration (VM II Format).
VM CD 55.1 - VM55.1 (BAS-Edition)
Multilingual German/English with human interpreter
(3 channels) - 11 spontaneous dialogues (11 close mic,
0 room mic, 0 phone line (GSM) recordings) - 518 turns,
transliteration (VM II Format).
VM CD 56.1 - VM56.1 (BAS-Edition)
Multilingual German/English with human interpreter
(3 channels) - 12 spontaneous dialogues (12 close mic,
0 room mic, 0 phone line (GSM) recordings) - 620 turns,
transliteration (VM II Format).
VM CD 57.1 - VM57.1 (BAS-Edition)
Multilingual German/Japanese with 2 human interpreters
(4 channels) - 11 spontaneous dialogues (11 close mic,
0 room mic, 0 phone line (GSM) recordings) - 702 turns,
transliteration (VM II Format).
VM CD 58.1 - VM58.1 (BAS-Edition)
Multilingual German/Japanese with 2 human interpreters
(4 channels) - 7 spontaneous dialogues (7 close mic,
0 room mic, 0 phone line (GSM) recordings) - 421 turns,
transliteration (VM II Format).
VM CD 59.1 - VM59.1 (BAS-Edition)
Multilingual German/Japanese with 2 human interpreters
(4 channels) - 7 spontaneous dialogues (7 close mic,
0 room mic, 0 phone line (GSM) recordings) - 354 turns,
transliteration (VM II Format).
VM CD 63.0 - VM63.0 (original edition)
German - 14 WOZ dialogues designed to evoke emotions
(mainnly anger) - transliteration, emotion labeling.
VM CD 64.0 - VM64.0 (original edition)
German - 13 WOZ dialogues designed to evoke emotions
(mainnly anger) - transliteration, emotion labeling.
VM CD 65.0 - VM65.0 (original edition)
German - 13 WOZ dialogues designed to evoke emotions
(mainnly anger) - transliteration, emotion labeling.
============
For further information, please contact:
ELRA/ELDA
55-57 rue Brillat-Savarin
F-75013 Paris, France
Tél. : +33 01 43 13 33 33
Fax : +33 01 43 13 33 30
Email: mapelli at elda.fr
or consult our catalogue at the following address:
http://www.icp.grenet.fr/ELRA/home.html
or http://www.elda.fr
-------------------------------- Message 2 -------------------------------
Date: Tue, 14 Aug 2001 10:12:09 -0500
From: Michael Bernstein <michael at cascadilla.com>
Subject: Disappearing accents in PDF
Have you noticed accents disappearing when you print a PDF file,
even though the accents are there on screen? If this is happening
with Acrobat Reader 3 or 4 for Windows, you can fix the problem by
upgrading to Acrobat Reader 5. That's a free download from:
http://www.adobe.com/products/acrobat/readstep2.html
We've posted a little more information about this Acrobat bug at:
http://www.cascadilla.com/faq/faq-viewingpdf.html#disappear
If you distribute PDF files of your research and you use accents
from any phonetic fonts (such as the SIL fonts) that appear over
the preceding or following character, you may want to add a note to
your web page suggesting that Windows users use Acrobat Reader 5
to make sure that the accents print out correctly.
Yours,
Michael Bernstein
Cascadilla Press
michael at cascadilla.com
-------------------------------- Message 3 -------------------------------
Date: Tue, 14 Aug 2001 11:48:24 +0100
From: Tony.Rose at reuters.com
Subject: Reuters Corpus
Reuters, the global information, news and technology group, is for the
first time making available free of charge, large quantities of
archived Reuters news stories for use by research communities around
the world. The first Reuters Corpus archive includes over 800,000
English language news stories, equivalent to the annual global news
output of Reuters. All the news stories are fully referenced using a
total of 775 different category codes for topic, geography and
industry sector.
Although this Corpus has been available for some time, it has not yet
been widely publicised. We are now happy to distribute it more widely
within the research community. Further details can be found at:
http://about.reuters.com/researchandstandards/corpus/
For discussion and queries regarding this corpus and future Reuters releases, please refer to the ReutersCorpora mailing list, which can be found at:
http://groups.yahoo.com/group/ReutersCorpora
Best wishes,
Tony
==========
Dr TG Rose
Leader of Language Technology
Reuters Limited, 85 Fleet Street, London EC4P 4AJ
Email: Tony.Rose at reuters.com
- ---------------------------------------------------------------
Visit our Internet site at http://www.reuters.com
Any views expressed in this message are those of the individual
sender, except where the sender specifically states them to be
the views of Reuters Ltd.
---------------------------------------------------------------------------
LINGUIST List: Vol-12-2055
More information about the LINGUIST
mailing list