Corpora: Broadcast corpus

Raman Chandrasekar ramanc at microsoft.com
Mon Jan 17 17:36:44 UTC 2000


LDC does have transcribed broadcast news. See
http://morph.ldc.upenn.edu/Catalog/by_type.html
<http://morph.ldc.upenn.edu/Catalog/by_type.html>   under the heading
Broadcast text . You'll see the following:


text_broadcast

Broadcast text
[ text <http://morph.ldc.upenn.edu/Catalog/by_type.html#text> ]
 <http://morph.ldc.upenn.edu/Catalog/LDC98T31.html> LDC98T31	 1996 CSR
Hub-4 Language Model	

 <http://morph.ldc.upenn.edu/Catalog/LDC97T22.html> LDC97T22	 1996
English Broadcast News Transcripts (Hub-4)	

 <http://morph.ldc.upenn.edu/Catalog/LDC98T28.html> LDC98T28	 1997
English Broadcast News Transcripts (Hub-4)	

 <http://morph.ldc.upenn.edu/Catalog/LDC98T24.html> LDC98T24	 1997
Mandarin Broadcast News Transcripts (Hub-4NE)	

 <http://morph.ldc.upenn.edu/Catalog/LDC98T29.html> LDC98T29	 1997
Spanish Broadcast News Transcripts (Hub-4NE)	

 <http://morph.ldc.upenn.edu/Catalog/LDC99T36.html> LDC99T36	 USC
Marketplace Broadcast News Transcripts	

However, access to these collections may require you to be a member. I'm
cc'ing LDC on this, hopefully they'll get back to you directly.

Regards,

   -- Raman Chandrasekar


-----Original Message-----
From: Mirjam Sepesy Maucec [mailto:mirjam.sepesy at uni-mb.si]
Sent: Sunday, January 16, 2000 10:41 PM
To: corpora at hd.uib.no
Subject: Corpora: Broadcast corpus


Hi,

my research topic is domain based  adaptation of language model. For my work
I hardly need a text corpus
with topic tags.
Broadcast corpus seems to be appropriate. Where can I get it?  I don't find
it in LDC catalog. I also write 2
e-mails to Primary Source Media to get some information and I got no answer.

Please, help!


Mirjam

--

_____________________________________________________________



Mirjam Sepesy Maucec

Faculty of Electrical Engineering and Computer Science

University of Maribor

Smetanova 17

2000 MARIBOR

tel: ++386 (062) 220 7225

e-mail: mirjam.sepesy at uni-mb.si


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20000117/6c0b858e/attachment.htm>


More information about the Corpora mailing list