[Corpora-List] Question concerning audio file search

Adam Kilgarriff adam at lexmasterclass.com
Wed Dec 20 21:24:55 UTC 2006


Briony,

Not directly an answer, but do you know http://podzinger.com.  This
astonishing website has vast quantities of podcasts, automatically
transcribed and text-searchable.  (Just the day before, I had been
confidently declaring that this level of transcription was  beyond the state
of the art.)  I encountered it because John Milton, in Hong Kong, has
integrated it into his English Language Teaching tools so students can hear
a word or phrase they are learning.

Adam

-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Briony Williams
Sent: 20 December 2006 15:18
To: CORPORA at uib.no
Subject: [Corpora-List] Question concerning audio file search

sato hiroaki wrote:
> I've just made a software tool for using DVD movies as a multimedia
corpus.

Following the publicising of this useful-looking tool, I wonder whether any 
members of the Corpora list could help me with a related task, concerning 
audio rather than video files. I'm asking on behalf of someone else.

Is there an existing Windows-based software application that will do the 
following? (preferably free of charge, and without a large memory
requirement):-

1) Given: several very long .wav sound files (possibly an hour long), with 
associated text transcript files (.trs files) as produced using the 
"Transcriber" software application.

2) User input:  User types one or more words to search for in the 
transcription files.

3) Output: Software returns each chunk of text that contains the search 
string (where "chunk" could be a phrase, sentence, paragraph, topic, or 
larger, depending on the granularity of the transcription files).

4) User input: User selects one of the search results.

5) Output: Software plays back the portion of the (large) sound file 
corresponding to the chunk selected by the user.

I can think of a way to do this using Cygwin and Edinburgh Speech Tools -
but 
does anyone know of an existing solution using a Windows graphical
interface? 
My contact seems to prefer a point-and-click interface if possible.

Thanks in advance for any responses.

Best regards

Briony Williams

-- 
Briony Williams

Arweinydd Tîm Technoleg Lleferydd / Speech Technology Team Leader
Uned Technolegau Iaith            / Language Technologies Unit
Canolfan Bedwyr                   / Canolfan Bedwyr
Prifysgol Cymru                   / University of Wales
Bangor                            / Bangor
Gwynedd LL57 2EN, UK              / Gwynedd LL57 2EN, UK

E-Bost / E-Mail : b.williams at bangor.ac.uk
Gwe (Cymraeg)   : http://www.bangor.ac.uk/ar/cb/technolegau_iaith.php.cy
Web (English)   : http://www.bangor.ac.uk/ar/cb/technolegau_iaith.php.en
Ffôn / Tel      : +44 (0) 1506 200862
Rhithfro / Blog : http://murmur.bangor.ac.uk
....................................................................


-- 
Gall y neges e-bost hon, ac unrhyw atodiadau a anfonwyd gyda hi,
gynnwys deunydd cyfrinachol ac wedi eu bwriadu i'w defnyddio'n unig
gan y sawl y cawsant eu cyfeirio ato (atynt). Os ydych wedi derbyn y
neges e-bost hon trwy gamgymeriad, rhowch wybod i'r anfonwr ar
unwaith a dilëwch y neges. Os na fwriadwyd anfon y neges atoch chi,
rhaid i chi beidio â defnyddio, cadw neu ddatgelu unrhyw wybodaeth a
gynhwysir ynddi. Mae unrhyw farn neu safbwynt yn eiddo i'r sawl a'i
hanfonodd yn unig  ac nid yw o anghenraid yn cynrychioli barn
Prifysgol Cymru, Bangor. Nid yw Prifysgol Cymru, Bangor yn gwarantu
bod y neges e-bost hon neu unrhyw atodiadau yn rhydd rhag firysau neu
100% yn ddiogel. Oni bai fod hyn wedi ei ddatgan yn uniongyrchol yn
nhestun yr e-bost, nid bwriad y neges e-bost hon yw ffurfio contract
rhwymol - mae rhestr o lofnodwyr awdurdodedig ar gael o Swyddfa
Cyllid Prifysgol Cymru, Bangor.  www.bangor.ac.uk

This email and any attachments may contain confidential material and
is solely for the use of the intended recipient(s).  If you have
received this email in error, please notify the sender immediately
and delete this email.  If you are not the intended recipient(s), you
must not use, retain or disclose any information contained in this
email.  Any views or opinions are solely those of the sender and do
not necessarily represent those of the University of Wales, Bangor.
The University of Wales, Bangor does not guarantee that this email or
any attachments are free from viruses or 100% secure.  Unless
expressly stated in the body of the text of the email, this email is
not intended to form a binding contract - a list of authorised
signatories is available from the University of Wales, Bangor Finance
Office.  www.bangor.ac.uk



More information about the Corpora mailing list