[Lexicog] audio component

Sat Apr 27 13:46:07 UTC 2013

Dear Jan,

I have used the following method. Extract a script from your lexicon of the
headwords you want to have spoken. Record a speaker saying them in order.
Enter the script into time-aligning software like Transcriber or Elan and
align each word to the relevant segment (actually quite quick if you use
the visual image of the wave form to help you decide on the chunking of the
audio file). There is software that will align a script and audio, but I
think that is only available for large languages.

Once you have finished, you need to export the information from Transcriber
so that is in the form 'timecode, tab, text' . This can then be imported
into Audacity as 'labels' for the imported audio file. Once you see all the
words as labels in the Audacity file you can select 'Export multiple' and
Audacity will proceed to chop up the file into small files (you select if
you want them as .wav or .mp3), each named as per the label.

It is magic to watch Audacity plough through the file creating new mp3
files!

I hope this helps,

Nick

On 27 April 2013 10:48, Jan Ullrich <jfu at lakhota.org> wrote:

> **
>
>
> Dear Colleagues,****
>
> ** **
>
> I would like to ask your advice regarding an audio component of dictionary
> entries. ****
>
> ** **
>
> We are hoping to eventually record around 30,000 word entries of our
> Lakota dictionary. The dictionary is currently in the Toolbox database
> although we also have a MySQL online version. Also, we have an additional
> Toolbox field in each entry where we store a unique ID number.****
>
> ** **
>
> We do have a plan for a semi-automated procedure, but I am wondering if
> there is a software utility or a recommended procedure for cutting the long
> audio file(s) with a chain of words into individual files for each word and
> naming them according to the respective entry word or preferable entry ID.
>
> In the second phase of the project we woul d also like to create the audio
> component for the 40,000 example sentences and collocations. These
> currently do not have ID numbers so I think we will have to add those. ***
> *
>
> ** **
>
> Best regards****
>
> ** **
>
> Jan****
>
>  
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lexicography/attachments/20130427/f2c266eb/attachment.htm>