[RNLD] Archiving SayMore files

Thu Mar 15 15:47:02 UTC 2018

Two questions:

1. If the archived version of the audio is "converted" from three tracks
down to one track - even if all the audio content is present - how is one
to restore the project from the archive back to a future user's instance of
SayMore in such a way as to have three independent tracks? The three track
paradigm is an original part of the "created corpus". I'm not sure I
understand via the linked post and software's readme file how we are
preserving the corpus to be usable in the same context as it was created.

2. John Hatton's screen shot presents a well working paradigm, as does
Nicks proposed solution if one is working solely with list based
elicitation data. In some languages (Mexico) the silence during connected
speech is actually meaningful and contrastive. So, when working with
discourse units lager than a word, I fail to see how Nick's solution
preserves the original audio in its original form. - Perhaps I am missing
something/not understanding something in his methodology and what the
perceived archival object is.

- All the best,
- Hugh Paterson III

Full disclosure, I was trained by SIL to use SayMore, but gave it up
because 1. it did not enable me to work on a LAN with multiple
collaborators annotating a corpus - My work usually involves several native
and non-native speakers working together. 2. It didn't run on OS X/Linux.
Occasionally I do pull it out and look at it, but mostly to read the
methodology files in the application's helps.

On Wed, Mar 14, 2018 at 4:18 PM, John Hatton <john_hatton at sil.org> wrote:

> Note that SayMore already automatically creates a single file that
> combines all the annotations, and places it, indented, below the .eaf
> (ELAN) file. If you have just speech followed by careful speech, you get a
> wav with the original and careful taking turns. If you also record a
> translation, then you get a 3 tracks, this time with the original, careful,
> and translation all taking turns, one after the other. Here's an annotated
> screenshot from audacity of the saymore's generated "Oral Annotations" file
> to illustrate that:
>
>
>
>
> I was thinking such a combined file was a good archival artifact capturing
> the whole BOLD package. Probably we need some kind of note when you export
> the IMDI telling people that your language archive is not going to want the
> contents of the folder with all the little recordings, but that all that
> information is encoded in this single file.
>
> It's exciting to hear that at least one researcher is doing BOLD. We'd
> love to hear experiences or announcements on the SayMore forum,
> https://community.software.sil.org/c/saymore.
>
> Regards,
>
> John Hatton
>
> Senior Software Engineer/Program Manager
> Language Software Development
> SIL International
>
>
> On Wed, Mar 14, 2018 at 3:08 PM, Nick Thieberger <thien at unimelb.edu.au>
> wrote:
>
>> If you have used SayMore for creating spoken annotations of a recording
>> (the BOLD method) then you may have found that it has created hundreds, or
>> thousands, of small audio files. When you come to archive this mass of
>> data, you may want to try the tool our colleagues built for us. It
>> rejoins the files and inserts silence in the master file so it is all
>> synced up, and playable as a single file. On behalf of digital language
>> archives I ask that you do not archive all of the small files created by
>> SayMore, but that you use this method to produce a good archival form of
>> the data.
>>
>> Details here: http://www.paradisec.org.au/blog/2018/03/merging-saymo
>> re-audio-snippets-into-a-single-wav-file/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/resource-network-linguistic-diversity/attachments/20180315/76def7f1/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 7586 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/resource-network-linguistic-diversity/attachments/20180315/76def7f1/attachment.png>