Toolbox as Transcriber

Andrew Margetts apmargetts at IPRIMUS.COM.AU
Thu Nov 11 11:41:32 UTC 2010


Dear RNLDrs,

Many people like to use Transcriber for chunking and transcription 
(perhaps mainly because of its simple interface, and because it is 
relatively easy to export the data to Toolbox).

A limitation that deters some potential users however is that, although 
Transcriber is Unicode (UTF-8) compliant, the program does not work well 
with Keyman. This post is addressed mainly to this non-user group.

There are two common solutions to this limitation, but they have drawbacks:

1) Use a 'working orthography' combined with subsequent search and 
replace operations on the completed Transcriber file. This can work for 
some situations but is a bit clunky and can be undesirable when you want 
your transcribers to be able to use the real thing.

2) Use ELAN instead, since it has options for mapping different 
character sets, including via Keyman. This is fine, but can introduce 
new problems. Firstly because the interface for ELAN is considerably 
more complex than that for Transcriber it is harder to teach local 
transcribers to use the software (and to support them afterwards, 
particularly via long distance, sporadic letters/phonecalls). Secondly 
there can still be problems setting up a project in ELAN such that you 
can get the data out and into Toolbox in a useful way.


Is there a way to keep the simplicity of Transcriber and the link to 
Toolbox, but make it work for Unicode input? I believe there is a case 
for actually using Toolbox as the prime transcribing tool rather than, 
say Transcriber or ELAN. I should make clear at the outset that this 
technique still relies on Transcriber to some extent (i.e. for the 
chunking, but no longer for the transcription).

This is probably not a new idea to people who have been faced with this 
problem, but following are a few suggestions for making this approach 
workable.

The key issue is to present an interface to Toolbox that is no more 
complex than Transcriber itself. Fortunately this is (in my view) 
achievable by combining a number of unrelated features of Toolbox. But 
first you have to prepare a suitable Toolbox file and corresponding 
'database type' file. These are the steps:

1) Chunk the sound file in Transcriber.

2) Convert the resulting Transcriber file to Toolbox format through my 
online converter - http://linguisticsoftwareconverters.zong.mine.nu/ 
(you see this entire post is just a plug), or equivalent. At this stage 
all the chunks - which are now Toolbox records - will be empty. If using 
the website, download also the matching Toolbox 'type' file from the 
website (you can use this in the next step).

3) Set up a simple Toolbox project which includes the file, its matching 
audio file in wav format (these two files should be in the same folder), 
and the appropriate type file (this goes in the settings folder, along 
with the project itself of course).

4) Check that you can play the sound in Toolbox (Shift + F4) then 
deliver the whole Toolbox project package to the transcriber/s. They can 
play the sound for each chunk and then type the transcription- using 
Keyman or a Toolbox-internal keyboard if desired - into the 
Transcription marker.


 From your experience with Toolbox, you might be thinking "isn't that 
horribly complex?" (delivering all those files), and "isn't that 
horribly confusing and fragile?" (all those markers that can be 
misunderstood and screwed up inadvertently by inexperienced users). Here 
are the Toolbox commands and techniques that I believe deal with these 
very valid fears:

1) Hide all the fields you don't need (View menu). That can even include 
the Play-sound marker (i.e. the one with the timecode in it that enables 
Toolbox to play the chunks). Basically you only need the Ref marker and 
the Transcription marker to show at this stage (unless you want to live 
dangerously and ask for a free translation at the same time).

2) Lock down the project to an appropriate level (see 'Advanced 
Consultant Features' in the Toolbox help files for details how to do 
this). For instance Level 5 would still allow the user to minimize, 
maximise or restore windows (which means you could perhaps set up a 
number of texts to be transcribed in the one Toolbox project), but not 
to close them (which means you might get all of them done without too 
much drama). This system of locking is not completely foolproof but it 
does help to make a more user-friendly skin for inexperienced users who 
only need to do one specific task (transcribing in this case, but of 
course the technique can be used for editing/reviewing, translating and 
so on).

3) Wrap the whole project with a copy of Toolbox in the settings folder 
(using the technique described under 'Portability' in the help files). 
Optionally also define an 'internal keyboard' (again see the help file). 
Distribute by, e.g. copying to a USB stick or CD/DVD, or zipping and 
emailing. The point here is to be able to supply the complete package as 
one lump together with a simple instruction to "navigate to folder X, 
click on file Y.prj" (or a shortcut to Y.prj - although this can be 
problematic because of change of drive-letter issues).


My hunch is that a Toolbox project delivered in such a way need be no 
more intimidating than (though admittedly different to) a Transcriber 
file, and easier to handle than an ELAN file. The only thing missing is 
video - but maybe that is a good thing in the interests of keeping 
things simple.

The result from this process is of course just a regular Toolbox text 
file - completely ready for further processing, e.g. interlinearising in 
Toolbox or importing into ELAN.

Andrew Margetts



More information about the Resource-network-linguistic-diversity mailing list