Toolbox as Transcriber
Andrew Margetts
apmargetts at IPRIMUS.COM.AU
Thu Nov 11 11:41:32 UTC 2010
Dear RNLDrs,
Many people like to use Transcriber for chunking and transcription
(perhaps mainly because of its simple interface, and because it is
relatively easy to export the data to Toolbox).
A limitation that deters some potential users however is that, although
Transcriber is Unicode (UTF-8) compliant, the program does not work well
with Keyman. This post is addressed mainly to this non-user group.
There are two common solutions to this limitation, but they have drawbacks:
1) Use a 'working orthography' combined with subsequent search and
replace operations on the completed Transcriber file. This can work for
some situations but is a bit clunky and can be undesirable when you want
your transcribers to be able to use the real thing.
2) Use ELAN instead, since it has options for mapping different
character sets, including via Keyman. This is fine, but can introduce
new problems. Firstly because the interface for ELAN is considerably
more complex than that for Transcriber it is harder to teach local
transcribers to use the software (and to support them afterwards,
particularly via long distance, sporadic letters/phonecalls). Secondly
there can still be problems setting up a project in ELAN such that you
can get the data out and into Toolbox in a useful way.
Is there a way to keep the simplicity of Transcriber and the link to
Toolbox, but make it work for Unicode input? I believe there is a case
for actually using Toolbox as the prime transcribing tool rather than,
say Transcriber or ELAN. I should make clear at the outset that this
technique still relies on Transcriber to some extent (i.e. for the
chunking, but no longer for the transcription).
This is probably not a new idea to people who have been faced with this
problem, but following are a few suggestions for making this approach
workable.
The key issue is to present an interface to Toolbox that is no more
complex than Transcriber itself. Fortunately this is (in my view)
achievable by combining a number of unrelated features of Toolbox. But
first you have to prepare a suitable Toolbox file and corresponding
'database type' file. These are the steps:
1) Chunk the sound file in Transcriber.
2) Convert the resulting Transcriber file to Toolbox format through my
online converter - http://linguisticsoftwareconverters.zong.mine.nu/
(you see this entire post is just a plug), or equivalent. At this stage
all the chunks - which are now Toolbox records - will be empty. If using
the website, download also the matching Toolbox 'type' file from the
website (you can use this in the next step).
3) Set up a simple Toolbox project which includes the file, its matching
audio file in wav format (these two files should be in the same folder),
and the appropriate type file (this goes in the settings folder, along
with the project itself of course).
4) Check that you can play the sound in Toolbox (Shift + F4) then
deliver the whole Toolbox project package to the transcriber/s. They can
play the sound for each chunk and then type the transcription- using
Keyman or a Toolbox-internal keyboard if desired - into the
Transcription marker.
From your experience with Toolbox, you might be thinking "isn't that
horribly complex?" (delivering all those files), and "isn't that
horribly confusing and fragile?" (all those markers that can be
misunderstood and screwed up inadvertently by inexperienced users). Here
are the Toolbox commands and techniques that I believe deal with these
very valid fears:
1) Hide all the fields you don't need (View menu). That can even include
the Play-sound marker (i.e. the one with the timecode in it that enables
Toolbox to play the chunks). Basically you only need the Ref marker and
the Transcription marker to show at this stage (unless you want to live
dangerously and ask for a free translation at the same time).
2) Lock down the project to an appropriate level (see 'Advanced
Consultant Features' in the Toolbox help files for details how to do
this). For instance Level 5 would still allow the user to minimize,
maximise or restore windows (which means you could perhaps set up a
number of texts to be transcribed in the one Toolbox project), but not
to close them (which means you might get all of them done without too
much drama). This system of locking is not completely foolproof but it
does help to make a more user-friendly skin for inexperienced users who
only need to do one specific task (transcribing in this case, but of
course the technique can be used for editing/reviewing, translating and
so on).
3) Wrap the whole project with a copy of Toolbox in the settings folder
(using the technique described under 'Portability' in the help files).
Optionally also define an 'internal keyboard' (again see the help file).
Distribute by, e.g. copying to a USB stick or CD/DVD, or zipping and
emailing. The point here is to be able to supply the complete package as
one lump together with a simple instruction to "navigate to folder X,
click on file Y.prj" (or a shortcut to Y.prj - although this can be
problematic because of change of drive-letter issues).
My hunch is that a Toolbox project delivered in such a way need be no
more intimidating than (though admittedly different to) a Transcriber
file, and easier to handle than an ELAN file. The only thing missing is
video - but maybe that is a good thing in the interests of keeping
things simple.
The result from this process is of course just a regular Toolbox text
file - completely ready for further processing, e.g. interlinearising in
Toolbox or importing into ELAN.
Andrew Margetts
More information about the Resource-network-linguistic-diversity
mailing list