More on Transcriber and Unicode input
Bruce Cox
bruce_cox at SIL.ORG
Sat Nov 20 15:51:03 UTC 2010
There is another option for keyboarding special characters that I
succeeded in getting to work with Transcriber which is based on the
AutoHotkey program. An IPA keyboard based on this system can be found at
http://scripts.sil.org/UniIPAKeyboard. In compatibility mode, it
effectively pastes the unicode character into the target window. If you
don't like the bindings you can (in principle) write your own script.
Having said that, I don't use it, and I could only get an alternative
script that I wrote on the basis of the one I downloaded several months
ago for a different keyboard to work in Transcriber for some reason. But
that did work, which is at least a proof of concept.
Cheers,
bruce
On 20/11/2010 7:15 AM, Andrew Margetts wrote:
> This is a follow up to my earlier post, 'Toolbox as Transcriber'.
> Several people responded with the suggestion that an alternative
> strategy might be to use Microsoft Keyboard Layout Creator (MSKLC)
> with Transcriber to facilitate the direct input of Unicode special
> characters (using UTF-8). MSKLC is freely available from
> http://msdn.microsoft.com/en-us/goglobal/bb964665.aspx
>
> This sounded like a great idea (albeit one that rather negated my
> own), so I had a look at it. Unfortunately, as far as I can see, MSKLC
> doesn't work in Transcriber, (at least with version 1.5.1 on Windows
> XP Professional). If anybody knows otherwise I would be very
> interested to hear.
>
> The good news is that MSKLC is easy to set-up and use, and does work
> well in (among others):
> Toolbox
> ELAN
> Notepad
>
> Regarding what IS possible in Transcriber:
> 1) you can do search-and-replace oprations within Transcriber, but it
> seems you must paste special characters to the dialog box from an
> external editor that can handle the required input. Therefore it is
> really simpler to just do all such editing in a text editor, after
> completing the Transcriber file. Notepad can be used for this task -
> i.e. it can handle UTF-8. (Avoid Wordpad and Word which just introduce
> problems; Notepad is reliable because it is purely a text editor).
> 2) you can also paste text strings which include special characters
> directly into Transcriber units.
>
> In any case, it is crucial is to explicitly set the encoding in
> Transcriber to UTF-8 thus:
> 'Options > General > Encoding > Unicode(UTF-8)'
>
> The result is that the top line in the .trs file will read:
> <?xml version="1.0" encoding="UTF-8"?>
> rather than
> <?xml version="1.0" encoding="ISO-8859-1"?>
> which is the default.
>
> This technique however does not always work well on existing
> Transcriber files (you have to at least make a change to the file so
> that you can save it); but of course you can instead just make the
> substitution in a text editor rather than using the Transcriber commands.
>
> For good measure I suggest also doing in Transcriber:
> 'Options > Save configuration'
> to keep UTF-8 as the default encoding for new Transcriber files.
>
> Failure to do this may result in Transcriber discarding all your
> Unicode characters on save, close and re-open - which is really very
> annoying. If you are having this problem check that the top line is
> correct!
>
> To summarise this as a work-flow, in case you do wish to use
> search-and-replace techniques with MSKLC I suggest:
> 1) define and load your custom MSKLC keyboard - it will show up as one
> of the options in the 'Language bar' (usually present in the Windows
> Taskbar at the bottom of the screen - if it is not there you will have
> to enable it via 'Control Panel > Regional and Language Options >
> Languages > Details > Settings > Language Bar > Show the language bar
> on the desktop').
> 1) set Transcriber to encode as UTF-8, but then use a working
> orthography in Transcriber.
> 2) open each finished file in Notepad (or other text editor) and
> transform to the real orthography with search-and-replace, using the
> default keyboard for the 'search' term and switching to your custom
> keyboard (using the Language bar) for the 'replace' term.
> 3) (Optionally reopen the file in Transcriber to see what it should
> have looked like all along).
>
> Incidentally, the on-line Transcriber to Toolbox converter can handle
> (and display) UTF-8 so you should have no problem using it to convert
> such a Transcriber file to an accurate Toolbox representation. As
> mentioned above, both Toolbox and ELAN support MSKLC keyboards.
>
> If you subsequently import a Toolbox file that uses UTF-8 special
> characters into ELAN you must use 'File > Import > Toolbox File...' ,
> rather than 'File > Import > Shoebox File...' , and you must tick the
> box 'All markers are Unicode'. Similarly, you should use 'File >
> Export As > Toolbox File(UTF-8)...'
>
> I hope these notes save someone some pain.
>
> Andrew Margetts
>
More information about the Resource-network-linguistic-diversity
mailing list