[RNLD] ELAN tiers and types

Wed Sep 12 10:57:40 UTC 2012

Hi Aidan,

The short answer is Nick Thieberger recently posted to this list a collection of templates produced by Andrea Berez:

http://www.rnld.org/software

or

http://paradisec.org.au/elansampletemplates.zip

You could use these and modify them.

Alternatively, I create tiers in the following way:

First, here is an etf template file that I use to start a new transcription:

https://www.sugarsync.com/pf/D138718_8778877_6506503

To change it for your needs, open it in a text editor and search for all instances of "Tom" and replace with the name of the person entering the data (e.g you). Replace "XXX" with the name of the language you are transcribing, and "YYY" with language of your translation. Replace "English" with the language you are making utterance level comments in, if it isn't English. Save the file.

Now replace all instance of "Unknown" with the name of the person you are transcribing. Save a copy of the file in a templates folder somewhere and rename it with the speaker's name. Repeat the process for each speaker you are transcribing. In this way I amass a template file for each speaker that I have transcribed, and so when I want to start a new transcription I simply import the all the relevant template files in with my audio file and hey presto away I go.

This template provides per speaker tiers for transcription, translation and comments. The transcription is time aligned, while the translation and comments are nested underneath. There is also a single tier for time-aligned comments (i.e. general comments that don't align with speakers' utterances).

I use tier names that are compatible with exporting to toolbox. "tx" is from transcriptions, "ft" is for translations, "cm" is for comments. The @ + participant name bit at the end of each tier name is a trick to aid exporting to toolbox. If you're not exporting to toolbox, then rename these as you like (e.g. make them more verbose), but remember you can't have separate tiers for each speaker that have the same name, so it's best to use a similar strategy of including the speaker's name in the tier name.

Naming and establishing tiers and types consistently across a corpus is really important once the corpus grows. It will allow you to search across the corpus and to narrow your searches to just one tier, or type, using the extremely powerful "Structured Search Multiple EAF". Say for instance you wanted to limit your search to a particular speaker, you would search for values within a particular tier across multiple files. Say you wanted to search in your translations for all speakers, then you would search within the type "YYY translation" (where YYY is the language you entered). Always establishing tiers and types using templates will really help to ensure the consistency which will enable these kinds of searches.

For your particular needs I would create two more tiers using the same per-speaker naming strategy, and create separate types for those tiers as well, both using the stereotype "symbolic association" (this is really important for the nesting). Then save the file as a template, and create all the individual versions for each speaker.

I translate in both Tok Pisin and English, so I have separate tiers with separate types for both of those. I also have a "tidied text" tier which is less representative of the original, but, for instance replaces accidental code-switching (at my consultants' request), and tidies up in other ways. I have a close phonetic transcription tier for when I transcribe word lists, again with a separate type. I only mention these extra tiers because it really helps to add them from the start, as going back and adding them afterwards is a pain.

Hope this helps,

Cheers,
Tom

On 12/09/2012, at 6:06 PM, Aidan Wilson <aidan.wilson at unimelb.edu.au> wrote:

> Hi all,
> 
> I'm having immense trouble with a collection of transcripts I'm building at the moment. I have never been successful in creating hierarchies of tiers. Ideally, I want to have a hierarchy like this:
> 
> -[initials]
> 	–transcription
> 	-morpheme gloss
> 	-free translation
> 	-action
> 
> for each participant. In reading through the manuals and so forth, I've created linguistic types like 'group' (for the outermost parent tier), transcription, morpheme gloss and so on, but what's happening when I create my tiers is that by trying to select a parent tier, the 'linguistic type' pull-down menu disappears and the 'add' button goes grey. It only allow me to create tiers (or change existing tiers) if I don't plan on nesting them, it seems.
> 
> Has someone got a link to instructions, or even better, a working template than I can reverse-engineer?
> 
> What I'm hoping to do is go through my transcripts with a text editor and manually nest them by editing the xml (unless I can figure out how to retrospectively nest them), but I need to know the structure of the eaf file.
> 
> This is bringing me to tears, as it were.
> 
> -- 
> Aidan Wilson
> 
> Dept of Linguistics and Applied Linguistics
> The University of Melbourne
> 
> +61428 458 969
> aidan.wilson at unimelb.edu.au

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/resource-network-linguistic-diversity/attachments/20120912/a2de63cc/attachment.htm>