Batch conversion of CHAT files to Praat TextGrids

Brian MacWhinney macw at cmu.edu
Sun Aug 24 20:48:59 UTC 2008


Dear Fredrik,
     Rather than worrying about getting a batch conversion to work, I  
think it would
be best to focus first on whether you can get the conversion to work  
on the
problematic file without running batch.
     Also, may I suggest that you would have a much easier time of all  
of this if you
would grab your copy of the ESF files not from the MPI server, but  
from the TalkBank
server.  I have recently updated all of the TalkBank materials and  
they all pass CHECK
and conform to the TalkBank XML standard.  This is decidedly not true  
for the copies of
the files on the MPI server.
    Unfortunately, not all of the files on the MPI server have  
corresponding audio.  In particular,
the audio is missing for all of the Swedish and most of the German.   
Whether someone at
the MPI will ever make an attempt to correct this is unclear.  I have  
written to them repeatedly
on the matter, but received no reply, so I am assuming this is all  
that is available.
    Good luck with this.

--Brian MacWhinney

On Aug 22, 2008, at 11:57 AM, Fredrik Karlsson wrote:

>
> Hi,
>
> I whant to convert allt the .cha-files I downloaded from the ESF
> Corpus made public through the MPI archive system. I have tried CLANS
> chat2praat program, but it fails. This is the log:
>
>> dir *.cha
> cha liean11b.1.cha liean12e.1.cha liean13g.1.cha
> liean14a.1.cha liean14a.2.cha liean16a.1.cha liean16k.1.cha
> liean17l.1.cha liean18a.1.cha liean18c.1.cha liean22a.1.cha
> liean22e.1.cha liean22g.1.cha liean23c.1.cha liean24a.1.cha
> liean24i.1.cha liean25a.1.cha liean25k.1.cha liean27l.1.cha
> liean31a.1.cha liean31d.1.cha liean32a.1.cha liean32e.1.cha
> liean32g.1.cha liean32h.1.cha liean34q.1.cha liean35j.1.cha
> test.cha unixfile.cha
>
> 30 files, 0 directories
>
>> chat2praat +e.wav liean11a.1.cha
> chat2praat +e.wav liean11a.1.cha
> Thu Aug 21 18:01:19 2008
> chat2praat (28-Jul-2008) is conducting analyses on:
> ALL speaker tiers
> and those speakers' ALL dependent tiers
> and ALL header tiers
> ****************************************
> From file <liean11a.1.cha> to file <liean11a.1.textGrid>
>
>
> *** File "liean11a.1.cha": line 16.
> Illegal speaker character found: 15.
>
> CURRENT OUTPUT FILE "liean11a.1.textGrid" IS INCOMPLETE.
>
> Line 16 of the file is:
>
> ^U%snd: "liean11a.wav" 9383 13613^U
>
> where ^U seems to be a \x0015. Removing the character with perl
>
>> perl -nae 's/\x15//;print'  liean11a.1.cha > test_liean11a.1.cha
>
> makes the file not readable for CLAN:
>
> .....
> *** File "test_liean11a.1.cha": line 1481. <- Lots of messages like
> this one
> It is illegal to have Header Tiers '@' inside the transcript
>
>
>    NO BULLETS FOUND IN THE FILE
>
>
> Done with file <test_liean11a.1.textGrid>
>
> So, what do I do? The chat2elan seems to behave the same way.
>
> I would, or course, appreciate any help I could get.
>
> /Fredrik
> >
>


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en
-~----------~----~----~----~------~----~------~--~---



More information about the Chibolts mailing list