Batch conversion of CHAT files to Praat TextGrids
Brian MacWhinney
macw at cmu.edu
Tue Aug 26 20:55:23 UTC 2008
Dear Fredrik,
If you run CHAT2PRAAT on the ESL files, even those in TalkBank, you
will get a report of a large number of missing bullets. These are
missing
because they were not inserted systematically by the people creating the
CHAT corpora at the MPI in Nijmegen. For example, in the snippet you
give below,
there are missing bullets on 3 of the 5 lines. Successful conversion
to Praat requires that each utterance have a bullet. Adding bullets
to existing
transcripts is not a terribly difficult matter, if you use the F5
function. I have found
that I can do this in about the time required to play the transcript.
So, I would
recommend that you do this before exporting to Praat. If you do this,
the errors
will surely go away.
In the best of all possible worlds, I would find time to do this
myself. However,
I typically have many other projects that require my attention, so
this type of
clean up work tends to move to the back burner. If you manage to get
the bullets
inserted, please send me a copy of whatever you do and I will update
the database.
Sorry about the hassle. If the MPI-Nijmegen people would ever
reply to my messages
about their corpus, I would be more willing to devote the resources
needed to get everything
linked in CHAT. It is a very important and unique corpus, but I need
their help to
allow it to reach its full potential.
--Brian MacWhinney
On Aug 26, 2008, at 6:55 AM, Fredrik Karlsson wrote:
>
> Hi,
>
> Thank you so much for replying. Using your and suggestions did take me
> coser to the goal, although I have not been entirely successful in
> this. Indeed, the TalkBank version of the .cha files seems to be more
> well formed, but it seems that there may be a problem still either
> with the files or the chat2praat program.
>
> Using the EngIta/An/liean11a.cha, I have this portion in the
> original file:
>
> *SAN: # mhm. %snd:"liean11a"_149686_150197
> *INM: and in the kitchen the sound will bounce # on the [>] table .
> *SAN: mm mm [<].
> *INM: and everything so I think this is [>] this is very good.
> *SAN: mhm mhm [<]. %snd:"liean11a"_150520_155392
>
> which becomes this in the textGrid file:
>
>
> intervals [55]
> xmin = 149.686
> xmax = 150.197
> text = "# mhm."
> intervals [56]
> xmin = 150.197
> xmax = -0.002 <---- Strange...
> text = "mm mm [<]."
> intervals [57]
> xmin = -0.002
> xmax = 150.520
> text = ""
> intervals [58]
> xmin = 150.520
> xmax = 155.392
> text = "mhm mhm [<]."
>
> Obviously, Praat does not like something ending 150.199 ms before it
> starts, so the file is not usable. An xmin < 0 is also something you
> would not expect. I don't know enough about .chat files to be sure
> about this, but it seems that the program is doing something wrong
> here?
>
> /Fredrik
>
> --
> "Life is like a trumpet - if you don't put anything into it, you don't
> get anything out of it."
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en
-~----------~----~----~----~------~----~------~--~---
More information about the Chibolts
mailing list