CA coding in CHAT
Brian MacWhinney
macw at cmu.edu
Mon Apr 17 02:24:22 UTC 2006
Dear Info-Chibolts,
Over the last months, I have been working with Johannes Wagner to
produce increased compatibility between CA (Conversation Analysis)
transcription and CHAT. This work has involved extending CHAT to
allow for a variety of CA features such as arrows for marking up and
down jumps in pitch, changes in tempo, changes in volume, and so on.
All of these marks are entered using the F1 function key along with
some other character to produce some Unicode value. The summary of
relevant codes is at http://talkbank.org/ca/codes.html.
CA uses a radically different method for marking overlaps and
we support this now by using special Unicode forms for begins and
ends of overlaps.
These changes have brought CHAT and CA much closer, but it is
still not possible to achieve a complete unification because of the
ambiguity of some CA codes and conflicts in the meanings of some
basic symbols such as parentheses.
However, in order to further bridge this gap, Leonid has written a
program that reformats this compromised version of CA into a purer CA
version for display. This program is called CHAT2CA.
Our image of the work flow is that CA transcribers would
transcribe in CA/CHAT where they can run CHECK to verify accuracy.
In this mode they can use the F5 function to link transcripts to
media. Once this CHATish work is done,
researchers can then run the INDENT and CHAT2CA programs to produce
output that looks like good CA. This output has the file
extension .ca, instead of .cha. This .ca output can be edited, but
the edits cannot be converted back to CHAT, so it is best to use
the .cha file as the master copy.
I would like to encourage transcribers to make full use of CA
type codes in CHAT. As long as you can run CHECK, all of these codes
are in full conformity with the CHILDES requirements for inclusion in
the database and MOR analysis.
--Brian MacWhinney
More information about the Chibolts
mailing list