CA coding in CHAT

Mon Apr 17 02:24:22 UTC 2006

Dear Info-Chibolts,

    Over the last months, I have been working with Johannes Wagner to  
produce increased compatibility between CA (Conversation Analysis)  
transcription and CHAT.  This work has involved extending CHAT to  
allow for a variety of CA features such as arrows for  marking up and  
down jumps in pitch, changes in tempo, changes in volume, and so on.
All of these marks are entered using the F1 function key along with  
some other character to produce some Unicode value.  The summary of  
relevant codes is at http://talkbank.org/ca/codes.html.
      CA uses a radically different method for  marking overlaps and  
we support this now by using special Unicode forms for begins and  
ends of overlaps.
     These changes have brought CHAT and CA much closer, but it is  
still not possible to achieve a complete unification because of the  
ambiguity of some CA codes and conflicts in the meanings of some  
basic symbols such as parentheses.
However, in order to further bridge this gap, Leonid has written a  
program that reformats this compromised version of CA into a purer CA  
version for display.  This program is called CHAT2CA.
     Our image of the work flow is that CA transcribers would  
transcribe in CA/CHAT where they can run CHECK to verify accuracy.   
In this mode they can use the F5 function to link transcripts to  
media.  Once this CHATish work is done,
researchers can then run the INDENT and CHAT2CA programs to produce  
output that looks like good CA.   This output has the file  
extension .ca, instead of .cha.  This .ca output can be edited, but  
the edits cannot be converted back to CHAT, so it is best to use  
the .cha file as the master copy.
     I would like to encourage transcribers to make full use of CA  
type codes in CHAT.  As long as you can run CHECK, all of these codes  
are in full conformity with the CHILDES requirements for inclusion in  
the database and MOR analysis.

--Brian MacWhinney