chat to text conversion (accents in Spanish)

Brian Macwhinney macw at andrew.cmu.edu
Thu Oct 28 18:06:54 UTC 2021


Dear Elnaz,
     In order to view non-Roman characters such as é, as well as diacritics, CLAN relies on use of the Arial Unicode font which supports not only special European characters, but also Chinese, Sinhalese etc. because it is all of Unicode.  If you convert a CHAT file viewed in Unicode to .txt format, you are going to see what you are seeing now unless your editor for the .txt format allows you to load in a Unicode font.
 
— Brian MacWhinney
Teresa Heinz Professor of Cognitive Psychology, 
Computational Linguistics, 
and Modern Languages, CMU

> On Oct 28, 2021, at 1:27 PM, Elnaz Kia <ek325 at nau.edu> wrote:
> 
> Hi Everyone,
> 
> I have a question about accents in Spanish. So, here is what I have on a .cha transcript file:
> 
> *STU: yo [:: _] me gusta ver(lo) él [:: _] porque él es muy bien
> [:: bueno] en el xxx eh porque es el goleador .
> 
> And when I convert the file to a text. this happens:
> *STU:	yo [:: _] me gusta ver(lo) él [:: _] porque él es muy bien 
> 	[:: bueno] en el xxx eh porque es el goleador .
> 
> And this is how I convert .cha files to .txt files:
> 
> chstring +re +cbullets.cut *.cha
> ren -f +re *.chstr.cex *.txt 
> 
> My question is, how can I avoid this problem?
> 
> Thanks,
> Elnaz
> 
> 
> 
> 
> Elnaz Kia, Ph.D. (she, her, hers)
> Post-Doctoral Research Associate <https://l2trec.utah.edu/about/staff-directory.php>
> Second Language Teaching and Research Center (L2TReC) <https://l2trec.utah.edu/>
> University of Utah
> LinkedIn <https://www.linkedin.com/in/elnaz-kia?lipi=urn%3Ali%3Apage%3Ad_flagship3_profile_view_base_contact_details%3B%2FResGVF2SMGMAxhocdPnRw%3D%3D>
> Personal Website <https://elnazkia.weebly.com/>
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAOwOJYkD%3DarGj%2B7sDYxHzyZikHs_Jd3wHJmCALbCfJCWe69T1A%40mail.gmail.com <https://groups.google.com/d/msgid/chibolts/CAOwOJYkD%3DarGj%2B7sDYxHzyZikHs_Jd3wHJmCALbCfJCWe69T1A%40mail.gmail.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/EE73C6A2-849E-4FED-906C-7ED78BAD803E%40andrew.cmu.edu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20211028/465697ac/attachment.htm>


More information about the Chibolts mailing list