Chinese spaces

Brian MacWhinney macw at cmu.edu
Thu Jul 6 21:30:03 UTC 2017


Dear ChiBolts,
  Hang Jiang from Emory University has written a Python script that can insert spaces in Mandarin Chinese texts in CHAT files.  It uses a system called jieba.  This is not a CLAN program, but it works well on CHAT files – but only for Mandarin, not Cantonese, Hakka, etc.  So, if you have transcribers who are doubtful about their abilities to add spaces to Mandarin, then they could transcribe without spaces and then we could run this program to add the spaces that the MOR program and the rest of CLAN needs.  Many thanks to Hang for writing this program.
  
  --Brian MacWhinney

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/EC781A8A-B1DD-4D7B-96B4-2D83BE86CFF8%40cmu.edu.
For more options, visit https://groups.google.com/d/optout.



More information about the Chibolts mailing list